Setup Airflow in Ubuntu Server
Prepare the System (on all nodes)
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3-pip python3-venv libpq-dev gcc build-essential libssl-dev libffi-dev python3-dev
Create Airflow User and Environment (on all nodes)
sudo adduser airflow
sudo usermod -aG sudo airflow
su - airflow
# Inside user shell
python3 -m venv airflow-venv
source airflow-venv/bin/activate
Install Apache Airflow
Use constraints to install a specific version:
AIRFLOW_VERSION=3.0.1
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1,2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
pip install "apache-airflow[celery,postgres,mysql,redis,auth,google_auth]" --constraint "$CONSTRAINT_URL"
This installs Airflow with CeleryExecutor support (for distributed tasks), PostgreSQL (metadata DB), and Redis (as message broker).
PostgreSQL (Metadata DB)
Install and configure PostgreSQL:
sudo apt install postgresql postgresql-contrib
sudo -u postgres psql
CREATE DATABASE airflow;
CREATE USER airflow WITH PASSWORD 'airflowpass';
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
ALTER SCHEMA public OWNER TO airflow;
Edit /etc/postgresql/*/main/pg_hba.conf to allow remote access:
host airflow airflow 0.0.0.0/0 md5
Edit /etc/postgresql/*/main/postgresql.conf:
listen_addresses = '*'
Restart PostgreSQL:
sudo systemctl restart postgresql
Open firewall port if needed:
sudo ufw allow 5432
Redis (Message Broker)
sudo apt install redis-server
sudo systemctl enable redis-server
sudo systemctl start redis-server
Test connection: redis-cli ping → should return PONG
Airflow Configuration (on ALL nodes)
Initialize Airflow
export AIRFLOW_HOME=~/airflow
airflow db check # only on one node
Configure ~/airflow/airflow.cfg
Set executor:
executor = CeleryExecutor
Set PostgreSQL connection:
sql_alchemy_conn = postgresql+psycopg2://airflow:airflowpass@<db_host>:5432/airflow
Set Redis as broker:
broker_url = redis://<redis_host>:6379/0
result_backend = db+postgresql://airflow:airflowpass@<db_host>:5432/airflow
Set other configs (e.g., parallelism, worker concurrency) as needed.
Checking DB Connections
airflow db check # success
airflow db migrate
# Create an admin user
airflow users create \
--username admin \
--firstname Peter \
--lastname Parker \
--role Admin \
--email [email protected]
Running Airflow Components
Run these processes as systemd services or in tmux/supervisord for long-running.
Webserver
airflow api-server --port 8080
Scheduler
airflow scheduler
Worker (on worker nodes)
airflow celery worker
Flower (optional: Celery monitoring UI)
airflow celery flower --port=5555
Systemd Service Setup
You can create systemd unit files to run the services automatically.
Example for webserver:
# /etc/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow webserver daemon
After=network.target
[Service]
Environment="AIRFLOW_HOME=/home/airflow/airflow"
User=airflow
Group=airflow
Type=simple
ExecStart=/home/airflow/airflow-venv/bin/airflow api-server
[Install]
WantedBy=multi-user.target
Then:
sudo systemctl daemon-reexec
sudo systemctl enable airflow-webserver
sudo systemctl start airflow-webserver
Repeat similarly for scheduler and worker.
Add .bashrc env config
echo "export AIRFLOW_HOME=/home/airflow/airflow" >> .bashrc
echo "source /home/airflow/airflow-venv/bin/activate" >> .bashrc
Web UI Access
Open the webserver port (8080) in firewall:
sudo ufw allow 8080
Visit: http://<webserver-ip>:8080
(Optional) Serving using NGINX
Install Nginx
sudo apt update
sudo apt install nginx -y
Create a config file
sudo nano /etc/nginx/sites-available/airflow
server {
listen 80;
server_name airflow.diybaazar.com;
# Redirect HTTP to HTTPS
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name airflow.diybaazar.com;
ssl_certificate /etc/ssl/certs/diybaazar/ssl.pem;
ssl_certificate_key /etc/ssl/certs/diybaazar/ssl.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
Enable it
sudo ln -s /etc/nginx/sites-available/airflow /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx