Setup Airflow in Ubuntu Server

Prepare the System (on all nodes)

sudo apt update && sudo apt upgrade -y
sudo apt install -y python3-pip python3-venv libpq-dev gcc build-essential libssl-dev libffi-dev python3-dev

Create Airflow User and Environment (on all nodes)

sudo adduser airflow
sudo usermod -aG sudo airflow
su - airflow

# Inside user shell
python3 -m venv airflow-venv
source airflow-venv/bin/activate

Install Apache Airflow

Use constraints to install a specific version:

AIRFLOW_VERSION=3.0.1
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1,2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"

pip install "apache-airflow[celery,postgres,mysql,redis,auth,google_auth]" --constraint "$CONSTRAINT_URL"

This installs Airflow with CeleryExecutor support (for distributed tasks), PostgreSQL (metadata DB), and Redis (as message broker).

PostgreSQL (Metadata DB)

Install and configure PostgreSQL:

sudo apt install postgresql postgresql-contrib
sudo -u postgres psql
CREATE DATABASE airflow;
CREATE USER airflow WITH PASSWORD 'airflowpass';
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
ALTER SCHEMA public OWNER TO airflow;

Edit /etc/postgresql/*/main/pg_hba.conf to allow remote access:

host    airflow    airflow    0.0.0.0/0    md5

Edit /etc/postgresql/*/main/postgresql.conf:

listen_addresses = '*'

Restart PostgreSQL:

sudo systemctl restart postgresql

Open firewall port if needed:

sudo ufw allow 5432

Redis (Message Broker)

sudo apt install redis-server
sudo systemctl enable redis-server
sudo systemctl start redis-server

Test connection: redis-cli ping → should return PONG

Airflow Configuration (on ALL nodes)

Initialize Airflow

export AIRFLOW_HOME=~/airflow
airflow db check  # only on one node

Configure ~/airflow/airflow.cfg

Set executor:

executor = CeleryExecutor

Set PostgreSQL connection:

sql_alchemy_conn = postgresql+psycopg2://airflow:airflowpass@<db_host>:5432/airflow

Set Redis as broker:

broker_url = redis://<redis_host>:6379/0
result_backend = db+postgresql://airflow:airflowpass@<db_host>:5432/airflow

Set other configs (e.g., parallelism, worker concurrency) as needed.

Checking DB Connections

airflow db check # success
airflow db migrate

# Create an admin user
airflow users create \
    --username admin \
    --firstname Peter \
    --lastname Parker \
    --role Admin \
    --email [email protected]

Running Airflow Components

Run these processes as systemd services or in tmux/supervisord for long-running.

Webserver
airflow api-server --port 8080
Scheduler
airflow scheduler
Worker (on worker nodes)
airflow celery worker
Flower (optional: Celery monitoring UI)
airflow celery flower --port=5555

Systemd Service Setup

You can create systemd unit files to run the services automatically.

Example for webserver:

# /etc/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow webserver daemon
After=network.target

[Service]
Environment="AIRFLOW_HOME=/home/airflow/airflow"
User=airflow
Group=airflow
Type=simple
ExecStart=/home/airflow/airflow-venv/bin/airflow api-server

[Install]
WantedBy=multi-user.target

Then:

sudo systemctl daemon-reexec
sudo systemctl enable airflow-webserver
sudo systemctl start airflow-webserver

Repeat similarly for scheduler and worker.

Add .bashrc env config

echo "export AIRFLOW_HOME=/home/airflow/airflow" >> .bashrc
echo "source /home/airflow/airflow-venv/bin/activate" >> .bashrc

Web UI Access

Open the webserver port (8080) in firewall:

sudo ufw allow 8080

Visit: http://<webserver-ip>:8080

(Optional) Serving using NGINX

Install Nginx

sudo apt update
sudo apt install nginx -y

Create a config file

sudo nano /etc/nginx/sites-available/airflow
server {
    listen 80;
    server_name airflow.diybaazar.com;

    # Redirect HTTP to HTTPS
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name airflow.diybaazar.com;

    ssl_certificate /etc/ssl/certs/diybaazar/ssl.pem;
    ssl_certificate_key /etc/ssl/certs/diybaazar/ssl.key;

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

Enable it

sudo ln -s /etc/nginx/sites-available/airflow /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx