Skip to content
Advertisement

Unable to start Airflow worker/flower and need clarification on Airflow architecture to confirm that the installation is correct

Running a worker on a different machine results in errors specified below. I have followed the configuration instructions and have sync the dags folder.

I would also like to confirm that RabbitMQ and PostgreSQL only needs to be installed on the Airflow core machine and does not need to be installed on the workers (the workers only connect to the core).

The specification of the setup is detailed below:

Airflow core/server computer

Has the following installed:

  • Python 2.7 with
    • airflow (AIRFLOW_HOME = ~/airflow)
    • celery
    • psycogp2
  • RabbitMQ
  • PostgreSQL

Configurations made in airflow.cfg:

  • sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
  • executor = CeleryExecutor
  • broker_url = amqp://username:password@192.168.1.2:5672//
  • celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow

Tests performed:

  • RabbitMQ is running
  • Can connect to PostgreSQL and have confirmed that Airflow has created tables
  • Can start and view the webserver (including custom dags)

.

.

Airflow worker computer

Has the following installed:

  • Python 2.7 with
    • airflow (AIRFLOW_HOME = ~/airflow)
    • celery
    • psycogp2

Configurations made in airflow.cfg are exactly the same as in the server:

  • sql_alchemy_conn = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow
  • executor = CeleryExecutor
  • broker_url = amqp://username:password@192.168.1.2:5672//
  • celery_result_backend = postgresql+psycopg2://username:password@192.168.1.2:5432/airflow

Output from commands run on the worker machine:

When running airflow flower:

JavaScript

When running airflow worker:

JavaScript

When celery_result_backend is changed to the default db+mysql://airflow:airflow@localhost:3306/airflow and the airflow worker is run again the result is:

JavaScript

What am I missing? How can I diagnose this further?

Advertisement

Answer

The ImportError: No module named postgresql error is due to the invalid prefix used in your celery_result_backend. When using a database as a Celery backend, the connection URL must be prefixed with db+. See https://docs.celeryproject.org/en/stable/userguide/configuration.html#conf-database-result-backend

So replace:

JavaScript

with something like:

JavaScript
Advertisement