Skip to content
Advertisement

Airflow 2.2.2 remote worker logging getting 403 Forbidden

I have a setup where airflow is running in kubernetes (EKS) and remote worker running in docker-compose in a VM behind a firewall in a different location.

Problem Airflow Web server in EKS is getting 403 forbidden error when trying to get logs on remote worker.

Build Version

  • Airflow – 2.2.2
  • OS – Linux – Ubuntu 20.04 LTS

Kubernetes

  • 1.22 (EKS)
  • Redis (Celery Broker) – Service Port exposed on 6379
  • PostgreSQL (Celery Backend) – Service Port exposed on 5432

Airflow ENV config setup

  AIRFLOW__API__AUTH_BACKEND: airflow.api.auth.backend.basic_auth
  AIRFLOW__CELERY__BROKER_URL: redis://<username>:<password>@redis-master.airflow-dev.svc.cluster.local:6379/0
  AIRFLOW__CELERY__RESULT_BACKEND: >-
    db+postgresql://<username>:<password>@db-postgresql.airflow-dev.svc.cluster.local/<db>
  AIRFLOW__CLI__ENDPOINT_URL: http://{hostname}:8080
  AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
  AIRFLOW__CORE__EXECUTOR: CeleryExecutor
  AIRFLOW__CORE__FERNET_KEY: <fernet_key>
  AIRFLOW__CORE__HOSTNAME_CALLABLE: socket.getfqdn
  AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
  AIRFLOW__CORE__SQL_ALCHEMY_CONN: >-
    postgresql+psycopg2://<username>:<password>@db-postgresql.airflow-dev.svc.cluster.local/<db>
  AIRFLOW__LOGGING__BASE_LOG_FOLDER: /opt/airflow/logs
  AIRFLOW__LOGGING__WORKER_LOG_SERVER_PORT: '8793'
  AIRFLOW__WEBSERVER__BASE_URL: http://{hostname}:8080
  AIRFLOW__WEBSERVER__SECRET_KEY: <secret_key>
  _AIRFLOW_DB_UPGRADE: 'true'
  _AIRFLOW_WWW_USER_CREATE: 'true'
  _AIRFLOW_WWW_USER_PASSWORD: <username-webserver>
  _AIRFLOW_WWW_USER_USERNAME: <password-webserver>

Airflow is using CeleryExecutor

Setup Test

  1. Network reach ability by ping – OK
  2. Celery Broker reach ability for both EKS and remote worker – OK
  3. Celery Backend reach ability for both EKS and remote worker – OK
  4. Firewall Port expose for remote worker Gunicorn API – OK
  5. curl -v telnet://:8793 test – OK (Connected)
  6. Airflow flower recognizing both workers from Kubernetes and remote worker – OK
  7. All the ENV on both webserver, worker (EKS, remote) and scheduler are identical
  8. Queue is setup so the DAG runs exactly in that particular worker
  9. Time on both docker, VM and EKS is on UTC. There is a slight 5 to 8 seconds difference in docker and the pod in EKS
  10. Ran webserver on the remote VM as well which can pick up and show logs

Description Airflow is able to execute the DAG in remote worker, the logs can be seen in the remote worker. I have tried all combinations of setting but still keep getting 403.

Another test which was done was just normal curl with webserver auth

This curl was done both from EKS and remote server which hosts docker-compose. Results are the same on all the server.

curl --user <username-webserver> -vvv http:<remote-worker>:8793/logs/?<rest-of-the-log-url>
Getting 403 Forbidden

I might have miss configured it, but I doubt that is the case. Any tips on what I am missing here? Many thanks in advance.

Advertisement

Answer

https://github.com/apache/airflow/discussions/26624#discussioncomment-3715688

With the above discussion I had with airflow community in Github, I synced the servers to use NTP, EKS and the remote worker had 135sec time drift.

Later worked on the auth.

I rebuilt the curl auth from this file of branch 2.2 https://github.com/apache/airflow/blob/main/airflow/utils/log/file_task_handler.py

Later realized that the auth doesn’t like special characters in secret key, and added to that there was NTP time drift of 135 seconds (2min 15seconds) which would also factor in causing confusion.

I would recommend people who would face this problem to avoid special characters in secret key. Just an airflow user recommendation, I wouldn’t want to say it is the only solution but something which helped me.

Special character and combined with NTP caused confusion for debugging the issue, resolving NTP should be first thing than with the auth.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement