Use existing celery workers for Airflow’s Celeryexecutor workers

Question

I am trying to introduce dynamic workflows into my landscape that involves multiple steps of different model inference where the output from one model gets fed into another model.Currently we have few Celery workers spread across hosts to manage the inference chain. As the complexity increase, we are attempti…

Accepted Answer

Thanks @DejanLekic. I got it working (though the DAG task scheduling latency was too much that I dropped the approach). If someone is looking to see how this was accomplished, here are few things I did to get it working.Change the airflow.cfg to change the executor,queue and result back-end settings (Obvious)If we have to use Celery worker spawned outside the airflow umbrella, change the celery_app_name setting to celery.execute instead of airflow.executors.celery_execute and change the Executor to &#8220;LocalExecutor&#8221;. I have not tested this, but it may even be possible to avoid switching to celery executor by registering airflow&#8217;s Task in the project&#8217;s celery App.Each task will now call send_task(), the AsynResult object returned is then stored in either Xcom(implicitly or explicitly) or in Redis(implicitly push to the queue) and the child task will then gather the Asyncresult ( it will be an implicit call to get the value from Xcom or Redis) and then call .get() to obtain the result from the previous step.Note: It is not necessary to split the send_task() and .get() between two tasks of the DAG. By splitting them between parent and child, I was trying to take advantage of the lag between tasks. But in my case, the celery execution of tasks completed faster than airflow&#8217;s inherent latency in scheduling dependent tasks.

Advertisement

Answer