我对Airflow还不熟悉,正尝试设置Airflow以运行ETL流水线。我已经成功安装了:
- Airflow
- PostgreSQL
- Celery
- RabbitMQ
我能够测试运行教程DAG。但是当我尝试调度任务时,调度程序能够将其添加到队列中(在UI上可以看到),但任务没有运行。请问有人能帮我解决这个问题吗?
下面是我的配置文件:
[core]
airflow_home = /root/airflow
dags_folder = /root/airflow/dags
base_log_folder = /root/airflow/logs
executor = CeleryExecutor
sql_alchemy_conn = postgresql+psycopg2://xxxx.amazonaws.com:5432/airflow
api_client = airflow.api.client.local_client
[webserver]
web_server_host = 0.0.0.0
web_server_port = 8080
web_server_worker_timeout = 120
worker_refresh_batch_size = 1
worker_refresh_interval = 30
[celery]
celery_app_name = airflow.executors.celery_executor
celeryd_concurrency = 16
worker_log_server_port = 8793
broker_url = amqp://rabbit:rabbit@x.x.x.x/rabbitmq_vhost
celery_result_backend = db+postgresql+psycopg2://postgres:airflow@xxx.amazonaws.com:5432/airflow
flower_host = 0.0.0.0
flower_port = 5555
default_queue = default
DAG: 这是我使用的教程DAG
我的 DAG 的开始日期为 -- 'start_date': datetime(2017, 4, 11),
celery_result_backend
、相同的dags_folder
和相同的broker_url
。 - jhnclvr