Apache Airflow : Complete Distributed Configuration

Apache Airflow Distributed Setup using Celery Executor

Airflow is a platform created by the community to programmatically author, schedule and monitor workflows. Airflow is one of the best open source orchestrators and it is used widely because it is simplicity, scalability and extensibility.

What you’ll learn

  • Gain complete understanding of the apache airflow.
  • Learn different types of Executors and it’s working principles.
  • Own a scalable distributed airflow setup which could be shared by multi teams in your organisation.

Course Content

  • Introduction –> 2 lectures • 3min.
  • Install Airflow and Web Server walkthrough –> 8 lectures • 27min.
  • Sequential Executor with SQLite –> 1 lecture • 2min.
  • Local Executor with Mysql –> 11 lectures • 58min.
  • Celery Executor with Mysql and RabbitMQ –> 10 lectures • 30min.

Apache Airflow : Complete Distributed Configuration

Requirements

  • Python >= 2.7.

Airflow is a platform created by the community to programmatically author, schedule and monitor workflows. Airflow is one of the best open source orchestrators and it is used widely because it is simplicity, scalability and extensibility.

Main goal of this course is to achieve an Airflow distributed setup using Celery Executor and be able to run more than 100 jobs or DAGs in parallel at any instance in time. I cover Sequential, Local and Celery Executor in this course. We acquire few EC2 instances from AWS and configure these executors.

In addition to this, we explore salient features like Login, Email alerting and Logs management.

By the end of this course, you own a great distributed airflow setup which could be shared by multi teams in your organisation.