Apache Airflow#2: Concepts and Architecture

DATA ENGINEERING SERIES | KEEP IN TOUCH

Rahul S
3 min readDec 3, 2023

--

Airflow allows you to build and run workflows. Every workflow is represented as a directed acyclic graph. Your workflow instances in Airflow are actually referred to as DAGs. Now, every task or bit of work that you want to perform is represented as a node or a vertex in this directed acyclic graph. The graph cannot have cycles, because cycles represent circular dependencies, and in that case your workflow will never execute through to completion. This is why we needed directed acyclic graph.

The edges that connect these nodes represent dependencies. Each directed edge is a dependency from one task to another. It also represents how data might flow between these dependencies. It's possible for a task in the graph to access data generated by any other previously executed task. Dependencies represent which task needs to be completed before another task can begin.

Here are the basic components that make up Airflow.

SCHEDULER

  • Now at the center of…

--

--