Airflow 2.x to 3.x Migration Made Easy: How Automation Cut Manual Work by 70%

Upgrading Apache Airflow isn't like your typical software update, it's a complete overhaul. When Airflow 3.0 dropped, it brought tougher validation rules, killed off old operators, and introduced those sneaky little changes that can quietly break your production pipelines without warning. If you're managing dozens or hundreds of DAGs, trying to update each one by hand is basically impossible.
Here's the thing - you can't sit on Airflow 2.x forever. Security patches are becoming less frequent, the community is moving on, and all the cool new stuff that actually fixes your day-to-day headaches is only coming to 3.x. Keep waiting, and you're setting yourself up for system failures, audit nightmares, and a massive pile of technical debt that'll make your platform engineers want to quit. We've all seen what happens to teams that put this off until the last second. It's not pretty.
This insight shows how we automated the Airflow 2.x to 3.x migration process end-to-end. Learn how our tool reduced manual effort by 90%, flagged risky patterns before they caused breakage, and pushed clean, compliant DAGs directly back to GitHub. If you’re a data or platform engineer preparing for this transition, this guide will give you a proven blueprint to migrate with confidence.
Author
Nikhil
Mohod
I'm a Data Engineer with 8 years of experience specialising in the Azure data ecosystem. I design and implement scalable data pipelines, lakes and ETL/ELT solutions using tools like ADF, Airflow, Databricks, Synapse and SQL Server. Focused on building high-quality, secure, and optimised cloud data architecture.
Connect with
Nikhil
Mohod
Get a free recap to share with colleagues
What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
Your data platform is only as good as the trust people have in it. When teams know their pipelines will deliver fresh data exactly when they need it, everything flows smoothly. But the moment Airflow goes down? That trust evaporates. Analytics teams can't do their jobs, executives stare at blank dashboards, and suddenly everyone's wondering if they can count on the data they've been using to make decisions.
Why It Matters
From a business perspective, automation translates directly into savings. Cut 90% of manual upgrade work, and your engineering team gets weeks back that can be spent on higher-value initiatives. Plus, automated checks catch the human errors that lead to expensive outages or those compliance headaches when someone misses a critical deprecation.
On the technical side, you're setting yourself up for the long haul. Airflow 3.x gives you access to the latest features, better performance, and tighter security. When you standardise DAGs across teams, you're not just cleaning up code; you're making life easier for platform engineers who won't have to wrestle with a mess of legacy systems. Bottom line: invest in smooth migrations now, and you'll see lower costs, faster innovation, and systems you can actually count on down the road.
Architecture Overview
Migrating DAGs at scale required more than search-and-replace scripts; it demanded a repeatable, auditable pipeline that fit naturally into existing GitOps workflows. Here’s how we designed Cloudaeon’s Airflow migration tool to make that happen.

Tools & technologies used:
Apache Airflow – Target platform (migrating DAGs from 2.x to 3.x)
GitHub API – Read and update DAG files securely
Python (AST, re) – For both structural and simple code rewrites
Ruff (AIR301) – to check for deprecated imports and paths (Airflow Official Approach)
CI/CD compatible – Commits and pushes the changes back to GitHub.
Key components explained:
GitHub Integration – Connects with repos via access token, retrieves DAGs from the dags/ folder, and pushes updates back without manual cloning.
Pattern Scanning – Uses regex for quick wins and AST for deeper analysis, detecting deprecated operators, execution patterns, and hidden edge cases.
Linting & Validation – Runs Ruff against every DAG to enforce Airflow 3.x compliance, ensuring clean imports and correct operator usage.
Automated Fixes – Applies replacements where safe, stages changes, and commits them back to GitHub. Context-sensitive cases are flagged for manual review.
CI/CD Friendly – Fully GitOps-native, enabling pull request reviews, audit trails, and automated testing before rollout.
Logging Layer – Every migration run produces a timestamped log file that developers can review, making the whole process traceable and team-friendly.
Step-by-Step Walkthrough
1. Connect to GitHub and retrieve DAG files
Using a personal access token, we securely connect to your GitHub repository and search the dags/ subdirectory for Python-based DAG files.
def list_files_in_folder(folder_path):
url = f"https://api.github.com/repos/{repo_owner}/{repo_name}/contents/{folder_path}?ref={branch}
headers = {"Authorization": f"token {github_token}"}
response = requests.get(url, headers=headers)By allowing programmatic reading and subsequent updating of DAGs, this maintains the workflow fully Git-based and CI/CD-ready.
2. Scan DAGs for deprecated patterns
We identify deprecated elements that are incompatible with Airflow 3.x using both regex-based and AST-based analysis. Some of these include:
DummyOperator
Schedule_interval
Execution_date
"tasks": {
r"\bDummyOperator\s*\(": "EmptyOperator(",{ "params":
r"\bschedule_interval\b":"schedule",AST was necessary for a more thorough examination, even though regex works well for typical patterns.
Identifying several cluster configurations contained within a single DAG, for instance.
3. Run Ruff (AIR301) to validate compatibility
We incorporate Ruff, the static analysis tool that the Apache Airflow project has officially approved, to make sure every DAG complies with Airflow 3.x specifications.
We specifically check for Airflow rule AIR301, which highlights operator usage and deprecated import pathways.
def run_ruff_check_on_content(content, file_name="dag_file.py"):
with tempfile. NamedTemporaryFile("w", delete=False, suffix=".py", prefix="dag_", encoding="utf-8'
temp_file.write(content)
temp_file_path = temp_file.nameThis helps us lint and auto-correct issues like deprecated imports or outdated function usage before committing any changes.
4. Perform replacements and push to GitHub
Once checks pass, the tool stages the changes, applies replacements straight into the code, and pushes the revised DAGs back to the original GitHub repository. We regularly and safely implement automatic modifications where feasible. We explicitly mark anything that is context-specific, like odd operator logic or unsupported plugins, for manual follow-up.
def write_file_to_github(file_path, new_content, sha):
url = f"https://api.github.com/repos/{repo_owner}/{repo_name}/contents/{file_path}"
headers = {"Authorization": f"token {github_token}"}
data = {
"message": f"Update deprecated usage in {file_path}",
"content": base64.b64encode(new_content.encode()).decode(),
"branch": branch,
"sha": shaEvery step is logged into process_log.txt, so developers can see exactly what changed, which DAGs were touched, and which patterns need manual review.
Results
Reduced manual effort by over 90%
Migrated DAGs across four teams in under 2 weeks
Cleaned up more than 25 deprecated patterns in dozens of DAGs.
Flagged 20+ DAGs with multi-cluster setups - useful even outside this upgrade
Next: Integrating with GitHub Actions for nightly DAG hygiene run
Conclusion
Migrating from Airflow 2.x to 3.x can be a smooth, automated process rather than a manual and error-prone slog. Cloudaeon successfully developed a migration framework that combines Abstract Syntax Tree (AST) parsing, Ruff linting, and GitHub-native automation. This approach scales effectively across multiple teams and repositories. The success of this framework stemmed from its ability to automate repetitive tasks, which allowed engineers to concentrate on handling complex edge cases.
The framework's effectiveness was further enhanced by GitOps, which provided a reliable and auditable process with integrated Continuous Integration/Continuous Deployment (CI/CD) reviews. This combination of AST parsing and regular expressions (regex) achieved a good balance between speed and accuracy. The use of AST was particularly crucial for identifying complex issues, such as embedded cluster logic, which simple regex alone couldn't detect. It also helped to reveal DAGs with mixed cluster usage, providing valuable insights for future platform consolidation. The seamless and auditable nature of the GitOps integration was a direct result of the automated GitHub commits. This rule-based replacement method proved highly effective because the deprecated patterns were frequently replicated across the codebase.
Ready to streamline your Airflow migration? Contact us to learn more about our automated migration framework.


