You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Claude Code skill that guides the end-to-end implementation of a new Airflow lint rule (the AIR prefix) in the Ruff codebase.
What it does
When invoked, the skill walks the agent through:
Pre-implementation interview — establishes the Airflow version target (2, 3, or both), the DAG-API surface (TaskFlow vs. operator API vs. both), and whether a local Airflow checkout is available for validation.
Rule numbering — picks a code in the right range (AIR0xx for general best-practice, AIR3xx for migration rules).
Code generation — produces a rule file from one of two skeletons, registers it in codes.rs, exports it from rules/mod.rs, and wires up the dispatch in checkers/ast/analyze/{statement,expression}.rs.
Fixture + snapshot workflow — creates a positive/negative test fixture, runs cargo nextest, accepts insta snapshots, regenerates schemas via cargo dev generate-all, and runs clippy + uvx prek run -a.
Real-world validation — runs the new rule against a local Airflow checkout (if available) to surface false positives before finalizing.
The skill encodes conventions specific to this codebase — Dag/DAG terminology, seen_module(Modules::AIRFLOW) guards, is_guarded_by_try_except for migration rules, the dual-path ["airflow", "decorators", ...] | ["airflow", "sdk", ...] matching pattern, any_qualified_base_class for transitive inheritance, and so on — so the agent doesn't have to rediscover them from the source tree on every rule.
When it activates
The skill's frontmatter description triggers it on phrases like "add a new airflow rule", "create an airflow lint rule", "implement an airflow inspection", "new AIR rule", and similar formulations. The agent invokes it via Claude Code's Skill tool before writing any code.
SKILL.md is the entry point and is loaded into the agent's context when the skill is invoked. It contains the workflow, the 12-step checklist, the rule-numbering table, and key conventions — everything the agent always needs.
The references/ files are loaded on demand. Each checklist step points to the relevant reference (e.g., Step 2 → templates.md, Step 6 → pitfalls.md for the deferred-annotation gotcha). This split keeps the always-resident context small while still making the long-tail detail available when a step needs it.
Reference file purposes
File
Read when …
import-paths.md
The rule must match both Airflow 2 (deprecated) and Airflow 3 (airflow.sdk) import paths.
templates.md
Creating a new rule file (Step 2 of the checklist).
helpers.md
Looking for an existing helper before writing a new utility, or unsure which existing one applies.
patterns.md
Implementing a non-trivial pattern: decorator detection, dual-dispatch for TaskFlow + operator API, recursive return collection, mixed-content string scanning, etc.
documentation.md
Writing the rule's ## What it does / ## Why is this bad? / message / fix-title text.
pitfalls.md
Writing predicates that resolve names, chain fallback checks, or work with annotation expressions.
Maintaining this skill
When a Ruff PR review surfaces a bug, missing convention, or footgun that future rule authors should know about:
If it's a review-informed gotcha (e.g., a bug pattern an agent could repeat), add it to references/pitfalls.md.
If it's a new shared helper or idiomatic API, add it to references/helpers.md (and ensure the helper itself lives in airflow/helpers.rs).
If it's a recurring code pattern, add it to references/patterns.md with a pointer to the rule that exemplifies it.
If it's a docstring/message convention, add it to references/documentation.md.
If it's a workflow correction (the order of steps was wrong, a command needs a different flag), update SKILL.md directly.
Keep SKILL.md short — under ~250 lines is a good target. Detail belongs in references/.
Conciseness: Keep "What it does" to one sentence. Expand details in "Why is this bad?"
Active voice: Prefer "Using X causes..." over "X may cause..." or "It is possible that X causes..."
Imperative starts: Begin sentences in "Why is this bad?" with action verbs: "Using...", "This leads to...", "These symbols were removed..."
Specific consequences: Don't just say "this is bad practice"—explain the actual impact (performance, compatibility, maintainability)
Code examples: Keep them minimal but complete enough to demonstrate the issue. Include necessary imports.
Message Guidelines
Error messages (the message() method) should:
Be a generic description of the problem — do NOT include context-specific suggestions in the message
Use backticks around code symbols: "`{deprecated}` is removed in Airflow 3.0"
Avoid starting with "Checks for" or "Detects" (this is for documentation, not messages)
Fix titles (the fix_title() method) should:
Contain context-specific suggestions (e.g., "Use Jinja templates" vs "Move into a @task-decorated function")
Start with an imperative verb: "Use schedule", "Replace with...", "Remove..."
Be very brief (2-5 words when possible)
Even without an actual auto-fix, fix_title() is displayed as a separate help: ... line in diagnostics
Pattern: Generic message + context-specific fix title (from AIR003):
fnmessage(&self) -> String{"`Variable.get()` outside of a task".to_string()// Generic}fnfix_title(&self) -> Option<String>{ifself.in_function{Some("Move into a `@task`-decorated function".to_string())// Context-specific}else{Some("Use Jinja templates instead".to_string())// Context-specific}}
Examples from Existing Rules
Good documentation structure (from AIR002):
/// ## What it does/// Checks for a `DAG()` class or `@dag()` decorator without an explicit/// `schedule` parameter.////// ## Why is this bad?/// The default value of the `schedule` parameter on Airflow 2 is/// `timedelta(days=1)`, which is almost never what a user is looking for./// Airflow 3 changed the default value to `None`, which would break/// existing Dags using the implicit default.////// ## Example/// ```python/// from airflow import DAG////// # Using the implicit default schedule./// dag = DAG(dag_id="my_dag")/// ```////// Use instead:/// ```python/// from datetime import timedelta/// from airflow import DAG////// dag = DAG(dag_id="my_dag", schedule=timedelta(days=1))/// ```
Note the use of:
"Dag" (proper noun) in prose: "would break existing Dags"
`DAG()` and `@dag()` for class and decorator
Backticks for parameters: `schedule`, `timedelta(days=1)`
"Airflow 2" and "Airflow 3" (capitalized, no backticks)
ProviderReplacement — for provider migrations (AIR302/312):
Rename { module, name, provider, version } — moved to provider with rename
SourceModuleMovedToProvider { module, name, provider, version } — module changed
FunctionSignatureChange — for AIR303:
Message(&'static str) — describes the signature change
Try-Except Guarding
Prevents false positives when symbols are conditionally imported:
usecrate::rules::airflow::helpers::is_guarded_by_try_except;// Skip if the usage is inside a try-except that catches ImportError/AttributeErrorifis_guarded_by_try_except(expr,"airflow.old_module","OldName", checker.semantic()){return;}
This checks whether the expression is in a try-except block that:
For imports: catches ImportError or ModuleNotFoundError, and the try block imports from the new location
For attributes: catches AttributeError, and the try block accesses the new attribute
Fix Generation
usecrate::rules::airflow::helpers::{generate_import_edit, generate_remove_and_runtime_import_edit};// When symbol name changes (e.g., Dataset -> Asset):ifletSome(fix) = generate_import_edit(checker, stmt,"old_name","new_module","new_name"){
diagnostic.set_fix(fix);// Safe edit}// When module changes but name stays (provider migration):ifletSome(fix) = generate_remove_and_runtime_import_edit(checker, stmt,"new_module","name"){
diagnostic.set_fix(fix);// Unsafe edit}
Class/Module Identification
usecrate::rules::airflow::helpers::{is_airflow_builtin_or_provider, is_method_in_subclass};// Check if qualified name matches airflow.<module>.**.*<suffix> or provider equivalent:is_airflow_builtin_or_provider(segments,"operators","Operator")is_airflow_builtin_or_provider(segments,"secrets","Backend")is_airflow_builtin_or_provider(segments,"hooks","Hook")// Check if a method is defined in a subclass of a specific base:is_method_in_subclass(function_def, semantic,"execute", |qn| {matches!(qn.segments(),["airflow","models" | "sdk", ..,"BaseOperator"])})
Note: BaseOperator task-execution-time methods include execute, pre_execute, and post_execute.
All three run at task execution time (not DAG parse time).
Operator Argument Scope (template_fields)
Airflow operators declare a template_fields class attribute that determines which keyword arguments receive Jinja template rendering at runtime. This varies per operator (e.g., BashOperator templates bash_command and env; PythonOperator templates op_args, op_kwargs, templates_dict). When writing rules that inspect operator string arguments for template patterns, check all arguments (both positional and keyword) rather than restricting to known names — the pattern matching is typically specific enough to avoid false positives, and restricting to known names would miss custom operators and providers. Add a code comment explaining the rationale (see AIR201 for an example).
Transitive Inheritance Checks
When checking if a class inherits from a base class, use any_qualified_base_class from ruff_python_semantic::analyze::class instead of directly iterating class_def.bases(). This handles transitive inheritance (e.g., class A(BaseOperator) → class B(A) → class C(B)):
use ruff_python_semantic::analyze::class::any_qualified_base_class;any_qualified_base_class(class_def, semantic,&|qn| {matches!(qn.segments(),["airflow","models" | "sdk", ..,"BaseOperator"])})
Resolving Named Annotations to Class Definitions
When an annotation is a Name (e.g., def f() -> MyTD:) and the rule needs to inspect the referenced class — e.g., to check whether MyTD is a TypedDict subclass, an Enum, a Pydantic model, etc. — resolve the name to its binding, confirm it's a ClassDefinition, fetch the ClassDef statement, and then run any_qualified_base_class on it:
use ruff_python_ast::Stmt;use ruff_python_semantic::{BindingKind, analyze};fnannotation_is_typed_dict_subclass(annotation:&Expr,semantic:&SemanticModel) -> bool{letExpr::Name(name) = annotation else{returnfalse};// `resolve_name` consults the `resolved_names` map populated by the// visitor. With `from __future__ import annotations`, names *inside*// annotations are not added to that map (they are treated as forward// references and deferred). `lookup_symbol` does a live scope-based// lookup that works regardless of deferred-annotation context — use it// as a fallback so the rule behaves consistently with and without the// `__future__` import.letSome(binding_id) = semantic
.resolve_name(name).or_else(|| semantic.lookup_symbol(&name.id))else{returnfalse};let binding = semantic.binding(binding_id);if !matches!(binding.kind,BindingKind::ClassDefinition(_)){returnfalse;}letSome(Stmt::ClassDef(class_def)) = binding.statement(semantic)else{returnfalse;};
analyze::class::any_qualified_base_class(class_def, semantic, |qualified_name| {
semantic.match_typing_qualified_name(&qualified_name,"TypedDict")})}
This pattern catches the in-module case (the class is defined in the same file). Cross-module imports of subclasses would require re-export following and are usually out of scope for a linter — call that out explicitly in ## Fix safety rather than trying to handle it.
See AIR202 (task_implicit_multiple_outputs.rs) for a working example.
Detecting DAG Files
To check if a file is a Dag definition file, check for imports of DAG or dag from airflow. This is simpler and more reliable than checking for actual DAG() calls or @dag decorators, since types must be imported before use:
When determining whether code is at module level vs inside a function (e.g., to vary diagnostic messages), prefer semantic.current_scope().kind over semantic.current_statements().any(...). The scope approach correctly handles nested classes inside functions:
let in_function = matches!(
checker.semantic().current_scope().kind,ScopeKind::Function(_) | ScopeKind::Lambda(_));
current_scope() returns the innermost scope, so Variable.get() inside a class body nested in a function returns ScopeKind::Class (not function).
Rules that target both Airflow 2 and 3 must handle both old (deprecated) and new (airflow.sdk) import paths. Match against both in qualified_name.segments().
Old Import Path (Deprecated)
New Import Path (airflow.sdk)
airflow.decorators.dag
airflow.sdk.dag
airflow.decorators.task
airflow.sdk.task
airflow.decorators.task_group
airflow.sdk.task_group
airflow.decorators.setup
airflow.sdk.setup
airflow.decorators.teardown
airflow.sdk.teardown
airflow.models.dag.DAG
airflow.sdk.DAG
airflow.models.baseoperator.BaseOperator
airflow.sdk.BaseOperator
airflow.models.param.Param
airflow.sdk.Param
airflow.models.param.ParamsDict
airflow.sdk.ParamsDict
airflow.models.baseoperatorlink.BaseOperatorLink
airflow.sdk.BaseOperatorLink
airflow.sensors.base.BaseSensorOperator
airflow.sdk.BaseSensorOperator
airflow.hooks.base.BaseHook
airflow.sdk.BaseHook
airflow.notifications.basenotifier.BaseNotifier
airflow.sdk.BaseNotifier
airflow.utils.task_group.TaskGroup
airflow.sdk.TaskGroup
airflow.utils.context.Context
airflow.sdk.Context
airflow.datasets.Dataset
airflow.sdk.Asset
airflow.datasets.DatasetAlias
airflow.sdk.AssetAlias
airflow.datasets.DatasetAll
airflow.sdk.AssetAll
airflow.datasets.DatasetAny
airflow.sdk.AssetAny
airflow.models.connection.Connection
airflow.sdk.Connection
airflow.models.variable.Variable
airflow.sdk.Variable
airflow.io.*
airflow.sdk.io.*
Example pattern for matching both paths:
match qualified_name.segments(){// Match both old and new import paths["airflow","decorators","task"] | ["airflow","sdk","task"] => {/* ... */}["airflow","models","dag","DAG"] | ["airflow","sdk","DAG"] => {/* ... */}
_ => return,}
Important: Before writing a new decorator-checking function, check if one already exists in helpers.rs (e.g., is_airflow_task in AIR301). If a decorator check is needed by multiple rules, extract it to helpers.rs as a shared utility.
Use map_callable to handle both @decorator and @decorator() forms:
Also see ruff_python_semantic::analyze::visibility for generic decorator utilities: is_staticmethod, is_classmethod, is_overload, is_abstract, is_property, etc.
Dual-Dispatch: Targeting Both TaskFlow and Operator APIs
When a rule must detect the same anti-pattern in both @task.<variant> decorated functions and operator callables (e.g., BranchPythonOperator(python_callable=func)), use this architecture:
Shared analysis helper — extract the core logic into a private function that operates on a function body:
Statement-level entry point (for @task.<variant> decorators) — dispatched from statement.rs on StmtFunctionDef:
pub(crate)fnmy_rule_decorator(checker:&Checker,function_def:&StmtFunctionDef){// Check decorator, then call shared helper on function_def.body}
Expression-level entry point (for operator calls) — dispatched from expression.rs on ExprCall. Resolve the python_callable argument to the function definition using the semantic model:
pub(crate)fnmy_rule_operator(checker:&Checker,call:&ExprCall){// 1. Resolve call.func to qualified name, match operator paths// 2. Extract python_callable keyword argument// 3. Resolve name to function definition:letExpr::Name(name_expr) = &keyword.valueelse{return;};letSome(binding_id) = semantic.only_binding(name_expr)else{return;};letBindingKind::FunctionDefinition(scope_id) = semantic.binding(binding_id).kindelse{return;};letScopeKind::Function(function_def) = semantic.scopes[scope_id].kindelse{return;};// 4. Call shared helper on function_def.body}
Operator import paths to match (example for BranchPythonOperator):
See AIR003 (task_branch_as_short_circuit.rs) for a complete working example.
Checking Function Calls and Arguments
letExpr::Call(ast::ExprCall{ func, arguments, .. }) = expr else{return;};// Resolve the function being called:
checker
.semantic().resolve_qualified_name(func).is_some_and(|qn| matches!(qn.segments(),["airflow", ..,"SomeClass"]))// Check keyword arguments:ifletSome(keyword) = arguments.find_keyword("some_arg"){// keyword.value is the argument value expression}
Matching Airflow Operators (builtin + providers)
match qualified_name.segments(){// Builtin operators["airflow","operators", ..] => true,// Provider operators (operators must appear somewhere in the middle)["airflow","providers", rest @ ..] => {
rest.iter().position(|&s| s == "operators").is_some_and(|pos| pos + 1 < rest.len())}
_ => false,}
Collecting Return Statements (recursive)
Use ReturnStatementVisitor to find all returns including those in nested blocks:
When adding new helper functions or methods, place the higher-level predicates or public-facing helpers above the lower-level utilities they call. That way reviewers see the intent/entry point first, followed by the supporting helpers (e.g., in_airflow_task_function before is_airflow_task).
Mixed-content string scanning
When a rule must detect a pattern within a string argument — not just recognize the string's overall structure — use a raw-source scanner rather than operating on to_str(). This handles all quote styles (single, double, triple) uniformly and produces accurate sub-range TextRange values for in-place fixes.
When to use: the pattern appears inside a larger Jinja template or shell command string that has surrounding content (e.g., "echo {{ ti.xcom_pull('task') }}"). The fix should replace only the detected sub-expression, not the entire string argument.
Pattern (from AIR201):
use ruff_text_size::TextSize;structMyMatch{start:TextSize,// absolute file position of match startend:TextSize,// absolute file position just past match end// ... extracted fields ...}fnscan_my_patterns(source:&str,literal_start:TextSize) -> Vec<MyMatch>{letmut matches = Vec::new();letmut pos = 0;while pos < source.len(){let remaining = &source[pos..];letmut cursor = Cursor::new(remaining);// Try to match at this position; on mismatch, advance by one byte.letSome(token) = parse_identifier(&mut cursor)else{
pos += 1;continue;};if token != "expected_receiver"{
pos += token.len();// skip entire identifier to avoid suffix re-matchescontinue;}// ... parse rest of pattern using eat_char, parse_identifier, etc. ...let consumed = remaining.len() - cursor.as_str().len();iflet(Ok(s),Ok(e)) = (u32::try_from(pos), u32::try_from(pos + consumed)){
matches.push(MyMatch{start: literal_start + TextSize::new(s),end: literal_start + TextSize::new(e),// ...});
pos += consumed;continue;}
pos += 1;}
matches
}
In the rule function:
// Pure-template path first (entire string is the expression):ifletSome(result) = parse_pure_template(string_value){// whole-argument replacement fixcontinue;}// Mixed-content path: pattern appears within larger stringlet raw_source = checker.locator().slice(string_literal.range());for m inscan_my_patterns(raw_source, string_literal.start()){letmut diagnostic = checker.report_diagnostic(MyViolation{ .. }, arg_value.range());if in_scope {
diagnostic.set_fix(Fix::unsafe_edit(Edit::range_replacement("replacement_text".to_string(),TextRange::new(m.start, m.end),)));}}
Key invariants:
checker.locator().slice(string_literal.range()) returns the raw source starting at string_literal.start(), so adding a within-source byte offset to string_literal.start() gives the correct absolute TextRange.
The scanner is quote-agnostic: the target pattern cannot appear in ", ', or """ quote characters.
Advancing pos += token.len() (not just pos += 1) when a non-matching identifier is found prevents false positives from identifier suffixes (e.g., multi_ti matching as ti).
Store TextSize in the match struct (not usize) and use u32::try_from to convert — avoids as u32 cast truncation lint.
Multiple matches in one string each get their own Fix with IsolationLevel::NonOverlapping (the default), which ruff applies in a single pass when ranges do not overlap.
See crates/ruff_linter/src/rules/airflow/rules/xcom_pull_in_template_string.rs (scan_xcom_pull_patterns) for a complete working example.
Bugs and footguns surfaced in past PR reviews. Check this list whenever you write code that resolves names or chains fallback predicates.
Chained return matches!(...) swallows the false branch
When a predicate has multiple fallback strategies (e.g., "is this a known qualified name? if not, is it a TypedDict subclass?"), an early return matches!(qn.segments(), ...) exits the function on false as well as on true, skipping the rest:
// BUG — returns false instead of falling through to the next check.ifletSome(qn) = semantic.resolve_qualified_name(head){returnmatches!(qn.segments(),["collections","abc","Mapping" | ..]);}annotation_is_typed_dict_subclass(head, semantic)// never reached when qn resolves
Fix: make the early return conditional on the match being true, so the function falls through on a non-match:
When the file has from __future__ import annotations (PEP 563), names inside return-type / parameter annotations are forward references and do not appear in the resolved_names map. semantic.resolve_name(&name) returns None. Always pair it with semantic.lookup_symbol(&name.id) as a fallback when resolving annotation-position names — see "Resolving Named Annotations to Class Definitions" in helpers.md for the canonical pattern.
This is easy to miss: a hand-rolled fixture without the __future__ import will pass, but the standard airflow fixtures use it and silently exhibit the bug.
Field naming: semantics, not derivation
A violation field used to drive fix-title selection should be named for what it captures, not how it was derived. A reader of the struct should be able to tell what condition the bool represents without reading the rule body.
// Worse: reader has to find the call site to learn what `inferred` means.pub(crate)structAirflowTaskImplicitMultipleOutputs{inferred:bool}// Better: the field name is the assertion.pub(crate)structAirflowTaskImplicitMultipleOutputs{annotation_is_mapping:bool}
Eager let before short-circuit &&
If one of two predicates is much more expensive than the other (e.g., a function-body traversal versus a name lookup), don't bind both to let before combining them — that forces the expensive one to run even when the cheap one would have short-circuited. Inline the expensive call into the && so it only runs when necessary:
// Worse: body traversal runs for every flagged decorator, even when the// annotation already determined the answer.let annotation_is_mapping = ...;let body_returns_dict = body_has_dict_return(function_def, semantic);if !annotation_is_mapping && !body_returns_dict {return;}// Better: short-circuits when the annotation suffices.let annotation_is_mapping = ...;if !annotation_is_mapping && !body_has_dict_return(function_def, semantic){return;}
Single-use const &[&str] allowlists
If a const &[&str] is only consulted at one call site, inline it as a matches! arm at the use site. Named constants pay for themselves only when they have multiple callers or independent documentation value; a single-caller constant just adds a layer of indirection.
Choose the template below based on rule category. Place the resulting file at crates/ruff_linter/src/rules/airflow/rules/<snake_case_name>.rs.
Template: General Best-Practice Rule (AIR001-099)
use ruff_macros::{ViolationMetadata, derive_message_formats};use ruff_python_ast::{selfas ast};use ruff_python_semantic::Modules;use ruff_text_size::Ranged;usecrate::Violation;usecrate::checkers::ast::Checker;/// ## What it does/// <One-line description of what the rule checks for.>////// ## Why is this bad?/// <Explanation of why the flagged pattern is problematic.>////// ## Example/// ```python/// <Python code that triggers the violation>/// ```////// Use instead:/// ```python/// <Corrected Python code>/// ```#[derive(ViolationMetadata)]#[violation_metadata(preview_since = "NEXT_RUFF_VERSION")]pub(crate)structMyRuleName;implViolationforMyRuleName{#[derive_message_formats]fnmessage(&self) -> String{"<Diagnostic message shown to the user>".to_string()}}/// AIRxxxpub(crate)fnmy_rule_name(checker:&Checker,/* appropriate AST node */){if !checker.semantic().seen_module(Modules::AIRFLOW){return;}// Rule logic here...
checker.report_diagnostic(MyRuleName, node.range());}
Template: Airflow 3 Migration Rule (AIR3xx)
For rules that flag removed/moved/renamed symbols, use the existing infrastructure in helpers.rs:
use ruff_macros::{ViolationMetadata, derive_message_formats};use ruff_python_ast::{selfas ast,Expr};use ruff_python_semantic::Modules;use ruff_text_size::Ranged;usecrate::{FixAvailability,Violation};usecrate::checkers::ast::Checker;usecrate::rules::airflow::helpers::{Replacement, is_guarded_by_try_except};/// ## What it does/// Checks for uses of deprecated Airflow symbols removed in Airflow X.Y.////// ## Why is this bad?/// These symbols were removed/moved in Airflow X.Y and will cause runtime errors.////// ## Example / Use instead blocks...#[derive(ViolationMetadata)]#[violation_metadata(preview_since = "NEXT_RUFF_VERSION")]pub(crate)structMyMigrationRule{deprecated:String,replacement:String,}implViolationforMyMigrationRule{constFIX_AVAILABILITY:FixAvailability = FixAvailability::Sometimes;#[derive_message_formats]fnmessage(&self) -> String{letMyMigrationRule{ deprecated, replacement } = self;format!("`{deprecated}` is removed in Airflow X.Y; use `{replacement}` instead")}fnfix_title(&self) -> Option<String>{letMyMigrationRule{ replacement, .. } = self;Some(format!("Use `{replacement}`"))}}pub(crate)fnmy_migration_rule(checker:&Checker,expr:&Expr){if !checker.semantic().seen_module(Modules::AIRFLOW){return;}// Dispatch based on expression type:match expr {Expr::Attribute(ast::ExprAttribute{ attr, .. }) => {check_name(checker, expr, attr.range());}Expr::Name(_) => {check_name(checker, expr, expr.range());}
_ => {}}}fncheck_name(checker:&Checker,expr:&Expr,ranged:TextRange){letSome(qualified_name) = checker.semantic().resolve_qualified_name(expr)else{return;};let(replacement, module, name) = match qualified_name.segments(){["airflow","old_module","OldName"] => (Replacement::Rename{module:"airflow.new_module",name:"NewName"},"airflow.old_module","OldName",),
_ => return,};// Skip if guarded by try-except (conditional import):ifis_guarded_by_try_except(expr, module, name, checker.semantic()){return;}letmut diagnostic = checker.report_diagnostic(MyMigrationRule{deprecated: name.to_string(),replacement: replacement.to_string(),},
ranged,);// Optionally generate a fix:// if let Some(fix) = generate_import_edit(...) {// diagnostic.set_fix(fix);// }}
This skill should be used when the user asks to "add a new airflow rule", "create an airflow lint rule", "implement an airflow inspection", "new AIR rule", or discusses creating Ruff linter rules in the airflow category.
license
Apache-2.0
Creating a New Airflow Lint Rule in Ruff
This skill guides creating new Airflow-specific lint rules (AIR prefix) in the Ruff codebase. Detail content lives in references/ files — load them as needed instead of carrying them in context up front.
Reference Index
Read the relevant file when you reach a step that needs it:
references/import-paths.md — Airflow 2-to-3 import path mapping table (use when the rule must match both old and new SDK paths).
references/templates.md — Full rule-file skeletons for general best-practice (AIR001-099) and Airflow 3 migration (AIR3xx) rules.
references/helpers.md — helpers.rs utilities: Replacement enums, try-except guarding, fix generation, class/module identification, template_fields scope, transitive inheritance, resolving named annotations to class defs, Dag-file detection, scope-based context detection.
references/pitfalls.md — Review-informed bugs to avoid: chained return matches!(), resolve_name under from __future__ import annotations, field naming, eager let before &&, single-use allowlists. Skim this before writing predicates that resolve names or chain fallback checks.
Pre-Implementation: Gather Context
Before writing any code, ask the user the following questions (if not already answered in their request):
Airflow version target:
Which version of Airflow does this rule target?
Airflow 2 only
Airflow 3 onward
Both Airflow 2 and 3
DAG API targeting (for general best-practice rules AIR001-099 only):
Airflow DAGs can be written using the TaskFlow API (decorators like @task.branch) or the standard operator API (BranchPythonOperator). Both are equally supported. Should this rule target:
TaskFlow API only (decorator-based)
Operator API only (operator-based)
Both (recommended — the rule should be DAG implementation-agnostic)
If both: see the dual-dispatch pattern in references/patterns.md.
Local Airflow repository for validation:
Do you have a local clone of the Airflow repository? If so, provide the path (e.g., ~/repositories/airflow).
If available, use it after implementation to validate against real-world code (see Step 12 below).
Code prefix validation: If a rule targets Airflow 3 (onward), its code MUST start with AIR3## (e.g., AIR301, AIR302, AIR311). If the user specified a code that doesn't follow this pattern, WARN them before proceeding.
Important distinction for AIR3xx rules: AIR3xx rules are migration rules that flag old-style (Airflow 2) imports/patterns to help migrate to Airflow 3. They should only match deprecated import paths — NOT the new airflow.sdk paths (which are the correct replacements). However, shared helpers that check context (e.g., "is this function decorated with @task?") should match both old and new paths, since deprecated patterns inside a task function need to be flagged regardless of which import style the decorator uses.
Rule Numbering Scheme
Range
Category
Description
AIR001-099
General best-practice
Style, readability, common mistakes (not version-specific)
AIR301
Removed in 3.0
Symbols/args fully removed in Airflow 3.0 with no compat layer
AIR302
Moved to provider in 3.0
Symbols moved to external provider packages (required migration)
Deprecated with compat layer — still works but will break later
AIR312
Suggested provider move for 3.0
Deprecated compat layer for provider migrations
AIR321
Moved in 3.1
Symbols moved/deprecated in Airflow 3.1
Checklist
Follow these steps in order. Each step is mandatory.
0. Search for Reusable Utilities
Before writing rule logic, search for existing utilities that can be reused:
ruff_python_semantic: Check SemanticModel methods (e.g., resolve_qualified_name, match_builtin_expr, match_typing_expr) and analyze/visibility.rs for decorator-checking patterns.
ruff_python_ast: Check helpers.rs for AST traversal utilities (e.g., map_callable, ReturnStatementVisitor).
ruff_python_trivia::Cursor: For rules that need ad-hoc parsing of string content (e.g., Jinja templates, SQL fragments), use Cursor instead of chaining strip_prefix/strip_suffix/find (see AIR201 for an example).
crate::rules::airflow::helpers: Check for existing airflow-specific helpers — full inventory in references/helpers.md.
Existing airflow rules: Check rules like AIR301 (removal_in_3.rs) for patterns that may already exist or could be extracted.
If a pattern would be useful in multiple rules, extract it into helpers.rs as a shared utility rather than duplicating code.
1. Choose a Rule Code and Name
Check existing codes in crates/ruff_linter/src/codes.rs under the // airflow section.
Pick the next available code in the appropriate range.
Name the struct following the convention: the name should make sense as "allow ${name}". For example, TaskBranchAsShortCircuit reads as "allow task branch as short circuit".
Use the appropriate skeleton from references/templates.md (general best-practice or migration rule). For Airflow-2-and-3 rules, also consult references/import-paths.md for the dual-path matching pattern.
3. Register the Rule Code
In crates/ruff_linter/src/codes.rs, add an entry under the // airflow section:
Cases that SHOULD trigger the rule (with # AIRxxx comments)
Cases that should NOT trigger the rule (edge cases, similar but valid patterns)
For migration rules: cases guarded by try-except (should NOT trigger)
If the rule resolves names inside annotations: include at least one fixture with from __future__ import annotations to catch the deferred-annotation bug — see references/pitfalls.md.
7. Add the Test Case
In crates/ruff_linter/src/rules/airflow/mod.rs, add a #[test_case] line:
Always run cargo fmt before testing or committing. Rust formatting issues will cause CI failures.
cargo fmt -p ruff_linter
9. Run Tests and Accept Snapshots
# Verify output manually first:
cargo run -p ruff -- check crates/ruff_linter/resources/test/fixtures/airflow/AIRxxx.py --no-cache --preview --select AIRxxx
# Run the test (will fail first time, generating a snapshot):
RUFF_UPDATE_SCHEMA=1 cargo nextest run -p ruff_linter -- "airflow::tests"# Accept the snapshot:
cargo insta accept
# Verify the test passes:
RUFF_UPDATE_SCHEMA=1 cargo nextest run -p ruff_linter -- "airflow::tests"
10. Regenerate Docs and Schemas
cargo dev generate-all
11. Run All Checks
cargo clippy -p ruff_linter --all-targets --all-features -- -D warnings
uvx prek run -a
12. Validate Against Airflow (if local repo available)
If the user provided a path to a local Airflow repository during pre-implementation, run the new rule against it:
True positives: Violations that correctly flag the anti-pattern. Report the count and sample locations.
False positives: Violations that flag code that is actually correct. If found:
Identify the pattern causing the false positive.
Update the rule logic to exclude it (e.g., add a guard clause).
Add the false-positive pattern as a non-violation test case in the fixture.
Re-run steps 8–11 to update snapshots and verify.
Report findings to the user before finalizing.
Key Conventions
Use checker.report_diagnostic(ViolationStruct, range) — NOT Diagnostic::new().
Add #[violation_metadata(preview_since = "NEXT_RUFF_VERSION")] for new rules.
Always guard with checker.semantic().seen_module(Modules::AIRFLOW).
For migration rules, use is_guarded_by_try_except to avoid false positives on conditional imports.
For rules targeting both Airflow 2 and 3, match both old and new (airflow.sdk) import paths (see references/import-paths.md).
Reuse over duplication: before writing a utility function, search ruff_python_semantic, ruff_python_ast, and airflow/helpers.rs for existing implementations. If a pattern is useful in multiple rules, add it to helpers.rs rather than keeping it local.
Follow early-return style (guard clauses) rather than deeply nested if-let chains.
Prefer let chains (if let combined with &&) over nested if let when possible.
Avoid panic!, unreachable!, or .unwrap().
Use #[expect()] over #[allow()] for suppressing clippy lints.
For internal (non-public) functions, implementation notes should be /// doc comments, not // comments.
Short-circuit expensive checks: when a violation requires either of two predicates and one of them traverses the function body, order the cheap one first in && so the expensive one only runs when needed. Avoid eager let-bindings of expensive traversals.
Inline single-use allowlists: if a const &[&str] is only consulted at one call site, inline it as a matches! arm at the use site.
Name violation fields by their semantics, not their derivation: a field used to decide between two fix titles is annotation_is_mapping, not inferred.