Workflow Finding Type Race Condition

Problem

When Workflow A calls Workflow B via a run_workflow node, both workflows must share the same entity_name_type (finding type). This is validated at save time and at every WorkflowManager instantiation. However, entity_name_type lives on the mutable Workflow record — not in the versioned snapshot — so it can be changed at any time via PUT /workflows/<cuid>, breaking running executions.

Root Cause

Two separate API paths update a workflow:

Operation	Endpoint	Versioned?
Settings (entity_name_type, trigger, title…)	`PUT /workflows/<cuid>`	No — writes directly to `workflows` table
Graph (nodes, edges)	`POST /workflows/<cuid>/<version>`	Yes — creates new version if active executions exist

The Version source JSON only contains {nodes, edges}. All entity context (entity_name_type, entity_name, primary_record_type_id) is read from the live Workflow record at runtime. When resuming an execution, validate_source() compares the current entity_name_type values — not the values at the time the execution started.

Data flow on resume

execution.get_workflow_manager()
  └─> WorkflowManager.__init__(workflow=self.workflow, version=self.version)
        └─> version_to_use.validate_source()
              └─> NodesManager.validate_nodes_data(
                    workflow=self.workflow,     ← CURRENT mutable record
                    nodes=version.source["nodes"] ← frozen graph
                  )
                  └─> _validate_run_workflow_node():
                        workflow.entity_name_type  (current, mutable)
                        vs target_workflow.entity_name_type (current, mutable)
                        → WorkflowValidationError if mismatch

Reproduction

Setup

Workflow #3 — "Model Limitation Artifact Workflow 1", finding type = Model Limitation
Workflow #4 — "Policy Exception Artifact Workflow 1", finding type = Policy Exception → changed to Model Limitation
Workflow #4 has a run_workflow node pointing to Workflow #3
Both workflows set to same finding type, saved successfully, execution started

Steps

Start execution of Workflow #4 on a Policy Exception artifact
Execution pauses at user_action node (waiting for user input)
Edit Workflow #4 settings → change entity_name_type back to Policy Exception (different from Workflow #3's Model Limitation)
Return to the running execution → submit the user action form

Result

WorkflowValidationError:
  Run Workflow: Incompatible entity name types.
  Current workflow has type 'cmnnax6gd002m73i5xjmwdn9q'
  but referenced workflow has type 'cmnnax6ha002q73i5cqfdkjnn'.

The execution is permanently stuck — every interaction re-triggers the same validation. Only reverting the finding type via DB or UI unblocks it.

Key Code Locations

File	Lines	What
`src/backend/db/workflow.py`	231-232	`entity_name_type` column on Workflow
`src/backend/db/workflow.py`	2977-3030	Version model — `source` only stores nodes/edges
`src/backend/db/workflow.py`	2012-2150	`update_workflow` — no guards for active executions
`src/backend/db/workflow.py`	3137-3162	`validate_source` — uses `self.workflow` (current, not snapshot)
`src/backend/db/workflow.py`	2549-2637	`get_workflow_dependencies` — finds workflows referencing this one
`src/backend/workflows/managers.py`	1258-1322	`_validate_run_workflow_node` — the check that fails
`src/backend/workflows/managers.py`	2163-2169	`validate_source()` called in `WorkflowManager.__init__`
`src/backend/handlers/workflows_handlers.py`	50-117	Handler that calls `update_workflow`

Proposed Solution: Guard at Settings Update Time

Add checks in Workflow.update_workflow() (src/backend/db/workflow.py:2012) when entity_name_type is changing:

Guard 1: Block if this workflow has active executions

if entity_name_type != workflow.entity_name_type:
    active_count = db.session.execute(
        select(func.count()).select_from(Execution).filter(
            Execution.workflow_id == workflow.id,
            Execution.status.in_([
                Execution.STATUS_ACTIVE,
                Execution.STATUS_WAITING,
                Execution.STATUS_SCHEDULED,
            ])
        )
    ).scalar_one()
    if active_count > 0:
        raise BadRequestError(
            "Cannot change finding type while this workflow has active executions."
        )

Guard 2: Block if other workflows reference this one via `run_workflow` and have active executions

get_workflow_dependencies(cuid) already returns workflows that have run_workflow nodes pointing to this workflow (used today to block deletion). Extend the check:

if entity_name_type != workflow.entity_name_type:
    deps = cls.get_workflow_dependencies(workflow.cuid)
    dependent_workflows = deps.get("dependent_workflows", [])
    if dependent_workflows:
        # Check if any dependent workflow has active executions
        dep_cuids = [dw["cuid"] for dw in dependent_workflows]
        dep_active = db.session.execute(
            select(func.count()).select_from(Execution).filter(
                Execution.workflow_id.in_(
                    select(cls.id).filter(cls.cuid.in_(dep_cuids))
                ),
                Execution.status.in_([
                    Execution.STATUS_ACTIVE,
                    Execution.STATUS_WAITING,
                    Execution.STATUS_SCHEDULED,
                ])
            )
        ).scalar_one()
        if dep_active > 0:
            raise BadRequestError(
                "Cannot change finding type: workflows referencing this one "
                "have active executions."
            )

Guard 3 (optional): Block if this workflow references other workflows with different finding type

This is already caught by validate_source() at save time for the graph. But if the referenced workflow's type changed, the parent's next settings save would not catch it (settings update doesn't call validate_source). Consider also running validate_source on the latest version during settings update when entity_name_type changes.

Why This Approach

Matches existing pattern: deletion already uses get_workflow_dependencies() — same guard, different mutation
Minimal scope: one method, two/three checks, no migration
Doesn't require versioning entity_name_type: that would need a migration + changes to Version model + changes to how validate_source resolves entity context
Closes the gap: save-time validation already works for the graph; this fix prevents the mismatch from being created after a valid save

Long-Term Consideration

Versioning entity_name_type alongside the node graph in source would make the system fundamentally resilient — executions would always use the finding type from when the version was created. But this is a larger change:

Schema migration to add entity_name_type to Version or source JSON
Changes to validate_source to read from version context instead of live Workflow
Migration path for existing versions without the field
Needs careful analysis of all places that read workflow.entity_name_type

Additional Notes

entity_name (Finding vs InventoryModel) is not changeable via UI (frontend restriction only, backend accepts it). Not a concern for this issue.
active_execution_count() only exists on Version, not Workflow — the guard needs a direct query on Execution.workflow_id.

panchicore/workflow-finding-type-race-condition.md

Select an option

No results found

Select an option

No results found

Workflow Finding Type Race Condition

Problem

Root Cause

Data flow on resume

Reproduction

Setup

Steps

Result

Key Code Locations

Proposed Solution: Guard at Settings Update Time

Guard 1: Block if this workflow has active executions

Guard 2: Block if other workflows reference this one via `run_workflow` and have active executions

Guard 3 (optional): Block if this workflow references other workflows with different finding type

Why This Approach

Long-Term Consideration

Additional Notes

panchicore/workflow-finding-type-race-condition.md

Workflow Finding Type Race Condition

Problem

Root Cause

Data flow on resume

Reproduction

Setup

Steps

Result

Key Code Locations

Proposed Solution: Guard at Settings Update Time

Guard 1: Block if this workflow has active executions

Guard 2: Block if other workflows reference this one via run_workflow and have active executions

Guard 3 (optional): Block if this workflow references other workflows with different finding type

Why This Approach

Long-Term Consideration

Additional Notes

Guard 2: Block if other workflows reference this one via `run_workflow` and have active executions