Action Readiness
When a workflow is running, Waymark needs to decide when each node is ready to execute. The answer is one rule, applied uniformly to every node type - actions, barriers, joins, returns:
Every node tracks readiness. A node is only enqueued when all its required predecessors have completed.
That's the entire scheduling model. The corollary - and the reason it scales - is that completions push their effects forward to the immediate downstream nodes; we never rescan the graph to find runnable work.
The core rule
Same shape applied to every node:
- An action is ready when its predecessors have completed.
- A join (fan-in) is ready when its required predecessor count is met.
- A return is ready when its predecessor produces a value.
There's no special-case scheduler for "actions" vs "barriers" vs "loops" - there's one push-based propagation, and the node type only affects what happens after it becomes ready.
Vocabulary
A few terms used throughout the runtime:
- State machine edges - execution order (who can run after whom).
- Data-flow edges - variable writes from one node to another.
- Inline nodes - assignments, expressions, branches, returns. These don't need worker dispatch; they advance in the runloop.
- Frontier nodes - actions, barriers, outputs. Inline traversal stops at these because they need external coordination.
- Readiness - determined by predecessor completion status.
Push-based scheduling
When a node completes, the runloop:
- Marks the node's completion in the execution graph.
- Stores the result and updates variables in the workflow scope.
- Evaluates guards on outgoing edges.
- For each successor, checks whether all required predecessors are complete.
- Adds newly-ready successors to the ready queue.
All of this is in-memory; durable state is batched periodically to Postgres. The runloop never does a global scan to ask "what's runnable right now?" - every completion knows what it unblocks, and only those nodes are touched.
Frontier nodes
Inline traversal stops at three node kinds, because each requires external coordination:
- Action: dispatched to a Python worker.
- Barrier: waits for multiple predecessors before firing - spread aggregators, joins of parallel branches.
- Output: the workflow's terminal node.
Joins with required_count = 1 collapse to inline (no actual barrier
needed). Joins with more predecessors become real barriers.
A worked example
Take a fan-out with spread:
items = @fetch_items()
results = spread items:item -> @process_item(item=item)
summary = @summarize(items=results)
The completion flow:
@fetch_items()completes;itemslands in the workflow scope.- The spread node creates N action instances, one per item, each
tagged with a
spread_index. - Each
@process_itemcompletion stores its result against itsspread_index. - The barrier becomes ready once all N results have arrived.
- The barrier aggregates results into an ordered list and writes it to scope.
@summarizebecomes ready and receives the aggregated results.
At every step, only the immediate downstream nodes are touched. No scan, no quadratic walk.
Loops and branches
Loops and branches use the same model. They're just nodes and edges, arranged so back-edges and guards do the right thing.
A loop's IR looks roughly like:
fn main(input: [items], output: [results]):
results = []
for item in items:
processed = @process_item(item=item)
results = results + [processed]
return results
One iteration flows like this:
loop_initsets the internal index (inline).loop_condevaluates the guard and picks continue vs break.loop_extractassignsitem = items[__loop_i](inline).@process_itemis a frontier action - dispatched to a worker.- On completion, the result lands as
processedandresultsis updated. - The append assignment runs inline;
resultsis now in scope. loop_incradvances__loop_i; the back-edge routes toloop_cond.- When the guard fails, the break edge routes to
loop_exit.
A few specifics worth knowing:
- A loop head is a
branchnode with guarded edges. - Loop back-edges are marked and do not count toward readiness.
- Each iteration updates state and resets node status where needed.
- Branch joins become barriers only when multiple paths can converge.
This keeps loops inside the same push-based model - no separate scheduler mode for iteration.
Why this scales
Push-based scheduling costs O(d) per completion, where d is the
number of downstream nodes touched by that completion. The cost is
local to the completion, not to the graph size - a workflow with a
million completed nodes costs no more per step than a workflow with a
hundred. That's the key reason fan-outs of arbitrary width and
long-running multi-stage pipelines stay tractable on Postgres.