Inside the Multi-Section Orchestrator

Most AI coding tools are happy to write a function or a single screen. CodeLoop's multi-section orchestrator is built for the harder case: ship an entire app, end to end, while you go for a walk. This post unpacks how that actually works.

The master spec

Every multi-section project starts with a master_spec.md. It is a single Markdown file that lists every section the app needs — a landing page, an auth flow, a settings panel, a billing dashboard — together with the acceptance criteria for each section and the dependencies between them. Drop the file at the root of your project, run codeloop init, and the orchestrator parses it into a typed plan.

The plan is intentionally lightweight: each section gets a name, an acceptance file under docs/acceptance/section-N.md, and a list of upstream dependencies. CodeLoop does not prescribe a directory layout or a framework — the spec is pure intent, the agent decides how to translate it into code.

The dependency graph

Once the master spec is parsed, CodeLoop builds a dependency graph. Sections that depend on nothing are eligible to start; sections that depend on section-1 wait for it to reach ready_for_review. The graph is reread before every section transition so a late-arriving dependency (e.g. you edit the spec mid-run) is honoured automatically.

The graph also enforces global invariants. If the spec mentions an integration check between sections 3 and 4, the orchestrator inserts a codeloop_integration_check step before either section can be marked complete. Sections cannot ship in isolation if the spec says they must be wired together.

The section state machine

Each section walks a deterministic state machine:

planning → implementing → verifying → diagnosing → repairing → gate_check → ready_for_review

The agent advances the section by calling the appropriate MCP tool. codeloop_section_status returns the current state and any blocking repair tasks. codeloop_replan is available for the rare case where a section needs a different approach mid-flight; it preserves evidence already gathered (screenshots, test runs, build logs) so the agent does not pay for the same proof twice.

Importantly, the state machine is *resumable*. If you close your IDE in the middle of section 3, the next time the agent runs codeloop_section_status it picks up exactly where it left off — same evidence, same repair list, same confidence baseline.

The integration check

A common failure mode in multi-section work is "all sections passed but the app is broken". CodeLoop guards against this with codeloop_integration_check: a synthetic verify run that exercises the entire app at once. The check fires after every two sections complete, and again before the final gate_check. It is the moment where signup-then-billing-then-settings actually has to work as a single user flow.

If the integration check fails, the orchestrator does not unwind — it surfaces the failure as a new repair task on whichever section the diagnostic points to. The repair flows through that section's state machine like any other failure, then the integration check is retried. This keeps the loop monotonic: every iteration moves towards green.

Evidence and lineage

Every transition is recorded in .codeloop/runs// with the full set of evidence: build logs, test JSON, screenshots, video, repair history. The run_id is bound to a commit_sha and a branch so reproducing any decision later — including the gate-check confidence — is a one-line lookup.

This is the same evidence the local dashboard renders, which means you can audit a multi-section build in the browser the moment it finishes. No spreadsheet, no manual collation.

What this enables

Once the orchestrator is in place, the unit of work changes. You stop micromanaging "implement the login form, now the password reset, now the email verification". You hand the agent a master spec, you go for a walk, and you come back to a fully verified app with structured evidence per section. The agent never asks for human input mid-flight unless a section's state machine genuinely cannot make progress — and even then, the question is precise enough to answer in one sentence.

Try it

A complete sample lives in examples/multi-section-sample. It exercises five interlocking sections — landing, auth, dashboard, billing, settings — and ships in roughly 25 minutes on a clean machine.

Start your free trial → | Read the docs →