Earn 14 free days when your bug report or suggestion is accepted — how it works

Architecture

CodeLoop is a small set of well-separated pieces that all agree on a single contract: every run produces a RunArtifact directory on disk. Everything else — the dashboard, the GitHub Action sticky comment, the badge, the gate score — is a view over that directory. This page is the architectural overview for engineers evaluating the project, contributing, or self-hosting.

High-level diagram

            ┌────────────────────────────────────────────────────┐
            │   Cursor / Claude Code / Devin (any MCP-aware agent)│
            └────────────────────────────────────────────────────┘
                              │  29 MCP tools (stdio JSON-RPC)
                              ▼
            ┌────────────────────────────────────────────────────┐
            │              codeloop-mcp-server (Node)            │
            │  verify · diagnose · gate-check · screenshots ·   │
            │  recording · interaction · design compare · ...    │
            └────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┼──────────────────┐
            ▼                 ▼                  ▼
   ┌──────────────────┐  ┌──────────────────┐ ┌──────────────────┐
   │  Local runners   │  │  Plugin sandbox  │ │ Artifact writer  │
   │  (Playwright,    │  │  (.codeloop/     │ │ artifacts/runs/  │
   │   Maestro, adb,  │  │   plugins.json)  │ │   <run_id>/...   │
   │   simctl, ffmpeg)│  │                  │ │ RunArtifact JSON │
   └──────────────────┘  └──────────────────┘ └──────────────────┘
                                                       │
                                ┌──────────────────────┼─────────────────┐
                                ▼                      ▼                 ▼
                      ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
                      │  Local dashboard │ │  GitHub Action   │ │  CLI reporters   │
                      │  localhost:3737  │ │ codeloop-verify  │ │  (status, doctor)│
                      └──────────────────┘ └──────────────────┘ └──────────────────┘
                                ▲                      │
                                │ optional             │ usage events (counts only)
                                │ --share tunnel       ▼
                      ┌──────────────────────────────────────────────┐
                      │   codeloop-backend-api  (cloud or self-host) │
                      │   auth · billing · usage · badge · OSS apply │
                      └──────────────────────────────────────────────┘

The MCP server

codeloop-mcp-server is the binary that AI agents talk to. It speaks the Model Context Protocol over stdio, exposing the 29 tools documented in Tool reference. It does not call any LLM — every tool is a deterministic local computation.

  • Stateless. The server holds no session state across tool calls. State lives in the artifact directory and .codeloop/.
  • Multi-agent safe. Two MCP clients can connect simultaneously (Cursor in one window, Claude Code in another). Each run gets a unique run_id; the agents see consistent state.
  • No code modification. The server reads files but never writes source code. Repairs are sent back as task lists for the agent to apply.

The CLI

codeloop (npm package codeloop) is the same toolbox exposed as commands. The CLI and the MCP server share a common @codelooptech/shared module that holds the runners, the artifact format, and the project setup logic. New functionality lands in shared first; the CLI and MCP server are thin adapters.

See CLI reference.

The plugin sandbox

Built-in runners cover Node, Web (Playwright), Flutter, Xcode, Android, .NET. Anything else — Python/Django, Ruby/Rails, Go, custom monorepo scripts — plugs in via .codeloop/plugins.json. Plugins run in a child process with the project root as cwd, get the run id injected, and emit either:

  • A standard parser output (pytest JSON, RSpec JSON, Jest, TAP, exit code).
  • A custom RunArtifact fragment— advanced plugins write their own JSON which is merged into the main artifact.

See Plugin SDK.

The artifact writer

Every tool that produces evidence (verify, diagnose, screenshot, recording, gate-check) writes into a per-run directory rooted at artifacts/runs/<run_id>/. The shape is documented in Core concepts » Artifact and is the public contract that decouples the agent from the downstream consumers (dashboard, action, badge).

Two design choices worth understanding:

  • Append-only. Runs never overwrite each other. Cleanup is a separate concern (codeloop_run_history --gc).
  • Self-describing. manifest.jsonat the root contains every relative path the run produced — consumers can parse a single file to discover everything else.

The local dashboard

Next.js web app served on localhost:3737. Reads the artifact directory directly — there is no API call, no auth, no telemetry. The dashboard is “just a viewer”; it is safe to run on any developer machine, in CI, or behind a private VPN.

For team sharing, --share spawns a temporary cloudflared tunnel and prints a public URL that terminates when you stop the dashboard.

The GitHub Action

codeloop/codeloop-verify@v1 wraps the CLI to:

  1. Install Node + the CodeLoop CLI on the runner.
  2. Run codeloop verify » codeloop diagnose » codeloop gate-check.
  3. Post (or update) a sticky PR comment with the gate result.
  4. Optionally publish the gate score to /badge/<repo>.svg for the README badge.

The action runs the same artifact pipeline as a local run — same run_id shape, same dashboard view if you download the artifact and open it locally.

The backend API

codeloop-backend-api is the small server at api.codeloop.tech. Its surface is intentionally minimal:

Endpoint groupPurpose
/v1/authEmail + password sign-in, browser-key handshake, key validate.
/v1/keysCreate / rotate / revoke API keys (requires auth).
/v1/billingSubscription state, plan changes, invoice URL (Stripe pass-through).
/v1/usageCounters batched in by the MCP server. { kind, count, project_hash, ts } only.
/v1/badgeReturns the SVG badge with the latest gate score, plus the public showcase page for the run.
/v1/ossSelf-serve OSS application + auto-approve based on GitHub repo metadata.

See Security & data handling for what counters look like and what is never transmitted.

Self-host stack

For environments that want zero data exfiltration, the same pieces (API + dashboard + Postgres + Redis + MinIO) ship as a single Docker Compose file in deploy/self-hosted/. The CLI respects CODELOOP_API_URL and CODELOOP_MODE=local, so any cl_test_* key is accepted and no metering events ever leave your network. See Self-host runbook.

Project layout (monorepo)

packages/
  cli/                # codeloop CLI
  mcp-server/         # codeloop-mcp-server
  shared/             # runners, templates, RunArtifact contract
  dashboard-ui/       # Next.js dashboard
  backend-api/        # api.codeloop.tech
  cursor-extension/   # VSIX
  claude-toolkit/     # Claude Code agents + memory templates
  website/            # codeloop.tech
deploy/
  self-hosted/        # docker-compose + .env.example
docs/
  RUNBOOK_CROSS_OS.md
  E2E_TEST_CHECKLIST.md
.github/
  workflows/          # CI for the project itself + scheduled cross-OS sweep

Contributing

See CONTRIBUTING.md. Most pull requests touch packages/shared/ first (runners + templates) and then add thin adapters in packages/cli/ and packages/mcp-server/. Tests live next to the code; integration sweeps run nightly across macOS, Linux, and Windows.

Related