Does CodeLoop work with Claude inside Cursor?

Yes. CodeLoop registers as an MCP server in Cursor's mcp.json. Whatever model Cursor is pointed at — Claude, GPT, Gemini, the Cursor-tuned models — can call the 29 CodeLoop tools.

Will CodeLoop add to my Claude API bill?

No. CodeLoop never spawns its own LLM calls. It runs tests, captures screenshots, and records videos locally; the calling agent (Claude, in this case) does all the reasoning.

How does CodeLoop differ from Cursor Bugbot?

Bugbot reports issues; CodeLoop runs the full verify → diagnose → fix → gate-check loop, ships real screenshots and videos as evidence, and works in both Cursor and Claude Code.

Do I have to write tests for CodeLoop to work?

It uses whatever tests you have (Vitest, Jest, Playwright, Flutter, Maestro, etc.). If a project has zero tests, CodeLoop still runs lint and build, plus screenshot and design-comparison gates for UI projects.

Running Claude Inside Cursor? Here's How to Add Automated QA

Cursor's "switch model to Claude" toggle is now the default for a lot of senior engineers. The reasoning is solid: you get Cursor's agent UX, file context, and terminal — with Claude's depth on long edits and refactors. The combo writes code faster than any single-tool stack we've measured.

But there's a missing layer. Neither Cursor nor Claude ships with an automated QA loop. The agent edits, you read the diff, you switch to the browser, you click around, you paste failures back into chat. Repeat. The thing that's *fast* is the writing. The thing that's *slow* is still you.

This is exactly the gap CodeLoop fills.

The 60-second setup

CodeLoop is a local MCP server. It registers itself with Cursor (so Claude inside Cursor can call its tools) and adds a User Rule that says "after every code change, verify and gate-check." Once that rule is in place, every Claude edit triggers a real verify pass before the chat moves on.

npx codeloop install-cursor-extension

That's it. No config file to edit, no MCP JSON to paste. The extension wires up ~/.cursor/mcp.json, drops the User Rule into ~/.cursor/codeloop-user-rule.md, and reloads Cursor. Claude (or any model you switch to) now has access to 29 verification tools.

What changes in your loop

Before:

Ask Claude for a feature.

Read the diff.

Switch to the browser.

Click around.

Find a bug.

Paste the bug back into Cursor.

Loop until you give up.

After:

Ask Claude for a feature.

Claude calls codeloop_verify automatically.

If it fails, Claude calls codeloop_diagnose and fixes the listed issues.

Claude calls codeloop_capture_screenshot for every changed page.

Claude calls codeloop_gate_check and only stops when confidence ≥ 94%.

You read one message: "Done. Confidence 96%. Here are the screenshots."

The difference is real-test evidence in the chat instead of agent confidence theater.

Why this works specifically with Claude

Claude is unusually good at *reading* structured tool output. When codeloop_verify returns a 2-KB JSON object with pass/fail counts, artifact paths, and a "next-step" suggestion, Claude follows it deterministically. It's the same trait that makes Claude great at function calling — it doesn't pretend the output isn't there.

That means CodeLoop's verify → diagnose → gate flow turns into a clean state machine instead of a probabilistic suggestion. Claude rarely declares a task done before the gate actually passes.

Cost: zero extra LLM tokens

CodeLoop never spawns its own model calls. All reasoning is delegated to Claude (or whatever model Cursor is currently pointed at). CodeLoop just runs your tests, captures screenshots, records videos, and posts the results back. Your Claude bill doesn't change.

What to install next

- npx codeloop init in any project — autodetects Flutter / web / Python / Ruby / Rails / Rust.

- Add designs/ PNGs and let codeloop_design_compare gate visual regressions against Figma.

- Plug the same MCP server into Claude Code (npx codeloop install) so your CLI workflow gets the same gates.

Install CodeLoop → · How it works in Cursor →