Earn 14 free days when your bug report or suggestion is accepted — how it works
Back to blog

How to Automate Testing for Cursor AI-Generated Code

CodeLoop TeamApril 24, 20266 min read

How to Automate Testing for Cursor AI-Generated Code

Cursor is the fastest way to write code with AI. But there's a gap between "the code compiles" and "the code works." Every Cursor user knows the cycle: ask the agent to implement a feature, manually test it, find 5 bugs, paste them back, fix 3, introduce 2 new ones, test again.

CodeLoop closes this gap by automating the entire verification loop inside Cursor.

What you get

After a one-time setup, your Cursor agent will automatically:

  • Run codeloop_verify after each implementation — build, lint, test, and screenshots in one call
  • Call codeloop_diagnose when failures occur — categorized repair tasks, prioritized by severity
  • Fix the issues using the structured repair tasks
  • Check the gate with codeloop_gate_check — pass/fail at 94% confidence
  • Loop until done — up to 15 iterations without human intervention
  • Setup (under 2 minutes)

    Step 1: Get your API key

    Sign up at codeloop.tech/signup (free 14-day trial, no credit card) and copy your API key.

    Add to your shell profile (~/.zshrc or ~/.bashrc)

    export CODELOOP_API_KEY="cl_live_your_key_here"

    Step 2: Initialize in your project

    cd your-project

    npx codeloop init

    This creates the MCP config at .cursor/mcp.json and sets up agent rules that tell Cursor when and how to call CodeLoop tools.

    Step 3: Enable Auto-Run mode

    By default, Cursor prompts you to approve every terminal command. To let the verification loop run uninterrupted:

  • Open Settings: Cmd+Shift+J (Mac) or Ctrl+Shift+J (Windows/Linux)
  • Go to Features > Terminal
  • Set Auto-Run Mode to "Yolo" (runs everything) or "Auto-Run with Allowlist" (safer)
  • Step 4 (optional): Global activation

    Want CodeLoop active in every future project without running init again?

    npx codeloop init --global

    This registers the MCP server globally in ~/.cursor/mcp.json so CodeLoop tools are available in every workspace.

    What the loop looks like in practice

    You ask Cursor: *"Implement the login screen with email/password validation."*

    The agent writes the code, then automatically calls codeloop_verify. The output looks like:

    {

    "status": "fail",

    "build": { "passed": true },

    "tests": { "passed": 8, "failed": 2 },

    "confidence": 0.72

    }

    The agent calls codeloop_diagnose, gets repair tasks, fixes the two failures, and calls codeloop_verify again. This time: 10/10 tests pass, confidence 0.94, gate passes. Done — without you touching anything.

    Design comparison with Figma

    If you have Figma designs, CodeLoop can compare your coded UI against them:

  • Export your Figma frames to designs/ or configure .codeloop/figma.json with your Figma API token
  • The agent calls codeloop_design_compare to pixel-diff across viewports
  • A blocker gate (design_compare_evidence) prevents shipping until the match score meets the threshold
  • This is particularly powerful for UI-heavy projects where "it works" isn't enough — it also needs to *look right*.

    Video recording and interaction testing

    For interactive apps, CodeLoop goes beyond screenshots:

  • codeloop_start_recording begins a window-scoped video recording
  • codeloop_interact performs real UI actions — click, type, swipe, scroll
  • codeloop_stop_recording finalizes the video
  • codeloop_interaction_replay extracts key frames for visual verification
  • The video is motion-validated — static recordings (where the app didn't actually respond) are automatically rejected by the gate.

    Tips for best results

  • Use test filters for focused verification: the test_filter parameter lets you run only relevant tests
  • Start with the verify-fix loop, then add visual review and design comparison as your project matures
  • Let the agent iterate — the rules enforce up to 15 fix attempts before escalating to you
  • Check the development logcodeloop_generate_dev_report creates a structured evidence trail of every run
  • Pricing

    CodeLoop is $5/mo for solo developers. The 14-day trial gives you the full Team-tier allowance — unlimited verifications, 5,000 visual reviews, 2,000 design comparisons. No credit card required.

    Start your free trial → | Read the docs →