Earn 14 free days when your bug report or suggestion is accepted — how it works

Design compare (Figma)

One-time setup, then automatic. Drop a designs/ folder in your repo (or paste a Figma file key into .codeloop/figma.json) and CodeLoop runs design compare on every verify, with no further input. Read this page when you want to wire it up the first time, or when you need to interpret a low match score.

Design compare is the gate that closes the loop with your designer. For every screen × viewport combination, CodeLoop fetches the canonical Figma frame (or a local PNG reference), pixel-diffs it against the coded UI, and returns a match score the gate can act on. It is the QA layer that keeps a Figma file and a shipped product actually aligned.

Two modes

  • Figma mode— designs live in a Figma file. You give CodeLoop a file key and a frame map; it fetches the frames over the Figma REST API at run time.
  • Local mode — designs live on disk under designs/<screen>/<viewport>.png. No credentials, no network. Useful for non-Figma teams or air-gapped builds.

Both modes feed into the same codeloop_design_compare tool and the same gate.

Figma mode

1. Get a Figma personal access token

  1. Open Figma » Settings » Personal access tokens.
  2. Click Create new token, give it a name (e.g. codeloop), copy it.
  3. Export the token in your shell (and your CI secrets):
# macOS / Linux
export FIGMA_API_TOKEN="figd_..."

# Windows PowerShell
[System.Environment]::SetEnvironmentVariable("FIGMA_API_TOKEN", "figd_...", "User")

2. Map screens to Figma frames

Drop a .codeloop/figma.jsonin the project root. Get each frame URL from the Figma right-click menu » Copy link.

{
  "file_key": "ABC123abcXYZ",
  "frames": {
    "home": {
      "desktop": "1:24",
      "tablet":  "1:48",
      "mobile":  "1:72"
    },
    "checkout": {
      "desktop": "1:96",
      "mobile":  "1:120"
    }
  },
  "scale": 2
}

scale: 2 exports the frame at @2x so the diff matches a retina screenshot. Use 1 for non-retina runners.

3. Run the compare

The agent calls codeloop_design_compare as part of the verify loop. Or run it manually from the CLI:

# everything mapped in figma.json
npx codeloop design

# one screen
npx codeloop design --screen home

# fail (non-zero exit) if score below threshold
npx codeloop design --threshold 0.85

Local mode

For teams that don't use Figma, drop PNGs into designs/at the project root. The directory shape is enough — no config required:

designs/
  home/
    desktop.png
    tablet.png
    mobile.png
  checkout/
    desktop.png
    mobile.png

codeloop_design_compare matches each PNG to the screenshot under artifacts/runs/<run_id>/screenshots/<viewport>/<screen>.png and runs the same diff.

Match score

Each screen × viewport gets a score in [0, 1]:

  • 1.0— pixel-perfect match.
  • 0.9 – 0.99— minor (sub-pixel AA, font smoothing).
  • 0.7 – 0.89— recognisable drift (spacing, colour, font weight).
  • < 0.7— the coded UI does not match the design.

The default gate threshold is 0.85. Tune in .codeloop/config.json:

{
  "design_compare": {
    "threshold": 0.9,
    "ignore_regions": [
      { "screen": "home", "rect": [0, 0, 1440, 64] }
    ],
    "scoring": "weighted_lab"
  }
}

scoring picks the diff metric: pixel (raw pixelmatch), weighted_lab(perceptual, weights luminance higher than chroma — recommended), or structural (SSIM, ignores small colour drift).

The gate

Design compare contributes the design_compare_evidence sub-gate inside visual_regression_threshold. By default it is warning severity. Promote it to blocker when your team is enforcing pixel-accurate delivery:

{
  "gate_check": {
    "design_severity": "blocker"
  }
}

Dashboard view

The local dashboard renders the Figma frame next to the coded screenshot, sorted worst-to-best by score. Hover over a region to see the per-pixel diff overlay; click Open in Figma to jump to the source frame.

What changes between runs

Each run captures the design references it used into artifacts/runs/<run_id>/designs/. This makes the run reproducible — if a designer edits the Figma frame between runs, you can see exactly which version of the design any historical run was diffed against.

CI and the GitHub Action

Add FIGMA_API_TOKEN to your GitHub repo Secrets. The CodeLoop Verify Action picks it up automatically when present and surfaces the worst design regressions in the sticky PR comment.

Common gotchas

  • Frames not exported correctly. Make sure each Figma frame is a top-level frame (not a group) and has Export enabled in the right panel.
  • Token rate-limited. Figma personal tokens limit to ~6000 requests/hour. For very large frame maps, set scale: 1 and run the compare on changed screens only (--scope affected).
  • Aspect-ratio mismatch. The Figma frame and the captured screenshot must share an aspect ratio for a fair diff. Match your viewport widths to the frame widths in your design system.
  • Coloured background fills.Set the Figma frame fill to match the app's actual page background (e.g. dark mode); a mismatched fill is the most common “huge diff in an obvious place” cause.

Related