Design compare (Figma)
One-time setup, then automatic. Drop a designs/ folder in your repo (or paste a Figma file key into .codeloop/figma.json) and CodeLoop runs design compare on every verify, with no further input. Read this page when you want to wire it up the first time, or when you need to interpret a low match score.
Design compare is the gate that closes the loop with your designer. For every screen × viewport combination, CodeLoop fetches the canonical Figma frame (or a local PNG reference), pixel-diffs it against the coded UI, and returns a match score the gate can act on. It is the QA layer that keeps a Figma file and a shipped product actually aligned.
Two modes
- Figma mode— designs live in a Figma file. You give CodeLoop a file key and a frame map; it fetches the frames over the Figma REST API at run time.
- Local mode — designs live on disk under
designs/<screen>/<viewport>.png. No credentials, no network. Useful for non-Figma teams or air-gapped builds.
Both modes feed into the same codeloop_design_compare tool and the same gate.
Figma mode
1. Get a Figma personal access token
- Open Figma » Settings » Personal access tokens.
- Click Create new token, give it a name (e.g.
codeloop), copy it. - Export the token in your shell (and your CI secrets):
# macOS / Linux
export FIGMA_API_TOKEN="figd_..."
# Windows PowerShell
[System.Environment]::SetEnvironmentVariable("FIGMA_API_TOKEN", "figd_...", "User")2. Map screens to Figma frames
Drop a .codeloop/figma.jsonin the project root. Get each frame URL from the Figma right-click menu » Copy link.
{
"file_key": "ABC123abcXYZ",
"frames": {
"home": {
"desktop": "1:24",
"tablet": "1:48",
"mobile": "1:72"
},
"checkout": {
"desktop": "1:96",
"mobile": "1:120"
}
},
"scale": 2
}scale: 2 exports the frame at @2x so the diff matches a retina screenshot. Use 1 for non-retina runners.
3. Run the compare
The agent calls codeloop_design_compare as part of the verify loop. Or run it manually from the CLI:
# everything mapped in figma.json
npx codeloop design
# one screen
npx codeloop design --screen home
# fail (non-zero exit) if score below threshold
npx codeloop design --threshold 0.85Local mode
For teams that don't use Figma, drop PNGs into designs/at the project root. The directory shape is enough — no config required:
designs/
home/
desktop.png
tablet.png
mobile.png
checkout/
desktop.png
mobile.pngcodeloop_design_compare matches each PNG to the screenshot under artifacts/runs/<run_id>/screenshots/<viewport>/<screen>.png and runs the same diff.
Match score
Each screen × viewport gets a score in [0, 1]:
- 1.0— pixel-perfect match.
- 0.9 – 0.99— minor (sub-pixel AA, font smoothing).
- 0.7 – 0.89— recognisable drift (spacing, colour, font weight).
- < 0.7— the coded UI does not match the design.
The default gate threshold is 0.85. Tune in .codeloop/config.json:
{
"design_compare": {
"threshold": 0.9,
"ignore_regions": [
{ "screen": "home", "rect": [0, 0, 1440, 64] }
],
"scoring": "weighted_lab"
}
}scoring picks the diff metric: pixel (raw pixelmatch), weighted_lab(perceptual, weights luminance higher than chroma — recommended), or structural (SSIM, ignores small colour drift).
The gate
Design compare contributes the design_compare_evidence sub-gate inside visual_regression_threshold. By default it is warning severity. Promote it to blocker when your team is enforcing pixel-accurate delivery:
{
"gate_check": {
"design_severity": "blocker"
}
}Dashboard view
The local dashboard renders the Figma frame next to the coded screenshot, sorted worst-to-best by score. Hover over a region to see the per-pixel diff overlay; click Open in Figma to jump to the source frame.
What changes between runs
Each run captures the design references it used into artifacts/runs/<run_id>/designs/. This makes the run reproducible — if a designer edits the Figma frame between runs, you can see exactly which version of the design any historical run was diffed against.
CI and the GitHub Action
Add FIGMA_API_TOKEN to your GitHub repo Secrets. The CodeLoop Verify Action picks it up automatically when present and surfaces the worst design regressions in the sticky PR comment.
Common gotchas
- Frames not exported correctly. Make sure each Figma frame is a top-level frame (not a group) and has Export enabled in the right panel.
- Token rate-limited. Figma personal tokens limit to ~6000 requests/hour. For very large frame maps, set
scale: 1and run the compare on changed screens only (--scope affected). - Aspect-ratio mismatch. The Figma frame and the captured screenshot must share an aspect ratio for a fair diff. Match your viewport widths to the frame widths in your design system.
- Coloured background fills.Set the Figma frame fill to match the app's actual page background (e.g. dark mode); a mismatched fill is the most common “huge diff in an obvious place” cause.
Related
- Visual review— baseline regressions between runs (different from design drift).
- Core concepts — design reference
- Tool reference