CLI Reference

Installation

curl -fsSL https://app.gaffer.sh/install.sh | sh

Installs the gaffer binary to ~/.local/bin. Supports Linux (x86_64, aarch64) and macOS (Apple Silicon, Intel). For a guided walkthrough, see the Getting Started guide.

`gaffer test`

Run a test command and analyze results.

gaffer test -- npm test
gaffer test -- pytest -x
gaffer test -- go test ./...
gaffer test -- cargo test

Flags

Flag	Env var	Description
`--token <token>`	`GAFFER_TOKEN`	API token for cloud sync
`--report <path>` / `-r <path>`	—	Report file path(s) to parse (repeatable)
`--root <dir>`	—	Project root directory (default: `.`)
`--format <human\|json>`	—	Output format (default: `human`)
`--show-errors`	—	Show full error messages, stack traces, and context files for failed tests
`--compare <branch>`	—	Compare against the latest run on a branch (e.g. `--compare=main`)
`--fail-on <mode>`	—	Override exit code based on failure classification. `new` exits 0 when only pre-existing or flaky failures exist
`--api-url <url>`	`GAFFER_API_URL`	Override API endpoint

Behavior

Runs your command as a child process, passing through stdout/stderr
Discovers report files via glob patterns (config or defaults)
Parses test results and coverage reports
Computes health score, flaky tests, failure clusters, duration analysis
Classifies each failure as new, pre_existing, flaky, or unknown (auto-compares against the default branch)
Prints enriched summary to stderr
Syncs results to cloud (if token configured)
Exits with the child process’s exit code (or overrides via --fail-on)

Example output

gaffer  40 passed  2 failed  3 skipped  12.4s
Health: 87 (good) ^  Slow: p95 245.3ms
Flaky: 2 tests
   src/auth.test.ts > login — 40% flip rate (4/10 runs)
   src/api.test.ts > timeout handler — 20% flip rate (2/10 runs)
Clusters: 1 pattern (3 tests)
   "Connection refused" — 3 tests
New failures: 1
   src/billing.test.ts > charge card
Pre-existing: 1
   src/db.test.ts > connection timeout
Coverage: 78.5% lines (1234/1572)
Synced: 1 run uploaded

Branch comparison

Compare the current run against a baseline branch:

gaffer test --compare=main -- npm test

vs main: 2 new failures, 1 fixed, 3 pre-existing  pass rate -5.0%  duration +1.2s
   NEW  src/auth.test.ts > login > OAuth redirect
   NEW  src/billing.test.ts > charge card
   FIX  src/api.test.ts > timeout handler

Failure classification

Every failure is automatically classified by comparing against your default branch:

Classification	Meaning
`new`	Failed now, passed on the baseline branch. Likely caused by your changes.
`pre_existing`	Already failing on the baseline branch. Not your fault.
`flaky`	Known flaky test (high flip rate in historical data).
`unknown`	No baseline data available (first run or no runs on the default branch).

Classification runs automatically on every gaffer test invocation. The default branch is detected via git (falls back to main, then master).

Smart exit codes

Use --fail-on=new to exit 0 when only pre-existing or flaky failures exist:

gaffer test --fail-on=new -- npm test

This is useful in CI to avoid blocking PRs on failures that existed before your changes. If a failure is classified as unknown (no baseline), it’s treated as new for safety.

Signal exits (e.g. SIGTERM killing the test process) always propagate regardless of --fail-on.

JSON output

Use --format=json to get machine-readable output on stdout:

gaffer test --format=json -- npm test | jq .health.score

The JSON output includes a classification object with each failure’s type:

gaffer test --format=json -- npm test | jq '.classification.classified_failures[] | {name, classification}'

`gaffer affected-tests`

Map changed source files to relevant test specs. Returns test files and a suggested run command.

gaffer affected-tests --files src/auth.ts src/api.ts

Flags

Flag	Description
`--files <paths>`	Source files that changed (required, repeatable)
`--root <dir>`	Project root directory (default: `.`)
`--format <human\|json>`	Output format (default: `json`)

Detection strategies

Strategy	Confidence	Example
Naming convention	90%	`src/auth.ts` finds `src/auth.test.ts`, `src/auth.spec.ts`, `src/__tests__/auth.test.ts`
Directory proximity	30%	`src/utils.ts` finds test files in sibling `__tests__/` or `tests/` directories

Results are deduplicated. When both strategies find the same test file, the higher confidence wins.

Example output

{
  "affected": [
    { "test_file": "src/auth.test.ts", "confidence": 0.9, "strategy": "naming_convention", "source_file": "src/auth.ts" },
    { "test_file": "src/__tests__/api.test.ts", "confidence": 0.3, "strategy": "directory_proximity", "source_file": "src/api.ts" }
  ],
  "run_command": "pnpm vitest src/auth.test.ts src/__tests__/api.test.ts",
  "framework": "vitest"
}

The run command auto-detects your framework and package manager (pnpm, yarn, bun, or npm from lock files).

Use with AI agents

Agents can pipe changed files from git into affected-tests to run only relevant tests:

gaffer affected-tests --files $(git diff --name-only main) | jq -r .run_command | sh

`gaffer doctor`

Diagnose common setup issues. Checks config, database, token validity, report discovery, framework detection, and CLI version.

gaffer doctor

gaffer doctor

  OK  Config  .gaffer/config.toml
  OK  Database  .gaffer/data.db (48KB), has data
  OK  Token  gaf_...x4f2 (valid, API reachable)
  OK  Reports  12 files match current patterns
  OK  Frameworks  vitest (vitest.config.ts), playwright (playwright.config.ts)
  OK  Version  gaffer 0.1.0

Useful as a first diagnostic step when tests fail to sync or report files aren’t detected. Each check outputs OK, WARN, or FAIL with actionable detail.

`gaffer init`

Interactive project setup.

gaffer init

Steps:

Detects test frameworks (Vitest, Playwright, Jest, pytest, Go, RSpec, .NET, Cargo, PHPUnit, Mocha)
Shows reporter setup instructions for each detected framework
Optionally authenticates via browser (creates API token)
Writes .gaffer/config.toml
Adds .gaffer/ to .gitignore

`gaffer query`

Query local test intelligence without running tests. Output is JSON by default — use --pretty for human-readable. AI agents can access the same data via the MCP server.

`gaffer query health`

Health score, trend, and label.

gaffer query health
gaffer query health --pretty
gaffer query health | jq .score

`gaffer query flaky`

Flaky tests ranked by composite score.

gaffer query flaky
gaffer query flaky | jq '.[].test_name'

`gaffer query slowest`

Top N slowest tests by duration.

gaffer query slowest
gaffer query slowest --limit 5

`gaffer query runs`

Recent test runs with pass/fail counts.

gaffer query runs
gaffer query runs --limit 5

`gaffer query history "<test>"`

Pass/fail history for a specific test (name matched with LIKE).

gaffer query history "login"
gaffer query history "auth > login" --limit 10

`gaffer query failures "<pattern>"`

Search failures across runs by test name or error message.

gaffer query failures "timeout"
gaffer query failures "connection refused" --limit 10

`gaffer sync`

Force-sync pending uploads. Use when a previous gaffer test run was interrupted before syncing, or to retry failed uploads. To upload reports without the CLI, see the Upload API.

gaffer sync
gaffer sync --token gaf_xxx

Configuration

Config file: .gaffer/config.toml (or gaffer.toml at project root)

[project]
token = "gaf_..."
api_url = "https://app.gaffer.sh"

[test]
report_patterns = [
    "**/.gaffer/reports/**/*.xml",
    "**/.gaffer/reports/**/*.json",
    "**/junit*.xml",
    "**/test-results/**/*.xml",
    "**/test-reports/**/*.xml",
    "**/target/nextest/**/*.xml",
    "**/ctrf/**/*.json",
    "**/ctrf-report.json",
    "**/coverage/lcov.info",
    "**/lcov.info",
]

Resolution order: CLI flags > environment variables > config file > defaults.

Config discovery: Walks up from the working directory looking for .gaffer/config.toml or gaffer.toml. The directory containing the config becomes the project root.

Environment variables

Variable	Purpose
`GAFFER_TOKEN`	API token for cloud sync (overridden by `--token`)
`GAFFER_API_URL`	API endpoint URL (overridden by `--api-url`)

Default report patterns

When no --report flag or report_patterns config is set, Gaffer auto-discovers:

**/.gaffer/reports/**/*.xml — Gaffer’s own report directory
**/.gaffer/reports/**/*.json — Gaffer’s own report directory
**/junit*.xml — JUnit XML reports
**/test-results/**/*.xml — Common test result directories
**/test-reports/**/*.xml — Common test report directories
**/target/nextest/**/*.xml — Cargo nextest JUnit output
**/ctrf/**/*.json — CTRF JSON reports
**/ctrf-report.json — Default CTRF output
**/coverage/lcov.info — Default coverage output
**/lcov.info — Root-level coverage