Skip to content
Join Now Login

CLI Reference

Terminal window
curl -fsSL https://app.gaffer.sh/install.sh | sh

Installs the gaffer binary to ~/.local/bin. Supports Linux (x86_64, aarch64) and macOS (Apple Silicon, Intel). For a guided walkthrough, see the Getting Started guide.

Run a test command and analyze results.

Terminal window
gaffer test -- npm test
gaffer test -- pytest -x
gaffer test -- go test ./...
gaffer test -- cargo test
FlagEnv varDescription
--token <token>GAFFER_TOKENAPI token for cloud sync
--report <path> / -r <path>Report file path(s) to parse (repeatable)
--root <dir>Project root directory (default: .)
--format <human|json>Output format (default: human)
--show-errorsShow full error messages, stack traces, and context files for failed tests
--compare <branch>Compare against the latest run on a branch (e.g. --compare=main)
--fail-on <mode>Override exit code based on failure classification. new exits 0 when only pre-existing or flaky failures exist
--api-url <url>GAFFER_API_URLOverride API endpoint
  1. Runs your command as a child process, passing through stdout/stderr
  2. Discovers report files via glob patterns (config or defaults)
  3. Parses test results and coverage reports
  4. Computes health score, flaky tests, failure clusters, duration analysis
  5. Classifies each failure as new, pre_existing, flaky, or unknown (auto-compares against the default branch)
  6. Prints enriched summary to stderr
  7. Syncs results to cloud (if token configured)
  8. Exits with the child process’s exit code (or overrides via --fail-on)
gaffer 40 passed 2 failed 3 skipped 12.4s
Health: 87 (good) ^ Slow: p95 245.3ms
Flaky: 2 tests
src/auth.test.ts > login — 40% flip rate (4/10 runs)
src/api.test.ts > timeout handler — 20% flip rate (2/10 runs)
Clusters: 1 pattern (3 tests)
"Connection refused" — 3 tests
New failures: 1
src/billing.test.ts > charge card
Pre-existing: 1
src/db.test.ts > connection timeout
Coverage: 78.5% lines (1234/1572)
Synced: 1 run uploaded

Compare the current run against a baseline branch:

Terminal window
gaffer test --compare=main -- npm test
vs main: 2 new failures, 1 fixed, 3 pre-existing pass rate -5.0% duration +1.2s
NEW src/auth.test.ts > login > OAuth redirect
NEW src/billing.test.ts > charge card
FIX src/api.test.ts > timeout handler

Every failure is automatically classified by comparing against your default branch:

ClassificationMeaning
newFailed now, passed on the baseline branch. Likely caused by your changes.
pre_existingAlready failing on the baseline branch. Not your fault.
flakyKnown flaky test (high flip rate in historical data).
unknownNo baseline data available (first run or no runs on the default branch).

Classification runs automatically on every gaffer test invocation. The default branch is detected via git (falls back to main, then master).

Use --fail-on=new to exit 0 when only pre-existing or flaky failures exist:

Terminal window
gaffer test --fail-on=new -- npm test

This is useful in CI to avoid blocking PRs on failures that existed before your changes. If a failure is classified as unknown (no baseline), it’s treated as new for safety.

Signal exits (e.g. SIGTERM killing the test process) always propagate regardless of --fail-on.

Use --format=json to get machine-readable output on stdout:

Terminal window
gaffer test --format=json -- npm test | jq .health.score

The JSON output includes a classification object with each failure’s type:

Terminal window
gaffer test --format=json -- npm test | jq '.classification.classified_failures[] | {name, classification}'

Map changed source files to relevant test specs. Returns test files and a suggested run command.

Terminal window
gaffer affected-tests --files src/auth.ts src/api.ts
FlagDescription
--files <paths>Source files that changed (required, repeatable)
--root <dir>Project root directory (default: .)
--format <human|json>Output format (default: json)
StrategyConfidenceExample
Naming convention90%src/auth.ts finds src/auth.test.ts, src/auth.spec.ts, src/__tests__/auth.test.ts
Directory proximity30%src/utils.ts finds test files in sibling __tests__/ or tests/ directories

Results are deduplicated. When both strategies find the same test file, the higher confidence wins.

{
"affected": [
{ "test_file": "src/auth.test.ts", "confidence": 0.9, "strategy": "naming_convention", "source_file": "src/auth.ts" },
{ "test_file": "src/__tests__/api.test.ts", "confidence": 0.3, "strategy": "directory_proximity", "source_file": "src/api.ts" }
],
"run_command": "pnpm vitest src/auth.test.ts src/__tests__/api.test.ts",
"framework": "vitest"
}

The run command auto-detects your framework and package manager (pnpm, yarn, bun, or npm from lock files).

Agents can pipe changed files from git into affected-tests to run only relevant tests:

Terminal window
gaffer affected-tests --files $(git diff --name-only main) | jq -r .run_command | sh

Diagnose common setup issues. Checks config, database, token validity, report discovery, framework detection, and CLI version.

Terminal window
gaffer doctor
gaffer doctor
OK Config .gaffer/config.toml
OK Database .gaffer/data.db (48KB), has data
OK Token gaf_...x4f2 (valid, API reachable)
OK Reports 12 files match current patterns
OK Frameworks vitest (vitest.config.ts), playwright (playwright.config.ts)
OK Version gaffer 0.1.0

Useful as a first diagnostic step when tests fail to sync or report files aren’t detected. Each check outputs OK, WARN, or FAIL with actionable detail.

Interactive project setup.

Terminal window
gaffer init

Steps:

  1. Detects test frameworks (Vitest, Playwright, Jest, pytest, Go, RSpec, .NET, Cargo, PHPUnit, Mocha)
  2. Shows reporter setup instructions for each detected framework
  3. Optionally authenticates via browser (creates API token)
  4. Writes .gaffer/config.toml
  5. Adds .gaffer/ to .gitignore

Query local test intelligence without running tests. Output is JSON by default — use --pretty for human-readable. AI agents can access the same data via the MCP server.

Health score, trend, and label.

Terminal window
gaffer query health
gaffer query health --pretty
gaffer query health | jq .score

Flaky tests ranked by composite score.

Terminal window
gaffer query flaky
gaffer query flaky | jq '.[].test_name'

Top N slowest tests by duration.

Terminal window
gaffer query slowest
gaffer query slowest --limit 5

Recent test runs with pass/fail counts.

Terminal window
gaffer query runs
gaffer query runs --limit 5

Pass/fail history for a specific test (name matched with LIKE).

Terminal window
gaffer query history "login"
gaffer query history "auth > login" --limit 10

Search failures across runs by test name or error message.

Terminal window
gaffer query failures "timeout"
gaffer query failures "connection refused" --limit 10

Force-sync pending uploads. Use when a previous gaffer test run was interrupted before syncing, or to retry failed uploads. To upload reports without the CLI, see the Upload API.

Terminal window
gaffer sync
gaffer sync --token gaf_xxx

Config file: .gaffer/config.toml (or gaffer.toml at project root)

[project]
token = "gaf_..."
api_url = "https://app.gaffer.sh"
[test]
report_patterns = [
"**/.gaffer/reports/**/*.xml",
"**/.gaffer/reports/**/*.json",
"**/junit*.xml",
"**/test-results/**/*.xml",
"**/test-reports/**/*.xml",
"**/target/nextest/**/*.xml",
"**/ctrf/**/*.json",
"**/ctrf-report.json",
"**/coverage/lcov.info",
"**/lcov.info",
]

Resolution order: CLI flags > environment variables > config file > defaults.

Config discovery: Walks up from the working directory looking for .gaffer/config.toml or gaffer.toml. The directory containing the config becomes the project root.

VariablePurpose
GAFFER_TOKENAPI token for cloud sync (overridden by --token)
GAFFER_API_URLAPI endpoint URL (overridden by --api-url)

When no --report flag or report_patterns config is set, Gaffer auto-discovers:

  • **/.gaffer/reports/**/*.xml — Gaffer’s own report directory
  • **/.gaffer/reports/**/*.json — Gaffer’s own report directory
  • **/junit*.xml — JUnit XML reports
  • **/test-results/**/*.xml — Common test result directories
  • **/test-reports/**/*.xml — Common test report directories
  • **/target/nextest/**/*.xml — Cargo nextest JUnit output
  • **/ctrf/**/*.json — CTRF JSON reports
  • **/ctrf-report.json — Default CTRF output
  • **/coverage/lcov.info — Default coverage output
  • **/lcov.info — Root-level coverage