CLI Reference
Installation
Section titled “Installation”Install via Homebrew (macOS, Linux):
brew install gaffer-sh/tap/gafferOr via the install script (macOS, Linux):
curl -fsSL https://app.gaffer.sh/install.sh | shThe install script places the gaffer binary in ~/.local/bin. Both methods support Linux (x86_64, aarch64) and macOS (Apple Silicon, Intel). For a guided walkthrough, see the Getting Started guide.
gaffer test
Section titled “gaffer test”Run a test command and analyze results.
gaffer test -- npm testgaffer test -- pytest -xgaffer test -- go test ./...gaffer test -- cargo test| Flag | Env var | Description |
|---|---|---|
--token <token> | GAFFER_TOKEN | API token for cloud sync |
--report <path> / -r <path> | — | Report file path(s) to parse (repeatable) |
--root <dir> | — | Project root directory (default: .) |
--format <human|json> | — | Output format (default: human) |
--show-errors | — | Show full error messages, stack traces, and context files for failed tests |
--compare <branch> | — | Compare against the latest run on a branch (e.g. --compare=main) |
--fail-on <mode> | — | Override exit code based on failure classification. new exits 0 when only pre-existing or flaky failures exist |
--affected | — | Derive the wrapped command from affected-tests. Use with --files. The trailing -- <cmd> is ignored when set |
--files <paths> | — | Changed source files. Only meaningful with --affected (repeatable) |
--no-graph | — | With --affected, disable the import-graph strategy and fall back to naming + proximity heuristics |
--no-cache | — | With --affected, force an in-memory graph build instead of using .gaffer/graph.db |
--on-empty <auto|skip|fail> | — | With --affected, behavior when no tests are affected. auto (default) exits 0 only when all signals were available; skip always exits 0; fail always exits non-zero |
--api-url <url> | GAFFER_API_URL | Override API endpoint |
Behavior
Section titled “Behavior”- Runs your command as a child process, passing through stdout/stderr
- Discovers report files via glob patterns (config or defaults)
- Parses test results and coverage reports
- Computes health score, flaky tests, failure clusters, duration analysis
- Classifies each failure as
new,pre_existing,flaky, orunknown(auto-compares against the default branch) - Prints enriched summary to stderr
- Syncs results to cloud (if token configured)
- Exits with the child process’s exit code (or overrides via
--fail-on)
Example output
Section titled “Example output”gaffer 40 passed 2 failed 3 skipped 12.4sHealth: 87 (good) ^ Slow: p95 245.3msFlaky: 2 tests src/auth.test.ts > login — 40% flip rate (4/10 runs) src/api.test.ts > timeout handler — 20% flip rate (2/10 runs)Clusters: 1 pattern (3 tests) "Connection refused" — 3 testsNew failures: 1 src/billing.test.ts > charge cardPre-existing: 1 src/db.test.ts > connection timeoutCoverage: 78.5% lines (1234/1572)Synced: 1 run uploadedBranch comparison
Section titled “Branch comparison”Compare the current run against a baseline branch:
gaffer test --compare=main -- npm testvs main: 2 new failures, 1 fixed, 3 pre-existing pass rate -5.0% duration +1.2s NEW src/auth.test.ts > login > OAuth redirect NEW src/billing.test.ts > charge card FIX src/api.test.ts > timeout handlerFailure classification
Section titled “Failure classification”Every failure is automatically classified by comparing against your default branch:
| Classification | Meaning |
|---|---|
new | Failed now, passed on the baseline branch. Likely caused by your changes. |
pre_existing | Already failing on the baseline branch. Not your fault. |
flaky | Known flaky test (high flip rate in historical data). |
unknown | No baseline data available (first run or no runs on the default branch). |
Classification runs automatically on every gaffer test invocation. The default branch is detected via git (falls back to main, then master).
Smart exit codes
Section titled “Smart exit codes”Use --fail-on=new to exit 0 when only pre-existing or flaky failures exist:
gaffer test --fail-on=new -- npm testThis is useful in CI to avoid blocking PRs on failures that existed before your changes. If a failure is classified as unknown (no baseline), it’s treated as new for safety.
Signal exits (e.g. SIGTERM killing the test process) always propagate regardless of --fail-on.
JSON output
Section titled “JSON output”Use --format=json to get machine-readable output on stdout:
gaffer test --format=json -- npm test | jq .health.scoreThe JSON output includes a classification object with each failure’s type:
gaffer test --format=json -- npm test | jq '.classification.classified_failures[] | {name, classification}'Run only affected tests
Section titled “Run only affected tests”--affected collapses the affected-tests + gaffer test agentic loop into one invocation. Pass the changed source files; Gaffer maps them to test files, scopes the runner, and parses results as usual.
gaffer test --affected --files src/auth.ts src/api.tsUse --on-empty=auto (default) to exit 0 only when all detection signals were available. When some signals were unavailable (degraded mode), auto exits non-zero so CI doesn’t silently green-light a partial run. Use --on-empty=skip to always exit 0 or --on-empty=fail to always exit non-zero.
The CLI currently reports coverage_history and failure_history as unavailable on every run (those signals require a future Gaffer-history connection), so auto will exit non-zero on any empty result today. If you want silent-skip on empty, pass --on-empty=skip explicitly.
gaffer affected-tests
Section titled “gaffer affected-tests”Map changed source files to relevant test specs. Returns test files and a suggested run command.
gaffer affected-tests --files src/auth.ts src/api.ts| Flag | Description |
|---|---|
--files <paths> | Source files that changed (required, repeatable) |
--root <dir> | Project root directory (default: .) |
--format <human|json> | Output format (default: json) |
--pretty | Human-readable output to stderr. Equivalent to --format human |
--no-graph | Disable the import-graph strategy. Falls back to naming + proximity heuristics only. Faster on huge codebases at the cost of missing indirect dependencies |
--no-cache | Force an in-memory graph build instead of using .gaffer/graph.db. Useful for ephemeral CI runs and read-only filesystems |
--print-cmd | Print only the bare run_command string to stdout. Exit 1 when no command is available so gaffer test -- $(gaffer affected-tests --files X --print-cmd) fails fast on the empty case |
Detection strategies
Section titled “Detection strategies”| Strategy | Example |
|---|---|
| Naming convention | src/auth.ts finds src/auth.test.ts, src/auth.spec.ts, src/__tests__/auth.test.ts |
| Directory proximity | src/utils.ts finds test files in sibling __tests__/ or tests/ directories |
| Import graph | Reverse-reachability over the static import graph. First call walks the project; subsequent calls incrementally update files whose mtime has changed, persisted to .gaffer/graph.db |
The import graph runs by default. Pass --no-graph to opt out. Results are deduplicated across strategies; the JSON payload reports which signals were attempted and which were unavailable so callers can detect degraded runs.
Example output
Section titled “Example output”{ "affected": [ { "test_file": "src/auth.test.ts", "source_file": "src/auth.ts", "confidence": 0.97, "strategy": "naming_convention", "signals": [ { "strategy": "naming_convention", "confidence": 0.9 }, { "strategy": "import_graph", "confidence": 0.7 } ] }, { "test_file": "src/__tests__/api.test.ts", "source_file": "src/api.ts", "confidence": 0.3, "strategy": "directory_proximity", "signals": [ { "strategy": "directory_proximity", "confidence": 0.3 } ] } ], "run_command": "pnpm vitest src/auth.test.ts src/__tests__/api.test.ts", "framework": "vitest", "signals": { "attempted": ["naming_convention", "directory_proximity", "import_graph"], "unavailable": ["coverage_history", "failure_history"] }}Per-test fields: confidence is the noisy-OR combination across signals, and strategy is the highest-confidence individual signal (kept flat for legacy consumers). The signals array carries every signal that selected the test with its per-signal confidence.
The run command auto-detects your framework and package manager (pnpm, yarn, bun, or npm from lock files). signals.unavailable lists detection sources that weren’t reachable on this run; when affected is empty and this list is non-empty, the result is degraded rather than confirmed-empty.
Use with AI agents
Section titled “Use with AI agents”The integrated gaffer test --affected flag is the simplest path. To pipe through affected-tests directly, use --print-cmd:
gaffer test -- $(gaffer affected-tests --files $(git diff --name-only main) --print-cmd)--print-cmd exits 1 when no command is available, so the gaffer test invocation never runs with an empty wrapped command.
gaffer doctor
Section titled “gaffer doctor”Diagnose common setup issues. Checks config, database, token validity, report discovery, framework detection, and CLI version.
gaffer doctorgaffer doctor
OK Config .gaffer/config.toml OK Database .gaffer/data.db (48KB), has data OK Token gaf_...x4f2 (valid, API reachable) OK Reports 12 files match current patterns OK Frameworks vitest (vitest.config.ts), playwright (playwright.config.ts) OK Version gaffer 0.1.0Useful as a first diagnostic step when tests fail to sync or report files aren’t detected. Each check outputs OK, WARN, or FAIL with actionable detail.
gaffer init
Section titled “gaffer init”Interactive project setup.
gaffer initSteps:
- Detects test frameworks (Vitest, Playwright, Jest, pytest, Go, RSpec, .NET, Cargo, PHPUnit, Mocha)
- Shows reporter setup instructions for each detected framework
- Optionally authenticates via browser (creates API token)
- Writes
.gaffer/config.toml - Adds
.gaffer/to.gitignore
gaffer query
Section titled “gaffer query”Query local test intelligence without running tests. Output is JSON by default — use --pretty for human-readable. AI agents can access the same data via the MCP server.
gaffer query health
Section titled “gaffer query health”Health score, trend, and label.
gaffer query healthgaffer query health --prettygaffer query health | jq .scoregaffer query flaky
Section titled “gaffer query flaky”Flaky tests ranked by composite score.
gaffer query flakygaffer query flaky | jq '.[].test_name'gaffer query slowest
Section titled “gaffer query slowest”Top N slowest tests by duration.
gaffer query slowestgaffer query slowest --limit 5gaffer query runs
Section titled “gaffer query runs”Recent test runs with pass/fail counts.
gaffer query runsgaffer query runs --limit 5gaffer query history "<test>"
Section titled “gaffer query history "<test>"”Pass/fail history for a specific test (name matched with LIKE).
gaffer query history "login"gaffer query history "auth > login" --limit 10gaffer query failures "<pattern>"
Section titled “gaffer query failures "<pattern>"”Search failures across runs by test name or error message.
gaffer query failures "timeout"gaffer query failures "connection refused" --limit 10gaffer sync
Section titled “gaffer sync”Force-sync pending uploads. Use when a previous gaffer test run was interrupted before syncing, or to retry failed uploads. To upload reports without the CLI, see the Upload API.
gaffer syncgaffer sync --token gaf_xxxConfiguration
Section titled “Configuration”Config file: .gaffer/config.toml (or gaffer.toml at project root)
[project]token = "gaf_..."api_url = "https://app.gaffer.sh"
[test]report_patterns = [ "**/.gaffer/reports/**/*.xml", "**/.gaffer/reports/**/*.json", "**/junit*.xml", "**/test-results/**/*.xml", "**/test-reports/**/*.xml", "**/target/nextest/**/*.xml", "**/ctrf/**/*.json", "**/ctrf-report.json", "**/coverage/lcov.info", "**/lcov.info",]Resolution order: CLI flags > environment variables > config file > defaults.
Config discovery: Walks up from the working directory looking for .gaffer/config.toml or gaffer.toml. The directory containing the config becomes the project root.
Environment variables
Section titled “Environment variables”| Variable | Purpose |
|---|---|
GAFFER_TOKEN | API token for cloud sync (overridden by --token) |
GAFFER_API_URL | API endpoint URL (overridden by --api-url) |
Default report patterns
Section titled “Default report patterns”When no --report flag or report_patterns config is set, Gaffer auto-discovers:
**/.gaffer/reports/**/*.xml— Gaffer’s own report directory**/.gaffer/reports/**/*.json— Gaffer’s own report directory**/junit*.xml— JUnit XML reports**/test-results/**/*.xml— Common test result directories**/test-reports/**/*.xml— Common test report directories**/target/nextest/**/*.xml— Cargo nextest JUnit output**/ctrf/**/*.json— CTRF JSON reports**/ctrf-report.json— Default CTRF output**/coverage/lcov.info— Default coverage output**/lcov.info— Root-level coverage