Playwright Reports: HTML, JSON, JUnit & Sharing in CI

Playwright ships with eight built-in reporters, and the one most people use produces a folder called playwright-report/ that only opens on the machine that made it and gets overwritten on the next run. This guide covers every reporter honestly, then keeps going to the two things the other Playwright reporting guides skip: turning a CI run into one shareable URL, and keeping a searchable history of every run instead of just the latest.

What Playwright generates by default

Run npx playwright test with no reporter configured and you get the list reporter in the terminal plus the HTML report written to disk. That HTML report is the artifact most teams mean when they say “the Playwright report.”

The HTML reporter: what it includes and where it lives

The HTML reporter writes a self-contained report to playwright-report/ in your project root. Open it with:

npx playwright show-report

That serves playwright-report/index.html on a local web server (it needs the server because the report fetches trace data over HTTP, so double-clicking the file alone gives you a broken page). Inside you get a filterable list of every test, pass/fail/flaky status, durations, retries, and per-failure attachments: screenshots, videos, and the full Playwright trace with a step-by-step DOM timeline. For debugging a single failing run, it is the best report any JS test framework produces.

Why the default report disappears after the next run

The HTML reporter writes to the same playwright-report/ directory every time. There is no run ID in the path, no timestamp, no history. Run the suite again and yesterday’s report is gone. This is fine on your laptop, where “yesterday’s report” usually means “the thing I already looked at.” It stops being fine the moment the report you want lives on a CI runner that was destroyed an hour after the job finished, or the moment a teammate asks “did this test fail last week too?” and the only honest answer is “the report that would tell you no longer exists.”

Hold that gap. Everything in the second half of this guide, from CI setup to sharing a report with your team, exists to close it.

Built-in Playwright reporters

Playwright’s built-in reporters are: list, line, dot, html, json, junit, blob, and github. You set them in playwright.config.ts or with the --reporter CLI flag. Each one targets a different consumer.

List, Line, and Dot reporters (CI console output)

These three write to the terminal only. They produce no files.

list (the default for local runs) prints one line per test as it finishes, with status and duration.
line is more compact: a single updating status line plus a list of failures at the end. It is the default on CI.
dot prints one character per test (a dot for pass, an F for fail). It is the quietest option, useful when you have thousands of tests and want CI logs to stay short.

npx playwright test --reporter=line

None of these survive the job. They are progress indicators, not artifacts.

HTML reporter (local interactive report)

The reporter covered above, configured explicitly:

import { defineConfig } from "@playwright/test";

export default defineConfig({
  reporter: [["html", { open: "never" }]],
});

open: "never" stops Playwright from auto-launching a browser after every local run, which is the setting you want on CI and usually want locally too. The report still gets written; you just open it on demand with show-report.

JSON reporter (machine-readable output)

The json reporter emits one structured JSON file describing the whole run: every test, its status, retries, timings, errors, and attachment paths. It is what you reach for when you want to post-process results yourself.

export default defineConfig({
  reporter: [["json", { outputFile: "results/results.json" }]],
});

Without outputFile (or the PLAYWRIGHT_JSON_OUTPUT_NAME env var) the JSON goes to stdout, which is rarely what you want on CI because it mixes with the rest of the log.

JUnit reporter (CI pipeline consumption)

The junit reporter writes JUnit XML, the lowest-common-denominator test-result format. Every major CI provider and test-analytics platform knows how to read it.

export default defineConfig({
  reporter: [["junit", { outputFile: "results/junit.xml" }]],
});

GitHub Actions can surface failures inline from this file with a third-party reporter action, GitLab CI has a native junit: artifact type, and analytics tools ingest it directly. JUnit XML carries the pass/fail/duration data but not the screenshots or traces, so it complements the HTML report rather than replacing it.

Blob reporter (merging sharded parallel runs)

The blob reporter is the one most guides mention in a single sentence and then move past, which is a shame because it solves a real problem. When you shard a Playwright suite across multiple CI machines, each shard runs a slice of the tests and produces its own report. The blob reporter writes an intermediate .zip per shard that can later be merged into one complete report covering the whole suite.

export default defineConfig({
  reporter: process.env.CI ? "blob" : [["html", { open: "never" }]],
});

After all shards finish, merge their blobs into a single HTML (or any other) report:

npx playwright merge-reports --reporter=html ./all-blob-reports

The full shard-merge-host-share walkthrough is in the sharded runs section below. It is the part of Playwright reporting that breaks teams the first time they shard, and the part no current guide walks end to end.

Using multiple reporters at once

The reporter option takes an array, so you can run several reporters in a single pass. There is no extra plugin to install, unlike Cypress. A common production setup is HTML for humans, JUnit for CI annotations, and JSON for any custom processing:

import { defineConfig } from "@playwright/test";

export default defineConfig({
  reporter: [
    ["list"],
    ["html", { open: "never" }],
    ["junit", { outputFile: "results/junit.xml" }],
    ["json", { outputFile: "results/results.json" }],
  ],
});

Every test runs once; each reporter sees the same results and writes its own format. The same works from the CLI with a comma-separated list:

npx playwright test --reporter=list,html,junit

The CLI form can’t set per-reporter options like outputFile, so for anything beyond the defaults the config-file array is the better home.

Custom Playwright reporters

Playwright exposes its reporter API directly, so you can write a class that receives test events and does whatever you want with them. Most teams never need this. It is worth knowing where the line is.

When to build a custom reporter vs use third-party

Build a custom reporter when you have an integration that doesn’t exist yet: posting results to an internal dashboard, formatting output for a chat tool with a specific schema, or emitting metrics to a system that has no off-the-shelf reporter. Use a third-party or hosted reporter when the thing you want is “see results across runs,” “share a report,” or “track flaky tests,” because those are solved problems and a custom reporter that reimplements them is maintenance you will regret.

A minimal custom reporter example

A reporter is a class implementing the Reporter interface. This one prints a one-line summary, which is roughly the smallest useful example:

import type { Reporter, TestCase, TestResult } from "@playwright/test/reporter";

class SummaryReporter implements Reporter {
  private passed = 0;
  private failed = 0;

  onTestEnd(test: TestCase, result: TestResult) {
    if (result.status === "passed") this.passed++;
    if (result.status === "failed") this.failed++;
  }

  onEnd() {
    console.log(`Done: ${this.passed} passed, ${this.failed} failed`);
  }
}

export default SummaryReporter;

Point the config at the file path:

export default defineConfig({
  reporter: "./my-reporter.ts",
});

The full interface has hooks for run start, step begin/end, stdout/stderr capture, and errors. The Playwright reporter API docs list them all.

Third-party and hosted reporters

Once you want results that outlive a single run, you leave the built-ins behind. The options split into “richer report generators” and “places to put reports so people can see them.”

Allure for Playwright (what it adds, what it costs to run)

Allure generates a richer report than Playwright’s built-in HTML: test history graphs, severity tagging, step-level traces, and behavior-driven groupings. The trade-off is that Allure is a two-step process. The Playwright run writes raw result files, then the Allure CLI builds them into a report, and the history graphs only persist if you carry the previous results directory forward between runs yourself.

npm install --save-dev allure-playwright allure-commandline
npx playwright test --reporter=line,allure-playwright
npx allure generate ./allure-results --clean -o ./allure-report

Pick Allure when you want step-level traces and you are willing to run and host the report-generation pipeline. The history feature still leaves you owning the storage and the carry-forward logic.

Gaffer: persistent HTML report hosting without configuration

Gaffer takes the report files Playwright already produces (HTML, JUnit XML, or JSON) and gives them a durable home: upload from CI, get a URL that doesn’t expire when the runner dies, and view pass-rate trends and flaky-test detection across every run without writing any analytics code. There is no custom reporter to install. You keep your existing playwright.config.ts and add one upload step. The sharing and history sections below walk through it.

Testmo, ReportPortal (when you also need test management)

If your team needs test-case management on top of reporting (manual test plans, requirement traceability, QA sign-off workflows), Testmo and ReportPortal both ingest Playwright results and add that layer. They are heavier tools aimed at QA orgs with a dedicated test manager. For a solo developer or a small team that just wants to run Playwright in CI and see the results, that machinery is overhead you will not use.

Setting up Playwright reporting in CI/CD

The reporter config is the same everywhere. What changes per provider is how you run the suite and what you do with the files afterward.

GitHub Actions: run tests and expose a report URL on every PR

A workflow that runs Playwright, produces HTML and JUnit reports, and uploads them to Gaffer on every push and PR regardless of pass or fail:

name: Playwright

on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps

      - name: Run Playwright tests
        run: npx playwright test --reporter=line,html,junit

      - name: Upload results to Gaffer
        if: always()
        uses: gaffer-sh/gaffer-uploader@v2
        with:
          gaffer_upload_token: ${{ secrets.GAFFER_UPLOAD_TOKEN }}
          report_path: ./results
          commit_sha: ${{ github.sha }}
          branch: ${{ github.ref_name }}
          test_framework: playwright

The if: always() matters. The runs you most want to keep are the failing ones, so gating the upload on success throws away exactly the data you need. The same upload works from the CLI if you’d rather not use the Action:

gaffer upload ./results \
  --token $GAFFER_UPLOAD_TOKEN \
  --commit-sha $GITHUB_SHA \
  --branch $GITHUB_REF_NAME \
  --test-framework playwright

The flags map one-to-one with the CLI: --token accepts a gfr_… project token (or falls back to GAFFER_PROJECT_TOKEN, then GAFFER_UPLOAD_TOKEN, then GAFFER_TOKEN), --commit-sha and --branch get recorded as tags so you can filter runs by branch later, and --test-framework labels the run.

Reporters on CI: suppressing noise, capturing failures

Two adjustments make CI output useful instead of overwhelming. Use line or dot instead of list so the console stays readable when hundreds of tests run, and configure retries plus traces so the report captures what actually went wrong:

export default defineConfig({
  retries: process.env.CI ? 2 : 0,
  use: {
    trace: "on-first-retry",
    screenshot: "only-on-failure",
  },
  reporter: process.env.CI
    ? [["line"], ["html", { open: "never" }], ["junit", { outputFile: "results/junit.xml" }]]
    : [["list"], ["html", { open: "never" }]],
});

trace: "on-first-retry" records a full trace only when a test fails and retries, which keeps artifact size down while still capturing every real failure. A test that passes on retry is also exactly the signal you want flagged as flaky later.

This is where most guides stop and most real setups break.

The problem: the HTML report needs the machine that made it

The Playwright HTML report is not a single portable file. It is a directory that serves trace data over HTTP, which is why show-report spins up a local server. You cannot email it, and a teammate cannot meaningfully open it without recreating that server. On CI, the report is a zipped artifact buried behind the Actions tab: a teammate has to find the workflow run, click into the job, scroll to artifacts, download a zip, unzip it, and run a local server to view it. To compare two runs, do all of that twice. That is the friction that keeps Playwright reports from ever being looked at.

Hosting the HTML report: static file hosting vs dedicated tools

Two ways to give the report a URL.

You can dump the playwright-report/ directory into static hosting (an S3 bucket, GitHub Pages, Netlify) and get a permalink. That works for a single run, but it is plumbing you maintain, and it gives you a static file with a URL and nothing else: no cross-run pass rates, no flaky detection, no way to ask “which test failed most this month.” Each run overwrites the last unless you build path-versioning yourself.

A dedicated report-hosting layer closes that gap. Gaffer’s Playwright report sharing solution ingests the report from CI, hands back a persistent URL, and adds the cross-run view that static hosting can’t: pass-rate trends, flaky-test detection, and failure search across every run, without you writing or maintaining any of it.

From CI run to shareable URL in one step (Gaffer walkthrough)

The sharded case is where this pays off most, because it is the case the built-in tooling handles worst. Here is the full shard, merge, host, share path.

Shard the suite across four machines and have each one write a blob:

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - run: npx playwright install --with-deps
      - name: Run shard
        run: npx playwright test --shard=${{ matrix.shard }}/4 --reporter=blob
      - uses: actions/upload-artifact@v4
        with:
          name: blob-${{ matrix.shard }}
          path: blob-report/

Then a single merge job collects the blobs, merges them into one HTML report, and uploads that one report to Gaffer:

  merge:
    needs: test
    if: always()
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20, cache: npm }
      - run: npm ci
      - uses: actions/download-artifact@v4
        with:
          path: all-blobs
          pattern: blob-*
          merge-multiple: true
      - name: Merge into one report
        run: npx playwright merge-reports --reporter=html,junit ./all-blobs
      - name: Upload merged report to Gaffer
        uses: gaffer-sh/gaffer-uploader@v2
        with:
          gaffer_upload_token: ${{ secrets.GAFFER_UPLOAD_TOKEN }}
          report_path: ./playwright-report
          commit_sha: ${{ github.sha }}
          branch: ${{ github.ref_name }}
          test_framework: playwright

Four shards, one merge, one upload, one persistent URL covering the entire suite. That is the workflow you can’t get from the built-in HTML reporter, which only ever knows about the slice of tests that ran on its own machine.

Keeping report history across runs

A single Playwright report answers “did this build pass?” Fifty reports answer “has this test been flaking for three weeks while everyone re-ran the job and shrugged?”

Why overwriting hurts: debugging yesterday’s failure

The default HTML reporter overwrites playwright-report/ every run. When a test that passed on main yesterday fails on a PR today, the report that would let you compare the two is already gone. You are left reading two log dumps side by side instead of two structured reports. Keeping each run as its own record, addressable by commit and branch, turns “did this fail before?” from an archaeology problem into a single lookup.

We run this on Gaffer’s own Playwright suite. Each CI run uploads its report, and the dashboard keeps every run within the plan’s analytics window rather than overwriting. That is what makes the flaky view possible: a single overwritten HTML report can only ever show you the latest run, so it can’t tell you a test flipped pass-fail-pass across the last several builds. The cross-run record can, and that flip history is the signal that distinguishes a genuinely broken test from a flaky one. (Retention follows the plan: 7 days on Free, 30 on Pro, 90 on Team, with a longer analytics window on top.)

Retention and searchable history: what to look for

A report-history layer is worth having only if you can actually query it. The questions that matter: which test failed most often this month, which test got slower since the last release, which specs fail only on PR branches and never on main. The data to answer all three is sitting in the JUnit XML you already generate; it just needs to be treated as a time series instead of a throwaway artifact. Gaffer’s test report hosting keeps that history and makes it searchable, so a postmortem that references a run from two months ago can still open the report instead of finding a deleted CI artifact.

Common mistakes in Playwright reporting

Treating CI artifacts as a viewing surface. A zipped report behind the Actions tab is storage, not a place anyone will actually look. If viewing the report takes seven clicks and a local server, it won’t get viewed.
Gating the upload on test success. if: always() is not optional. The failing runs are the ones you most need to keep.
Forgetting open: "never" on CI. Without it, the HTML reporter tries to launch a browser on a headless runner and stalls the job.
Sharding without merging. Four shards produce four partial reports. Without merge-reports, no single report shows the whole suite, and the partial reports mislead anyone who opens just one.
Relying on a single overwritten report for trends. One report is a snapshot. Flaky detection and pass-rate trends need the history that overwriting throws away.

Playwright reporting best practices at scale

The setup that holds up as a suite grows from fifty tests to thousands:

One config, multiple reporters. line for the console, html for debugging a run, junit for analytics. One run, three outputs.
Blob reporter for sharded runs, merged once. Shard for speed, merge for a complete report, upload the merged result rather than the fragments.
Traces on retry, screenshots on failure. Enough detail to debug, small enough to not bloat every artifact.
Host the report, don’t just store it. A persistent URL beats a zipped artifact for every use that isn’t “the person who ran the tests, debugging within the hour.”
Keep the history, query the trend. The return on all the reporter config is knowing what changed across the last fifty runs, which only exists if you stop overwriting.

The reporter configuration is fifteen minutes of work. Making those reports useful across hundreds of CI runs is the part that pays back. If your Playwright tests are written or maintained by an AI agent, the same stability question applies even more sharply: see Playwright Test Agents for how to tell whether agent-written tests actually stay green over time.

Gaffer