June 10, 2026
What to Log When Browser Tests Fail: Video, Console, Network, and DOM State
A practical checklist for browser test failure logs, including video, console errors, network traces, and DOM snapshots, so teams can separate real defects from flaky noise.
When a browser test fails in CI, the hardest part is usually not reproducing the test, it is understanding what actually happened. A red build might come from a real product defect, a timing issue, a misconfigured test environment, a third-party dependency, or a flaky assertion that only fails under specific rendering or network conditions. If the only artifact you keep is a stack trace, you are forcing engineers to guess.
Good browser test failure logs are not about collecting everything. They are about keeping the minimum evidence that lets you answer a few practical questions quickly:
- Did the browser render the expected UI state?
- Did the page throw a JavaScript error?
- Did a network request fail, time out, or return unexpected data?
- Was the DOM different from what the test expected at the moment of failure?
- Was this an infrastructure issue, an app issue, or a test issue?
This checklist focuses on the artifacts that matter most in CI, especially for teams running test automation at scale across Chrome, Firefox, WebKit, or remote browser grids. The goal is not to turn every test into a movie archive. It is to preserve enough context to debug failures without drowning your pipeline in noise.
The short version: what to keep on every meaningful failure
For most browser automation stacks, the most useful browser test failure logs are:
- A short failure summary with test name, browser, version, and environment.
- A screenshot at the failure point.
- A video or screen recording for flows with multiple steps or asynchronous UI behavior.
- Console logs, including warnings and errors.
- Network traces or request logs for API-driven UI flows.
- A DOM snapshot or HTML excerpt from the relevant page state.
- Metadata such as URL, viewport, timestamps, and retry count.
- Any browser, driver, or grid logs that explain execution problems.
If you only save one thing beyond the stack trace, make it the state of the page at the moment the assertion failed.
The rest of this article explains what each artifact tells you, what it does not tell you, and how to avoid generating so much output that nobody wants to inspect it.
1) Start with failure metadata, not just artifacts
Before you attach a video or DOM dump, capture the context that gives the evidence meaning. A screenshot from a checkout page is not very useful if you do not know which browser, locale, viewport, or test retry produced it.
Capture this metadata for every failure
- Test name and suite name
- Commit SHA or build number
- Branch and pull request ID
- Browser name and version
- Operating system and runtime image
- Viewport size and device emulation settings
- URL at failure time
- Retry number, if retries are enabled
- Test duration and failure timestamp
- Parallel worker ID, if applicable
- Grid node ID or container ID, if using Selenium Grid or a remote runner
This metadata helps you correlate browser test failure logs with CI logs, infrastructure logs, and application deploys. It also helps distinguish deterministic regressions from test flakiness. A failure that happens only on the third retry, or only in one browser, points you toward a different root cause than a failure that happens consistently across all runs.
Why this matters in practice
A browser test that fails on WebKit but not Chromium could be a browser compatibility issue. A test that fails only on a 1366x768 viewport might be suffering from a responsive layout break. A test that fails only in a container image with a newer font stack could indicate a visual or text-measurement sensitivity. Without metadata, those signals get lost.
2) Keep a screenshot, but do not stop there
Screenshots are still the fastest way to understand many UI failures. They answer the simplest question, what did the user see when the test stopped?
A screenshot is most useful when it includes
- The full viewport or a full-page capture when the page is short enough
- Browser chrome, if your setup allows it and it helps identify the environment
- The exact state at the assertion point, not just the end of the test
- The visible error message, toast, modal, or loading indicator
Screenshots are weak at explaining
- Why the UI was in that state
- Whether a request failed before the screenshot was captured
- Whether the page was still animating, rerendering, or waiting on a background job
- Whether the issue was only visible in the DOM and not the pixels
For that reason, treat screenshots as the first clue, not the final answer. In React, Angular, Vue, and similar single-page apps, the visual state can look stable while the DOM is still changing or a pending request is about to alter the page again.
Practical rule
Capture a screenshot automatically on failure, and, for flaky-prone flows, capture one after each critical step. This makes it much easier to identify the moment when the UI diverged from the expected path.
3) Add video logs for multi-step flows and timing-sensitive bugs
Video logs are especially valuable when a test failure is caused by transitions, overlays, slow rendering, or user interaction timing. A single screenshot can show the final state, but a video shows how the state evolved.
Video is worth keeping when the test involves
- Login and MFA flows
- Drag and drop interactions
- Infinite scroll or lazy loading
- Modals and popovers that depend on animations
- File uploads and download flows
- Navigation across multiple pages or tabs
- Complex form validation with asynchronous save behavior
What video helps you see
- The page taking too long to stabilize
- A spinner that never disappears
- A dialog opening and closing too quickly
- An element shifting position during click targeting
- The test clicking before the UI is ready
- Unexpected browser prompts, permission dialogs, or popups
When video is less useful
Video can be expensive to store and can become hard to search. If every failed test uploads a long video, teams often stop looking at them. For stable, low-risk tests, a screenshot plus logs may be enough. For flows with known timing sensitivity, video is often the best artifact you have.
Keep it short and purposeful
If your runner supports it, start recording at test start and stop at failure, or record only failing tests in full. Long recordings of passing tests rarely help diagnose a specific issue.
4) Capture console logs, including warnings
Console output is a direct line to front-end runtime problems. Many browser failures are not purely test failures, they are app errors that the test exposed.
Save these console events
errormessageswarningmessages that indicate broken assumptions or deprecations- Unhandled promise rejections
- Stack traces from client-side exceptions
- CSP violations, if relevant
- Failed resource loads that appear in the console
Console errors often explain why the UI never reached the expected state. A button might not render because a JavaScript exception interrupted the component tree. A validation message might never appear because an API response handler crashed. A flaky failure might be tied to a warning that becomes an error only under one browser engine.
Filter carefully
Not every console warning deserves a red build. Some apps emit noisy messages from analytics scripts, browser extensions, or third-party widgets. If you treat every warning as a failure, teams will quickly ignore the signal.
A better approach is:
- Always store all console messages for failed tests
- Optionally fail the test only on a curated allowlist of severe messages
- Suppress known benign noise by source, not by broad message patterns
Selenium and Playwright example: collecting console events
import { test } from '@playwright/test';
test('records console output', async ({ page }) => {
const messages: string[] = [];
page.on('console', msg => messages.push(`${msg.type()}: ${msg.text()}`));
await page.goto(‘https://example.com’); // assertions here
console.log(messages.join(‘\n’)); });
If you are using Selenium, you will often need browser-specific log capabilities or driver APIs. The exact setup varies by browser and driver version, so the important part is the principle, keep the logs with the test result instead of hoping someone opens the browser console later.
5) Record network traces for data-dependent failures
A browser test often looks like a UI check, but many failures originate in the network layer. If a page renders incorrectly because a backend request was slow, returned a 500, or produced a shape change in the JSON response, the DOM alone may not reveal enough.
Keep network information such as
- Request method, URL, and status code
- Request timing and duration
- Failed requests and retries
- Redirect chains
- Response payload summaries for critical API calls
- Cache or service worker involvement, when relevant
- Throttling or offline simulation settings
Network traces are especially useful when tests fail intermittently because the app depends on race conditions or non-deterministic backend data. They are also useful when a UI failure is actually caused by authentication expiration, CORS misconfiguration, CDN issues, or a third-party API outage.
What to inspect first
If a UI assertion fails, check whether the app made all expected requests. A missing API call can be more important than a wrong DOM assertion. Similarly, a 200 response may still be a failure if the payload shape changed and the client code silently rejected it.
Example: logging failed requests in Playwright
page.on('requestfailed', request => {
console.log('FAILED', request.method(), request.url(), request.failure()?.errorText);
});
page.on(‘response’, async response => { if (response.status() >= 400) { console.log(‘HTTP’, response.status(), response.url()); } });
Tradeoff
Full network HAR files can be helpful, but they can also be large and noisy. If your app makes dozens of static asset requests, you may only need to retain API calls, failures, and requests related to the current route.
6) Save a DOM snapshot at the moment of failure
The DOM snapshot is often the most underused and most valuable artifact. A screenshot shows pixels, but a DOM snapshot shows structure, attributes, text content, hidden elements, and sometimes state that is not visually obvious.
Good DOM evidence includes
- The container element around the failed assertion
- The HTML of the component or region under test
- Key attributes such as
aria-*,data-*,disabled,checked,value, andclass - Text content of the target element and nearby context
- The computed presence or absence of expected nodes
Why DOM snapshots matter
Many flaky failures are caused by timing mismatches between the test and the UI. The element might exist but be hidden. The element might be visible but detached and reinserted. The text might be correct in the DOM but not yet reflected in the rendered screenshot. The snapshot helps explain which of those states existed at failure time.
Use the smallest useful fragment
Dumping the entire page HTML is often too much. Start with the subtree around the failing locator or the component root. That gives enough context without overwhelming the artifact store.
Example: capturing a targeted DOM snapshot in Playwright
typescript
const el = page.locator('[data-testid="cart-summary"]');
console.log(await el.evaluate(node => node.outerHTML));
For Selenium, a similar approach is to retrieve element attributes or outerHTML via JavaScript execution. The key is to snapshot the relevant part of the page before the DOM changes again.
7) Include browser and driver logs for infrastructure issues
Sometimes the app is fine and the test runner is not. Browser test failure logs should include enough execution detail to diagnose infrastructure problems separately from product problems.
Retain these when available
- Browser driver logs
- Selenium Grid node logs
- Container startup and teardown logs
- Browser crash messages
- WebSocket or remote debugging connection errors
- Resource exhaustion signs, such as out-of-memory messages
- Timeouts from the test framework itself
These logs help with issues like browser startup failures, session creation problems, version mismatches, and disconnected nodes. In distributed setups, they can explain failures that never reached the app under test.
If you use continuous integration with parallel workers, this separation becomes even more important. A red job might be due to one bad node, not a regression in the codebase.
8) Decide what to retain based on failure type
Not every failure needs the same evidence. A practical logging strategy uses different artifact sets depending on the failure category.
For assertion failures
Keep:
- Screenshot
- DOM snapshot
- Console logs
- Relevant network requests
- Test metadata
This is the common case when the UI loaded but the expected state did not appear.
For timeouts
Keep:
- Video
- Screenshot
- Network traces
- Console logs
- Step timings
Timeouts usually mean the app was late, stuck, or waiting on a dependency. Video often makes the root cause obvious.
For browser crashes or session failures
Keep:
- Browser driver logs
- Grid or container logs
- Screenshot if one exists
- Last known console and network events
These failures can happen before the test reaches the application. You need runner evidence more than UI evidence.
For flaky intermittent failures
Keep:
- Full artifact bundle for every failure occurrence
- Retry number
- The exact step that failed
- Any preceding warnings, slow requests, or layout changes
For flakes, the failure itself may be less important than the sequence that led to it.
9) Use structured naming so artifacts can be searched later
Artifacts are only useful if engineers can find them. A folder full of screenshot.png and video.mp4 files is not enough.
Good naming conventions include
- Test name
- Browser
- Build number
- Retry count
- Timestamp
- Worker or node ID
Example:
text checkout-add-to-cart_chromium_build-1842_retry-1_node-3_dom.html
This makes it easier to trace failures across CI runs and compare artifacts across browsers.
Store metadata alongside the files
A small JSON or text manifest can make artifact inspection much easier.
{ “test”: “checkout-add-to-cart”, “browser”: “chromium”, “version”: “125”, “retry”: 1, “url”: “https://app.example.com/cart”, “status”: “failed” }
10) Do not over-log by default
The most common mistake is to turn on every possible artifact for every test run. That creates expensive storage, slow CI, and low signal.
Problems caused by over-logging
- Artifact storage grows quickly
- Uploads slow down the pipeline
- Debugging tools become harder to use
- Engineers stop checking the logs because there is too much noise
- Sensitive data may accidentally be retained longer than intended
A better default
Use a tiered approach:
- Always keep metadata and minimal logs
- Keep screenshots on failure
- Keep video for selected suites or failed retries
- Keep DOM and network evidence for high-value or flaky paths
- Keep driver logs for environment or session failures
This gives you observability without overwhelming the CI system.
11) Redact secrets and user data before upload
Browser test failure logs can capture real user content, tokens, emails, API responses, and session identifiers. That is useful for debugging, but it is also a security risk.
Redact or avoid storing
- Authorization headers
- Session cookies
- Password fields
- One-time codes
- Personal data from production-like fixtures
- Full request bodies when they include secrets
Safer practices
- Run end-to-end tests against synthetic data
- Mask sensitive fields in logs before uploading
- Avoid full-page screenshots of pages with PII unless access is tightly controlled
- Limit artifact retention windows
If your CI environment tests against data that resembles real customer information, treat artifact retention as part of your security policy, not just a debugging preference.
12) A practical failure checklist for CI
Use this as a working standard for browser test failure logs.
Always capture
- Test name
- Browser and version
- Environment and build metadata
- Failure message and stack trace
- Screenshot
Capture when the test is UI-heavy or flaky-prone
- Video
- DOM snapshot of the failing region
- Console logs
- Key network requests and failures
Capture when the failure looks environmental
- Browser driver logs
- Grid node logs
- Container logs
- Session creation logs
Capture when the failure is data-driven
- API response summaries
- Request identifiers or correlation IDs
- Timing of critical requests
- Cache or auth state if relevant
13) How to wire this into a Playwright or Selenium workflow
The implementation details differ, but the strategy is the same, collect evidence at the point of failure and attach it to the CI artifact bundle.
Playwright pattern
Playwright makes it relatively straightforward to store traces, screenshots, and videos per test. A common pattern is to enable richer artifacts only on retry or failure.
import { defineConfig } from '@playwright/test';
export default defineConfig({ use: { screenshot: ‘only-on-failure’, video: ‘retain-on-failure’, trace: ‘retain-on-failure’ } });
The trace file can be especially valuable because it combines actions, snapshots, console events, and network activity in one place.
Selenium pattern
Selenium usually requires a bit more manual plumbing. A practical setup is to:
- Capture screenshots in an after-each hook
- Save browser console logs where supported
- Export page HTML for the failing element or page
- Pull driver or grid logs from the execution environment
The framework matters less than the discipline of preserving evidence close to the failure.
14) Interpreting the evidence, what each artifact tells you
A useful mental model is to map each artifact to the question it answers.
- Screenshot, what was visible?
- Video, how did the state evolve?
- Console logs, did the page throw or warn?
- Network traces, did the app receive the right data at the right time?
- DOM snapshot, what was actually in the page structure?
- Driver logs, did the browser infrastructure fail?
When these artifacts agree, debugging becomes much faster. When they disagree, that is often the most interesting clue. For example, a DOM snapshot may show the correct text while the screenshot still shows the loading state, which can point to rendering or timing behavior rather than a missing API response.
15) A simple retention policy that works for many teams
If you need a starting point for CI policy, use this:
- Keep screenshots for every failure for 30 days
- Keep DOM snapshots and console logs for every failure for 14 to 30 days
- Keep videos only for failed tests, retries, and critical user journeys for 7 to 14 days
- Keep network traces for high-value suites or failed API-dependent flows for 7 to 14 days
- Keep driver and grid logs for environment failures for 7 days
Adjust retention based on compliance needs, artifact size, and how often the same failure recurs. The right answer is the one your team will actually inspect.
Conclusion
The best browser test failure logs are not the largest ones, they are the ones that let a developer or QA engineer answer the next question immediately. Screenshots show the state. Video shows the sequence. Console logs reveal runtime problems. Network traces expose data and dependency issues. DOM snapshots show what the page actually contained. Infrastructure logs explain failures outside the app.
If you treat these artifacts as a minimal evidence set, not as an unlimited dump, you get faster triage, better flake detection, and cleaner ownership between test problems, application bugs, and environment failures. That is the difference between a red build that gets ignored and one that gets fixed.