June 15, 2026
How to Debug Flaky Browser Tests Caused by Service Workers, Caches, and Offline State
Learn how to debug flaky browser tests caused by service workers, cache storage, and offline state. Practical steps for Playwright, Selenium, and CI troubleshooting.
Browser tests that fail only sometimes are annoying enough when the root cause is a race condition or a locator problem. They become much harder when the failure is tied to state the browser keeps between runs, especially service workers, Cache Storage, and offline-like behavior. A test may pass on a clean profile, then fail on the second or third run because the browser reuses an asset cached by a previous session, a service worker intercepts a request, or a page thinks it is offline even though the machine has network access.
These failures are common in teams that run test automation against modern web apps with progressive web app features, aggressive caching, or data-heavy frontends. They are also a frequent source of confusion in continuous integration systems where browsers are reused for speed, profiles persist across retries, or parallel jobs interfere with shared test fixtures. If you are investigating flaky browser tests service workers, the goal is not just to make the current test pass. The goal is to understand which layer is hiding the real behavior of the app.
This guide focuses on practical debugging techniques for SDETs, QA engineers, frontend engineers, and test infrastructure owners. It explains how to identify service worker interference, how to inspect cached assets and offline state, and how to make browser automation more deterministic without disabling useful production behavior everywhere.
Why these failures feel different from ordinary flakiness
Most flaky UI tests are caused by timing. The element is not ready yet, the API response was slower than usual, or the test clicked before the page settled. Cache and service worker problems feel different because the test often fails in ways that look logically impossible:
- The page loads, but shows stale data.
- A request appears to succeed, but the UI never updates.
- The app works in a fresh browser context, then breaks on rerun.
- A test passes in headed mode, but fails headless in CI.
- A page is reported offline even though the browser can reach the network.
If a browser test changes behavior when you reuse the same profile, suspect storage or service worker state before you suspect the app code.
The reason is that browser runtime state is layered. Your test interacts with page JavaScript, but the network request may be intercepted by a service worker, the response may come from a cache, and the page’s own offline indicator may derive from navigator.onLine, a failed fetch, or some custom app state. A single failed assertion can be the result of a chain of decisions made long before your test action.
The three state layers that usually cause trouble
1. Service workers
A service worker runs separately from the page and can intercept requests, cache responses, and serve assets offline. That makes it useful in production, but risky in tests if you do not control registration and updates.
Common failure patterns:
- The first test run registers a worker and caches assets.
- The second test run loads the page from the worker’s cached response, not the network.
- A code change has been deployed, but the worker still serves an old shell or old API response.
- A worker update is available, but the test page still uses the active old worker until the next navigation.
2. Cache Storage and HTTP cache
Cache Storage is a browser API often used by service workers. It is distinct from the normal HTTP cache, but both can affect what your test sees. An app might read assets or JSON from Cache Storage, while the browser itself may satisfy a fetch from its HTTP cache before the request ever reaches the network.
This matters when your assertions depend on:
- request counts,
- updated response bodies,
- cache invalidation after login or logout,
- request headers like authentication tokens,
- per-test mock data.
3. Offline or offline-like state
Browsers support explicit offline simulation in automation, but apps also infer offline state from failed network calls, timeouts, or service worker behavior. You can end up with a test that is not truly offline, yet the page logic thinks it is.
This can happen when:
- a service worker falls back to an offline shell,
- a proxy or browser context blocks requests,
- network stubbing is incomplete,
- the app has retry logic that masks the original failure,
- the browser profile preserves a stale
navigator.onLine-adjacent app state after a reload.
Start by proving which layer is responsible
Before changing code, isolate the failure mode. Do not assume the service worker is the culprit just because the app is a PWA.
Check whether the failure depends on browser profile reuse
Run the same test in these modes:
- fresh browser context each run,
- same browser but a new context,
- same user data directory or profile across runs.
If the failure only appears when the profile is reused, storage or service worker state is likely involved. If it appears only after the first navigation, the worker may be registering during the test and affecting later requests.
Compare a clean profile with a dirty profile
A simple diagnostic is to run once on a clean profile, then intentionally rerun against the same profile without deleting browser storage. If the second run fails, you have a reproducibility clue.
Useful artifacts to compare:
- network log,
- console messages,
- application storage snapshot,
- response bodies for the same URL,
navigator.serviceWorker.controllerstate,- cache entries under Cache Storage.
Confirm whether the browser is offline or just behaving like it is
If the app shows offline UI, distinguish these cases:
- the browser context is explicitly offline,
- the fetch request is blocked or mocked,
- the service worker intercepted the request and returned fallback content,
- the app’s own connectivity probe failed.
In Playwright, for example, you can check offline mode directly or use request logging to see whether the page is really reaching the network.
typescript
await context.setOffline(false);
page.on('request', request => {
console.log('request', request.url());
});
page.on('response', response => {
console.log('response', response.status(), response.url());
});
How to inspect service worker behavior
Look for worker registration and activation
If your app registers a service worker, find out when it happens and what scope it covers. Many flaky tests happen because a worker is registered on the first page load, but subsequent navigations run under worker control.
For debugging, inspect:
- registration scope,
- worker script URL,
- activation status,
- whether the current page is controlled,
- whether update checks are happening during test execution.
In browser devtools, the Application panel is helpful. In automation, you can often evaluate a few properties on the page.
typescript
const controlled = await page.evaluate(() => Boolean(navigator.serviceWorker?.controller));
console.log({ controlled });
If controlled is true, requests from that page can be intercepted by the service worker.
Temporarily disable service worker caching in test builds
The cleanest test strategy is often to disable registration in test builds or behind an environment flag. If the app does not need offline behavior under test, do not load worker code in the test environment.
Typical options:
- gate registration on
NODE_ENV !== 'test', - use a build-time flag for E2E environments,
- register only in production domains,
- unregister in a test-only setup step.
This is not always practical if you specifically need to test PWA behavior. In that case, isolate the PWA scenarios into a dedicated suite, and keep the rest of your tests on a fresh profile with worker registration blocked or removed.
Verify updates explicitly
Service workers update asynchronously. A test may pass on the first load because the old worker still handles requests, then fail after the browser notices a newer worker and switches control mid-suite.
To make update behavior visible during debugging, log worker lifecycle events in the app or inspect them in automation. If the app exposes a page-level event bus or telemetry hook, record worker transitions in test logs.
How to inspect Cache Storage and HTTP cache
Dump Cache Storage entries during a failing run
Cache Storage is often the hidden source of stale responses. You can inspect it from the page context.
typescript
const cacheNames = await page.evaluate(async () => await caches.keys());
console.log(cacheNames);
const entries = await page.evaluate(async () => { const names = await caches.keys(); const result: Record<string, string[]> = {}; for (const name of names) { const cache = await caches.open(name); const requests = await cache.keys(); result[name] = requests.map(r => r.url); } return result; }); console.log(entries);
If a failed test is reading a stale HTML shell or JSON response from cache, the URLs in this dump usually make the cause obvious.
Remember that browser HTTP cache is separate
Disabling service workers does not necessarily disable browser HTTP caching. If your failure is due to 304 Not Modified behavior or asset reuse, you may need to:
- create a new browser context,
- set cache-related headers in the app or test environment,
- append cache-busting query params for deterministic test fixtures,
- clear browser data between runs.
For HTTP-level debugging, logs from a proxy, a browser network trace, or a test runner HAR can be more informative than a UI screenshot.
Avoid overusing global cache disabling
It is tempting to disable all caching for all tests. That can mask the real issue and slow the suite enough that timing shifts create new flakes.
A better approach is usually:
- disable caching only in the suites that do not test cache behavior,
- keep one explicit suite for cache and service worker behavior,
- make test data versioned so stale responses are obvious.
When offline state is the real problem
Offline bugs often show up when automation is run on CI workers with strict network controls, local Dockerized browsers, or proxies that make some requests fail intermittently.
Distinguish app offline UI from browser offline mode
An app might show offline UI because one API call failed, even though the browser is online. That means the root cause could be:
- backend availability,
- DNS or proxy resolution,
- certificate problems,
- blocked third-party assets,
- an auth redirect that did not complete,
- a service worker fallback route.
If the test only checks the visible offline banner, it may hide the original network issue. Add request-level logging and assert the exact endpoint or asset that failed.
Reproduce offline behavior intentionally
You should have at least one test that deliberately simulates offline mode. This gives you a reference for what genuine offline behavior looks like in your app.
Playwright example:
typescript
await context.setOffline(true);
await page.reload();
await expect(page.getByText('You are offline')).toBeVisible();
This is useful because it tells you whether the app handles true offline mode correctly, separate from accidental offline-like failures.
Be careful with retries
Retries can make offline issues harder to diagnose. A request that fails once and succeeds on retry may look like a transient backend issue, when the real problem is a short-lived race with worker activation or app boot code.
If a test is flaky around offline detection, capture the first failure before retry logic changes the state again.
A debugging workflow that usually works
Step 1, minimize the test
Strip the test down to the smallest path that still fails. Keep the same browser, same profile behavior, and same app entry point.
Ask:
- Does it fail on initial load or after navigation?
- Does it fail only after login?
- Does it fail after a page reload?
- Does it fail only in one browser engine?
Step 2, capture network and storage state
Record:
- all requests and responses,
- console logs,
- localStorage and sessionStorage contents,
- Cache Storage contents,
- service worker controller status.
In Selenium, you may need browser-specific devtools integration or app-side logging to get enough visibility. In Playwright, request and console listeners are usually enough for a first pass.
Step 3, repeat with a pristine context
Create a new context with no shared storage. If the failure disappears, you are dealing with state leakage, not a pure timing problem.
Step 4, selectively remove sources of hidden state
Try these one at a time:
- unregister service workers,
- clear caches,
- clear cookies and storage,
- use a different browser profile,
- disable offline simulation,
- bypass service worker in the test environment.
Step 5, confirm the same failure without the UI
If possible, hit the same backend endpoint directly with a test client or inspect the response from the page context. This helps answer whether the data is stale before it reaches the UI, or whether the page is rendering stale data after it receives the correct response.
Practical Playwright patterns for these bugs
Playwright is often a good fit because it gives you fine-grained control over browser contexts and network inspection.
Create a fresh context per test
import { test } from '@playwright/test';
test('uses a clean browser state', async ({ browser }) => {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://example.test');
await context.close();
});
Log requests that are likely to be intercepted
page.on('request', request => {
if (request.resourceType() === 'fetch' || request.resourceType() === 'xhr') {
console.log('request', request.method(), request.url());
}
});
Clear storage when the app under test allows it
typescript
await page.evaluate(async () => {
localStorage.clear();
sessionStorage.clear();
const keys = await caches.keys();
await Promise.all(keys.map(key => caches.delete(key)));
});
This does not remove the service worker itself, but it helps reveal whether a stale cache entry is the problem.
Practical Selenium patterns for these bugs
Selenium can still debug this class of failure well, especially in environments already built around WebDriver and continuous integration.
Use a clean profile whenever possible
If your grid allows it, avoid reusing user profiles across test jobs. A reused profile can preserve worker registration, cache data, and storage from previous runs.
Attach browser logs or devtools trace data
Browser console logs are often the fastest path to a clue. If your app logs when a worker takes control or when offline mode is detected, those logs can explain the failure without a full trace.
Do not assume driver.refresh() clears the problem
A refresh may still use the same profile, the same service worker, and the same cached assets. If the issue survives a refresh, try a new browser session instead.
How to make these bugs less likely in the first place
Separate production behavior from test determinism
Not every test should exercise service worker caching. In fact, most functional tests should not. Keep production behavior enabled only in suites that explicitly need it.
A useful rule is:
- feature tests for app logic, use fresh contexts and minimal storage,
- PWA or offline tests, use a dedicated suite with controlled cache state,
- cross-browser smoke tests, keep caching behavior predictable and easy to reset.
Version your test data and assets
If the UI is sensitive to cached data, use versioned API fixtures or unique cache keys per test run. That makes stale cache entries easier to detect and invalidate.
Make service worker changes visible
When app code or a deploy changes the service worker, include clear logging in test environments. A few extra lines in telemetry or console output can save hours of guessing later.
Have a cleanup step that is actually effective
A teardown that only closes the tab is not enough. Depending on your setup, cleanup may need to clear storage, close the browser context, or delete the temporary profile directory.
Decide whether to block or embrace caching in E2E
For each suite, answer one question clearly: is caching part of the thing under test? If not, disable it or isolate it. If yes, write explicit assertions about cached behavior rather than letting it affect unrelated tests.
A quick decision tree for debugging
If a test fails only on rerun
Suspect cached state or service worker registration.
If a test fails only in CI
Suspect profile reuse, network policy, proxy behavior, or slower worker activation timing.
If a test fails after a login/logout flow
Suspect storage leakage, stale cache entries, or cached auth responses.
If a test fails only in one browser engine
Check service worker support, cache eviction differences, and offline behavior across engines.
If a test fails after a page reload but not on initial load
Suspect worker control, cache revalidation, or app boot code that reads old data on startup.
What to capture in a bug report
When you hand this off to a teammate, include enough detail to separate network bugs from browser state bugs:
- browser and version,
- test runner and browser automation tool,
- whether a clean profile fixes it,
- whether service workers are enabled,
- whether offline mode was set explicitly,
- the exact URLs loaded from cache, if known,
- request and response logs for the failing action,
- whether the failure reproduces after clearing storage.
That information makes it much easier to decide whether the fix belongs in the app, the test harness, or the infrastructure layer.
Final takeaway
Flaky browser tests service workers are rarely just about one bad assertion. They usually point to hidden browser state that outlives a single test step and changes what the page sees on the next run. The fastest path to a stable fix is to isolate the layer causing the behavior, then make your test setup either fully control that state or eliminate it altogether.
If you treat service workers, Cache Storage, and offline state as first-class variables in your debugging process, the failures become much easier to explain. More importantly, you can decide when caching is a real product behavior worth testing, and when it is just noise getting in the way of reliable automation.