Browser tests that pass all morning and then start failing right after a CDN purge are usually telling you something important, even if the failure message looks random. The page still loads, selectors still exist, and the app did not obviously break. Yet one environment is seeing a bundle, stylesheet, or image that another environment is not, and your browser automation is getting caught in the middle.

This class of issue sits at the intersection of test automation, deployment pipelines, and cache behavior. In other words, it is not just a test problem. It is often an asset delivery problem that browser tests happen to expose first. If you are working in software testing or test automation, this is one of those cases where the quickest fix is often to understand the delivery chain rather than to add another wait.

What makes these failures different

A normal flaky test often fails because of timing, nondeterminism, or an unstable selector. A CDN or asset rebuild issue fails because the browser is seeing a mixed version of the application. Common examples include:

  • HTML from version N, JavaScript bundle from version N-1
  • A stylesheet update that changes layout before the test is ready
  • A hashed asset name changing while some caches still point to the old path
  • A service worker or browser cache serving stale files after a deploy
  • A CI environment that resolves a different CDN edge than your local laptop

The result is often subtle. Your login test passes, but the dashboard smoke test fails because a button moved. A visual assertion changes, but only in one browser. A click intercepts on an overlay that should not exist. The app is technically up, but the runtime state is inconsistent.

If the failure appears immediately after a purge or rebuild, treat it as a version skew problem first, and a test problem second.

The main failure modes to look for

1. Mixed asset versions

This is the classic cache invalidation issue. The HTML response references one build, but the browser fetches old or partially updated assets from cache, a CDN edge, or an intermediate proxy. Because modern frontends use hashed filenames, this often shows up as a missing chunk, a runtime exception, or a page that renders with unexpected layout.

A few common causes:

  • HTML is cached too aggressively
  • JS chunk URLs are not updated atomically
  • The app shell and API responses are released at different times
  • CDN propagation is still in progress when tests begin

2. Stale browser state

Even if the server-side caches are perfect, the browser can preserve state from a previous run. Service workers, localStorage, IndexedDB, and cache storage can keep old application code or user state alive. Tests then behave differently after a rebuild because the browser is not starting from a clean slate.

This is especially common when:

  • The test runner reuses browser profiles
  • CI jobs are optimized for speed by persisting workspace or user data
  • Local debugging uses a profile with previous sessions and service worker registrations

3. Asset rebuild flakiness

A rebuild can be functionally correct but operationally noisy. For example, the new build may contain:

  • Different image dimensions, which shift layouts
  • A font change that alters text wrapping
  • A CSS class rename that breaks a brittle locator or assertion
  • A JS chunk split that changes load order and introduces transient loading states

These are not failures in the traditional sense. They are changes in the assumptions your tests made about the UI contract.

4. Cross-environment drift

Your local environment, CI, staging, and production-like test environment may not share the same caching rules, CDN topology, or deployment ordering. A test that passes on localhost can fail only in CI because CI hits a different edge node or starts before cache propagation settles.

Start by classifying the symptom

Before changing the test, figure out what kind of failure you are observing. The classification often points directly to the fix.

Network-level symptoms

Look for these in the browser devtools network tab or test runner traces:

  • 404 or 403 on JS, CSS, or image requests
  • chunkLoadError or similar runtime errors
  • Long tail of requests that succeed in one run and fail in another
  • HTML loaded from a new deploy, but assets loaded from the old one

UI-level symptoms

These typically show up as:

  • Layout shifts causing clicks to hit the wrong element
  • Hidden elements becoming visible or vice versa
  • Text wrapping changing, breaking visual or locator-based assertions
  • Race conditions around spinners, skeleton screens, or lazy-loaded components

State-level symptoms

If the app only fails after a purge or rebuild on a reused browser profile, inspect:

  • service worker registration
  • localStorage values
  • sessionStorage values
  • indexedDB entries
  • cached static assets

The browser may be carrying state that masks the real version of the application until a fresh context is used.

A practical debug sequence

When you suspect asset delivery flakiness, do not start by widening timeouts. Start by proving whether the browser and server are aligned on the same version.

1. Capture the exact build identifier

Your app should expose a build or release identifier somewhere easy to inspect, such as in the HTML, a response header, or a small endpoint. Without that, debugging becomes guesswork.

A lightweight pattern is to render a version marker into the DOM:

<meta name="app-build" content="2026.07.05.1">

Then assert against it in tests or log it during failures.

2. Log the browser’s network requests

In Playwright, you can inspect failed requests and responses during a problematic flow:

page.on('response', response => {
  const url = response.url();
  if (url.includes('/static/') || url.includes('/assets/')) {
    console.log(response.status(), url);
  }
});

For Selenium, you usually need browser logs or a proxy. The important part is not the tool, but the evidence. You want to know whether the browser is requesting an old chunk, getting a 404, or receiving a different asset than expected.

3. Check whether the page and assets agree on version

A common failure pattern is this sequence:

  1. Browser loads HTML from the latest deploy.
  2. HTML references hashed assets that were just invalidated.
  3. CDN edge still has the old bundle, or the old bundle has already been removed.
  4. App fails to boot or renders inconsistently.

If you can, compare:

  • HTML build identifier
  • JS bundle hash in the URL
  • CSS hash in the URL
  • Any runtime version marker exposed by the app

If they do not match, you have found the root cause class.

4. Reproduce with a clean browser context

Run the test against a fresh browser profile or incognito context. If the failure disappears, stale browser state is likely involved.

In Playwright, that looks like using a new context per test or per suite:

typescript

const context = await browser.newContext();
const page = await context.newPage();
await page.goto(process.env.BASE_URL!);

If a reused context is required for performance, add explicit cleanup for service workers, localStorage, and cookies before the scenario.

5. Compare CI and local request paths

CI often differs in ways that matter:

  • Different geographic CDN edge
  • Different DNS resolution
  • Different TLS termination
  • Different cache headers preserved by an internal proxy
  • Different test startup timing after deployment

If the failure is CI-only, add logging for request URLs, response headers, and environment metadata so you can compare runs.

Where browser tests usually go wrong

Brittle selectors that depend on layout

After an asset rebuild, CSS changes can move elements just enough to break selectors that are too positional or too dependent on visible text. For example, a button may still exist, but now a sibling element overlays it.

Prefer stable hooks such as data-testid or accessible roles over CSS position-based selectors. In Playwright, a role-based locator is usually more resilient:

typescript

await page.getByRole('button', { name: 'Save changes' }).click();

This does not fix asset delivery problems, but it makes the test less sensitive to incidental UI changes triggered by a rebuild.

Waiting for the wrong thing

Tests often wait for the page load event, then assume the app is ready. With code-split bundles, background hydration, or lazy-loaded components, that is not enough.

A better wait condition is usually a domain-specific ready signal, such as:

  • a root element becomes visible
  • a spinner disappears
  • a specific API response completes
  • a known app state is rendered

Avoid sleep-based waits unless you are confirming a specific race condition during debugging.

Ignoring service worker behavior

Service workers can make failures appear inconsistent because they intercept requests independently from the browser cache. After a deploy, an old service worker may keep serving old assets while the page shell has already changed.

If service workers are part of your app, make their lifecycle visible in staging and test environments. A test may need to unregister them between runs when validating release behavior.

How to prove cache invalidation is the real issue

A good debugging goal is to answer one question: did the browser receive the version you think it received?

Here is a checklist that usually narrows it down quickly:

  • Hard refresh the page in a clean context, does the failure change?
  • Disable cache in devtools, does the failure disappear?
  • Load the same page from a different network or geography, does behavior change?
  • Remove service workers and retry, does the app boot normally?
  • Compare the asset hash in the HTML with the actual network requests
  • Inspect whether the CDN still serves a purged path for some edges

If you can make the test pass by clearing state or disabling cache, the problem is almost never the assertion itself.

What to fix in the application and delivery pipeline

Use immutable, hashed assets, but serve HTML carefully

Hashed filenames are good because they make static files cacheable for a long time. The tradeoff is that the HTML or manifest that references those files must not be cached in a way that outlives the release.

A healthy pattern is:

  • HTML and app entry points, short cache or no cache
  • hashed static assets, long cache with immutable semantics
  • deploys that update references atomically

If HTML can point to assets that have already been purged, browser tests will find that gap quickly.

Make deployments atomic from the browser’s perspective

If you deploy frontend code and supporting assets separately, there is a window where users and tests can see mixed versions. A safer approach is to make a release either fully visible or not visible at all.

That can mean:

  • writing the build to a versioned path and switching traffic after upload completes
  • using a manifest file that is replaced last
  • keeping the previous asset set available long enough for stragglers

Keep old bundles around during rollout windows

Purging a CDN immediately after publishing a new build can expose clients still referencing the previous version. If your app relies on hashed chunk paths, removing old files too early can create runtime failures that only happen during rollout.

The test signal here is valuable. If browser tests start failing only after a purge, it is often safer to change the purge strategy than to make the tests more tolerant.

Add a release marker

A simple release marker helps a lot:

  • build ID in HTML
  • build ID in response headers
  • build ID in the app footer or debug panel

That marker lets test logs and observability data answer whether the failing run saw the expected release.

CI-specific tactics

Continuous integration (continuous integration) systems are especially prone to this class of bug because they optimize for parallelism and speed. That means they may start browser tests before caches settle or after a deploy has partially propagated.

Gate tests on release readiness

If browser tests run against a freshly deployed environment, add an explicit readiness check that validates version alignment before the suite starts.

For example, a simple CI step can poll a release endpoint until the build ID matches the expected value:

bash #!/usr/bin/env bash set -euo pipefail

expected=”$1” for i in {1..30}; do actual=$(curl -fsS https://staging.example.com/version) if [ “$actual” = “$expected” ]; then exit 0 fi sleep 5 done

echo “Release not ready” exit 1

This is not a substitute for fixing the deployment flow, but it reduces false failures caused by premature test execution.

Separate deploy validation from regression testing

A smoke test that confirms the new build is live should not be the same job that runs the full browser suite. Keep a small, version-aware validation step close to the deployment and let the broader suite run once the environment is stable.

This separation helps you answer whether a failure is due to the release process or the product code.

Capture traces on first failure

Browser traces, screenshots, and network logs are more useful here than in many other flaky test cases because they can show the exact mixed state. If a failure is intermittent, preserve the first failing run rather than re-running immediately and losing the evidence.

How to stabilize tests without hiding real regressions

It is tempting to solve these failures by adding retries or larger timeouts. That usually masks the symptom, not the cause.

Better stabilization options include:

  • Use fresh browser contexts for flows sensitive to version or cache state
  • Wait for app-specific readiness conditions, not just load
  • Assert on version markers when the test is meant to validate a release
  • Avoid brittle assumptions about exact layout or resource ordering
  • Keep test data isolated from browser cache and storage

A good retry policy is narrow

Retries can still help, but only when they are used to absorb genuine environmental noise after you have reduced the underlying race. A retry policy that keeps passing mixed-version tests is risky, because it turns a deployment issue into a hidden test issue.

Prefer explicit reset steps

For tests that must run in an already-used profile, explicitly reset state:

typescript

await page.context().clearCookies();
await page.evaluate(() => {
  localStorage.clear();
  sessionStorage.clear();
});

If your app uses service workers, also consider unregistering them in the test setup for environments where that is appropriate.

A quick decision guide

Use this rough triage when a browser test fails right after a purge or rebuild:

  • 404 or chunk load error: investigate deployment atomicity and asset retention
  • UI shifted or element overlayed: inspect CSS and layout changes from the rebuild
  • Passes in fresh profile, fails in reused one: stale browser state or service worker issue
  • Only CI fails: environment drift, cache propagation, or release timing
  • Only after purge fails: old asset references are being removed before all clients stop using them

If more than one category applies, fix the release process first. Otherwise, the same failure will reappear under a different label.

A minimal checklist for your team

Before declaring a browser test flaky, ask these questions:

  1. What build ID did the browser see?
  2. Did the HTML and static assets come from the same release?
  3. Was the browser context clean?
  4. Did a service worker or cache interfere?
  5. Did the CDN purge happen before old clients were safe?
  6. Is the test asserting the right readiness signal?

If you cannot answer at least the first three, you are debugging blind.

The engineering tradeoff to remember

The deeper lesson behind browser tests fail after CDN purge cases is that modern frontend delivery is a distributed system. Your browser test is not just checking a page, it is validating the consistency of HTML, assets, caches, and runtime state across several layers.

That is why some failures only appear after a rebuild or purge, and why they can be so frustrating. The test is often the first consumer to notice a release coordination bug.

The best long-term fix is not adding arbitrary waits. It is making the application release observable, making asset delivery atomic enough for browsers, and making test setup deterministic enough to avoid inherited state. Once those are in place, the flaky failures usually become either real product regressions or easy-to-explain environment issues.

Closing thought

If a test fails only after a CDN purge or asset rebuild, treat that failure as a diagnostic clue. It is pointing at version drift, stale browser state, or a deployment that is not yet browser-safe. The fastest path to stability is to align release behavior, cache policy, and test setup so the browser always sees one coherent version of the app, not a blend of old and new.