Caching Layers Explained: Browser, CDN, and App Caching

Marcus White
10 Min Read

If you’ve ever opened a website and thought, “Why did that load instantly?”, chances are caching did most of the work.

Modern web applications rarely serve every request from scratch. Instead, they rely on multiple caching layers working together. These layers store previously generated responses closer to the user, dramatically reducing latency, server load, and infrastructure costs.

At a high level, caching simply means saving a copy of data so it can be served faster later. But in real-world systems, caching isn’t just one thing. It usually happens across three main layers:

  1. Browser caching
  2. CDN (edge) caching
  3. Application-level caching

Understanding how these layers interact is essential if you want to build scalable systems. Many performance problems come not from missing caches, but from misconfigured ones fighting each other.

Let’s break down how each layer works and how they fit together in a modern architecture.

Why Modern Systems Use Multiple Caching Layers

A single cache cannot solve every performance problem. Each layer exists because different parts of the request lifecycle have different bottlenecks.

Consider a typical request:

User → Browser → CDN → Application Server → Database

Every hop introduces latency. Caching works best when data is stored as close to the user as possible.

This creates a hierarchy:

Layer Location Purpose
Browser cache On the user’s device Prevents unnecessary network requests
CDN cache Edge servers near users Reduces distance to the server
Application cache Backend servers Avoids expensive database queries

Think of it like a grocery supply chain.

  • Browser cache is your refrigerator.

  • CDN cache is the neighborhood store.

  • Application cache is the warehouse.

If something is already in your fridge, you don’t drive to the store. The same logic applies to HTTP requests.

Browser Caching: The First Line of Defense

The browser cache sits directly on the user’s device. It stores static assets like:

  • Images
  • CSS files
  • JavaScript bundles
  • Fonts
  • Sometimes HTML pages
See also  Top 12 SOC 2 Compliance Software Platforms to Simplify Audits in 2025

When the browser already has a resource locally, it can skip the network entirely.

How Browser Caching Works

Browsers rely primarily on HTTP headers such as:

Cache-Control
Expires
ETag
Last-Modified

For example:

Cache-Control: public, max-age=31536000

This tells the browser it can reuse the resource for one year without checking the server.

Two Common Browser Cache Strategies

1. Strong caching

The browser uses the cached file without contacting the server.

Example:

Cache-Control: max-age=86400

The file is reused for 24 hours.

2. Validation caching

The browser asks the server if the file changed.

Example:

ETag: “abc123”

The browser sends:

If-None-Match: “abc123”

If unchanged, the server returns:

304 Not Modified

This response contains no body, making it much faster than downloading the file again.

CDN Caching: Moving Content Closer to Users

A Content Delivery Network (CDN) sits between users and your origin server. Popular providers include:

  • Cloudflare
  • Fastly
  • AWS CloudFront
  • Akamai

CDNs operate edge servers distributed around the world. Instead of every user requesting data from your origin server, requests are served from the nearest edge node.

How CDN Caching Works

When a request reaches the CDN:

  1. CDN checks if the content exists in the edge cache
  2. If yes → serve immediately
  3. If no → request from origin server
  4. Store response for future requests

This dramatically reduces:

  • latency
  • origin server load
  • bandwidth usage

Example Flow

User in Germany requests:

https://example.com/logo.png

Without CDN:

User → US Server

With CDN:

User → Frankfurt edge server → (cache hit)

Response time can drop from 200–300ms to under 20ms.

CDN Cache Headers

CDNs respect similar headers to browsers:

Cache-Control
Surrogate-Control
s-maxage

Example optimized for CDNs:

Cache-Control: public, max-age=300, s-maxage=3600

Meaning:

  • Browser cache: 5 minutes
  • CDN cache: 1 hour

Application-Level Caching: Avoiding Expensive Backend Work

Even with browser and CDN caching, dynamic requests still reach your backend.

Application caching reduces the cost of generating responses by storing computed results in memory or fast storage.

See also  When Is Smart Home Technology an Investment Worth Making for Property Managers?

Common tools include:

  • Redis
  • Memcached
  • in-process caches (like Node LRU cache)
  • database query caches

Example Problem

Imagine an API endpoint:

GET /top-posts

Without caching:

Request → App → Database query → Response

If the query runs 10,000 times per minute, your database becomes the bottleneck.

With Application Cache

Request → Cache → Response

If cache misses:

Request → App → Database → Cache → Response

This pattern is called cache-aside.

Example pseudocode:

posts = redis.get(“top_posts”)

if posts is null:
posts = database.query(“SELECT * FROM posts ORDER BY score DESC LIMIT 10”)
redis.set(“top_posts”, posts, ttl=300)

return posts

Now the expensive query runs once every 5 minutes instead of thousands of times.

How These Layers Work Together

The real power comes from combining them.

A request typically flows like this:

User

Browser Cache

CDN Edge Cache

Application Cache

Database

Most requests never reach the bottom of the stack.

Example distribution in a well-optimized system:

  • 60–80% served by browser cache

  • 15–30% served by CDN

  • 5–10% reach application

  • <1% hit database

This layered architecture is why companies like Netflix, Shopify, and Reddit can handle millions of requests per second.

How to Design a Practical Caching Strategy

Here’s a simple approach used in many production systems.

Step 1: Cache Static Assets Aggressively

Use long TTLs for immutable files.

Example:

Cache-Control: public, max-age=31536000

Pair this with content hashing:

app.34f9a.js

If the file changes, the name changes.

Step 2: Use CDN Edge Caching for Public Pages

Cache pages that do not depend on user identity.

Good candidates:

  • blog posts
  • documentation
  • marketing pages

CDN edge caching often provides the largest performance gain.

Step 3: Cache Expensive Backend Queries

Use Redis or Memcached for:

  • API responses
  • expensive database queries
  • session data

Focus on endpoints with:

  • high request volume
  • expensive computation

Step 4: Plan for Cache Invalidation

Caching is easy. Cache invalidation is the hard part.

See also  From Incidents to Intelligence: How Enterprise Leaders Are Really Using AI Operations

Common strategies include:

  • TTL expiration
  • cache busting
  • versioned keys
  • event-based invalidation

For example:

user_profile_v2:1234

Changing the version automatically invalidates older cache entries.

Common Caching Mistakes Engineers Make

Even experienced developers run into caching problems.

Here are the big ones.

Caching personalized content

Never cache user-specific responses globally.

Example:

/dashboard

This should bypass CDN caching.

Forgetting cache headers

Without headersCache-Control, neither browsers nor CDNs know what to cache.

Over-caching dynamic data

Real-time data like stock prices or chat messages should not have long TTLs.

Ignoring cache observability

You should monitor:

  • cache hit ratio
  • latency improvements
  • eviction rates

Without this data, you cannot tune caching effectively.

FAQ

What is a cache hit vs cache miss?

A cache hit means the requested data was found in cache.
A cache miss means the system had to fetch the data from the source.

Higher hit ratios mean better performance.

Which caching layer provides the biggest speed boost?

Usually, CDN caching reduces the network distance between users and servers.

Should APIs use CDN caching?

Yes, for public API responses. Many modern APIs cache responses at the edge to reduce backend load.

Is Redis required for application caching?

Not strictly. But Redis is popular because it provides:

  • in-memory speed

  • TTL expiration

  • distributed caching

Honest Takeaway

Caching layers are one of the most powerful performance tools in modern web architecture. When used correctly, they can reduce server load by orders of magnitude and make applications feel instant to users.

But caching is not a magic switch. It requires thoughtful design around TTL strategies, invalidation rules, and observability. The teams that do this well treat caching as part of system architecture, not just an optimization added later.

If you remember one principle, make it this:

Cache as close to the user as possible, and only compute when you absolutely must.

Share This Article
Marcus is a news reporter for Technori. He is an expert in AI and loves to keep up-to-date with current research, trends and companies.