CloudFront CDN in Practice (1) — How a CDN and CloudFront Work

CloudFront CDN in Practice (1) — How a CDN and CloudFront Work


Introduction

A service that runs fine on a single API server suddenly slows down as users spread worldwide. A Seoul-region server is fast for Tokyo users but far from São Paulo. On top of that, the origin serving the same images, JS, CSS, and static responses on every request piles up load.

A CDN (Content Delivery Network) solves both at once. It caches copies close to users and serves those copies instead of the origin. AWS’s CDN is CloudFront.

This series covers CloudFront from concepts to hands-on. The hands-on uses an example that puts a Spring Boot + Kotlin origin behind CloudFront, reproducible with Terraform.

This post covers the concepts you must understand before the Part 2 hands-on. If you don’t know why a cache hits or misses, you’ll hit “why isn’t it caching?” and “why does the old response keep showing?” in the lab.


TL;DR

  • A CDN keeps copies near users. Instead of the origin server, an edge server close to the user serves a cached copy. Latency drops and origin load falls.
  • CloudFront works per distribution. One distribution points at origins, and per-path behaviors decide “what to cache and what to send to the origin.”
  • Cache is found by key, expires by lifetime. A request becomes a cache key to look up a copy (hit). If absent, it fetches from the origin (miss). Lifetime comes from the origin’s Cache-Control header.
  • Cache static, pass through dynamic. Cache images/JS/CSS for a long time; don’t cache per-user API responses — send them to the origin. Split rules by path.
  • To update, prefer versioning over invalidation. Two ways to swap cached copies: invalidation and filename versioning. For static assets, versioning is cheaper and safer.

1. Why a CDN

Serving worldwide traffic from a single origin server causes two problems.

ProblemDescription
LatencyPhysically distant users have long round trips. Seoul server ↔ São Paulo user is hundreds of ms
Origin loadServing the same static file from the origin on every request wastes bandwidth and CPU

A CDN caches copies at edge locations scattered worldwide. User requests route to the nearest edge; if a copy exists there, it responds immediately without reaching the origin.

flowchart LR
    user["User<br/>(São Paulo)"]
    edge["Edge location<br/>(near São Paulo)"]
    origin["Origin<br/>(Seoul region)"]

    user -->|close · fast| edge
    edge -.only on miss.-> origin

The key is the dashed “only on miss” line. When most static requests end at the edge, the origin goes idle and users get faster.


2. CloudFront Building Blocks

Configuring CloudFront means distinguishing four concepts.

flowchart TB
    subgraph dist["Distribution"]
      b1["Behavior: /static/*<br/>(cache long)"]
      b2["Behavior: /api/*<br/>(no cache)"]
      b3["Behavior: default (*)"]
    end
    edges["Edge locations + Regional edge caches"]
    o1["Origin A: S3 (static)"]
    o2["Origin B: ALB → Spring Boot (dynamic)"]

    edges --> dist
    b1 --> o1
    b2 --> o2
    b3 --> o2
ComponentRole
DistributionThe top-level unit of CloudFront config. Has one domain (d123.cloudfront.net or a custom domain)
OriginThe source server: an S3 bucket, an ALB, or any HTTP server
BehaviorA rule, per path pattern (/static/*, /api/*), deciding which origin to use, whether to cache, and which headers/cookies to forward
Edge locationA cache server (PoP) near users. The first-tier cache

2.1 Edge Locations and Regional Edge Caches

CloudFront’s cache is two-tier. On a miss at the nearest edge location, it doesn’t go straight to the origin — it checks a larger Regional Edge Cache first. Many edges share the same regional cache, so a copy fetched by one edge is reused by others, cutting origin load further.

flowchart LR
    e1["Edge A"] --> rec["Regional Edge Cache"]
    e2["Edge B"] --> rec
    rec -.only on miss.-> origin["Origin"]

3. How Caching Works

3.1 Request Flow — hit and miss

A request reaching the edge is turned into a cache key to look up a copy.

flowchart TB
    req["Request arrives"] --> key["Compute cache key"]
    key --> check{"Copy exists<br/>and not expired?"}
    check -->|yes = HIT| serve["Serve from cache"]
    check -->|no = MISS| fetch["Fetch from origin"]
    fetch --> store["Store in cache"] --> serve

The X-Cache response header shows the result — Hit from cloudfront or Miss from cloudfront. Part 2 verifies cache behavior with this header.

3.2 Cache Key — What Counts as the Same Request

CloudFront treats requests with the same cache key as “the same request” and serves the same copy. The default key is just the path, but a cache policy can add:

Key componentIncluding it means
PathAlways included (/img/logo.png)
Query stringA separate copy per ?v=2. Used for versioning
Headerse.g. Accept-Encoding (compression variants), Accept-Language (per language)
CookiesFor responses that vary per user. But hit ratio plummets — be careful

Key point: The more variables in the cache key, the more the copies fragment and the lower the hit ratio. Include only “what genuinely makes the response differ.” Putting all cookies in the key, for instance, gives every user a different copy and effectively disables the cache.

3.3 TTL and Cache-Control — How Long to Keep

How long to keep a copy is set by TTL (Time To Live). TTL is mainly determined by the Cache-Control response header from the origin.

Note — Cache-Control is a two-way header: Cache-Control can appear on both requests and responses, but what decides “how long to cache” (the lifetime) is the response (server/origin). A request-side Cache-Control (e.g. a browser hard-refresh’s no-cache) is just a signal like “fetch fresh this time.” So CDN cache lifetime is decided in Part 2 by attaching headers to the origin (Spring Boot) response.

Cache-Control: public, max-age=31536000, immutable   # cache 1 year (static assets)
Cache-Control: no-store                              # never cache (private API)
Cache-Control: no-cache                              # revalidate with origin each time

A CloudFront behavior has Min/Default/Max TTL settings that interact with the origin header.

SettingMeaning
Min TTLMinimum lifetime. Kept at least this long even if the origin says shorter
Max TTLMaximum lifetime. Capped at this even if the origin says longer
Default TTLApplied when the origin sends no Cache-Control

Caution: 90% of “why does the old response keep showing?” is the origin sending a long max-age, or a long Min TTL on the behavior. For dynamic responses, the origin must clearly send no-store/no-cache (set with Kotlin in Part 2).


4. Invalidation vs Versioning

Two ways to swap a deployed copy.

MethodHowPros/cons
InvalidationSpecify a path like /static/app.js to remove it from edge cachesImmediate, but with many paths it costs and lags. Billed beyond the monthly free tier
VersioningPut a version in the filename/query and deploy a new URL (app.abc123.js, app.js?v=2)The new URL naturally misses and fetches a fresh copy. No invalidation needed, easy rollback

Bottom line: Versioning is standard for static assets. Hash the filename at build time (app.[hash].js) and cache long with Cache-Control: immutable, max-age=1 year. When content changes, the hash changes, the URL changes, and the new version applies instantly without invalidation. Reserve invalidation for the few files whose URL can’t change, like HTML.


5. What to Cache and What Not To

The core CDN design decision is splitting cache policy by path.

ContentExample pathPolicy
Static assets/static/*, /assets/*Cache long + versioning (max-age=1 year, immutable)
Public, identical responses/, /about (same HTML for everyone)Cache briefly (max-age=minutes)
Per-user dynamic/api/*, /meDon’t cache (no-store), pass to origin

In CloudFront this split is implemented as per-path-pattern behaviors. /api/* is a behavior with caching off that forwards all headers/cookies to the origin; /static/* is a behavior that caches long and ignores cookies. Part 2 builds this behavior split with a Spring Boot origin, in Terraform.


Recap

ConceptOne-liner
CDNCaches copies at edges near users to cut latency and origin load
DistributionThe top-level CloudFront unit, one domain
OriginThe source server (S3, ALB, HTTP server)
BehaviorPer-path-pattern cache/forwarding rules
Cache keyWhat counts as the “same request” — fewer inputs, higher hit ratio
Cache-Control / TTLCopy lifetime. Dynamic must say no-store
VersioningThe standard for updating static assets, before invalidation

Part 2 applies these concepts for real. We bring up a Spring Boot + Kotlin app as the origin, set Cache-Control/ETag in Kotlin, split CloudFront behaviors into /api/* (no cache) and /static/* (cached) with Terraform, and verify hit/miss directly via the X-Cache header.


Appendix

A. Glossary

TermDescription
CDNContent Delivery Network. Caches and serves copies near users
OriginThe server holding the original content (S3/ALB/HTTP)
Edge locationA cache server (PoP) near users. CloudFront’s first-tier cache
Regional Edge CacheA larger second-tier cache between edges and the origin
DistributionA CloudFront deployment unit (one domain)
BehaviorPer-path-pattern cache/routing rule
Cache keyThe key identifying a request to find a copy (path/query/header/cookie combo)
TTLCopy lifetime (Time To Live)
hit / missCopy present in cache / absent so fetched from origin
hit ratioShare of requests served from cache
InvalidationRemoving a copy of a specific path from edge caches

B. References

Shop on Amazon

As an Amazon Associate, I earn from qualifying purchases.