CDN caching and TTL optimization
Brief Summary
A CDN cache is an "accelerator + shield" between the user and origin. It works well when:1. The cache key is stable and contains no "noise."
2. TTL policy under load: 's-maxage '/' max-age' + 'stale-while-escalate/if-error'.
3. Disability is managed: by tags/prefixes + "soft" purge.
4. Tiered-cache/origin-shield and negative-cache are included.
5. There is an observability: hit-ratio by layers, p95 TTFB, return share 304.
Base headers and what they mean
`Cache-Control`:- 'max-age =
'- TTL for browser. - 's-maxage =
'- TTL for CDN/proxy (overlaps'max-age'). - 'stale-while-revalidate =
'- give out outdated, update in parallel. - 'stale-if-error =
'- we return the outdated one when the origin error occurs. - 'immutable '- the resource does not change (suitable for versioned assets).
- 'ETag '/' Last-Modified '- conditions for 304, save bytes/CPU origin.
- 'Vary '- a list of headers that affect the cache key (use with restraint!).
- 'Surrogate-Control '- "extended" Cache-Control for CDN (if supported).
- 'Expires' - obsolete, but still accounted for by customers.
Cache-Control: public, max-age=31536000, immutable
Example (semi-speaker with secure obsolescence):
Cache-Control: public, s-maxage=300, max-age=60, stale-while-revalidate=600, stale-if-error=86400
ETag: "a1c3..."
Cache Key Design and Normalization
The goal is for essentially the same requests to fall into the same object.
URL normalization: case, double slashes, trailing slash, order of query parameters.
Ignore "noise": 'utm _', 'fbclid', 'gclid', arbitrary ref tags.
Limited Vary: only really significant titles ('Accept-Encoding', sometimes' Accept ',' Accept-Language'for locale).
Device-class: if necessary, use 2-3 classes (mobile/desktop/tablet), not endless user-agent branches.
Auth context: do not cache private by default; use signed-URLs/cookies-bypass or separate public/private paths.
Surrogate-Key: product:123 catalog
Cache-Control: public, s-maxage=300, stale-while-revalidate=600
Vary: Accept-Encoding
TTL strategies by content type
Disability policies
By URL/Prefix: "sweep everything under '/static/2025-11-05/'."
By Tag/Key: "remove all 'catalog' and 'product: 123'."
Soft purge: mark as obsolete, do not erase the object - faster refilling.
Event-driven: CI/CD or admin event invokes webhook "invalidate tags."
Recommendation: combine both tactics: versioning paths for assets + tag-purge for content/pages.
Tiered-cache, origin-shield и prewarm
Tiered-cache: CDN regional layers → fewer origin requests.
Origin-shield: one "shield" POP to origin - improves locality and hit-ratio.
Prewarm (pre-fetch): Warm up hot URLs/caches before event/release.
Negative-cache: cache 5xx/Timeout for a short time (30-120 s) so as not to overwhelm origin with a storm of retras.
API Cache: When You Can
Only GET/HEAD and idempotent.
Key: path + essential queries (for example, '? category =... & page =...').
Validation: 'ETag '/' Last-Modified' and short 's-maxage'.
Filters by user: bring personalization to the client/edge function or use signed-requests + "public" response.
Cache-Control: public, s-maxage=30, max-age=5, stale-while-revalidate=120, stale-if-error=600
ETag: "feed-v42"
Cache poisoning protection
Hard URL/header normalization; whitelist of parameters in the key.
Clipping suspicious headers/duplicates ('X-Forwarded-', extended 'Accept').
Limit'Vary 'and control the size/number of headers.
Domain separation: private/admin - on a separate name without cache.
Validation of responses: do not cache 4xx (except 404 for static), do not cache "user" pages without an explicit policy.
Compression and Formats
Brotli for text (js/css/json), gzip - fallback; pre-compressed assets are acceptable.
Images: webp/avif where support; use'Vary: Accept '+ derivatives.
Range-requests for video/audio: CDN caches chunks.
Content-Negotiation: Keep the key cardinality low (device-class instead of raw UAs).
Observability and SLO
Key Metrics
Hit-ratio (by bytes/requests) на edge/tier/shield.
p50/95/99 TTFB by region and type (static/API).
Fill-rate/Origin egress - how much goes to origin.
304 rate and average response size.
Error budget: share of 'stale-if-error '/' SWR' issues; purge frequency.
SLO examples
'p95 TTFB'statics regionally ≤ 120-150 ms, API GET cached ≤ 200-250 ms.
Edge hit-ratio statics ≥ 90%, semi-speakers ≥ 60%.
The percentage of responses from the stale branch with errors ≤ 0. 5% in 30 days.
Config cheat sheets
Nginx (reverse-proxy before CDN or in self-PoP)
nginx proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=CDN:512m max_size=100g inactive=7d;
map $args $clean_args {
"~(^ &)(utm_ gclid fbclid) """; # default $ args simplified example;
}
server {
listen 443 ssl http2;
set $cache_key "$scheme$request_method$host$uri?$clean_args $http_accept $http_accept_encoding";
location /static/ {
proxy_cache CDN;
proxy_cache_key $cache_key;
proxy_ignore_headers Set-Cookie;
add_header Cache-Control "public, s-maxage=86400, max-age=3600, stale-while-revalidate=600" always;
proxy_pass https://origin_static;
}
location /api/public/ {
proxy_cache CDN;
proxy_cache_key $cache_key;
proxy_cache_valid 200 30s;
add_header Cache-Control "public, s-maxage=30, max-age=5, stale-while-revalidate=120, stale-if-error=600" always;
proxy_set_header If-None-Match $upstream_http_etag;
proxy_pass https://origin_api;
}
}
Envoy (SWR + negative-cache, concept)
yaml http_filters:
- name: envoy. filters. http. cache typed_config:
"@type": type. googleapis. com/envoy. extensions. filters. http. cache. v3. CacheConfig typed_config:
"@type": type. googleapis. com/envoy. extensions. cache. simple_http_cache. v3. SimpleHttpCacheConfig
Cache-Control/Surrogate-Control Header Cache Policies
We cache 5xx errors briefly via route/retry policy + local_rate_limit
Headers for "fast" assets
Cache-Control: public, max-age=31536000, immutable
ETag: "hash"
Content-Encoding: br
Headers for semi-speakers (catalogs)
Cache-Control: public, s-maxage=600, max-age=120, stale-while-revalidate=1800, stale-if-error=86400
Vary: Accept-Encoding, Accept
FinOps: How cash saves money
Egress origin ↓, less CPU/DB load → lower infrastructure costs.
Fewer requests to paid backends (search/index/images).
Target metric: $/decrease in p95 and $/decrease in egress by 1 GB - track the post-launch effect.
iGaming/fintech specific
Provider catalogs/assets: versioned paths + annual TTL.
Event/tournament landings: 1-5 min 's-maxage' + 'SWR' for 10-30 min; tag-purge on upgrade.
Liv pages (coefficients/tables): partial cache of JSON blocks, short TTL (5-30 s), for personal blocks - client render.
PSP/payment endpoints: do not cache, strict 'no-store'; Cache only reference books (BIN tables, statuses).
Antibot: static/GET caching, gray routes for suspicious ASNs; keep 'Vary' out of noisy headlines.
Implementation checklist
- Cache key described: URL normalization, list of allowed queries, 'Vary' only for the desired one.
- Public/private paths separated; private - 'no-store' and bypass CDN.
- TTL ladders by content type introduced; configured'SWR/if-error '.
- tiered-cache + origin-shield configured; negative-cache 5xx (short) enabled.
- There is tag/URL purge, soft purge; integration with CI/CD.
- Includes compression (br/gzip), web image formats, and range responses.
- Metrics: hit-ratio by layer, p95 TTFB, 304 rate, origin egress; alerts to failures.
- Playbooks: cache warm-up before peaks, emergency purge, origin degradation.
Common errors
Non-versio assets with a large TTL → "sticky" bundles from users.
Excessive 'Vary' (by 'User-Agent', all headers) → an explosion of cardinality and a low hit-ratio.
Caching 4xx/401/403/private content.
Lack of negative-cache → an avalanche of requests for degraded origin.
No tag-purge → massive point purge and storm re-fill.
The cache key includes "noisy" UTM/ref parameters.
Too short TTL for statics → extra load on CDN and origin.
Mini playbooks
1) Warm up the cache before the event
1. Collecting top-N URLs by logs → 2) Parallel prefetch (rate-limited) by region → 3) Check hit-ratio ↑ and p95 ↓.
2) Emergency soft-purge catologists
1. Send 'PURGE '/tag-clear → 2) CDN gives stale and pulls up fresh with the background → 3) Check for no spikes on origin.
3) Origin failure
1. 'stale-if-error' helps X hours → 2) Enable the banner "technical work" on the edge → 3) Upon recovery - the target warm-up.
Result
Strong CDN strategy = correct cache key + meaningful TTL with SWR/if-error + managed disability + tiered/shield + observability. Fix the policy in the headers and IaC, measure the hit-ratio and p95, plan to warm up to peaks - and users will always receive a quick answer, and origin will remain alive even in the hottest hour.