Edge caches and POP
1) What is POP and why is the "edge"
POP (Point of Presence) is a content delivery network (CDN/edge) node geographically close to the user. Edge cache - a layer of storing responses directly in POP, which reduces:- Latency (less RTT before client).
- Load and cost per origin (offload).
- Traffic between regions/clouds (egress-saving).
Edge is not only a cache. Modern POPs support L7 routing, WAF/bot filters, rate-limit, A/B/canaries, transformations, and edge-compute (scripts/functions).
2) Edge caching architectures
2. 1 Flat vs tiered
Flat: Every POP goes to origin. Simple but expensive for origin.
Tiered/Shield: POP → Shield POP (central cache) → origin. Shield accumulates cache misses, creates an umbrella for origin.
2. 2 Regional segments
Separate caching domains by region/jurisdiction (GDPR/data localization).
Variant: "EU-only POPs" and "Global POPs," separate keys/rules.
2. 3 Anycast + latency/geo-aware routing
Anycast brings the client to the nearest POP via BGP.
Geo/latency-aware switches between POP/regional pools on active RTT/error measurements.
3) Cache keys, 'Vary', TTL and freshness
3. 1 Key design
Normalize queries: sort query parameters, remove noise (utm, ref).
Include semantic axes: 'tenant', 'locale', 'schema version' ('v = 3'), but avoid PII.
For private content, separate the public and private cache (see § 7).
3. 2 Cache Control (HTTP)
Titles:- `Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=60, stale-if-error=120`
- 'ETag '/' Last-Modified'for conditional GETs (304).
- Vary: minimize cardinality ('Accept-Encoding', 'Accept-Language', sometimes' Authorization '/' Cookie'for private paths).
- Micro-cache for "near-speaker": 1-5 seconds + SWR.
3. 3 Stale strategies
SWR (stale-while-revalidate): give an outdated answer and update with the background.
SIE (stale-if-error): in case of an origin error, we use the cache before 'SIE' -TTL.
Soft/Hard TTL: soft term (can stale), hard (full miss).
4) Disability: how to update the "edge"
4. 1 By key and by tags
PURGE/BAN by URL/prefix - rough but fast.
Surrogate-Key/Tags: assign tags to objects ('article: 42', 'category: 7'), ban by tag - mass disability without URL brute force.
4. 2 Event disability
When changing data in origin, publish events (Kafka/NATS) → edge disabled call BAN/PURGE/soft-expire.
4. 3 Artifact versioning
For static - content-hash in the file name.
For APIs, change the key version ('v = 4') for incompatible changes.
5) Origin protection and performance
5. 1 Origin shielding
Turn on Shield POP as a single miss point → multiply the storm by origin.
5. 2 Coalescing/single-flight
At the edge, one request "punches" the cache at a miss; the rest wait (no catch-up stampede).
5. 3 Rate-limit/Queue/Shedding на edge
If overloaded, drop low-priority/anonymous requests to POP, not origin.
5. 4 Signed URL / Signed Cookie
Origin is hidden behind the edge. Access to private content - by signed links/cookies with TTL and attributes (IP/Geo/Path), so as not to distribute to "everyone."
6) Transport and transformation
6. 1 HTTP/2–3 и QUIC
HTTP/2: multiplexing, header compression.
HTTP/3/QUIC: fewer HOL locks and better on lost → channels below p95/p99 TTFB.
6. 2 Compression and imagery
Brotli for text, AVIF/WebP for images, image-resizing at the edge (responsive sizes, DPR).
Cache variants by format/size: keys include 'width/format' (or 'Vary: Accept '/Client-Hints).
6. 3 TLS/0-RTT (neat)
Session replaying speeds up installation, 0-RTT may be vulnerable to replay → enable only for idempotent GETs.
7) Public vs private edge cache
7. 1 Public
'Cache-Control: public, s-maxage =... 'and minimal' Vary'
Suitable for catalog, news, pictures, CDN static.
7. 2 Private/Personalized
Options:- Do not cache at the shared level: 'Cache-Control: private' (browser cache).
- Key-segmentation: include tenant/user-id (or token-hash) in the key and mark as private-shared (careful with storage and PII).
- Signed cookies and Edge-auth: cache is public, but access by signature (options with encrypted session state on the edge).
8) Edge-compute (Workers/Functions)
Easy functions on POP: rewriting path/headers, A/B split, key normalization, SWR logic, prefetch of neighboring resources.
Local KV/Cache API on POP for millisecond operations.
Limitations: short timeouts/memory, lack of long-lived connections, careful work with PII/regionality.
Pseudo-example (Workers-like)
js export default {
async fetch(req, env) {
const key = normalize(req);
let res = await caches. default. match(key);
if (res) return withHitHeader(res, "HIT");
res = await fetch(req, { cf: { cacheEverything: true }});
const ttl = computeTTL(res);
eventWaitUntil(caches. default. put(key, res. clone(), { expirationTtl: ttl }));
return withHitHeader(res, "MISS");
}
}
9) Configuration examples
9. 1 Nginx: micro-cache + SWR
nginx proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api:200m inactive=30m;
map $request_method $skip_cache { default 0; POST 1; PUT 1; DELETE 1; }
server {
location /api/list {
proxy_cache api;
proxy_cache_key "$scheme://$host$uri$is_args$args";
proxy_cache_valid 200 2s; # micro-cache proxy_cache_use_stale error timeout updating;# SIE + SWR proxy_cache_background_update on;
add_header X-Edge-Cache $upstream_cache_status;
proxy_pass http://origin_pool;
}
}
9. 2 Varnish: surrogate keys и BAN
vcl sub vcl_recv {
if (req. method == "BAN") {
if (req. http. Surrogate-Key) {
ban("obj. http. Surrogate-Key ~ " + req. http. Surrogate-Key);
return (synth(200, "Banned"));
}
}
}
sub vcl_deliver {
set resp. http. Surrogate-Key = "article:42 tag:author:7";
set resp. http. Cache-Control = "public, s-maxage=300, stale-while-revalidate=60";
}
9. 3 Envoy (edge-cache filter)
yaml http_filters:
- name: envoy. filters. http. cache typed_config:
"@type": type. googleapis. com/envoy. extensions. filters. http. cache. v3. CacheConfig typed_config:
"@type": type. googleapis. com/envoy. extensions. http. cache. simple_http_cache. v3. SimpleHttpCacheConfig
9. 4 CloudFront-style behavior (thumbnail)
Behavior A: '/images/' - long TTL, compression, vary in formats.
Behavior B: '/api/' - short TTL, SWR, signed cookie, WAF/bot protection.
Origin Shield is enabled, statuses 500/502/504 → 'stale-if-error'.
10) Observability, SLO and reporting
10. 1 Metrics
cache_hit_ratio (by POP/region/route), byte_hit_ratio.
origin_offload = 1 − (origin_requests / edge_requests).
TTFB/TTL by quantiles, stale_responses_total, revalidations_total.
stampede_prevented_total, coalesced_waiters.
shield_hit_ratio (if tiered), origin_egress_bytes (cost).
10. 2 Logs/Trails
Logs labeled 'HIT/MISS/STALE/UPDATING/BYPASS', key, TTL, POP, tenant.
In distributed traces, mark the source ('edge', 'origin') and cause (revalidate/stale/error).
10. 3 SLO examples
«Для `/api/list`: p99 TTFB ≤ 250 мс, edge hit ≥ 70%, byte-hit ≥ 80%, origin error-offload ≥ 90%».
"The rate of'stale-if-error 'responses ≤ 1% per day."
11) Security, privacy, compliance
WAF/bot management - on edge for filtering to origin.
Regionality of data: store private artifacts only in valid POPs; Use region-specific keys and ACLs.
Signatures and tokens on the edge, do not give private answers from the public cache.
PII minimization: do not include personal data in keys; encrypt cookies; short TTLs for personalization.
12) Typical recipes
12. 1 "Almost dynamic" (tapes/lists)
Micro-cache 1-3 with + SWR on edge, shield enabled, single-flight, negative-cache for empty results 1-5 s.
12. 2 Image/Media Clouds
Edge recise/formatting (WebP/AVIF), cache options by 'width/format', long TTL, disability by content tags.
12. 3 APIs with personalization
'Cache-Control: private'or signed cookie + key-segmentation (tenant), short TTLs, SWRs for "almost public" parts of the response.
12. 4 Big Sales/Picks
Warming up key resources (prewarm), increasing TTL for statics, aggressive SWR/SIE, hard limits for origin, Shield included.
13) Anti-patterns
No'Vary 'with different responses → leaks/incorrect data.
Huge 'Vary' → cardinality → low hit.
Common cache for prod/experiments → contamination.
No single-flight → storm misses on origin.
SWR without restrictions → update races and avalanche validate requests.
Edge cache of private responses as public → security incidents.
Absence of tiered/shield at worldwide load → origin overheating.
14) Implementation checklist
- Map POP coverage, enable anycast + latency-routing.
- Select tiered/shield and single-flight/coalescing policies.
- Design keys and Vary (minimum cardinality, no PII).
- Configure TTL/SWR/SIE (soft/hard TTL) and negative-cache.
- Enable signed URL/cookie, hide origin, enable WAF/bot filters.
- Organize disability: Surrogate-Key/BAN + event-driven.
- Raise hit/byte-hit/offload/TTFB metrics and per-POP dashboards.
- Warm-up before peaks, runbooks to storm/overload.
- Privacy/regionality tests, key and policy audits.
- SLO/erroneous budget for edge and TTL/SWR auto-tweak criteria.
15) FAQ
Q: How to choose TTL on the edge?
A: Push off the permissible obsolescence and hit-ratio goal. For "near-dynamics" - 1-5 s + SWR; for directories/images - minutes/hours with disability by events/tags.
Q: When is Shield POP needed?
A: With global traffic or hot keys: shield dramatically reduces misses on origin and stabilizes "catching up" waves.
Q: How do I cache authorized responses?
A: Either 'private' (browser), or public with signed cookie/URL and key segmentation (without PII), or generally bypass for critical personal data.
Q: What to do with HTTP/3?
A: Enable: Mobile/lost channel wins especially. Control the compatibility of proxy and fallback on the HTTP/2.
16) Totals
Edge caches and POP network are the foundation of high-speed and economical platforms. Success is determined by the correct key and 'Vary', reasonable TTL/SWR/SIE, tag/event disability, tiered/shield origin protection, as well as observation (hit/offload/TTFB) and security/privacy discipline. Follow the checklist - and the "edge" will be your accelerator, not a source of surprises.