Technology and Infrastructure → Cache tiers and data storage
Cache tiers and data storage
1) Why you need a multi-layer cache
Cache is a short path to an answer without going to "expensive" subsystems (databases, external APIs, networks). Layering distributes the load: browser → CDN/edge → application layer → distributed cache → database/storage. Goals: reduce P95/P99, unload origin, withstand peaks more firmly and reduce the cost of bytes.
2) Cache level map
1. Браузер: `Cache-Control`, `ETag`, `Last-Modified`, `stale-while-revalidate`.
2. CDN/Edge: TTL/ключ, Vary, Signed URLs, image-resize; tiered/shield.
3. API Gateway/Service Mesh: short-lived response cache for secure GET.
4. Application (in-process): LRU/LFU, near-cache for hot keys, milliseconds.
5. Distributed cache (Redis/Memcached): the main layer for dynamics.
6. DB caches: Pg/Innodb buffers, PgBouncer multiplexing, materialized views.
7. Disk/object stores: precomputed snapshots, blob cache (for example, S3 + CDN).
Principle: "the closer to the user, the shorter the TTL and less personalization; the closer to the data, the richer the consistency policy."
3) Cache patterns
Cache-Aside (Lazy): we read → with MISS we load from the source → put it in the cache. Simple, gives TTL control.
Read-Through: the application reads through a cache that pulls from the source itself. It is convenient to centralize the policy.
Write-Through: recording goes to the cache and to the source immediately. More consistent, but more expensive on record.
Write-Back (Write-Behind): we write to the cache, the source is updated asynchronously (queue). High speed, shipping guarantees and idempotency required.
Refresh-Ahead: for "top" keys, update the value before the TTL expires.
Where what: game cards/directories - cache-aside/read-through; counters/leaderboards - write-back + CRDT/aggregation; currency/limit directories - read-through with controlled TTL.
4) Keys, segmentation and naming
Шаблон: `domain:entity:{id}:v{schema}|region={R}|currency={C}|lang={L}`.
Include in the key only what actually changes the answer (region, currency, language, schema version).
Schema versioning: for incompatible changes - raise 'vN' in the key, avoiding mass purge.
Namespacing by product/tenant: 'tenant: {t}:...' - critical for multi-tenant.
Bloom filter for "key existence" can reduce trips to the source.
5) TTL, freshness and disability
TTL-matrix:- static (hashed files): 30-365 days + 'immutable';
- catalogs/banners: 5-60 minutes + 'stale-while-revalidate';
- leadboard/quotes: 2-15 seconds;
- directories (currencies/limits): 1-10 minutes.
- Disability events: publish 'product. updated '→ dot key/prefix disability.
- Tag-based purge: group purges by tag (promo/catalog release).
- Soft-Expiry: after the TTL expires, we give the outdated one as' stale ', update it in parallel (SWR/SIE).
- Versioned Keys> mass purge: cheaper and safer.
6) Stampede, hot keys and competition
Dogpile/Stampede protection:- Single-flight (request coalescing): one leader updates the key, the rest wait.
- TTL jitter: blur the outflow, avoiding a one-time collapse.
- SWR locally: we give the expired value to the user, update it in the background.
- Hot key replication to multiple 'key # 1.. N' slots distributed by read
- near-cache in process memory;
- prewarm/refresh-ahead before picks (tournaments/matches).
- Limits on concarrency updates for heavy keys.
7) Consistency and cross-layers
Write-invalidate: when writing to the source - synchronously disable the corresponding keys (pub/sub).
Read-repair: in case of discrepancies, update the cache with the correct value.
Eventual vs Strong: critical cash transactions are read directly/with short TTL; UI showcases and statistics - eventual.
CRDT/aggregators: for distributed counters/ratings - "merge-safe" structures (G-Counter, Top-K on streams).
Cascading disability: Updating the "game" disables the card + list + custom recommendation cache.
8) Serialization, compression and format
Formats: protobuff/MessagePack faster than JSON; for CDN/browser - JSON with Brotli.
Compression in Redis: beneficial for objects> 1-2 KB, but keep an eye on the CPU.
Partial responses/on-demand fields: less bytes → less TTFB and RAM.
9) Preemption policies and size
LRU (default) - safe; LFU is better for "popular" content.
Key/value size: keep under control (metrics' avg value size ',' max ').
Namespace/tenant quotas so that one product does not "eat" the entire cache.
10) Security and PII/PCI
Personal/financial data - do not cache on CDN/edge and in common layers; use tokens/projections.
Encryption of sensitive values in Redis via client-side crypto (with caution about TTL control losses).
Strict ACLs and network isolation; fixed NAT/IP for egress to providers.
11) Observability and cache SLO
Metrics:- Hit Ratio (by layer and prefix), Origin Offload.
- TTFB/P95/P99 before/after cache, Latency Redis.
- Evictions, OOM, keyspace hits/misses.
- Stampede rate, refresh time.
- Stale served % и Freshness lag.
- Game catalog: Hit Ratio ≥ 85%, TTFB P95 ≤ 150 ms (edge).
- API directories: Revalidation-hit ≥ 60%, P95 ≤ 200 ms.
- Redis: P99 operation ≤ 5 ms, evictions not more than 1% per hour.
12) FinOps: cache value
$/GB month RAM vs $/RPS origin: calculate the payback point.
Offload and egress: CDN + Redis reduce outbound traffic from region-origin.
Image/WebP/AVIF and denormalization provide the greatest byte savings.
Limit "expensive MISS": analytics "bytes × MISS × region."
13) Examples (fragments)
13. 1 Cache-Aside with single-flight (pseudocode)
python def get(key, ttl, loader):
val = redis. get(key)
if val: return val with single_flight (key): # only one updates val = redis. get (key) # double check if val: return val data = loader () # request to source redis. setex(key, ttl_with_jitter(ttl), serialize(data))
return data
13. 2 Publication of disability by event
json
{
"event": "game. updated",
"game_id": "g123",
"affected": ["catalog:list:region=TR", "game:card:g123:"]
}
The consumer subscribes to the channel and makes' DEL '/' PUBLISH'match the keys/tags.
13. 3 Key with schema version and locale
game:card:v2:id=g123 region=BR currency=BRL lang=pt-BR
14) Implementation checklist
1. Cache level map and TTL matrix (static/semi-static/API).
2. Key naming: domain, schema version, local/region/currency, tenant.
3. Select per-endpoint pattern (ask/read-through/write-through/back).
4. SWR/SIE, single-flight and TTL jitter vs. stampede.
5. Disabled by events (pub/sub), tag-purge for groups.
6. Near-cache for hot keys and prewarm before peaks.
7. Formats and compression (protobuf/MsgPack, Brotli), size control.
8. LRU/LFU policies, namespace/tenant quotas.
9. SLO/метрики: hit ratio, latency, evictions, stale %, freshness lag.
10. Security: no-store for personal, tokenization, network/ACL.
15) Anti-patterns
'no-cache '"just in case" and TTL failures are zero offload.
The key includes all queries/headers → cardinality explosion.
Bulk purge "total CDN/Redis" with each release.
Lack of protection against stampede and one-time expiration of "top keys."
Single common Redis without quotas/isolation; The "hot" tenant eats up the entire cache.
Caching personal responses to edge/CDN.
No freshness/evictions telemetry → blind control.
16) iGaming context/fintech: practical notes
Leaderboards/ratings: TTL 2-10 s, aggregate streams + CRDT, SWR in crashes.
Games catalog/banners: CDN + Redis; key: region/currency/language; disability by "promo: update" tags.
Payment statuses: no cache in the write path; read - short TTL (≤3 -5 seconds) or direct request.
KYC/AML answers: cache non-PII derivatives (statuses), do not store images/documents in Redis.
VIP path: separate namespace/Redis pool, priority service.
Total
A strong cache strategy is level architecture, correct update patterns, thoughtful TTL/disability, stampede resistance, neat keys and versions, and observability and FinOps. By following these principles, you will stabilize the tails of the P95/P99, reduce the load on the sources and get a predictable cost per millisecond - exactly where it is most important for the product and business.