Caching architecture: Redis, Memcached
Caching architecture: Redis, Memcached
1) When and why cache
Objectives: reduce latency, offload DB/PSP/external APIs and mitigate peaks.
Cache layers are often multilevel: in-process (L1) → service-level (Redis/Memcached L2) → edge/CDN. The internal cache speeds up hot reads, L2 is common for services, edge is for public content.
2) Redis vs Memcached - Brief
Rule: if you need complex data structures, persistence, pub/sub/streams/scripts - take Redis. If the super-simple, fast, cheap KV cache layer without durability is Memcached.
3) Caching patterns
3. 1 Cache-aside (lazy)
The application reads from the cache → a miss → reads from the database → puts it in the cache with TTL.
Simple TTL control, cache independence. − Possible "storm" in case of misses.
3. 2 Read-through
The client/proxy itself pulls up from origin at a miss and puts it in the cache.
Centralized logic. − More complex infrastructural.
3. 3 Write-through / Write-behind
Write-through: writing first to the cache, then to the database.
Write-behind: writing to the queue, asynchronous flash in the database (potential loss when crashing - you need a log).
3. 4 Two-tier (L1+L2)
L1 (in-process) with short TTL and soft TTL, L2 (Redis/Memcached) - "cache truth." Disability via pub/sub.
4) TTL, storms and consistency
TTL-Set close to the frequency of the data change. For hot keys, use TTL (jitter) randomization: 'ttl = base ± rand (0.. base0. 1) '- removes synchronous outflows.
Dogpile (thundering herd): protect misses:- Singleflight: only one process overgenerates the value (see Lua example).
- Soft-TTL + background refresh: after 'soft _ ttl', give it stale and update with the background.
- Semaphore/lock: `SET key:lock value NX PX=2000`.
- Near-stale: 'stale-while-revalidate' for response APIs (see section 8).
5) Keys, namespaces, serialization
5. 1 Key naming
Template: '{domain}: {entity}: {id}: {field}'
Examples:- `user:profile:42` `catalog:product:1001:v2` `psp:rates:2025-11-03`
Add a schema version (': v2') - this facilitates mass disability.
5. 2 Neimspaces via "space version"
Hold the key 'ns: catalog = 17'. Real keys: 'catalog: 17: product: 1001'. For global directory disability, simply increment 'ns: catalog'.
5. 3 Serialization/Compression
JSON is convenient, but heavy. Use MessagePack/CBOR.
Enable compression (LZ4/ZSTD) for large payload (> 1-2 KB). In Redis - on the client side.
6) Hot keys and sharding
Hot-keys: Monitor top-N by hit/miss/byte. For extremely hot keys:- Replicated read pattern: duplicate the value in several shard keys' hot: k: 1.. N ', choose random when reading.
- Local L1: Keep the disability subscription process in mind.
- Redis Cluster - native (16384 hash slots).
- Memcached is a client-side consistent hash.
- Hash-tag in Redis' {...} 'fixes a slot for a set of keys:' user: {42}: profile'and' user: {42}: limits' will be on the same card.
7) Preemption policies and sizes
Redis `maxmemory-policy`: `allkeys-lru`, `volatile-lru`, `allkeys-lfu`, `noeviction` и т. д. For cache, usually 'allkeys-lru '/' allkeys-lfu'.
Memcached — LRU на item-slab.
Key size and value: watch for max item size (Memcached by default 1 MB, tuning slab).
Exceeding memory should degrade predictably: not 'noeviction' on the active path.
maxmemory 32gb maxmemory-policy allkeys-lfu hz 50 tcp-keepalive 60
8) Storm Protection Patterns - Code
8. 1 Redis Lua singleflight (pseudo)
lua
-- KEYS[1] = data_key, KEYS[2] = lock_key
-- ARGV[1] = now_ms, ARGV[2] = soft_ttl_ms, ARGV[3] = hard_ttl_ms, ARGV[4] = lock_ttl_ms local payload = redis. call("GET", KEYS[1])
if payload then local meta = redis. call("HGETALL", KEYS[1].. ":meta")
local last = tonumber(meta[2] or "0")
if tonumber(ARGV[1]) - last < tonumber(ARGV[2]) then return { "HIT", payload }
end if redis. call ("SET," KEYS [2], "1," "NX," "PX," ARGV [4]) then return {"REFRESH," payload} - one worker updates, the rest give stale end return {"STALE," payload}
end if redis. call("SET", KEYS[2], "1", "NX", "PX", ARGV[4]) then return { "MISS", nil }
end return { "BUSY", nil }
8. 2 Node. js cache-aside (simplified)
js const v = await redis. get(key);
if (v) return decode(v);
const lock = await redis. setNX(key+":lock", "1", { PX: 1500 });
if (lock) {
const fresh = await loadFromDB(id);
await redis. set(key, encode(fresh), { EX: ttl, NX: false });
await redis. del(key+":lock");
return fresh;
} else {
await sleep(60); // short backoff const retry = await redis. get (key) ;//give someone's already filled return decode (retry);
}
9) Disability and coherence
By event: When changing in the database, publish the 'pub/sub' event 'invalidate: {ns}: {id}' → subscribers delete the keys.
By timer: short TTL for frequently changing data.
Versioning: see 'ns:' keys.
Outbox: disability delivery guarantee (log/topic event, retrai).
Cache operations idempotency: use 'SETXX/SETNX', versions ('etag') and hash fields for increment.
10) Replication, cluster, failover
10. 1 Redis
Sentinel: automatic failover master-replica (stateFUL IP/name).
Cluster: sharding + automatic failover; customers must support 'MOVED/ASK' redirects.
AOF/RDB: for cache usually 'appendfsync everysec', it is possible without persistence (like a pure cache).
10. 2 Memcached
No replication out of the box. Reliability - via multi-server shard + repeat'n '(client-side).
When the nodes fall, there is an increase in misses and "retraining" of the cache.
10. 3 K8s and Networking
Redis/Memcached do not like the frequent re-creation of pods; use StatefulSet + AZ antipodes, fixed PVC/POD IP.
Set PodDisruptionBudget and TopologySpreadConstraints.
11) Transactions, scripts and atomicity (Redis)
INCR/DECR, HINCRBY - counters, quotas, rate-limits (just consider persist).
MULTI/EXEC - a bundle of atomic commands.
Lua (EVAL) - read-modify-write without racing.
Pipeline - reduces RTT (especially in network hops).
lua
-- KEYS[1]=bucket, ARGV[1]=capacity, ARGV[2]=refill_rate_per_sec, ARGV[3]=now_ms
-- Returns 1 if the token is issued, otherwise 0
12) Queues, pub/sub and Streams (Redis)
Pub/Sub: disability, signals. No save, online listeners only.
Streams: Confirmation Event Queues (ACK), Consumer Group, Retrai - handy for write-behind/fan-outs.
Lists ('BRPOP'): simple queues.
Do not use Redis as a "single bus of everything" without a backup - this is a cache/fast bus, not Kafka.
13) Security and access
Network isolation/VPC, mTLS at the ingress level, ACL/passwords ('requirepass '/ACL in Redis 6 +).
Disable dangerous commands in Redis ('CONFIG', 'FLUSHALL', 'KEYS') via ACL.
For Memcached - do not listen to public interfaces, '-U 0' (without UDP), only private networks.
Do not store PII; if necessary - short TTL + encryption at the application level.
14) Observability and maintenance
Key metrics:- Hit ratio/Miss ratio (by namespace/route).
- Latency p95/p99 'GET/SET/MGET' commands, timeouts.
- Evictions и OOM errors.
- Replication lag (Redis), cluster state, migrate/rehash events.
- Top-N keys by traffic/bytes (sampling).
- Logs: slow commands ('slowlog'), network errors.
- Dashboards: general (CPU/RAM/connections), commands, cluster slots, sentinels, passing through Prometheus exporters.
15) Configs and deployments - examples
15. 1 Redis Sentinel (snippet)
port 6379 protected-mode yes appendonly yes appendfsync everysec maxmemory-policy allkeys-lfu
`sentinel. conf`:
sentinel monitor m1 10. 0. 0. 11 6379 2 sentinel auth-pass m1 sentinel down-after-milliseconds m1 5000 sentinel failover-timeout m1 60000
15. 2 Redis Cluster (helm values, simplified)
yaml cluster:
enabled: true nodes: 6 # 3 masters + 3 replicas persistence:
size: 100Gi resources:
requests: { cpu: "500m", memory: "2Gi" }
15. 3 Memcached (deployment)
yaml containers:
- image: memcached:1. 6 args: ["-m", "32768", "-I", "2m", "-v", "-t", "8", "-o", "modern"]
ports: [{ containerPort: 11211 }]
15. 4 NGINX as read-through proxy (API loop)
nginx proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=api:100m max_size=10g inactive=10m;
map $request_uri $cache_key { default "api:$request_uri"; }
location /api/ {
proxy_cache api;
proxy_cache_valid 200 1m;
proxy_cache_use_stale updating error timeout http_500 http_502 http_503 http_504;
proxy_cache_lock on; # singleflight на уровне NGINX proxy_cache_key $cache_key;
proxy_pass http://backend;
}
16) Testing and gates
Cold/warm/hot cache load profiles.
Injection of misses (purge en masse) - origin must withstand "retraining."
Alerts: sharp drop in hit-ratio, rise in miss latency, avalanche of evictions, rise in timeouts.
17) Anti-patterns
Store "truth" in Redis without AOF/RDB and without redundancy.
TTL = 0 (perpetual) for volatile data → perpetual inconsistency.
Mass' KEYS'in prod.
Absence of jitter/soft-TTL → synchronous outages and storms.
One instance for all commands without shardings/replicas.
Use Memcached for tasks requiring atomicity/scripts.
18) Implementation checklist (0-45 days)
0-10 days
Select a template (cache-aside + L1/L2), describe keys, TTL, namespaces.
Enable jitter/soft-TTL, singleflight; basic alerts/dashboards.
For Redis - configure ACL, protected-mode, slowlog, maxmemory-policy.
11-25 days
Switch to sharding (Redis Cluster or client hash), replicas.
Disability via pub/sub or neimspace version; outbox in the database.
Cache "retraining" load tests; limiting origin.
26-45 days
Autopromo/canary TTL, warm-up before release.
Streams for write-behind/background reassembly.
Weekly reports on hit-ratio, top keys, memory cost.
19) Maturity metrics
Hit-ratio L2 ≥ 80% (statistics on routes/namespaces).
P95 GET <2-3 ms (in-DC), misses <SLO origin.
0 storms in mass disability (proven by tests).
Automatic disability and nymspace versioning.
Sharding/replication covers 1 node failure without appreciable degradation.
20) Conclusion
Strong cache architecture is the discipline of keys and TTL, storm protection, proper shardiness, and predictable preemption. Redis gives rich semantics, persistence and atomicity; Memcached - maximum simplicity and speed. Add observability, event disability, L1 + L2, and the cache becomes a platform accelerator, not a source of accidental drops and "mystical" bugs.