Read Models and Projections
Read Model is a specially designed table/index/view for quick reads for a specific product scenario. Projection is a process that converts source events/changes to Read Model updates (usually idempotent upsert). In conjunction with CQRS, this allows you to unload the OLTP core and stabilize the p95/p99 reads, controlling the "freshness."
Main ideas:- Denormalize under request, not a "universal scheme."
- Update incrementally and idempotently.
- Explicitly manage staleness and order.
1) When to use Read Models (and when not)
Fit:- Frequent heavy reads (joins/aggregations/sorts) with acceptable update latency.
- Dashboards, catalogs, landing pages, "top-N," personal feeds, search lists.
- Load sharing: write-core - strict, read-plane - fast and scalable.
- Operations requiring strict invariants "per entry" (money, uniqueness). There is a strong path.
2) Architectural outline (verbal scheme)
1. Source of changes: events of the domain (event sourcing) or CDC from OLTP.
2. Projection pipeline: parser → aggregation/denormalization → idempotent upsert.
3. Read Store: database/index optimized for query (RDBMS, column, search).
4. API/client: quick SELECT/GET, with "as_of/freshness" attributes.
3) Read Model Design
Start with a query: which fields, filters, sorting, pagination, top N?
Denormalize: Store the already merged data (names, amounts, statuses).
- Partitioning: by 'tenant _ id', date, region.
- Primary key: business key + time bucket (for example, '(tenant_id, entity_id)' or '(tenant_id, bucket_minute)').
- Indexes: by frequent where/order by.
- TTL/retention: for temporary display cases (e.g. 90 days).
4) Update flow and idempotency
Idempotent upsert is the basis of projection stability.
Pseudo:sql
-- Projection table
CREATE TABLE read_orders (
tenant_id TEXT,
order_id UUID,
status TEXT,
total NUMERIC(12,2),
customer JSONB,
updated_at TIMESTAMP,
PRIMARY KEY (tenant_id, order_id)
);
-- Idempotent update by event
INSERT INTO read_orders(tenant_id, order_id, status, total, customer, updated_at)
VALUES (:tenant,:id,:status,:total,:customer,:ts)
ON CONFLICT (tenant_id, order_id) DO UPDATE
SET status = EXCLUDED. status,
total = EXCLUDED. total,
customer = COALESCE(EXCLUDED. customer, read_orders. customer),
updated_at = GREATEST(EXCLUDED. updated_at, read_orders. updated_at);
Rules:
- Each message carries a version/time; accept only "fresh or equal" (idempotency).
- For aggregates (counters, sums) - store state and use commutative updates (or CRDT approaches).
5) Source of Change: Events vs CDC
Events (event sourcing): rich semantics, it is easy to build different projections; evolution of circuits is important.
CDC (logical replication): simply connect to an existing database; DML→sobyty mapping and noise update filtering will be required.
- Delivery guarantees (at-least-once) and DLQ for "poisonous" messages.
- Order by key (partition key = 'tenant _ id: entity _ id').
6) Order, causality and "freshness"
Order by key: events of one object must come sequentially; use partitioning and versions.
Session/causal: For the author to see their changes (RYW), pass watermark versions in queries.
Bounded staleness: Return 'as _ of '/' X-Data-Freshness' and hold SLO (e.g. p95 ≤ 60c).
7) Incremental aggregates and top N
Example of minute sales buckets:sql
CREATE TABLE read_sales_minute (
tenant_id TEXT,
bucket TIMESTAMP, -- toStartOfMinute revenue NUMERIC(14,2),
orders INT,
PRIMARY KEY (tenant_id, bucket)
);
-- Update by Event
INSERT INTO read_sales_minute(tenant_id, bucket, revenue, orders)
VALUES (:tenant,:bucket,:amount, 1)
ON CONFLICT (tenant_id, bucket) DO UPDATE
SET revenue = read_sales_minute. revenue + EXCLUDED. revenue,
orders = read_sales_minute. orders + 1;
For Top N:
- Maintain a ranked showcase (for example, by 'revenue DESC') and update only changed positions (heap/skiplist/limited table).
- Store the "window" of the top (for example, 100-1000 lines per segment).
8) Search and geo-projection
Search (ES/Opensearch): denormalized document, pipeline transformations, document version = source version.
Geo: store 'POINT/LAT, LON', pre-aggregate tiles/quads.
9) Multi-tenant and regions
'tenant _ id'is required in projection keys and events.
Fairness: limit the throughput of projections per tenant (WFQ/DRR) so that the "noisy" does not slow down the rest.
Residency: the projection lives in the same region as the write core; interregional showcases - aggregates/summaries.
10) Observability and SLO
Metrics:- 'projection _ lag _ ms' (istochnik→vitrina), 'freshness _ age _ ms' (since the last delta).
- throughput of updates, error rate, DLQ-rate, redrive-success.
- Window size, p95/p99 reading latency.
- Теги: `tenant_id`, `entity_id`, `event_id`, `version`, `projection_name`, `attempt`.
- Annotations: merge solutions, omission of outdated versions.
11) Playbooks (runbooks)
1. Lag growth: check connector/broker, increase parties, include prioritization of key showcases.
2. Many schema errors: freeze redrive, migrate schemas (backfill), restart with a new version of the mapper.
3. Repeated DLQ: reduce batch, enable the "shadow" handler, increase idempotency.
4. Window inconsistency: rebuild windows from log/source per window (tenant/partition selective).
5. Hot keys: limit competition by key, add local queues, put the unit in a separate showcase.
12) Full recount (rebuild) and backfill
Approach:- Stop consumption (or switch to a new version of the showcase).
- Recalculate in batches (by batch/date/tenant).
- Enable two-phase switch: first fill in 'read __ v2', then atomically switch the read routing.
13) Evolution of circuits (versioning)
'schema _ version'in events/documents.
The projection can read several versions, migration on the fly.
For major changes - a new v2 showcase and canary traffic.
14) Security and access
Inherit RLS/ACL from the source; do not make the showcase wider in access than the original data.
Mask PII in projections not needed for UX/analytics.
Audit of redrives/recounts/manual edits.
15) Configuration template
yaml projections:
read_orders:
source: kafka. orders. events partition_key: "{tenant_id}:{order_id}"
idempotency: version_ts upsert:
table: read_orders conflict_keys: [tenant_id, order_id]
freshness_slo_ms: 60000 dlq:
topic: orders. events. dlq redrive:
batch: 500 rate_limit_per_sec: 50 read_sales_minute:
source: cdc. orders partition_key: "{tenant_id}:{bucket_minute}"
aggregate: increment retention_days: 90 limits:
per_tenant_parallelism: 4 per_key_serial: true observability:
metrics: [projection_lag_ms, dlq_rate, redrive_success, read_p95_ms]
16) Typical errors
"One showcase for all cases" → heavy updates and bad p99.
Lack of idempotency → duplicates/jumps in aggregates.
Dual-write directly to the showcase and OLTP → discrepancies.
Zero visibility of freshness → conflicting expectations with the product.
Rebuild without a two-phase switch → "holes" in the answers.
No partitioning/indexes → cost and latency growth.
17) Quick recipes
Catalog/search: document showcase + incremental upsert, lag ≤ 5-15 s, indexes for filters.
Dashboards: minute/hour tanks, 'SUM/COUNT' units, p95 freshness ≤ 60 s.
Personal tape: projection by user + causal/RYW for the author, fallback to the cache.
Global SaaS: regional showcases, aggregates cross-regionally; fairness per tenant.
18) Pre-sale checklist
- The showcase is designed for a specific request; there are indices and parties.
- Source of change selected (events/CDC); delivery guarantees and key order.
- Idempotent upsert with versions/time; protection against "old" events.
- The freshness SLO is defined and answered ('as _ of/freshness').
- DLQ and Secure Release configured; playbook on rebuild/backfill.
- Per-key serial and fairness per tenant.
- Lag/error/latency metrics, p95/p99 alerts, and DLQ growth.
- Circuit versioning and migration strategy (v2 + switch).
- Access/PII policies are inherited and validated.
Conclusion
Read Models and projections are an engineering accelerator of reads: you pay a small price for "freshness" and streaming infrastructure to get predictable milliseconds and offload the core of recordings. Design storefronts to suit your request, make updates idempotent, measure lag and clearly promise freshness - and your APIs will remain fast even with increasing load, data and geography.