Indexing and Query Optimization
1) Indexing and optimization goals
Latency: P50/P95/P99 reduction.
Throughput: QPS growth without scale-out.
Predictability: stable plans and no "jumps" in response time.
Savings: Less IO/CPU, less cloud bill.
Reliability: reducing locks and deadlocks due to correct access.
- Any optimization must maintain correctness and consistency.
- Track the effect in metrics and plan logs.
2) Basic index structures and when to apply them
2. 1 B-Tree (default)
Equals/ranges, sort, 'ORDER BY'.
Good for most time/ID/status filters.
2. 2 Hash
Pure equalities ('='), cheaper in memory but out of order (PG: constraints removed but still niche choice).
2. 3 GIN / GiST (PostgreSQL)
GIN: arrays/JSONB keys, full text (tsvector), containment ('@>').
GiST: geo, ranges, kNN.
2. 4 BRIN (PostgreSQL)
Super-cheap index by "naturally sorted" tables (append-only by time). Good for time-series with large tables.
2. 5 Bitmap (MySQL/InnoDB: none native; DW-DBMS/OLAP)
Effective for low cardinality and facets, more often in column storage.
2. 6 Column indices (ClickHouse)
Primary key + data skipping (minmax), secondary через `skip indexes` (bloom, set).
OLAP queries with aggregations and ranges.
2. 7 Inverted indexes (Elasticsearch/OpenSearch)
Full text, facets, hybrid search. For precise filters, use keyword fields and doc values.
2. 8 MongoDB
Single, compound, multikey (arrays), partial, TTL, text, hashed (for uniform key sharding).
3) Key and composite index design
3. 1 Left prefix rule
The order of the fields in the index determines usability.
Query'WHERE tenant_id =? AND created_at >=? ORDER BY created_at DESC` → индекс `(tenant_id, created_at DESC, id DESC)`.
3. 2 Tie-breaker
Add a unique tail (usually 'id') for stable sorting and seek pagination.
3. 3 Partial/filtered indices
Index only "hot" subsets:sql
CREATE INDEX idx_orders_paid_recent
ON orders (created_at DESC, id DESC)
WHERE status = 'paid' AND created_at > now() - interval '90 days';
3. 4 Covering indices
Include "readable" fields in the index (MySQL: 'INCLUDE' none; PG 11+: `INCLUDE`):sql
CREATE INDEX idx_user_lastseen_inc ON users (tenant_id, last_seen DESC) INCLUDE (email, plan);
3. 5 Functional/calculated
Normalize keys in index:sql
CREATE INDEX idx_norm_email ON users (lower(email));
4) Partitioning and sharding
4. 1 Partitioning (PG native/table inheritance; MySQL RANGE/LIST)
Rotation of parties by time ('daily/weekly') simplifies' VACUUM/DELETE '.
Indexes are local partitions → smaller than B-Tree, faster plan.
sql
CREATE TABLE events (
tenant_id bigint,
ts timestamptz,
...
) PARTITION BY RANGE (ts);
4. 2 Partitioning key
In OLTP - by'tenant _ id '(load localization).
In time-series/OLAP - by 'ts' (range queries).
Hybrid: '(tenant_id, ts)' + sub-parties.
4. 3 Sharding
Consistent hashing/range-shard by 'tenant _ id' or by time.
Cross-shard query → scatter-gather and k-way merge; hold the per-shard cursor.
5) Statistics, cardinality and plans
5. 1 Up-to-date statistics
Enable auto-analysis ('autovacuum/autoanalyze'), increase 'default _ statistics _ target' for dirty distributions.
5. 2 Advanced Statistics (PG)
Correlated columns:sql
CREATE STATISTICS stat_user_country_city (dependencies) ON country, city FROM users;
ANALYZE users;
5. 3 Execution plan
See 'EXPLAIN (ANALYZE, BUFFERS, VERBOSE)'; key fields:- `Rows`, `Loops`, `Actual time`, `Shared Read/Hit`, `Recheck Cond`.
- Типы join: Nested Loop, Hash Join, Merge Join.
- Seq Scan vs Index Scan/Only Scan/Bitmap Heap Scan.
5. 4 Stability of plans
Parameterization (prepared statements) can "stick" on a bad plan. Use plan cache guardrails (PG: 'plan _ cache _ mode = force_custom_plan' for problem queries) or "forwarding" constants.
6) Optimization of joins and sorts
6. 1 Strategies
Nested Loop: small external, fast index on internal.
Hash Join: large sets, enough memory for hash table.
Merge Join: sorted entries, advantageous in the order already available.
6. 2 Indexes under join
For'A JOIN B ON B.a_id = A.id ', → the index to'B (a_id)'.
For the filter after join - the index on the columns of the filter of the internal table.
6. 3 Triage
Avoid'ORDER BY'without a corresponding index; sorting on large sets is expensive by memory/disk.
7) Query rewrite
Get rid of the "snowflakes" of subqueries; expand in JOIN.
Use CTE-inline (PG ≥12 CTE default inlines, but'MATERIALIZED 'can commit an intermediate result if needed).
Remove'SELECT '→ list the fields (IO/network savings).
Transfer calculations from'WHERE 'to the indexed form (pre-computed columns).
Aggregations: preliminary summary tables/materialized views with incremental update.
8) Butching, limiting and pagination
Batch-insert/update: 500-5000 batches instead of one by one.
Seek pagination by '(sort_key, id)' instead of deep 'OFFSET'.
Limit the dialing before sorting/joyne (push-down 'LIMIT').
9) Caching and denormalization
Application-level query-cache (key = SQL + bind-vars + rights version).
Materialized views for heavy aggregates; rotation/reference plan.
Denormalization - Store frequently read calculated fields (price including discount) but with trigger/background task for consistency.
Redis as L2 for hot keys (with TTL and event disability).
10) The specifics of popular engines
10. 1 PostgreSQL
Индексы: B-Tree, Hash, GIN/GiST, BRIN, partial, functional, INCLUDE.
Example:sql
CREATE INDEX idx_orders_tenant_created_desc
ON orders (tenant_id, created_at DESC, id DESC)
INCLUDE (amount, status);
Full Text:
sql
CREATE INDEX idx_docs_fts ON docs USING GIN (to_tsvector('russian', title ' ' body));
10. 2 MySQL/InnoDB
Composite, spanning indexes (by including fields in the key), invisible indexes for tests:sql
ALTER TABLE orders ALTER INDEX idx_old INVISIBLE; -- check risk-free plans
Histogram statistics ('ANALYZE TABLE... UPDATE HISTOGRAM` в 8. 0).
10. 3 ClickHouse
Primary key = sort; 'ORDER BY (tenant_id, ts, id)'.
Skip indexes:sql
CREATE TABLE events (
tenant_id UInt64,
ts DateTime64,
id UInt64,
payload String,
INDEX idx_bloom_payload payload TYPE bloom_filter GRANULARITY 4
) ENGINE = MergeTree()
ORDER BY (tenant_id, ts, id);
10. 4 MongoDB
Composite/cartoons: order is important, filter and sort must match index:js db. orders. createIndex({ tenant_id: 1, created_at: -1, _id: -1 });
db. orders. createIndex({ status: 1 }, { partialFilterExpression: { archived: { $ne: true } } });
Use 'hint ()' for diagnostics, watch for 'covered query'.
10. 5 Elasticsearch/OpenSearch
Keyword vs text fields; doc_values for sorting/aggregates.
Heap segmentation: aggregations - heavy; restrict'size 'and use'composite' aggregations (paging).
Do not include analyzers where an accurate comparison is required.
11) Competitiveness, interlocks and MVCC
Short transactions; avoid "long" reads under 'REPEATABLE READ' needlessly.
Index operations also take locks (write throughput reduction).
Plan online indexing: 'CREATE INDEX CONCRETELY' (PG), 'ALGORITHM = INPLACE '/' ONLINE' (MySQL).
Inserts in the tail for an hour/ID → "hot pages" of the index; distribute the key (UUIDv7/salt).
12) Observability and SLO
Metrics:- 'db _ query _ latency _ ms' (P50/P95/P99) by query name.
- `rows_examined`, `rows_returned`, `buffer_hit_ratio`.
- `deadlocks`, `lock_wait_ms`, `temp_sort_disk_usage`.
- Share of plans with'Seq Scan'where'Index Scan' was expected.
- Regression alerts when changing the version/parameters of the DBMS.
- Enable the slow query log with a threshold (for example, 200 ms).
- Correlation of queries with spans (trace_id).
- Remove problem query plans and save to object storage for retrospective.
- Read P95 '<= 150 ms' with'LIMIT <= 50'and hot tenant.
- P95 records' <= 200 ms' with batches up to 1000 lines.
13) Safety and multi-tenancy
Indexes on access control fields ('tenant _ id', 'owner _ id') are required.
Policies (RLS/ABAC) must be pre-filter; otherwise, the optimizer plans incorrectly.
Do not index sensitive fields in clear text; use hashes/tokens.
14) Anti-patterns
Deep 'OFFSET' without seek-cursor alternative.
"One index for all" - memory overload and write-path.
'SELECT'in critical paths.
Functions above column in'WHERE 'without function index.
Unstable plans due to old statistics.
Missing'ORDER BY'while waiting for stable order.
Indexes for the sake of indexes: ROI <0 due to expensive write/support.
15) Implementation checklist
1. Top N requests by QPS and time → select 3-5 candidates.
2. Remove plans' EXPLAIN ANALYZE ', check cardinality vs actual.
3. Design indexes: field order, INCLUDE/partial/functional.
4. Implement partitioning for large tables (temporary/tenant keys).
5. Overwrite queries: remove'SELECT ', inline simple CTEs, restrict set.
6. Enable butching and seek pagination.
7. Configure cache: L1/L2, disability by events.
8. Introduce monitoring of plans and slow-log, alerts for regressions.
9. Perform load tests with real data distribution.
10. Update development guidelines (ORM hints, indexing, limits).
16) Before/after examples
Before:sql
SELECT FROM orders
WHERE status = 'paid'
ORDER BY created_at DESC
LIMIT 50 OFFSET 5000;
After:
sql
-- Индекс: (status, created_at DESC, id DESC) INCLUDE (amount, currency)
SELECT id, amount, currency, created_at
FROM orders
WHERE status = 'paid'
AND (created_at, id) < (:last_ts,:last_id) -- seek
ORDER BY created_at DESC, id DESC
LIMIT 50;
17) ORM and API protocols
Avoid N + 1: greedy samples ('includes', 'JOIN FETCH', 'preload').
Explicit field projections, paginate by cursor.
gRPC/REST: limit 'page _ size', fix 'sort', use opaque tokens.
Plan cache: use parameterization; do not generate "unique" SQL per call.
18) Migrations and Operations
Add indexes online and mark as INVISIBLE/CONCURRENT, test plans, then switch.
Index revisions - regular sanitary cleaning: duplicates, unused, "dead" for old features.
Party rotation plan (drop old) and 'VACUUM/OPTIMIZE' schedule.
19) Summary
Query optimization is systems engineering: correct keys and indexes, neat plans, thoughtful partitioning and sharding, discipline in queries and ORM, caching and observability. By following the described patterns, you will get a fast, predictable and economical system that is resistant to data growth and load.