KPIs and Benchmarks

KPIs and benchmarks

KPIs (Key Performance Indicators) translate strategy into measurable goals, and benchmarks give a "horizon line" - what to compare the results with (yesterday, competitors, market). Below is a practical framework: from the choice of metrics and goals to normalization, statistics, visualization and management rituals.

1) Taxonomy of metrics

North Star Metric (NSM): A key measure of product value (e.g., Active Paying Users for 30 Days).
Outcome vs Process: result (revenue, retention) and process (release speed, SLA fichestore).
Leading vs Lagging: Leading predictors (step conversion) and lagging totals (LTV).
Guardrail metrics: safety restrictions (FPR models ≤ 1%, p95 latency ≤ 200 ms).
Hierarchy: corporate → product/functional → team → individual.

2) Good KPI: Criteria

SMART: Specific, Measurable, Achievable, Relevant, Time-bound.
Controllability: KPIs are influenced by team, not external volatility.
Low manipulability: resistant to "cheating," described calculation method and data sources.
Signaling: sensitive to change but not noisy (reasonable variance).

3) Formulas and standards (constructor)

Activity: DAU/WAU/MAU, Stickiness = DAU/MAU.
Hold: Retention d = Users active day d/Cohort size; Churn = 1 − Retention.
Conversion: CR = Conversions/Visitors (per funnel - per-step CR).
Monetization: ARPU = Revenue/Users; ARPPU = Revenue / Paying users; LTV = Σ (Net cashflowt · discountt).
Quality of models: ROC-AUC/PR-AUC; logloss; Calibration (Brier); Recall@FPR≤x%; uplift@k.
Operations/Infrastructure: Availability = Uptime/Total time; SLA breach rate; p50/p95/p99 latency.
Data: Freshness, Completeness, Consistency, PSI.
Development: Deploy Frequency, Lead Time for Changes, Change Failure Rate, MTTR.

💡 It is recommended to fix the slice date, source and math (SQL/laptop) for each KPI.

4) Goal setting: OKR + KPI

OKR: "Lens → 3-5 measurable results (KR)." KPI - KR numeric form.

Targets:

Commit (base bar, ≥80% probability).
Stretch (ambitiously, 30-50%).
Ceiling (top of the reasonable).
Increment vs absolute: target is set as Δ (for example, "+ 10% to Retention D30") or as level ("MAU ≥ 1 million").

5) Benchmarks: where to get the "norm"

Internal: past periods (YoY/Yo2Y), neighboring markets/segments, control groups, best teams.
External: industry reports, open datasets, academic benchmarks for models (MNIST/GLUE/ROCStories, etc. - by domain).
Competitive: market intelligence, public metrics, regulator/association reviews.

Comparison types:

Absolute: KPI ≥ industry threshold.
Percentile: "in the top 25% of the market."
Gap analysis: Δ to median/leader; rate of rupture closure.

6) Normalization and adjustments

Seasonality and calendar: holidays, promotions, weekends → use seasonal indices or YoY comparison.
Mix shifts: the structure of traffic/segments has changed → do mix-adjusted KPI (weighing).
Smoothing: EMA/7-day medians for tactical reviews; store both "raw" and smoothed rows.
Sampling and scales: result in "per user/session/1000 requests"; watch for denominator stability.

7) Statistics and reliability

Trust in change: effect ≥ minimal significant (MDE); confidence intervals (bootstrap).
A/B culture: guardrail metrics (errors/latency); time of experiment ≥ full cycle of user.
Anomalies and outliers: robust metrics (median, Huber), vinzorization p1/p99.
Small samples: Bayesian intervals; aggregation by week.

8) Dashboards and management rituals

Layers: Executive (NSM + 3-5 leaders), Product/Domain (funnels, cohorts), Ops/ML (SLA, drift, model metrics).
Graph standards: YoY/DoD, quantiles p50/p95, factorization (mix, price, volume).
Rhythms: daily standup (incidents/alerts), weekly review (tactics), monthly QBR (strategy), quarterly OKR retrospectives.
Runbooks: what to do if the KPI is rejected (RCA → threshold → correction plan).

9) Anti-patterns and risks

Goodhart's Law: "when a metric is a goal, it ceases to be a metric." Use metric and guardrails packages.
Proxy optimization: click growth without revenue growth; track North Star.
Not accounting for delays: KPIs of the "effect" are late - keep leading metrics.
Change of definition: "hidden" formula editing breaks trends → versionize KPIs and store the dictionary of terms.
Funnel without a denominator: an increase in conversion when traffic falls - show both absolutes and fractions.

10) KPI map by area (cheat sheet)

Area	KPI Core	Guardrails
Product/growth	NSM, MAU/WAU/DAU, Retention D7/D30, Activation rate	Crash-free %, NPS/CSAT
Marketing	CAC, ROMI, CPL/CPA, Organic share	Spam rate, Brand safety
Sales	Win rate, Pipeline velocity, ACV	Churn MRR, Discount rate
Monetization	ARPU/ARPPU, LTV, Take rate	Refund %, Chargeback rate
Data	Freshness, Completeness, PSI	Data SLA, Schema errors
Models	PR-AUC, Recall@FPR≤x%, Calibration	Latency p95, Drift alerts
Infra/DevOps	Availability, MTTR, Change Failure Rate	Error budget burn

11) KPI & Benchmark Implementation Process

1. Define the objective and impact hypothesis (which action drives the KPI).
2. Describe the formula, source, frequency, aggregation levels (day/week/month, segments).
3. Select benchmarks (internal/external), agree on targets (commit/stretch).
4. Collect dashboard and alerts (threshold, hysteresis, suppression of the window).
5. Start a cycle of reviews (weekly/monthly), record decisions and effect.
6. Audit once a quarter: relevance, manipulability, communication with NSM.
7. Version: KPI v1 → v2 (history recalculation/mapping).

12) Patterns and artifacts

KPI passport template

Name and code: 'RET _ D30 _ v2'

Definition: Proportion of cohort users returning on day 30

Formula/SQL: link to laptop/script (versioned)

Data source: showcase 'dm _ user _ cohorts _ v3'

Granularity/latency: daily, lag ≤ 12 h

Segmentation: country, channel, platform

Guardrails: sampling error ≤ 2 pp; emissions vinzorize p1/p99

Owner/Contacts Product Intelligence Team

Revision History - Version/Date History

Target template (KPI-target)

Base (Q0): 24% Retention D30

Commit (Q1): 26% (YoY neutralized)

Stretch: 28%

Initiatives: improving onboarding, recommendations, email chains

Risks: seasonality, traffic mix change

Impact check: A/B, causal lift

13) Metrics quality checklist

Formula and source documented, KPI versioned
There is segmentation and guardrails
Seasonality and mix change taken into account
Confidence intervals/bootstrap on dashboard
Alerts with hysteresis; Runibook for deviations
Quarterly audit of KPI portfolio

Total

The key to control is not in the "ideal" one metric, but in a balanced set of KPIs associated with North Star, equipped with clear benchmarks, correctly normalized and built into decision-making rituals. This contour makes goals transparent, comparisons honest, and changes manageable.

KPIs and Benchmarks