Data visualization

Visualization is a way to turn data into solutions. A good graph saves time, reduces cognitive load and helps to see patterns rather than "patterns." Below is a field guide: from goals and chart selection to design, storytelling and operation in the product.

1) Targets and audiences

Objectives: research (EDA), explanation (insight → action), monitoring (dashboards), persuasion (presentations).
Audiences: management (high-level and trends), product/marketing (funnels, cohorts), engineers/ML (SLA, drift, model metrics), compliance (risks/control).
The golden rule: one visualization is one main question.

2) Chart selection (cheat sheet)

Question	Data	Schedule
Compare values	Categories (up to 15)	Bar chart (horizontal for long marks)
Dynamics	Time	Line graph, area (for accumulations), sparklines
Distribution	Continuous	Histogram, KDE, box/violin
Ratio of parts	Whole → parts	Stack bar/100% stack; donut/pie - for 2-3 parts only
Correlations	Two/multiple. variables	Scatter/bubble, heatmap, pairs of graphs
Ranks/leaderboards	Sorting	Bar chart with ranking, dumbbell
Compositions	Lots of metrics	Small multiples, facets
Streams	Transitions	Sankey, alluvial, chord (caution)

Anti-patterns: 3D graphics, double axes without obvious need, overloaded legends.

3) Composition and readability

Hierarchy: title → key insight of the → part.
Grid and indents: remove extra lines; numeric signatures are rarer, but appropriate.

Fonts: 3 sizes (title, axes, signatures); avoid kaps and "trifles."

Annotations: Sign peak/anomalous points, policy/campaign changes.
Layout dashboard: rule "Z" or "F," 3-6 cards per screen, one NSM on top.

4) Color and coding

Color value: categorical - quality palettes; ordinal - gradients; divergent - for "above/below normal."

Contrast: ratio ≥ 4. 5:1 for text; check color-blind palettes.
Minimum colors: ideal - 1 accent + 1-2 auxiliary.
Data channel: first position/length, then angle/area, color - as an amplifier.
Accent: emphasize the main thing (highlight), the rest is gray.

5) Storytelling

Frame: context → conflict (question/anomaly) → decoupling (output/action).
Narrative on the graph: leading title (insight), subtitle (how to read), notes (why important).
Comparisons: before/after, control/test, YoY/DoD, normalized values.
Units and scales: explicit units, reasonable rounding, zero point on bar charts.

6) Dashboards: from layout to operation

Layers: Executive (1-2 NSM + 3 drivers), Domain (funnels/cohorts), Ops/ML (SLA/drift/alerts).
Filters: time, segments (country/channel/platform), experiments.
Cards: KPI-tiles with trend/sparkline, drill-down by click.

States: empty (no data), "error," "load."

Update: Specify frequency/lag (e.g. "updated 10 min ago").

7) Visualization quality metrics

Time to insight (TTI): seconds to understand "what is happening here."

Cognitive load: number of elements/legends; the goal is minimum gaze switches.
Reading accuracy: discrepancy "by eye" vs real values.
Usage: clicks/scrolling/saves; which card provides solutions.
Trust: the proportion of correct interpretations in a user test.

8) Availability and localization

Alt texts and descriptive headings.
Colors distinguishable by color blindness; duplicate colors with shape/stroke.
Locales of numbers/dates, right-handed scales for some languages.
Keyboard navigation and screen-reader shortcuts for web dashboards.

9) Anti-patterns

Chartjunk: decorative elements that carry no meaning.
Pies with 7 + sectors: Replace with a bar chart.
Two Y-axes without a clear need: it is better to normalize/show two panels.
False accuracy: 12 decimal places, "torn" scales without warning.
Infinite interactivity: hides the main idea - first a static key view.

10) Data Task Visualization Templates

Cohorts and retention: heatmap/calendar + trend lines D7/D30.
Funnels: step bar + conversion deltas; annotations of experiments.
ML monitoring: metrics (PR-AUC, Recall@FPR≤x%), calibration (Reliability curve), drift (PSI heatmap), latency p95.
Finance: waterfall (bridge) for factor contributions to GGR/revenue.
Anomalies: line with confidence corridor + event/release markers.
Segmentation: small multiples by segment; UMAP scatter with painting.

11) Tools and stack

Research: notebooks + matplotlib/plotly, ggplot-like grammars.
BI/dashboards: Tableau/Power BI/Looker/Metabase/Superset.
Web front: D3/Observable, Plotly. js, Vega-Lite; for production widgets - light canvas/WebGL libraries.
Standards: design system of graphs (colors, grids, fonts), template components.

12) Performance and data

Calculate aggregates on the DWH side; lazily load large series.
Downsampling/binings for long rows; "small multiplicities" instead of giant heatmap.
Caching popular slices; precompute sparklines.
Control N unique categories (≤ 12 per graph).

13) Uncertainty and comparison visualization

Confidence intervals/tapes, error bars, box/violin for distributions.
Transparency/hatching for plan/actual.
Normalize the units; for relative changes - index (t0 = 100).
Do not mix linear and logarithmic scales without explicit explanation.

14) Visual code review and governess

Review checklist: Is the goal clear? Is the schedule correct? legend readable? Units/Source/Date Updated?
Dictionary of terms: uniform definitions of KPIs; a version of formulas on graphs.
Versioning: "dashboard vX," release date, changelog.
Safety: Mask PII; aggregate to a safe level.

15) Pre-publication checklist

Title articulates insight, not "graph type"
Axis labels/units/source/date updated
Scales and zero point are correct; no misleading axes
Colors are contrasting and color-blind; legend minimal
Annotations of key events/experiments added
There are empty/error states and a negotiated update SLA
Visualization Passes "5 Second Comprehension Test"

Mini Glossary

Small multiples: a series of identical graphs for different segments/periods.
Chartjunk: visual "garbage" that does not carry data.
Diverging palette: a palette with a neutral middle (below/above normal).
Sparklines: Mini-spark charts alongside KPIs.

Total

Strong visualization is not "beautiful graphs," but a clear thought, a correctly chosen type of diagram, a discipline of composition and colors, an honest reflection of uncertainty and a neat dashboard experience. Make a simple start view, emphasize the main thing, document definitions and monitor operation - this is how visualization becomes a control tool, not a decoration.

Data visualization