Correlation and Cause and Effect
Correlation and Cause and Effect
Correlation captures joint changes in variables. Causation answers the question: what happens if we intervene? In analytics, product and risk management, value brings precisely the causal effect: it allows you to evaluate the increment from a solution, and not just an association.
1) Basic concepts
Correlation (association): statistical relation without interpretation of "why." May be caused by common cause, reverse causation, or chance.
Treatment effect: the expected difference between the world "with intervention" and "without intervention."
Counterfactual: impossible observation "what would happen to the same object without impact."
Confounder: a variable that affects both the cause and the result → creates a false relationship.
Collider: a variable that is affected by both cause and result; the collider condition distorts the association.
Simpson paradox: the direction of the effect changes after taking into account the hidden variable/segment.
2) When correlation is sufficient and when it is not
Descriptive analytics, monitoring, EDA: correlations/ranks/heatmap → detect hypotheses and risks.
Decision-making and impact assessment: causal methods (experiments or quasi-experiments) are required.
Prediction models: Correlations are useful, but for ROI/policies - move to causal estimates or uplift models.
3) Experiments: Gold Standard
A/B tests (randomization): eliminate confounding, make groups comparable.
Guardrails: duration ≥ one cycle of behavior, stable exposure, control of seasonality and interference (spillover).
Metrics: effect, confidence intervals, MDE/power, heterogeneity of effect by segment (Heterogeneous Treatment Effect).
Practice: canary releases, phased rollout, CUPED/covariate control to reduce variance.
4) If experiment is not possible: quasi-experiments
Difference-in-Differences (DiD): Difference in before/after changes between "test" and "control." The key assumption is parallel trends before intervention.
Synthetic control: we build "synthetic" control as a weighted mixture of donor groups. Resistant to different trend dynamics.
Region Discontinuity (RDD): threshold rule for assigning impact; comparison on both sides of the threshold. Important: no "manipulation" of the threshold.
Instrumental variables (IV): the variable affects "treatment" but does not directly affect outcome (except through treatment). Required: relevance and validity of the instrument.
PSM/Matching: test and control with similar covariates; useful as preprocessing, but does not eliminate hidden confounders.
Interrupted Time Series (ITS): evaluation of a trend break at a policy point in the absence of other shocks.
5) Causal Graphs and the criteria for "holes"
DAG (oriented acyclic graph): a visual map of causal relationships. Helps you choose which variables to monitor.
Back-door criterion: we block all the back paths (confounders) - we get an unbiased effect estimate.
Front-door criterion: we use an intermediary that fully carries influence to bypass hidden confounders.
Do not control colliders and descendants of the result: this creates displacements.
Practice: first draw a DAG with domain experts, then choose the minimum set of covariates.
6) Potential outcomes and effect estimates
ATE/ATT/ATC: mean effect across all/treated/controls.
CATE/HTE: effect by segment (country, channel, risk class).
Uplift modeling: we teach the model to rank objects by the expected increase from the intervention, and not by the initial probability of the event.
7) Frequent traps
Reverse causality: "an increase in discounts ↔ a drop in demand" - discounts react to a fall, and not vice versa.
Missing variables: unreported stocks/seasonality/regional changes.
Survivors bias: Analysis of "remainers" only.
Leakage: use of future information in training/assessment.
Mixing metrics: optimizing proxy metrics instead of the business effect (Goodhart).
Regression to the mean: Natural returns to the trend mask "effects."
8) Causality in product, marketing and risk
Marketing/campaigns: uplift targeting, differentiated contact frequencies, causal LTV assessments, DiD/synthetic control ROMIs.
Pricing/promotion: RDD (threshold rules), SKU/region sampling experiments.
Recommendations: off-policy assessment (IPS/DR) and bandits; accounting for interference.
Anti-fraud/RG policies: careful with causality - locks change behavior and data; use quasi-experiments and guardrails on FPR and appeals.
Operation management: ITS for releases and incidents; causal graphs for RCA.
9) Analysis procedure: from hypothesis to solution
1. Formulate the question as causal: "What is the effect of X on Y in horizon T?"
2. Draw a DAG: coordinate with the domain, mark confounders/mediators/colliders.
3. Select design: RCT/A-B, DiD, RDD, IV, synthetic control, matching.
4. Define metrics: main (effect), guardrails (quality/ethics/operations), CATE segments.
5. Prepare data: point-in-time, covariates "before" impact, calendar and seasonality.
6. Evaluate effect: baseline models + robast tests (placebo tests, sensitivity).
7. Check robustness: alternative specifications, exclusion of suspect covariates, leave-one-out.
8. Put into action: policy/rollout, SLO, monitoring and retest when drifting.
10) Robast practices and verification
Pre-trend checks (for DiD): test/control trends are similar before intervention.
Placebo/permutations: "fictitious dates" or "fictitious groups" - the effect must disappear.
Sensitivity analysis: how much a hidden confounder will distort the result.
Bounds/pi-intervals: partially identifiable models → confidence bounds.
Multiple testing-BH/Holm adjustments for multiple segments.
External validity: portability of the effect to other markets/channels (meta-analysis).
11) Effect Reporting Metrics
Absolute effect: Δ in units (pp, cu, minutes).
Relative effect:% to baseline.
NNT/NNH: How many objects need to be processed to achieve one outcome/harm.
Cost-Effectiveness: effect/cost; priorities of budgets.
Uplift @ k/Qini/AUUC: for targeted interventions.
12) Causality in ML practice
Causal Features: Don't always improve prediction accuracy, but are better suited to policies.
Causal Forest/Meta-learners (T/X/S-Learner): CATE score and personal uplift.
Counterfactual fairness: fairness of models taking into account causal paths; blocking "unfair" paths.
Do-op vs predict: Distinguish between "predict" and "what if done." The second requires causal models/emulators.
13) Causal Checklist
- The question is framed as an intervention/policy effect
- Built and agreed by DAG; minimum set of covariates (back-door) selected
- Design selected (RCT/quasi experiment) and key assumptions tested
- Point-in-time data; excluded faces; calendar/seasonality taken into account
- Effect and confidence intervals calculated; robast checks were carried out
- Effect heterogeneity (CATE) and risks (guardrails) assessed
- Value digitized (ROI, NNT/NNH, error cost)
- Implementation and monitoring plan; retest criteria
14) Mini glossary
Back-door/Front-door: criteria for selecting covariates for effect identification.
IV (instrumental variable): "lever" changing treatment but not outcome directly.
DiD: difference in before/after changes between groups.
RDD: effect estimate near the rule threshold.
Synthetic Control: control as a weighted combination of donors.
HTE/CATE: heterogeneous/conditional effect by segment.
Uplift: the expected increase from the impact, not the probability of an event.
Result
Correlations help to find hypotheses, causality helps to make decisions. Build a DAG, choose an appropriate design (experiment or quasi-experiment), test assumptions and robustness, measure heterogeneous effects, and translate conclusions into policy with guardrails and monitoring. So analytics ceases to be "about connections" and becomes an engine of change.