GH GambleHub

KPI forecasting

KPI forecasting

KPI forecasting is not "guessing the graph," but a controlled loop: correct data → an adequate model → scenarios and interpretation → operational monitoring. Below is a system checklist and architecture that scales from simple series to portfolio, hierarchical and probabilistic forecasts.

1) Task statement

What do we predict? level, delta, quantile, interval, event (spike).
Horizon/step: hours/days/weeks/months; rolling windows for short-term control.
Unit: Product/Brand/Country/Platform/Channel.
Business context: controlled levers (promo, prices, releases) and restrictions (SLA, RG/compliance).
Values ​ ​ and risks: cost of re/under-forecast, fine for false alerts.

2) Data and preparation

Grain and calendar: single calendar (holidays/weekends/payroll days), time locale (UTC + local views).
Aggregates and consistency: DAU/WAU/MAU, GGR/Net, ARPPU, retention (D7/D30), funnel conversions, latency p95 - store as separate showcases with explicit formulas.
Regressors (X): promotions/bonuses, campaigns, price changes, content releases, sports events, exchange rates, weather (if relevant).
Anomalies and omissions: we label, do not remove blindly; for events - "one-off" flags.
Stability of schemes: we record the change points of product versions/dimensions as events.

3) KPI types and modeling features

Additive volumes (revenue, deposits): ETS/ARIMA/GBM/Temporal-NN perform well.
Fractions and conversions: logit lines, beta-binomial models, limited regression [0,1].
Coefficients and ratios (ARPPU): model the numerator and denominator separately, then the composition.
Intermittent series (rare events, chargeback): Croston/SBA/TSB, zero-inflated approaches.
Hierarchies (strana→brend→kanal): reconciliation: Bottom-Up, Top-Down, MinT.
Composite KPIs (for example, GGR): disaggregate drivers: traffic × conversion × frequency × average check.

4) Models: from basic to advanced

Baselines: Naive, Seasonal Naive, Drift - needed for an honest assessment.
Series Classics: ETS/ARIMA/SARIMA; Prophet for quick seasonality and holidays.
Regressors: ARIMAX/ETS + X, dynamic regressions, TBATS for multiple seasonalities.
Gradient boosting/tabular NN: LightGBM/XGBoost/TabNet with lag features, window statistics, calendar and promo.
Temporal NN: N-Beats, TFT (Temporal Fusion Transformer) - for multi-series and rich X.
Probabilistic: quantile regression (pinball loss), Gaussian/Student-t, quantile forests/GBM.

Causality and scenarios: DiD/SC to assess promo effect; uplift to schedule "what happens if we include."

5) Decomposition and signs

T + S + R: trend + seasonality (day of the week/month/hour) + balance.

Lags and windows: 'y _ {t-1.. t-28}', moving averages/std, exp. smoothing; "holiday tails."

Categorical: country/channel/OS as embeddings/one-hot.
Events: Releases/Promotions/Banners - Binary/Intensities.

Leakage control: only information "from the past."

6) Scoring and backtesting

Splits: rolling/expanding origin; we block seasonality (multiple weeks/months).
Level metrics: MAE, RMSE, MAPE/sMAPE, WAPE (more reliable at zeros).
Probabilistic metrics: pinball loss (q = 0. 1/0. 5/0. 9), CRPS, interval calibration (coverage, SHARP).
Event/spike metrics: precision/recall on the "ejection" detector.
Baseline rule: the model should beat Seasonal Naive.
Stability: error variances by segment/holiday; out-of-time (last N weeks).

7) Hierarchical forecasting and reconciliation

Bottom-Up: summarize the "bottom"; simple but noisy.
Top-Down: Spread over historical shares.
MinT (optimal reconciliation): minimizes error covariance - the best compromise with a rich bottom.
Practice: we train basic models at each level, then agree.

8) Probabilistic predictions and interpretation

Quantiles: q10/q50/q90 → planning "pessimist/base/optimist."

Intervals: target coverage (e.g. 80 %/95%); checking calibration.
Cost of risk: plan according to the conditional VaR/expected shortfall for KPIs with asymmetric losses (demand forecast is more expensive than forecast, and vice versa).

9) Scenario modeling

Exogenous scenarios: "no promo/s promo," "course ± 10%," "football final."

What-if: change X (campaign intensity, limits, prices) → KPI forecast and confidence intervals.
Plan-fact: bridge factors: contribution of seasons, promo, prices, trend, shock/incident.

10) Production loop and MLOps

Retraining frequency: short-term KPIs - daily/weekly; monthly - T + 1/T + 3.
Layers/artefacts: fichestor (online/offline parity), model register, KPI data/formula versions.
Monitoring: WAPE/SMAPE sliding window, interval coverage, feature drift (PSI), feed delay, SLA generation.
Alerts: error spike> threshold, uncalibrated intervals, seasonality breakdown.
Fail-safe: degradation → rollback to Seasonal Naive/ETS; freeze-models into holiday peaks.

Hysteresis: different on/off thresholds of "promo regressors" to avoid "blinking."

11) Specificity of product and iGaming-KPI (approximate map)

Traffic/activity: DAU/WAU/MAU, including match days/game releases.
Monetization: GGR/Net, deposits, ARPU/ARPPU - strong evening/weekend/holiday seasonality.
Retention: D1/D7/D30 - it is better to predict as a probability (logit) with a calendar.
Risks: chargeback rate (intermittent), RG indicators (policies/holidays), anti-fraud signals.
Operations: latency p95/p99, transaction errors - compatible with anomalies/causal influences of releases.

12) Artifact patterns

A. KPI Forecast Passport

KPI/Code: 'GGR _ EUR' (formula version)

Horizon/step: 8 weeks, day

Hierarchy: brend→strana→platforma

Regressors: 'promo _ spend', 'fixtures _ flag', 'holiday', 'fx _ rate'

Model: 'TFT _ v4' (q10/q50/q90) + MinT reconciliation

Metrics: WAPE (absolute target ≤ 8%), coverage 90% - interval ≥ 85%

SLO: generation ≤ 10 min after 06:00; data log ≤ 1 hour

Owners: Monetization Analytics; revision date: 2025-10-15

B. Decision-ready report (skeleton)

Headline: "GGR: Forecast 8 weeks, q10/q50/q90"

Key: risk of under-prognosis at Week 3 22% (ES = - € X)

Drivers: + weekend seasonality, + promo effect, FX −

Recommendations: shift budget for low-risk weeks, raise limits on A/B channels

C. Pseudo-code of the pipeline (transient)

python
1) load data y, X, calendar = load_series_and_regressors()
2) build features ds = make_lags_and_windows(y, X, lags=[1,7,14,28], roll=[7,14,28])
ds = add_calendar_features(ds, calendar) # holidays, dow, month_end
3) split cv = rolling_backtest(ds, folds=6, horizon=28)
4) models m1 = ETSx(). fit(ds. train)         # baseline m2 = LightGBMQuantiles(q=[0. 1,0. 5,0. 9]). fit(ds. train)
5) evaluate scores = evaluate([m1,m2], cv, metrics=['WAPE','pinball'])
6) retrain full + reconcile forecasts = reconcile_minT(train_and_forecast([m2], hierarchy))
7) report + push publish(forecasts, scores, sla=timedelta(minutes=10))

13) Frequent errors and anti-patterns

MAPE at zeros: use WAPE/sMAPE.
Mean means: aggregate numerators/denominators separately.
Ignore holidays/releases: Add regressors and "aftertaste" dates.
Faces: features with future information (target leakage).
Too "smart" models without baseline: first defeat Seasonal Naive.
Uncalibrated intervals: "beautiful but empty" - check coverage.
Hierarchy inconsistency: Without reconciliation, the overall plan is scattered.
Lack of fail-safe: at the peak of the holidays, the model "hangs," plans collapse.

14) Monitoring in sales

Quality: WAPE rolling, pinball by quantile, coverage 80/95%.
Stability: PSI by key attributes, seasonality drift.
Operations: generation time, data lag,% of folbacks.
Alerts: "3 σ" rule on error, SLO violation, hierarchy breakdown.
Runibook: freeze mode, turning off "noisy" regressors, force-overdone.

15) Pre-release checklist

  • KPI defined and versioned (semantic layer)
  • Calendar/holidays/regressors aligned and tested
  • Baselines (Naive/Seasonal) defeated by backtesting
  • Selected metrics (WAPE/pinball) and target thresholds
  • The intervals are calibrated; pessimist/base/optimist scenarios collected
  • Hierarchies agree (MinT/Top-Down)
  • MLOps: workout schedule, monitoring, alerts, fail-safe
  • Documentation: forecast passport, SQL/feature recipes, incident runibook

Total

KPI forecasting is a solution architecture: clear definitions, rich calendar and regressors, honest baselines, probabilistic predictions, hierarchical alignment, stable MLOps and scenario planning. Such an outline provides plausible expectations, manageable risks and "decision-ready" reports that directly feed planning, marketing, operations and compliance.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.