Revenue forecasting
Revenue Forecasting
Revenue is the result of the interaction of many factors: content/product offers, user behavior, prices and promotions, external conditions (holidays, sports events, exchange rates, regulatory changes). The reliable forecast is not one "model", but the operated contour: definitions → data → model → scenarios → operation → verifications → improvement.
1) Task statement
What we forecast: gross revenue (GGR), net (Net), revenue after bonuses/commissions, by base currency and in local currencies.
Horizon/step: daily/weekly/monthly; for cash gap planning - daily, for budget - monthly/quarterly.
Forecast unit: brand × country × platform × channel (minimum), followed by hierarchy reconciliation.
Purpose: budgeting, traffic/content procurement, infrastructure limits, financial covenants.
Error price: under forecast (lost demand/under forecast) vs re-forecast (excessive purchases/re-promises).
2) Definitions and coordination with the financial circuit
Formulas: GGR, Net, deductions (taxes, bonuses, afiliat commissions) - versioned in the semantic layer.
Calendar: UTC storage + local views; holidays/salary days; sports schedules (if relevant).
FX policy: exchange rate source, conversion date (transaction date/average period rate), single base currency.
Reconciliations: mandatory reconciliation procedure with accounting (discrepancy within acceptable limits).
3) Decomposition of income on drivers
The basic formula is:[
\ text {Revenue} =\text {Traffic }\times\text {Conversion }\times\text {Frequency }\times\text {Average check}
]
Traffic/active: users/sessions/logins.
Conversion: Proportion of paying, CR to target events.
Frequency: number of transactions per payer/period.
Average check: Average transaction amount (consider bonuses/discounts).
It is recommended to predict drivers separately, then assemble the composite to see the contribution of factors (plan-fact bridge).
4) Data and regressors
Time series: day/week aggregates by forecast unit.
X regressors:- promo/bonuses (intensity, type, coverage);
- marketing expenses/impressions/clicks;
- content events (releases, tournaments, major matches);
- Price/limit/catalog changes
- FX/inflation, weather/calendar (if affected);
- regulatory events (restrictions/defrosting).
- Anomalies/one-off: mark, do not "smooth" silently.
- No faces: Use only the information available at the time of the forecast.
5) Simulation
5. 1 Baselines
Naive/Seasonal Naive/Drift - required for honest evaluation.
5. 2 Classic Rows
ETS/ARIMA/SARIMA, TBATS (multiple seasonalities), Prophet (fast start with holidays).
5. 3 Regressors
ARIMAX/ETS + X, dynamic regressions with calendar and promo/FX.
5. 4 Multi-Syrian/Tabular
LightGBM/XGBoost/linear with lags/windows/calendar;
Temporal NN (TFT, N-Beats) for portfolios and long X.
5. 5 Probabilistic
Quantile regression (pinball), Student-t/Gaussian predictions, quantile ensembles for intervals (q10/q50/q90).
5. 6 Hierarchies and reconciliation
Bottom-Up/Top-Down/MinT for the strana→brend→kanal→platforma structure.
6) Specifics of income metrics
Fractions/ratios (margin, commission): model numerator/denominator separately, then compose.
Intermittent components (chargeback, high-roller): Croston/TSB, zero-inflated, individual components with quantiles.
Cannibalization: Model cross-segment flows (multi-output models or restricted regressors) when starting a new activity/product.
Elasticity by price/bonuses: log-log models/causal estimates (DiD/SC) to estimate coefficients, then - what-if.
7) Quality assessment and backtesting
Splits: rolling/expanding origin with seasonality multiplicity (weeks/months).
Level metrics: WAPE/sMAPE (zero resistant), MAE/RMSE.
Probabilistic: pinball loss, coverage 80/95% - intervals.
Stability: errors by segment/holiday/channel; out-of-time.
Baseline rule: the model must overtake Seasonal Naive on key horizons.
8) Scenarios and uncertainty
Quantiles: q10/q50/q90 → "pessimist/base/optimist."
Scenarios X: "no promo/s promo," "FX ± 10%," "major event," "regulatory restrictions."
Risk of metaparameters: stress tests for changes in elasticity and seasonality.
Cost of risk: plan according to the conditional shortfall (punishment for under-forecast/re-forecast is asymmetric).
9) Plan-actual and contribution of factors (revenue bridge)
Show the bridge: trend + seasonality + promo + price/limits + FX + shocks/incidents → final deviation. This increases trust and helps to take action (add a budget, move promo, change pricing).
10) MLOps and operation
Schedule: daily forecasts - T + 1 until 06:00 lock.; weekly - N times a week; monthly - T + 1/T + 3.
Artifacts: fichestor (online/offline parity), register of models, versions of income formulas.
Monitoring: WAPE/coverage by window, feature drift PSI, feed delay, SLA generation.
Alerts: error growth> threshold, uncalibrated intervals, hierarchy breakdown.
Fail-safe: rollback to ETS/Seasonal Naive; freeze mode during peak holidays.
Hysteresis: different thresholds for turning promo regressors on/off so as not to "blink."
Reconciliations: daily/weekly reconciliations with financial statements.
11) Artifact patterns
A. Income forecast passport
KPI: `NET_REVENUE_EUR_v3`
Horizon/step: 8 weeks/day
Units: brand × country × platform × channel; reconciliation: MinT
Регрессоры: `promo_spend`, `content_event_flag`, `price_index`, `fx_rate`, `holiday`
Models: 'ARIMAX _ v2' + 'LightGBM _ Quantiles _ v4' (Ensemble, q10/50/90)
Targets: WAPE ≤ 8% (daily), coverage 90% -interval ≥ 85%
SLO: generation ≤ 10 min after 06:00; data log ≤ 1 hour
Owners: Finance & Growth Analytics; revision date, version
B. Decision-ready report (skeleton)
Headline: "Revenue, Forecast 8 Weeks: q10/q50/q90"
Risks: shortfall in week 3 - 21% (expected shortfall € X- € Y)
Contributing factors: + holidays, + content event, FX −, − promo withdrawal
Recommendations: increase promo in A/B countries, move stock, FX hedge
C. Pseudo-code of the pipeline
python
1) load y = load_revenue_series(grain=['brand','country','platform','channel'], step='D')
X = load_regressors(['promo_spend','content_event','price_idx','fx_rate','holiday'])
2) features ds = make_lags(y, lags=[1,7,14,28])
ds = add_rolling_stats(ds, windows=[7,14,28])
ds = join_regressors(ds, X)
3) cv cv = rolling_backtest(ds, folds=6, horizon=28, step=7)
4) models m_baseline = ETS(). fit(ds. train)
m_gbm = LGBMQuantiles(q=[0. 1,0. 5,0. 9]). fit(ds. train)
m_arimax = ARIMAX(). fit(ds. train)
5) evaluate & ensemble scores = evaluate([m_baseline,m_gbm,m_arimax], cv, metrics=['WAPE','pinball'])
best = ensemble_quantiles([m_gbm,m_arimax])
6) reconcile & publish f = reconcile_minT(forecast(best), hierarchy=['country','brand','platform','channel'])
publish(f, sla='06:10', owners=['Finance','Growth'])
12) Frequent errors and anti-patterns
MAPE at zeros/low values: use WAPE/sMAPE.
Averages: Aggregate numerator/denominator rather than averaging percentages across segments.
Ignoring calendar/content/FX: without regressors, the forecast "fades."
Faces: features from the future or post-factum adjustments in train.
Hierarchy inconsistency - Totals do not converge → apply reconciliation.
No fail-safe: the model "floats" on holidays.
No reconciliation: the forecast does not fit with management/accounting.
13) Pre-release checklist
- Definitions of income and deductions are consistent and versioned
- Calendar/FX/Regressors connected and tested
- Baselines defeated on backtesting; WAPE/coverage targets met
- The intervals are calibrated; pessimist/base/optimist scenarios collected
- Hierarchical forecast agreed (MinT/Top-Down)
- MLOps: schedule, monitoring, alerts, fail-safe, runibook
- Daily/weekly reconciliations with financial supervision/accounting are set up
- decision-ready report with factor and recommendation bridge
Total
Revenue forecasting are consensus definitions + driver decomposition + regressors + probabilistic and hierarchical models + scenarios and intervals + disciplined MLOps and reconciliations. Such an outline turns "schedule divination" into a tool for budget planning, marketing and operations with an understandable cost of risk and transparent actions.