GH GambleHub

Stream vs Batch tahlili

1) qisqacha mazmuni

Stream - bir soniyada sodir bo’lgan voqealarni doimiy ravishda qayta ishlash: antifrod/AML, RG-triggerlar, SLA-alertlar, operativ panellar.
Batch - to’liq takrorlanadigan davriy qayta hisob-kitob: tartibga soluvchi hisobot (GGR/NGR), moliyaviy sverkalar, ML-datasetlar.

Taxminlar: Stream p95 e2e 0. 5-5 s, Batch D + 1 dan 06:00 gacha (lok.) .

2) Tanlov matritsasi (TL; DR)

MezonStreamBatch
SLA reaksiyalarsoniya/daqiqasoat/kun
Toʻliq (completeness)yuqori, lekin mumkin late-tuzatishlarjuda yuqori, nazorat qilinadigan D + 1
«as-of» ning takrorlanuvchanligimurakkabroq (replay)sodda (time-travel/snapshots)
Birlik qiymationlayn yo’ldan qimmatroqhajmiga arzonroq
Namunaviy vazifalarAML/RG alertalar, SRE, real-time vitrinalarhisobotlar, solishtirmalar, ML off-line
Tarixlashtirish (SCD)cheklanganto’liq
Regulyator/WORMGold-qayta tanlashnativ (Gold/D + 1)

80/20-qoida: reaktsiyani talab qilmaydigan hamma narsa <5 daqiqa - Batch; qolganlari - kechasi Batch validatsiyasiga ega bo’lgan Stream.

3) Arxitektura

3. 1 Lambda

Konsolidatsiya uchun onlayn + Batch uchun oqim. Plyus: moslashuvchanlik. Minus: ikkita mantiq.

3. 2 Kappa

Hammasi oqimga oʻxshaydi; Batch = log orqali «replay». Plyus: yagona kod. Minus: repleyning murakkabligi/qiymati.

3. 3 Lakehouse-Hybrid (tavsiya etilgan)

Stream → operativ OLAP-mart (daqiqa) va Bronze/Silver; Batch Gold (D + 1) ni qayta tanlaydi va hisobotlarni nashr etadi.

4) Ma’lumotlar va vaqt

Stream

Oynalar: tumbling/hopping/session.
Watermarks: 2-5 daqiqa; late data belgilanadi va emitatsiya qilinadi.
Stateful: CEP, dedup, TTL.

Batch

Inkrementlar/CDC:’updated _ at’, log-replikatsiya.
SCD I/II/III: atributlar tarixi.
Snapshotlar: «as-of» uchun kunduzgi/oylik qatlamlar.

5) iGaming’da qo’llash patternlari

AML/Antifrod: Stream (velocity/strukturalash) + Batch solishtirmalar va keyslar.
Responsible Gaming: Stream limitlar/o’z-o’zidan istisnolarni nazorat qilish; Batch hisobot reyestrlari.
Operatsiyalar/SRE: Stream alert SLA; Batch hodisalar va trendlarni post-tahlil qilish.
Mahsulot/marketing: Stream personalizatsiya/missiyalar; Batch kogortlari/LTV.
Moliya/hisobotlar: Batch (Gold D + 1, WORM-paketlar), Stream - operativ panellar.

6) DQ, takrorlanuvchanlik, reple

Stream DQ: sxemalar validatsiyasi, dedup’(event_id, source)’, completeness oyna, late-ratio, dup-rate; tanqidiy → DLQ.
Batch DQ: noyoblik/FK/range/temporal, OLTP/provayderlar bilan solishtirish; tanqidiy → fail job + hisobot.

Takrorlanuvchanlik:
  • Stream: + deterministik transformatsiya diapazoni boʻyicha topiklarni takrorlash.
  • Batch: time-travel/mantiq versiyasi (’logic _ version’) + Gold snapshotlari.

7) Xususiy va rezidentlik

Stream: taxalluslashtirish, online-niqoblash, mintaqaviy konveyerlar (EEA/UK/BR), tashqi PII-lookups uchun taymautlar.
Batch: PII-mappinglar, RLS/CLS, DSAR/RTBF, Legal Hold, WORM-arxivlarni izolyatsiya qilish.

8) Cost-injiniring

Stream: «issiq» kalitlardan qochish (salting), async lookups, TTL holatlarini cheklash, oldindan agregatsiya qilish.
Batch: partizatsiya/klaster, small files kompaksiyasi, barqaror agregatlarni materiallashtirish, kvota/ishga tushirish oynalari.

9) Misollar

9. 1 Stream - Flink SQL (10-min velocity depozitlar)

sql
SELECT user_id,
TUMBLE_START(event_time, INTERVAL '10' MINUTE) AS win_start,
COUNT() AS deposits_10m,
SUM(amount_base) AS sum_10m
FROM stream. payments
GROUP BY user_id, TUMBLE(event_time, INTERVAL '10' MINUTE);

9. 2 Stream - CEP (AML psevdokod)

python if count_deposits(10MIN) >= 3 and sum_deposits(10MIN) > THRESH \
and all(d. amount < REPORTING_LIMIT for d in window):
emit_alert("AML_STRUCTURING", user_id, snapshot())

9. 3 Batch - MERGE (Silver inkrement)

sql
MERGE INTO silver. payments s
USING stage. delta_payments d
ON s. transaction_id = d. transaction_id
WHEN MATCHED THEN UPDATE SET
WHEN NOT MATCHED THEN INSERT;

9. 4 Batch — Gold GGR (D+1)

sql
CREATE OR REPLACE VIEW gold. ggr_daily AS
SELECT
DATE(b. event_time) event_date,
b. market, g. provider_id,
SUM(b. stake_base) stakes_eur,
SUM(p. amount_base) payouts_eur,
SUM(b. stake_base) - SUM(p. amount_base) ggr_eur
FROM silver. fact_bets b
LEFT JOIN silver. fact_payouts p
ON p. user_pseudo_id = b. user_pseudo_id
AND p. game_id = b. game_id
AND DATE(p. event_time) = DATE(b. event_time)
JOIN dim. games g ON g. game_id = b. game_id
GROUP BY 1,2,3;

10) Metrika va SLO

Stream

p95 ingest→alert ≤ 2–5 c completeness окна ≥ 99. 5%

schema-errors ≤ 0. 1%

late-ratio ≤ 1%

foydalanish imkoniyati ≥ 99. 9%

Batch

Gold. daily soat 06:00 gacha tayyor.

completeness ≥ 99. 5%

validity ≥ 99. 9%

MTTR DQ-hodisa ≤ 24-48 soat

11) Test va relizlar

Kontraktlar/sxemalar: consumer-driven tests; back-compat CI.
Stream: kanareya qoidalari, qorong’u ishga tushirish, replay simulyatori.
Batch: namunalarda dry-run, metriklarni solishtirish, nazorat yig’indisi (reconciliation).

12) Anti-patternlar

Mantiqni takrorlash: formulalarni tekislamagan holda turli xil Stream va Batch hisob-kitoblari.
Kesh/taymautsiz Stream issiq yoʻlidagi sinxron tashqi API.
Full reload «har qanday holatda» inkrementlar o’rniga.
Watermarks/late siyosati mavjud emas.
tahliliy qatlamlarda PII; CLS/RLS yo’qligi.
Gold-vitrinalar, ular orqaga qaytadi.

13) Tavsiya etilgan gibrid (pleybuk)

1. Stream-kontur: ingest → shina → Flink/Beam (watermarks, dedup, CEP) →

1-5 daqiqalik panellar + Bronze/Silver (append) uchun OLAP (ClickHouse/Pinot).
2. Batch-kontur: inkrementlar/CDC → Silver normalizatsiya/SCD → Gold sutkalik vitrinalar/hisobotlar (WORM).
3. Kelishish: metriklarning yagona semantik qatlami; nightly solishtirmalar Stream Batch; tafovutlar> chegara → tiketlar.

14) RACI

R (Responsible): Streaming Platform (Stream-infra), Data Engineering (Batch modellari), Domain Analytics (metrika/qoidalar), MLOps (fichi/Feature Store).
A (Accountable): Head of Data / CDO.
C (Consulted): Compliance/Legal/DPO, Finance (FX/GGR), Risk (RG/AML), SRE (SLO/стоимость).
I (Informed): BI/Mahsulot/Marketing/Operatsiyalar.

15) Yo’l xaritasi

MVP (2-4 hafta):

1. Kafka/Redpanda + 2 tanqidiy topika (’payments’,’auth’).

2. Flink-joba: watermark + dedup + 1 CEP-qoida (AML yoki RG).

3. OLAP-vitrin 1-5 daqiqa + dashbordlar lag/late/dup.

4. Lakehouse Silver (ACID), birinchi Gold. ggr_daily (D + 1 dan 06:00 gacha).

2-bosqich (4-8 hafta):
  • Inkrementlar/CDC, SCD II, metriklarning semantik qatlami.
  • Oqimli DQ va nightly taqqoslash Stream Batch.
  • Hududlashtirish (EEA/UK/BR), DSAR/RTBF, Legal Hold.
3-faza (8-12 hafta):
  • Replay-simulyator, canary/A-B qoidalar/metriklar relizlari.
  • Cost-dashbordlar va kvotalar; tiered storage; DR mashqlari.
  • Vitrin/metrik va lineage hujjatlarini avtogeneratsiya qilish.

16) Joriy etish chek-varaqasi

  • Registridagi sxemalar/kontraktlar; back-compat testlari yashil.
  • Stream: watermarks/allowed-lateness, дедуп, DLQ; Prodda OLAP panellari.
  • Batch: qo’shimcha/CDC, SCD II, Gold D + 1 va WORM eksporti.
  • Metriklarning yagona semantik qatlami; nightly solishtirmalar Stream Batch.
  • DQ-dashbordlar Freshness/Completeness/Validity; alertlar lag/late/dup.
  • RBAC/ABAC, shifrlash, rezidentlik; DSAR/RTBF/Legal Hold.
  • Nazorat ostidagi narx (cost/GB, cost/query, state size, repley kvotalangan).

17) Jami

Stream va Batch raqobatchi emas, balki bitta g’ildirakning ikkita g’ildiragi. Stream «bu erda va hozir», Batch - «ertalabki haqiqat». Lakehouse gibrid yondashuvi, metrikalarning yagona qatlami va DQ/lineage intizomi SLA va qiymati bo’yicha maqbul bo’lgan tezkor, takrorlanadigan va komplayent tahliliy konturlarni qurish imkonini beradi.

Contact

Biz bilan bog‘laning

Har qanday savol yoki yordam bo‘yicha bizga murojaat qiling.Doimo yordam berishga tayyormiz.

Integratsiyani boshlash

Email — majburiy. Telegram yoki WhatsApp — ixtiyoriy.

Ismingiz ixtiyoriy
Email ixtiyoriy
Mavzu ixtiyoriy
Xabar ixtiyoriy
Telegram ixtiyoriy
@
Agar Telegram qoldirilgan bo‘lsa — javob Email bilan birga o‘sha yerga ham yuboriladi.
WhatsApp ixtiyoriy
Format: mamlakat kodi va raqam (masalan, +998XXXXXXXX).

Yuborish orqali ma'lumotlaringiz qayta ishlanishiga rozilik bildirasiz.