Akym vs Batch derňewi
1) Gysgaça mazmuny
Akym - wakalary sekuntda yzygiderli gaýtadan işlemek: antifrod/AML, RG-triggerler, SLA-alertler, operasiýa panelleri.
Batch - doly köpeldilmegi bilen döwürleýin gaýtadan hasaplama: kadalaşdyryjy hasabat (GGR/NGR), maliýe swerkleri, ML-datasetler.
Görkezmeler: Stream p95 e2e 0. 5-5 s, Batch D + 1 06: 00-a çenli (lok.) .
2) Saýlama matrisa (TL; DR)
80/20 düzgüni: reaksiýany talap etmeýän zatlaryň hemmesi <5 minut - Batch; galanlary - Akym, gijeki Batch tassyklamasy bilen.
3) Arhitektura
3. 1 Lambda
Konsolidasiýa üçin onlaýn + Batch üçin akym. Goşmaça: çeýeligi. Minus: iki logika.
3. 2 Kappa
Hemme zat akym ýaly; Batch = log arkaly "gürlemek". Goşmaça: ýekeje kod. Minus: repleýleriň çylşyrymlylygy/bahasy.
3. 3 Lakehouse-Hybrid (maslahat berilýär)
Akym → operatiw OLAP-mart (minut) we Bronze/Kümüş; Batch Gold (D + 1) -ni gaýtadan toplaýar we hasabatlaryny çap edýär.
4) Maglumatlar we wagt
Stream
Penjireler: tumbling/hopping/session.
Watermarks: 2-5 minut; late data bellik edilýär.
Stateful: CEP, dedup, TTL.
Batch
Inkrementler/CDC: 'updated _ at', log-replikasiýa.
SCD I/II/III: atributlaryň taryhy.
Snapshotlar: "as-of" üçin gündelik/aýlyk gatlaklar.
5) iGaming-de ulanmak patternleri
AML/Antifrod: Stream (velocity/gurluş) + Batch barlyşyklary we ýagdaýlary.
Responsible Gaming: Stream çäklendirmelere/öz-özüni aýyrmalara gözegçilik etmek; Batch hasabat sanawlary.
Amallar/SRE: SLA alertleri akymy; Wakalary we tendensiýalary seljermek.
Önüm/marketing: Akym şahsylaşdyrmak/missiýa; Batch cogorts/LTV.
Maliýe/Hasabatlar: Batch (Gold D + 1, WORM-paketler), Stream - iş panelleri.
6) DQ, köpeltmek, bellik etmek
Stream DQ: shemalary tassyklamak, dedup '(event_id, source)', penjireleri tamamlamak, late-ratio, dup-rate; kritiki → DLQ.
Batch DQ: özboluşlylygy/FK/range/temporal, OLTP/üpjün edijiler bilen deňeşdirmeler; kritiki → fail job + hasabat.
- Akym: topikleri deterministik üýtgetmek diapazony boýunça çalmak.
- Batch: time-travel/logika wersiýasy ('logic _ version') + snapshotlar Gold.
7) Gizlinlik we rezidentlik
Akym: lakamlaşdyrma, onlaýn-maskalanma, sebitleýin konweýerler (EEA/UK/BR), daşarky PII-lookups üçin wagtlar.
Batch: PII-mapping izolýasiýasy, RLS/CLS, DSAR/RTBF, Legal Hold, WORM-arhiwleri.
8) Cost-in engineering
Akym: "gyzgyn" açarlardan gaça durmak (salting), async lookups, TTL ýagdaýlaryny çäklendirmek, deslapky agregasiýa.
Batch: partizasiýa/klaster, small files kompaksiýa, durnukly agregatlaryň materiallaşdyrylmagy, kwotalar/başlangyç penjireleri.
9) Mysallar
9. 1 Stream - Flink SQL (10-minutlyk welocity depozitleri)
sql
SELECT user_id,
TUMBLE_START(event_time, INTERVAL '10' MINUTE) AS win_start,
COUNT() AS deposits_10m,
SUM(amount_base) AS sum_10m
FROM stream. payments
GROUP BY user_id, TUMBLE(event_time, INTERVAL '10' MINUTE);
9. 2 Stream - CEP (AML psevdokod)
python if count_deposits(10MIN) >= 3 and sum_deposits(10MIN) > THRESH \
and all(d. amount < REPORTING_LIMIT for d in window):
emit_alert("AML_STRUCTURING", user_id, snapshot())
9. 3 Batch - MERGE (Silver inkrement)
sql
MERGE INTO silver. payments s
USING stage. delta_payments d
ON s. transaction_id = d. transaction_id
WHEN MATCHED THEN UPDATE SET
WHEN NOT MATCHED THEN INSERT;
9. 4 Batch — Gold GGR (D+1)
sql
CREATE OR REPLACE VIEW gold. ggr_daily AS
SELECT
DATE(b. event_time) event_date,
b. market, g. provider_id,
SUM(b. stake_base) stakes_eur,
SUM(p. amount_base) payouts_eur,
SUM(b. stake_base) - SUM(p. amount_base) ggr_eur
FROM silver. fact_bets b
LEFT JOIN silver. fact_payouts p
ON p. user_pseudo_id = b. user_pseudo_id
AND p. game_id = b. game_id
AND DATE(p. event_time) = DATE(b. event_time)
JOIN dim. games g ON g. game_id = b. game_id
GROUP BY 1,2,3;
10) Metrikler we SLO
Akym
p95 ingest→alert ≤ 2–5 c completeness окна ≥ 99. 5%
schema-errors ≤ 0. 1%
late-ratio ≤ 1%
elýeterlilik ≥ 99. 9%
Batch
Gold. daily 06: 00-a çenli taýýar.
completeness ≥ 99. 5%
validity ≥ 99. 9%
MTTR DQ-hadysasy ≤ 24-48 sagat
11) Synagdan geçirmek we goýbermek
Şertnamalar/shemalar: consumer-driven tests; back-compat CI.
Akym: kanareýa düzgünleri, garaňky başlangyç, replay-simulýator.
Batch: nusgalarda dry-run, metrikleri deňeşdirmek, gözegçilik jemlemek (reconciliation).
12) Anti-patternler
Logikany köpeltmek: formulalary tekizlemezden dürli Akym we Batch hasaplamalary.
Akym gyzgyn ýolunda kesişsiz/wagtsyz sinhron daşarky API.
Doly reload "mümkin boldugyça" inkrementleriň ýerine.
Watermarks/late-syýasatyň ýoklugy.
seljeriş gatlaklarynda PII; CLS/RLS ýoklugy.
Yzly-yzyna "üýtgeýän" altyn penjireler.
13) Maslahat berlen gibrid (pleýbuk)
1. Akym kontur: ingest → teker → Flink/Beam (watermarks, dedup, CEP) →
1-5 minutlyk paneller üçin OLAP (ClickHouse/Pinot) + Bronze/Silver (append).
2. Batch kontury: inkrementler/CDC → Kümüş kadalaşma/SCD → Altyn gündelik penjireler/hasabatlar (WORM).
3. Ylalaşmak: metrikleriň ýeke-täk semantik gatlagy; nightly Stream Batch barlagy; gapma-garşylyklar> bosagasy → biletler.
14) RACI
R (Responsible): Streaming Platform (Stream-infra), Data Engineering (Batch modelleri), Domain Analytics (metrikler/düzgünler), MLOps (Fichi/Feature Store).
A (Accountable): Head of Data / CDO.
C (Consulted): Compliance/Legal/DPO, Finance (FX/GGR), Risk (RG/AML), SRE (SLO/стоимость).
I (Informed): BI/Önüm/Marketing/Amallar.
15) Ýol kartasy
MVP (2-4 hepde):1. Kafka/Redpanda + 2 kritiki topika ('payments', 'auth').
2. Flink-job: watermark + dedup + 1 CEP-düzgüni (AML ýa-da RG).
3. OLAP-vitrin 1-5 min + daşbordlar lag/late/dup.
4. Lakehouse Silver (ACID), ilkinji Gold. ggr_daily (D + 1 - 06:00).
2-nji faza (4-8 hepde):- Domenler boýunça Inkrementler/CDC, SCD II, metrikleriň semantik gatlagy.
- Akym DQ we Akym Batch.
- Sebitleşdirmek (EEA/UK/BR), DSAR/RTBF, Legal Hold.
- Simulýator repleýi, kanary/A-B düzgünleriň/metrikleriň çykarylyşy.
- Cost-daşbordlar we kwotalar; tiered storage; DR-maşklar.
- Vitrin/metrik we lineage resminamalarynyň awtogenerasiýasy.
16) Girizmegiň çek-sanawy
- Registriýadaky shemalar/şertnamalar; back-compat synaglary ýaşyl.
- Stream: watermarks/allowed-lateness, дедуп, DLQ; OLAP panelleri.
- Batch: WORM eksporty bilen inkrementler/CDC, SCD II, Gold D + 1.
- Metrikleriň ýekeje semantik gatlagy; Akym Batch.
- DQ-dashbordlary Freshness/Completeness/Validity; alertler lag/late/dup.
- RBAC/ABAC, şifrlemek, rezidentlik; DSAR/RTBF/Legal Hold.
- Gözegçilik astynda bahasy (cost/GB, cost/query, state size, repleýler kwotalar).
17) Jemleýji
"Stream" we "Batch" bäsdeşler däl-de, bir hereketlendirijiniň iki sany dişli güýji. Akym "şu ýerde we häzir" reaksiýasyny berýär, Batch - "irden barlanylýan hakykat". "Lakehouse" gibrid çemeleşmesi, metrikleriň bir gatlagy we DQ/lineage düzgüni, SLA we bahasy boýunça iň amatly çalt, köpeldilip bilinýän we oňat analitik konturlary gurmaga mümkinçilik berýär.