GH GambleHub

Maʼlumotlarni qayta ishlash

1) Maqsadi va qiymati

Batch-konveyerlar quyidagilar uchun ishonchli kundalik/soatbay vitrinalarni shakllantiradi:
  • Tartibga solish va moliyaviy hisobotlar (GGR/NGR, soliqlar, RG/AML reyestrlari).
  • BI va oziq-ovqat analitikasi (kogortlar, LTV, konversion hunilar).
  • Aniqlikni solishtirish (OLTP, DWH, provayderlar/PSP), tarixlashtirish (SCD).
  • ML uchun fichlar va o’quv to’plamlarini tayyorlash.

Asosiy xossalari: bashorat qilish, to’liqlik, takrorlanuvchanlik, ma’lumotlar birligiga past qiymat.

2) Arxitektura (referens)

1. Ingest (raw capture): HTTP/gRPC, OLTP dan CDC, provayder yuklamalari → Bronze.
2. Lakehouse: Bronze (raw, append-only) → Silver (clean/conform) → Gold (serve).
3. Orkestr: Airflow/Dagster/Prefect (DAG’i, qaramlik, retray, SLA).
4. Ishlov berish: Spark/Trino/DBT/SQL-dvigatellar; partizatsiya va ACID formatlari (Delta/Iceberg/Hudi).
5. DQ va Kontraktlar: Schema Registry, DQ-qoidalar (YAML/SQL), consumer-tests.
6. Serving: BI/semantik qatlam, hisobot eksportlari (CSV/PDF/JSON + hash), API/GraphQL.
7. Kuzatilganlik: payplaynlar metrikasi, lineage, loglar, qiymati (cost/GB, cost/query).

3) Chastotalar va SLAs

Kundalik (D + 1 dan 06:00 gacha) : GGR hisobotlari, tartibga soluvchi yuklamalar, solishtirmalar.
Soatbay/kvazirealtaym: Ops/Moliya uchun operativ panellar.
Haftalik/oylik: moliyaviy konsolidatsiya, modellar va retro-protsesslar.

Tavsiya etilayotgan SLOlar:
  • Gold-kundalik vitrinalar mahalliy vaqt bilan soat 06:00 gacha tayyor.
  • Freshness Silver p95 ≤ 15 min mikrobatchey uchun/ ≤ 2 soat kunduzgi uchun.
  • Completeness ≥ 99. 5%, Validity (sxema) ≥ 99. 9%.

4) Inkremental yuklamalar va CDC

Yondashuvlar:
  • CDC (Change Data Capture): Debezium/log replikatsiyasi → Bronze → Silver.
  • Watermark vaqti boʻyicha:’updated _ at> max_loaded_ts'.
  • Xesh taqqoslash:’md5 (row)’oʻzgarishlar detektori uchun.
  • Upsert/Merge: Silver/Gold.
MERGE (Delta/Iceberg) misoli:
sql
MERGE INTO silver. payments AS s
USING staging. payments_delta AS d
ON s. transaction_id = d. transaction_id
WHEN MATCHED THEN UPDATE SET
WHEN NOT MATCHED THEN INSERT;

5) SCD (o’lchovlarni tarixlashtirish)

SCD I: qayta yozish (imlo, minor tuzatishlar).
SCD II: Toʻliq funksional tarix (’valid _ from/valid _ to/is _ current’).
SCD III: qisqacha taqqoslash uchun «oldin/keyin».

SCD II (misol):
sql
MERGE INTO dim. users_scd t
USING stage. users u
ON t. user_pseudo_id = u. user_pseudo_id AND t. is_current = TRUE
WHEN MATCHED AND (t. country <> u. country OR t. rg_status <> u. rg_status)
THEN UPDATE SET t. is_current = FALSE, t. valid_to = CURRENT_TIMESTAMP
WHEN NOT MATCHED
THEN INSERT (user_pseudo_id, country, rg_status, valid_from, valid_to, is_current)
VALUES (u. user_pseudo_id, u. country, u. rg_status, CURRENT_TIMESTAMP, NULL, TRUE);

6) Backfill и Reprocessing

Backfill: birlamchi toʻldirish/qoʻshimcha yuklash.
Reprocessing: mantiq/maʼlumotlar tuzatilgandan keyin vitrinalarni qayta hisoblash.

Prinsiplar:
  • Idempotentlik (MERGE/upsert), Bronze o’zgarmasligi, mantiqni versiyalash.
  • takroriy progonlar uchun Time-travel; meta ma’lumotlar snapshotlari.
  • Guardrails: diapazonlar, kvotalar va raqobatbardosh joblarni cheklash.
  • Hujjatlar: runbook bosqichlari va tugash mezonlari.

7) Qatlamlarni modellashtirish

Bronze:
  • Append-only,’event _ date’,’jurisdiction’,’tenant’partiyalari.
  • Boshlang’ich payload’ni saqlaymiz (forensika uchun),’ingested _ at’ni o’rnatamiz.
Silver:
  • Normallashtirish va standartlashtirish: FK/ma’lumotnomalar, dedup, FX/taymzonlar.
  • Faktlar/o’lchovlar jadvallari (3NF/BCNF), asosiy o’lchovlar uchun SCD.
Gold:
  • BI/regulyator/moliya ostidagi denormallashtirilgan vitrinalar, tayyorgarlikning SLA.
  • Agregatlarni materiallashtirish; o’zgarmas eksport artefaktlari (hash + WORM).

8) Ma’lumotlar sifati (DQ-kod sifatida)

Silver uchun YAML qoidalari namunasi:
yaml table: silver. payments slo:
freshness_minutes: 15 completeness_percent: 99. 5 rules:
- name: amount_positive type: range column: amount_base min: 0. 01 severity: critical
- name: currency_whitelist type: in_set column: currency set: [EUR,USD,GBP,TRY,BRL]
severity: major
- name: unique_tx type: unique columns: [transaction_id]
severity: critical
- name: fk_user type: foreign_key column: user_pseudo_id ref_table: dim. users_scd severity: critical

Reaksiya siyosati: critical → fail job + DLQ; major/minor → tag + hisobot.

9) Semantik qatlam va hisobot

Semantic-layer/metrics-store da metriklarning yagona aniqliklari (GGR/NGR, ARPPU, Retention).
Metriklarni versionlash; BI/eksport paketlari bilan integratsiya qilish.
Hisobotlar: CSV/JSON/PDF + sha256, tushirish jurnali va zarur hollarda Legal Hold.

10) Maxfiylik, rezidentlik, xavfsizlik

PII-minimallashtirish: foydalanuvchilarni taxalluslashtirish; mapping - alohida himoyalangan konturda.
Data residency: EEA/UK/BR bo’yicha alohida kataloglar/kalitlar; huquqiy asossiz kross-mintaqaviy join’onlarni taqiqlash.
Shifrlash: TLS in-transit; KMS/CMK at-rest; eksportni nazorat qilish.
DSAR/RTBF: hisoblab chiqiladigan proyeksiyalar, selektiv tahrirlash; kirish auditi.
Legal Hold: tartibga soluvchi artefaktlar uchun WORM arxivlari.

11) Unumdorlik va qiymat

Sana/bozor/tenant bo’yicha partiyalashtirish; Tez-tez uchraydigan predikatlar boʻyicha Z-order/cluster.
Formatlar: Parquet + ACID jadvallari; kompressiya/statistika, OPTIMIZE/VACUUM.
Materiallashtirish: Goldda barqaror agregatsiyalar; «monolit» joblardan qochish.
Kvotalar/budjetlar: jamoalar bo’yicha chargeback; backfill/og’ir so’rovlar uchun limitlar.
Rejalashtirish: past yuklamali derazalar (kecha/dam olish kunlari), navbatlarning ustuvor yo’nalishlari.

12) Kuzatuv va boshqaruv

Payplaynlar metrikasi: duration, success rate, retries, rows processed, cost/query.
DQ-metriklar: completeness, validity, uniqueness, FK-xatolar, drift.
Freshness heatmap: domenlar va bozorlar bo’yicha; SLA dashbordlari.
Lineage: Bronzadan hisobotlargacha; oʻzgarishlardan oldin impact-tahlil.
Alertlar: SLO-byudjetlar, DQ tanazzullari, kechikishlar, qiymatning oshishi.

13) SQL/modellar namunalari

Valyutalarni normallashtirish (Silver):
sql
CREATE OR REPLACE TABLE silver. payments AS
SELECT p. transaction_id,
p. user_pseudo_id,
p. currency,
p. amount_orig,
r. rate AS fx_rate_used,
p. amount_orig r. rate AS amount_base,
p. market,
CAST(p. event_time AS TIMESTAMP) AS event_time
FROM bronze. payment_events p
JOIN dim. fx_rates r
ON r. date = DATE(p. event_time)
AND r. ccy_from = p. currency AND r. ccy_to = 'EUR';
GGR (Gold) ning kundalik vitrini:
sql
CREATE OR REPLACE VIEW gold. ggr_daily AS
SELECT
DATE(b. event_time) AS event_date,
b. market,
g. provider_id,
SUM(b. stake_base) AS stakes_eur,
SUM(p. amount_base) AS payouts_eur,
SUM(b. stake_base) - SUM(p. amount_base) AS ggr_eur
FROM silver. fact_bets b
LEFT JOIN silver. fact_payouts p
ON p. user_pseudo_id = b. user_pseudo_id
AND p. game_id = b. game_id
AND DATE(p. event_time) = DATE(b. event_time)
JOIN dim. games g ON g. game_id = b. game_id
GROUP BY 1,2,3;
To’liqlikni nazorat qilish (DQ SQL):
sql
SELECT market, event_date, COUNT() AS n
FROM silver. fact_bets
GROUP BY market, DATE(event_time) AS event_date
HAVING n = 0;

14) Jarayonlar va RACI

R (Responsible): Data Engineering (DAG’i, Silver/Gold modellari), Data Platform (infra, sxemalar registri, DQ).
A (Accountable): Head of Data / Chief Data Officer.
C (Consulted): Compliance/Legal/DPO (PII/retention), Finance (FX/GGR), Risk (RG/AML), SRE (SLO/стоимость).
I (Informed): BI/Mahsulot/Marketing/Operatsiyalar.

15) Joriy etish yo’l xaritasi

MVP (4-6 hafta):

1. Lakehouse Bronze/Silver (ACID formati), CDC/2-3 domen uchun inkrementlar.

2. DQ-kod sifatida: Payments/Gameplay + CI-validatsiya uchun 10-15 qoidalar.

3. Birinchi Gold-vitrin (GGR Daily) SLA dan 06:00 gacha; hisobot eksporti + hash.

4. Dashbordlar Freshness/Completeness/Cost, bazaviy alertlar.

2-faza (6-12 hafta):
  • SCD II для users/games/providers; domenlarni kengaytirish.
  • Metriklarning semantik qatlami; OLTP/provayderlar (accuracy) bilan taqqoslash.
  • backfill/reprocessing, lineage va impact-tahlil, hududlashtirish (EEA/UK) tartib-taomillari.
3-faza (12 + hafta):
  • Oʻzgarishlarni avtosimulyatsiya qilish (dry-run), byudjetlar/kvotalar, chargeback.
  • Avtomatik hujjatlar (data product pages), DR-mashqlar va time-travel-tiklash.
  • Qiymatni optimallashtirish (klaster, materiallashtirish, TTL, vakuum).

16) Sotishdan oldingi chek-varaq

  • Registridagi kontraktlar va sxemalar, muvofiqlik testlari yashil rangda.
  • Inkremental yuklash/CDC ishlamoqda, MERGE idempotentdir.
  • DQ qoidalari faol; critical → fail + DLQ; qoidabuzarliklar to’g "risidagi hisobot.
  • SLA/yangi/to’liq dashbordlar; Alertlar sozlangan.
  • PII/DSAR/RTBF/Legal Hold siyosati Legal/DPO tomonidan tasdiqlangan.
  • Runbook’va backfill/reprocessing/DR sinovdan o’tkazildi.
  • Nazorat ostidagi narx (cost/query, cost/GB, kvoutlar).

17) Anti-patternlar va qanday qilib

Monolit tungi joblar: mustaqil qadamlarga bo’ling, partiyalarga parallel bo’ling.
Full-reload kerak emas :/CDC/merji inkrementlaridan foydalaning.
Tahlilda PII aralashmasi: mappinglarni alohida saqlang, CLS/RLS qo’llang.
DQ/lineage yo’qligi: DQ kodini kiriting va kelib chiqishini kuzating.
«Qo’lda ishlatiladigan» backfill’lar: avtomatlashtirish va hujjatlashtirish, diapazonlarni cheklash.
Boshqarilmaydigan qiymat: klasterlashtirish, materiallashtirish, retenshn-siyosat.

18) Lugʻat (qisqacha)

CDC - OLTP’dan oʻzgarishlarni olib tashlash.
SCD - sekin o’zgaruvchan o’lchovlar (I/II/III).
Lakehouse - data lake + ACID jadvallari.
MERGE/Upsert - indempotent yangilash operatsiyalari.
Time-travel - jadvallarning tarixiy versiyalarini o’qish.
WORM - artefaktlarni o’zgarmas saqlash.

19) Jami

Paketda qayta ishlash - bu bashorat qilinadigan, takrorlanadigan va komplayent konveyerlarning intizomi. Schema-first, inkrementlar/CDC, SCD-tarixlashtirish, DQ-kod, kuzatuv va ongli iqtisodiyot printsiplariga amal qilib, siz istalgan vaqtda tekshiriladigan va auditga tayyor bo’lgan barqaror Gold-vitrinalar va hisobotlarni olasiz.

Contact

Biz bilan bog‘laning

Har qanday savol yoki yordam bo‘yicha bizga murojaat qiling.Doimo yordam berishga tayyormiz.

Integratsiyani boshlash

Email — majburiy. Telegram yoki WhatsApp — ixtiyoriy.

Ismingiz ixtiyoriy
Email ixtiyoriy
Mavzu ixtiyoriy
Xabar ixtiyoriy
Telegram ixtiyoriy
@
Agar Telegram qoldirilgan bo‘lsa — javob Email bilan birga o‘sha yerga ham yuboriladi.
WhatsApp ixtiyoriy
Format: mamlakat kodi va raqam (masalan, +998XXXXXXXX).

Yuborish orqali ma'lumotlaringiz qayta ishlanishiga rozilik bildirasiz.