Data Mesh: federated data model
(Section: Technology and Infrastructure)
Brief Summary
Data Mesh is an organizational and technical model where data is treated as products of domain teams, and the central role of the platform is to provide self-service, standards and compliance. For iGaming, this means: the Payments team owns "Deposit Events" and "Net Deposits Mart," the Risk team owns "Fraud Signals," Games owns "Bet Events" and "Leaderboards," and the central platform gives a catalog, contract schemes, accesses, quality monitoring, finops and tools streaming/ELT.
1) Data Mesh principles
1. Domain responsibility: each domain (Payments, Risk, Games, KYC/Compliance, CRM, Affiliate) owns its data sets and their life cycle.
2. Data as a product: each set has an owner, description, SLO, access SLA, documentation, version, feedback and roadmap.
3. Self-serve platform: standard pipelines ingest/transform/serve, templates, default security, directory and observability.
4. Federated management: common standards of schemes, metrics, PII/localization and quality - in the center; implementation and evolution - in domains.
2) Operating model and roles
Domain Data Product Owner (DPO): prioritization, SLO, backlog of data product improvements.
Domain Data Engineer/Analytics Engineer: schematics, pipelines, DQ tests, versioning.
Domain Steward: field semantics, correspondence to the metrics dictionary and PII classification.
Platform Team: catalog, IAM/RBAC, Policy-as-Code, table formats (Delta/Iceberg/Hudi), orchestration, observability, finops.
Federated Governance Board: approves standards (schemes, metrics, security), resolves cross-domain disputes.
3) "Data Product" - passport and artifacts
Minimum data product composition:- Contract (scheme, types, evolution, compatibility).
- Access API (SQL/table, topic/stream, file/share).
- SLA/SLO (freshness, availability, quality).
- DQ tests (uniqueness, ranges, referential integrity).
- Documentation (description of fields, examples of requests, owner, contact).
- Versioning (semantic versioning schemes, deprecate policy).
- Policies (PII, localization, retention/TTL, rights).
Passport template (YAML, example)
yaml name: bets. events. v1 domain: games owner: games-data@company interface:
sql: lakehouse. silver. bets_events stream: kafka://bets. events. v1 share: read-only (EU only)
schema_version: 1. 3. 0 slo:
freshness: "<= 5 min (p95)"
availability: ">= 99. 9%"
dq:
- unique: bet_id
- valid_values: currency in [EUR, USD, TRY, BRL]
- non_negative: [stake, payout]
security:
pii: false region: EU retention: 365d lineage:
sources: [game_engine. outbox, payments. psp. webhooks]
consumers: [crm. triggers, risk. realtime, dwh. fact_bets]
versioning:
compat: backward deprecation_policy: "60 days"
4) Interoperability and standards
Schemes/contracts: Avro/Protobuf/JSON-Schema + Schema Registry; back-compat policy, no breaking changes without a new major version.
Semantic layer: unified definitions of GGR, NGR, Net Deposits, LTV, cohorts - as code (dbt metrics/semantic layer).
Identifiers: global 'player _ id', 'tenant _ id', 'bet _ id', unified country/currency/provider directories.
Metadata: required columns' ingest _ ts', 'schema _ version', 'trace _ id', 'source', 'region'.
Access: SQL (lakehouse/OLAP), stream (Kafka/Pulsar), table/snapshot sharing; the exchange format is Parquet/Delta/Iceberg.
5) Process reference standard (agnostic to vendors)
Ingest: Outbox/CDC из OLTP → Kafka → Lakehouse (Bronze).
Transform: ELT/dbt в Silver/Gold; incremental 'MERGE', SCD, material display cases.
Serve: OLAP (ClickHouse/BigQuery/Snowflake), RT-движки (Pinot/Druid) для near-real-time.
Catalog/Lineage: a single catalog, auto-documentation, dependency graph.
Observability: freshness/SLO metrics, DQ-assert, stream lags, cost.
Policies: IAM/RBAC/ABAC, encryption, localization (zone data routing).
6) SLO/SLA for data products
Examples of target SLOs:- Freshness: Bets Events (p95) ≤ 5 мин; Fraud Signals ≤ 30 sec; Net Deposits Mart ≤ 15 min.
- Availability: ≥ 99. 9% for read interfaces.
- Quality: duplicates ≤ 0. 01%, share of empty required fields ≤ 0. 1%, currency consistency 100%.
- Cost SLO: cost of window scans ≤ N $/day, small files ratio <10%.
7) Safety, PII and localization
Classification: PII/sensitive financial/operational.
Technical measures: encryption at-rest/in-transit; PII tokenization; masking columns; row-level filters by 'tenant _ id'.
Localization: domain products are published in authorized regions (EU/TR/LATAM); cross-border sharing - only units without PII.
Audit: who published/read; Schema version rights escalation requests - through approval.
8) FinOps and Value Management
Budgets by domain: compute limits, overspend alerts.
Storage: storage classes + TTL (Bronze short, Silver medium, Gold long/aggregates).
Query optimization: partitions/clustering, materialized views, BI results cache.
Small files: compaction/OPTIMIZE policies; Target file size is 128-1024 MB.
9) Life cycle and evolution
Versioning: 'domain. product. v{major}`; minor fields - back-compat.
Deprecate: consumer notification, "two-rail" period, automatic alerts to older versions.
Schema changes: Pull Request to contract repository; CI compatibility tests; AutoPublish to Catalog.
Feedback: product channel (issue tracker), consumer NPS, incident response time.
10) Concretization for iGaming - domain and product map
Payments
`payments. psp. webhooks. v1` (stream)
`mart_net_deposits_daily. v1 '(SQL) - SLO freshness ≤ 15 minutes; PII-free
Games
`bets. events. v1 '(stream/SQL) - p95 ≤ 5 min
`mart_ggr_daily. v1 '(SQL/MV) - aggregates by country/game
Risk/Anti-fraud
`risk. signals. v1 '(stream) - p95 ≤ 30 sec
`risk. case_mgmt. v1 '(SQL) - SCD2 Investigation History
CRM/Personalization
`crm. triggers. v1 '(stream) - segment triggers
`profile. features. online. v1 '(KV/SQL) - online features (TTL)
KYC/Compliance
`kyc. status. v1 '(SQL) - PII protected, row-level policies
`responsible_gaming. events. v1 '(stream) - limits/signals
11) Platform processes and artifacts
Directory: search by domain/fields/PII labels, preview of diagrams and examples.
Template generators: cookiecutter for a new product (passport, CI, DQ tests, SLO dashboard).
Policy-as-Code: export rules, PII, sharing between regions.
Observability: ready-made dashboards: Freshness, DQ errors, Cost, Lineage, Stream lag.
Runbooks: incidents of freshness/DQ/schemes, emergency deprecate, rollback of versions.
12) Migration to Data Mesh (roadmap)
1. Inventory of current datasets → grouping by domain.
2. Pilot 2-3 domains (Payments, Games, Risk) - issue as products with passports.
3. Catalog and standards: schematics, metrics, PII/localization, DQ.
4. Self-serve: pipeline templates, CI/CD, SLO monitoring.
5. Cutting monolithic showcases into blast furnace; "two-rail" support for old interfaces.
6. Federated Council - regular sessions, review contract changes.
7. Scale to CRM/Affiliates/Marketing, then Partner Share.
13) Implementation checklist
Domains defined; owners and communication channels are assigned.
Directory started; the passport of each product is published.
Schemas - in the contract repository; CI tests compatibility/DQ.
SLO/SLA declared; Freshness/DQ/Cost dashboards are available.
PII/localization policies - code; audit enabled.
FinOps: budgets, alerts, cost by domain report.
Versioning/deposit process - documented and automated.
Runbooks of incidents - available and trained (game-day).
14) Antipatterns
"Renamed Data Mesh, but all through the central data command" - the narrow neck is not eliminated.
The lack of a single dictionary of metrics → GGR/NGR differ between domains.
Schemes without contracts and compatibility tests → "breaking" releases.
No Self-serve → each table is created manually, high time-to-data.
Ignoring PII/localization in cross-regional sharing.
Micro products without owners/SLO - "abandoned" data.
15) Data Mesh Success KPI
Time-to-Data: from idea to available data product (median ↓).
Reuse: number of consumer domains per product.
Quality: share of successful DQ checks, defects per million events.
Reliability: SLO compliance with freshness/availability.
Cost: $/request/user, share of small files, compute disposal.
Rate of change: circuit/storefront releases per week.
Summary
Data Mesh is not only a technology, but also a managed domain federation, where data is products with its owners, SLO, contracts and quality metrics. In iGaming, this approach removes narrow necks, speeds up integration (anti-fraud, payments, CRM), improves transparency of metrics (GGR/NGR/LTV) and controls cost. Build a strong self-serve platform, introduce federated standards and a data-as-a-product culture, and your analytics ecosystem scales with business - without losing quality, speed, or compliance.