GH GambleHub

Data access interfaces

1) Why a thoughtful interface

Speed ​ ​ and predictability: business metrics and reports fit into SLAs, without "manual uploads."

Security and privacy: PII/biometrics under control, k-anonymity, geo-boundaries.
Flexibility: different customers (BI, services, partners, DS/ML) get exactly what they need.
Reuse "data as a product" with contracts and versions.

2) Interface map (when what)

SQL/ANSI + vendor dialects: interactive analytics, BI, ad-hoc.
REST JSON: stable aggregates and operational data, integrations with partners.
GraphQL: flexible "selective" reading and navigation graph (dimensions/facts).
gRPC (protobuf): low latency of online surfing (Feature Store, scoring).
Arrow Flight/Parquet over HTTP/S3-presigned: fast column dumps for DS/ML.
OData: enterprise tools, table-as-a-service model.
Streams (Kafka/Pulsar) + CDC/Webhooks: real-time events, reactive integrations.
Federation (Trino/Presto): Single entry point to multiple sources.

Rule: aggregates and stable slices → REST/MV, rich arbitrary queries → SQL, low latency/online features → gRPC, flexible response form → GraphQL, mass binary exchange → Arrow/Parquet.

3) Contracts and versions (semver)

`MAJOR. MINOR. PATCH 'for each API/schema/event.
MAJOR: incompatible changes (new path/topic/table).
MINOR: Compatible field/argument additions.
PATCH: edit descriptions/limits.
Contracts are fixed: scheme, filters, limits, privacy, SLO.

OpenAPI (fragment, REST metrics):
yaml openapi: "3. 0. 3"
info: {title: "Analytics API", version: "2. 4. 0"}
paths:
/v2/payments/metrics:
get:
parameters:
- {name: brand, in: query, schema: {type: string}, required: true}
- {name: country, in: query, schema: {type: string}}
- {name: from, in: query, schema: {type: string, format: date-time}}
- {name: to, in: query, schema: {type: string, format: date-time}}
- {name: group_by, in: query, schema: {type: string, enum: [psp,status,day]}}
- {name: limit, in: query, schema: {type: integer, default: 500}}
responses:
"200": {description: "OK"}
x-slo: {p95_latency_ms: 1200, freshness_max: "PT5M"}
x-privacy: {pii: false, min_group_size: 20}

4) Access to analytics (SQL and federation)

SQL gateway with roles/masks (row/column-level security).
Blizzards/BI projections: stable names and semantics; heavy requests go to preaggregation.
Federation (Trino/Presto): single entry point, but with policies: what directories and what features are available.
Lakehouse (Iceberg/Delta/Hudi): time-travel, snapshot-retrievals via SQL/REST.
Квоты: scanned bytes/query, concurrency, wall-time.

5) GraphQL (flexible form)

We give the client to collect the desired field, but execute over the prepared blizzards/projections, with depth/bone limits.

graphql type Query {
payments(
brand: String!, country: String, from: DateTime!, to: DateTime!,
first: Int = 200, after: String
): PaymentConnection
}

Policies: depth ≤ 5, total nodes ≤ 5k, prohibit arbitrary regex/like by lines; we cache frequent requests.

6) gRPC/Feature Store (low latency)

Online features for scoring anti-fraud/recommendations/RG.

proto service FeatureStore {
rpc GetFeatures (FeatureRequest) returns (FeatureResponse);
}
message FeatureRequest { string user_tok = 1; repeated string features = 2; }
message FeatureResponse { map<string, FeatureValue> values = 1; int64 ts_micros = 2; }

Requirements: p95 ≤ 50-100 ms, exact offlayn↔onlayn consistency, TTL feature, LRU cache, idempotency and mTLS.

7) Flows and CDC

Domain events: 'payments. deposit_accepted`, `game. round_finished`.
CDC (from OLTP): status/limit changes in near-real-time.
Webhooks for partners: subscription to aggregates (e.g. "PSP failures> threshold").
Retray/acknowledgement policies: exactly-once for critical, at-least-once for monitoring.

8) Lakes and large samples

Arrow Flight for fast column discharges to DS/ML.
Signed-URL to Parquet/Feather, with short TTL and signed request.
Chunked transfer and file size control; Download log (WORM audit).

9) Filters, pagination, sorting

Keyset pagination (cursor) instead of OFFSET for large sets.
Filters: whitelists by fields, types and operators ('=, IN, BETWEEN, prefix').
Sorting: limited list of fields, default order.
Partial response: 'fields = brand, country, amount' reduces payload.

http
GET /v2/game-rounds? brand=X&from=...&to=...&first=1000&after=eyJkYXRlIjoi...

10) Caching and cost

Result cache for template requests, disabled by the snapshot id.
Edge cache/CDN for public/semi-public aggregates (without PII).
Budget parameters: scanned bytes limit, request timeout, rps/min quotas.
Prioritization of pools: 'bi _ hot', 'adhoc', 'partner _ api'.

11) Security and privacy

AuthN: OAuth2/OIDC (client credentials for services, PKCE for people).
AuthZ: RBAC + ABAC (attributes: brand, country, license, role).
mTLS between services, TLS 1. 2 + out.
PII hygiene: masks/tokenization on API layer, column masks, k-anonymity of aggregates.
Geo/tenant-isolation: routing requests to the license region; encryption keys per brand/region.
DSAR/Legal Hold: search by subject token, secrets for freezing sets.

12) Observability (SLI/SLO) and protection

SLI: p50/p95/p99 lat, error-rate, rps, bytes scanned, cache hit, quotas/limits, share of masked columns, share of authorization failures.
SLO: p95 latency, data freshness,% successful requests, min-group-size on responses.
Alerts: scanned bytes rise, hit-rate fall, 429/5xx spike, PII access attempts, cursor leaks.

Example policy:
yaml slo:
p95_latency_ms: 1200 success_rate: 0. 995 freshness_max: "PT5M"
privacy:
pii_allowed: false min_group_size: 20 quotas:
rps: 50 max_scanned_mb: 256

13) Formats and compression

JSON for compatibility; CSV - only for small and simple exports.
Parquet/Arrow - default for large uploads.
Compression: gzip/zstd (negotiation via 'Accept-Encoding').
Content-negotiation: `Accept: application/x-parquet`.

14) Metrics as API (Analytics/OLAP gateway)

Top-level metrics: GGR/NET, CR, hold, RG incidents - as resources with the parameters' brand, country, window, group _ by '.
Approx (HLL/TDigest) для distinct/percentiles.
Key cache: '(metric, params, snapshot_id)'.

15) iGaming specificity - ready-made endpoints

'GET/v2/payments/metrics' - failures/updates by PSP/country/brand with 7/30d windows.
'GET/v2/game-rounds/metrics' - top games/providers, p95 duration, RTP windows.
'GET/v2/rg/cases' - active restrictions/self-exclusions (k-anonymous aggregates).
'POST/v1/features: get '(gRPC) - online features for scoring fraud/recommendation.

'POST/v1/webhooks/psp-alerts' - notifications "decline rate> threshold."

16) Contract examples

GraphQL query thin slice:
graphql query {
payments(brand:"X", country:"TR", from:"2025-10-01", to:"2025-10-31", first:500) {
edges { node { day totalAmount declines psp } cursor }
pageInfo { hasNextPage endCursor }
}
}
Kafka (event, Avro):
json
{"event_id":"...","occurred_at":169..., "brand":"X","psp":"Papara","status":"declined","amount":"100. 00","currency":"TRY"}
Arrow Flight (pen):

/flight/v1/query? dataset=gold. payments&from=...&to=...&brand=X&format=arrow

17) New interface publishing process

1. ADR: Issue/Value/Customers/Security/Cost.
2. Contract: scheme, filters, limits, privacy, SLO, versions.
3. Load modeling: top-N requests, p95/scan bytes, cost.
4. Validation/cache/quotas: enable by default.
5. Documentation and SDK: examples, limits, errors, retrays, idempotency.
6. Canary:% of clients, regression tests, alerts.
7. GA: Data Products catalog version, effects report.

18) Anti-patterns

Open "raw" SQL to everyone - PII leaks, unpredictable cost.
OFFSET pagination and 'SELECT' - pain by latency and counting.
GraphQL without depth/cost constraints.
REST, which returns too many columns without 'fields =...'.
Lack of k-anonymity and min-group-size in aggregates.
Zero quotas/limits and disabled cache.
No versioning/contracts - we "break" clients with each change.
The same interface for all countries/brands is a disregard for regional rules.

19) Implementation Roadmap

0-30 days (MVP)

1. Data Products catalog (metrics/slices) and their OpenAPI/GraphQL contracts.
2. SQL gateway with RLS/CLS, k-anonymity of aggregates, basic quotas.
3. One REST-metric endpoint ('/payments/metrics') + cache + pools' bi _ hot/adhoc '.
4. gRPC Feature Store: reading 10-20 key online features (p95 ≤ 80 ms).

30-90 days

1. Stream interfaces (Kafka/Webhook) for PSP alerts/game events.
2. Arrow/Parquet uploads from the presented-URL; snapshot catalog.
3. Federation Gateway (Trino/Presto) with explicit policies.
4. Observability: dashboard SLI/SLO, alerts on cost/latency/PII.

3-6 months

1. SDK (TypeScript/Python/Go) with retrays/idempotency/quotas.
2. Thin GraphQL slices for products and partners.
3. gRPC/FS extension, offlayn↔onlayn negotiation; shadow→canary releases.
4. Privacy audit/DSAR; access compliance reports.

20) RACI

Data Platform (R): gateways, cache, quotas, federation, observability.
Data Governance (A/R): contracts, versions, privacy/k-anonymity.
Domain Owners (R): field semantics, business invariants, Data Products.
Security/DPO (A/R): AuthN/Z, geo-isolation, DSAR/Legal Hold.
SRE/Observability (C): SLO/SLI, alerts, capacity.
Analytics/BI/DS (C): requirements for forms/aggregates, SDK.

21) Related Sections

Analytical Storage Indexing, Analytical Query Optimization, Data Schemas and Evolution, Data Validation, DataOps Practices, Analytics and Metrics APIs, Feature Store, Data Security and Encryption, Access Control, Data Retention Policies.

Total

Properly designed data access interfaces turn storage and flows into a reliable "product": predictable SLAs, controlled cost, privacy compliance, and a single language for product teams, analytics, compliance, and partners. In iGaming, this means catching PSP crashes faster, understanding player behavior and meeting regulatory requirements - without manual uploads and night migrations.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.