GH GambleHub

Telemetry and Event Collection

1) Purpose and principles

Objectives:
  • Single and predictable event flow for analytics, anti-fraud, RG, compliance and ML.
  • End-to-end tracing (user/session/request/trace) and reproducibility.
  • PII minimization and privacy compliance.

Принципы: schema-first, privacy-by-design, idempotency-by-default, observability-by-default, cost-aware.

2) Taxonomy of events

Payment: 'payment. deposit`, `payment. withdrawal`, `payment. chargeback`.
Gaming: 'game. session_start/stop`, `game. bet`, `game. payout`, `bonus. applied`.
Custom: 'auth. login`, `profile. update`, `kyc. status_changed`, `rg. limit_set`.
Operating rooms: 'api. request`, `error. exception`, `release. deploy`, `feature. flag_changed`.
Compliance: 'aml. alert_opened`, `sanctions. screened`, `dsar. requested`.

Each type has a domain owner, a schema, and a freshness SLO.

3) Schemes and contracts

Required fields (minimum):
  • `event_time` (UTC), `event_type`, `schema_version`, `event_id` (UUID/ULID),
  • `trace_id`/`span_id`, `request_id`, `user. pseudo_id`, `session_id`,
`source` (clientserverprovider), `market` (jurisdiction), `labels.`.
Example (JSON):
json
{
"event_id": "01HFY1S93R8X",
"event_time": "2025-11-01T18:45:12. 387Z",
"event_type": "game. bet",
"schema_version": "1. 4. 0",
"user": {"pseudo_id": "p-7a2e", "age_band": "25-34", "country": "EE"},
"session": {"id": "s-2233", "device_id": "d-9af0"},
"game": {"id": "G-BookOfX", "provider": "StudioA", "stake": {"value": 2. 00, "currency": "EUR"}},
"ctx": {"ip": "198. 51. 100. 10", "trace_id": "f4c2...", "request_id": "req-7f91"},
"labels": {"market": "EE", "affiliate": "A-77"}
}

Evolution of schemes: semantic versions; backward-compatible - add nullable fields; breaking - only in the new version ('/v2 ') with a double recording period.

4) Instrumentation: where and how

4. 1 Client (Web/Mobile/Desktop)

Local buffer telemetry SDK, batch submission, exponential retrays.
Auto-events: visits, clicks, visibility of blocks, web-vitals (TTFB, LCP, CLS), JS errors.
Identifiers: 'device _ id' (stable, but private), 'session _ id' (updated), 'user. pseudo_id`.
Protection against "noise": dedup by 'event _ id', throttling, client-side sampling.

4. 2 Server/backend

Logger/tracer wrappers (OpenTelemetry) → domain event emit.
Mandatory throwing 'trace _ id' from edge/gateway to all downstream services.
Outbox pattern for transactional publishing of domain events.

4. 3 Providers/Third Parties

Connectors (PSP/KYC/studios) with normalization to host circuits; version adapters.
Signature/payload integrity check, perimeter logging (ingest audit).

5) OpenTelemetry (OTel)

Traces: each request receives a'trace _ id'; we associate logs/events via 'trace _ id '/' span _ id'.
Logs: use OTel Logs/converters; environment labels' service. name`, `deployment. env`.
Metrics: RPS/latency/error-rate by service, business metrics (GGR, conversion).
Collector: single point of receipt/buffer/export to Kafka/HTTP/graphic. stack.

6) Identifiers and correlation

'event _ id '- uniqueness and idempotence.
`user. pseudo_id' - stable aliasing (mapping separately and limited).
'session _ id ', 'request _ id', 'trace _ id ', 'device _ id' are required for end-to-end analysis.
ID consistency at API gateway and SDK level.

7) Sampling and volume control

Rules: per-event-type, per-market, dynamic (adaptive) by load.
Accurately captured events: payment/compliance/incidents - not sampled.
Analytical events: 10-50% with corrective weights in display cases is allowed.
Server-side downsampling: Valid for high-frequency metrics.

8) Privacy and compliance

Minimize PII: Tokenize PAN/IBAN/email; IP → geo codes/ASN when ingest.
Regionalisation: Send to regional ingest endpoints (EEA/UK/BR).
DSAR/RTBF: support for selective projection hiding; legal transaction log.
Retention policies: timing by type (analytics shorter, regulatory longer); Legal Hold.

9) Transport and buffering

→ Edge client: HTTPS (HTTP/2/3), 'POST/telemetry/batch' (up to 100 events).
Edge → Tire: Kafka/Redpanda partitioned by'user. pseudo_id`/`tenant_id`.
Formats: JSON (ingest), Avro/Protobuf (in bus), Parquet (in lake).
Reliability: retrai with jitter, DLQ, poison-pill isolation.

Batch specification (simplified):
json
{
"sdk": {"name":"igsdk-js","version":"2. 7. 1"},
"sent_at": "2025-11-01T18:45:12. 500Z",
"events": [ {... }, {... } ]
}

10) Reliability and idempotency

Client-generated 'event _ id' + server grandfather by '(event_id, source)'.
Outbox on services, Exactly-Once-semantics in threads (keyed state + dedupe).
Order within key: partitioned by 'user/session'.
Time control: NTP/PTP, allowed drift (for example, ≤ 200 ms), 'received _ at' on the server.

11) Telemetry Quality (TQ) and SLO

Completeness: ≥ 99. 5% of critical type events per T.
Freshness: p95 delivery delay to Silver ≤ 15 min.
Correctness: valid schemes ≥ 99. 9%, drop-rate < 0. 1%.
Trace coverage: The percentage of requests with 'trace _ id' ≥ 98%.
Cost/GB: target budget for ingest/storage by domain.

12) Observability and dashboards

Minimum widgets:
  • Lag ingest (p50/p95) by source and region.
  • Completeness by event type and market.
  • Validation errors of/oversized-payloads schemes.
  • SDK version map and percentage of legacy clients.
  • Correlation of web-vitals ↔ conversion/failures.

13) Client SDK Requirements

Light footprint, offline buffer, deferred initialization.
Settings: sampling, max batch size, max queue age, privacy fashion (no-PII).
Protection: package signature/anti-tamper, key obfuscation.
Update: feature-flags to disable noisy events.

14) Edge layer and protection

Rate limit, WAF, schema validation, compression (gzip/br).
Token bucket per client; anti-replay ('request _ id', TTL).
IP and UA removal → normalization/enrichment outside the "raw" payload.

15) Integration with the data pipeline

Bronze: irreversibly added raw payload (for forensics).
Silver: normalized tables with deduplication/enrichment.
Gold: display cases for BI/AML/RG/product.
Linage between events and reports; versions of transformations.

16) Customer Quality Analytics

Quiet customer ratio (no events in N hours).
Anomalies of the "storm" (mass duplicate/burst).
Share of "legacy SDKs" by version and platform.

17) Processes and RACI

R: Data Platform (ingest/bus/validators), App Teams (SDK instrumentation).
A: Head of Data/Architecture.
C: Compliance/DPO (PII/retention), SRE (SLO/incidents).
I: BI/Marketing/Risk/Product.

18) Implementation Roadmap

MVP (2-4 weeks):

1. Event taxonomy v1 + JSON schemas for 6-8 types.

2. SDK (Web/Android/iOS) с batch и sampling; Edge `/telemetry/batch`.

3. Kafka + Bronze layer; basic validators and dedup.

4. Dashboard ingest lag/completeness, alerts to drop/validator.

Phase 2 (4-8 weeks):
  • OTel Collector, trace correlation; Silver normalization and DQ rules.
  • Regional endpoints (EEA/UK), privacy-fashion, DSAR/RTBF procedures.
  • SDK version map, auto-rollout updates by rings.
Phase 3 (8-12 weeks):
  • Exactly-Once in streams, Feature Store connections, anti-fraud online feeds.
  • Rule-as-Code for schemes and validators, impact analysis.
  • Value optimization: adaptive sampling, Z-order/clustering in lake.

19) Quality checklist before release

  • Required schema fields and correct types are filled in.
  • 'trace _ id '/' request _ id '/' session _ id' are present.
  • SDK supports batch, retry, sampling.
  • Edge validates the scheme and limits the payload size.
  • Privacy filters and tokenization of sensitive fields are enabled.
  • Configured SLO/alerts and dashboards.
  • Documentation for domains (example event, owner, SLA).

20) Frequent mistakes and how to avoid them

Raw events without schemes: enter registry and CI validation.
No idempotency: require 'event _ id' and store deduplication windows.
PII and analytics mix: separate mappings, mask fields.
No tracing: route 'trace _ id' through gateway → services → events.
Unmanaged volumes - Use sampling/trrottling and budget quotas.
Global endpoint without regions - use regionalization and data residency.

21) Glossary (brief)

OpenTelemetry (OTel) is an open standard for trails/metrics/logs.
Outbox - transactional publishing of domain events.
DLQ - queue of "broken" messages.
Sampling - selection of a part of events for volume reduction.
Data Residency - storing data in the desired jurisdiction.

22) Bottom line

Well-designed telemetry is about arrangements, not just "sending logs": strict schemes, agreed identifiers, default privacy, reliable transport, observability and cost-saving. By following this article, you get a steady stream of events ready for analytics, compliance and machine learning with predictable SLOs.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.