GH GambleHub

Message queues: Kafka and RabbitMQ

Message queues: Kafka, RabbitMQ

(Section: Technology and Infrastructure)

Brief Summary

Message queues are the foundation of event-oriented architecture (EDA) in iGaming. They link microservices of rates, payments, anti-fraud, CRM, notifications and analytics. In practice, two classes of solutions are most common:
  • Apache Kafka is a distributed event log (log) focused on streaming, replication and horizontal scaling through parties.
  • RabbitMQ is an AMQP queue broker with flexible routing (exchanges/bindings), priorities, TTL, confirmations and classic queue tasks.

Both tools are mature, but solve different problems: Kafka - for scalable streams and analytics, RabbitMQ - for operational task orchestration, RPC and diverse routing.

Where is it appropriate in iGaming

Kafka - choose when:
  • We need high TPS events (bets, game events, telemetry) and horizontal scale through the parties.
  • Cold/hot re-consume (re-reading tape data), retention and compaction for aggregates (balance, player condition) are important.
  • We need stream processes (Kafka Streams/ksqlDB/Flink) for realtime aggregates: tournament leaders, responsible game limits, anti-fraud signals.
RabbitMQ - choose when:
  • We need classic task queues: KYC check, deferred/repeated payments, sending e-mail/SMS/push, webhooks to PSP.
  • Flexible routing (topic/direct/fanout), priorities, TTL, delay, dead-letter and RPC patterns.
  • Strict per-consumer restrictions (prefetch/QoS), simple load management and fast retrays are required.

Frequent outcome: Kafka for events and analytics + RabbitMQ for orchestration and integrations.

Data model and routing

Kafka

Topics are → divided into parties, each is an ordered log.
The message key defines the batch → ordering within the key.
Consumers read offset, groups of consumers scale processing.
Retention by time/volume; log compaction stores the latest version of the key.

RabbitMQ

Exchanges (direct/fanout/topic/headers) + bindings → messages get into queues.
Confirmations (ack/nack/request), publisher confirms, priorities, TTL, dead-letter (DLX/DLQ).
Quorum queues (Raft) for high availability; lazy queues to save RAM.

Delivery guarantees and idempotency

At-most-once: no retrays; risk of loss, minimum delay.
At-least-once: default standard → duplicates → idempotent handlers (request/transaction key, upsert, dedup table, outbox) are possible.
Exactly-once: in Kafka, an idempotent producer + transaction topics + agreed consumption is achieved in conjunction, but more often it is more expensive and more difficult; in RabbitMQ - limited and with bones. In real payment/bet flows, at-least-once + strict idempotence is applied.

Idempotency practice:
  • Unique idempotency-keys (UUID/ULID) per event/command.
  • Outbox pattern in the + Change Data Capture (Debezium) service database → double write prevention.
  • Dedup by (key, created_at) in a separate row with TTL.

Order/Message Order

Kafka guarantees order within the party. Choose the key so that the whole "life" of the entity (for example, 'player _ id' for balance) is in one key.
RabbitMQ order is not strictly guaranteed with repeated deliveries/multiple consumers; pipelines critical to order - better in Kafka or through single-active consumer and stream serialization.

Design of topicals and queues

Kafka:
  • Granularity: 'domain. event '(for example,' payments. deposit. created`).
  • Keys: 'player _ id', 'account _ id', 'bet _ id' for ordering.
  • Batches = N by target TPS (rule: 1 batch ≈ X messages/sec/consumer); lay stock for growth.
  • Retention: events - hours/days; compaction - for "states."
RabbitMQ:
  • Exchanges by domain: 'payments. direct`, `risk. topic`.
  • Queues for consumers: 'kyc. checker. q`, `psp. webhooks. retry. q`.
  • DLQ per work queue delay for backoff.
  • Prefetch specifies concurrency, quorum queues for HA.

Errors, Retrays and DLQs

Classify errors: temporary (network/PSP 5xx) → retrays; fatal (validation, scheme) → immediately DLQ.
Exponential backoff + jitter, retray limit, poison-pill detection.
Separate retry-queues by steps (5s, 1m, 5m, 1h).
DLQ handler: alert, trace, manual parsing, re-injection with patch.

Data Contract and Schematics

Use Avro/Protobuf + Schema Registry (for Kafka - de facto standard).
Versioning: backward-compatible changes (adding optional fields), prohibition of breaking migrations.
PII fields - encryption/tokenization; comply with GDPR and local regulations.

Monitoring, observability and SLO

Metrics of producers/consumers: lag, throughput, errors, retrai, processing time.
Logs + tracing (correlation ID: 'trace _ id', 'message _ id').
SLO: p99-latency of publication/delivery, permissible consumer lag, recovery time after files.
Alerts for DLQ growth, lag excess, drop in parties/quorum.

Safety and compliance

TLS in transit, secret encryption (SOPS/Vault), limited ACL/RBAC.
Separate topics/queues for sensitive domains (payments, KYC).
Audit log of publications/subscriptions, storage of keys outside the code.
Regional requirements (EU/Turkey/LatAm): retention, storage localization, masking.

High availability, fault tolerance and DR

Kafka:
  • Cluster of 3-5 brokers at least; replication. factor ≥ 3.
  • min. insync. replicas and acks = all for durable records.
  • Cross Regional Replication (MirrorMaker-2) for DR.
RabbitMQ:
  • Quorum queues for HA, even/odd number of nodes with quorum.
  • Federation/Shovel for inter-data center replication, DR scripts.
  • Cold/warm stand, switching tests.

Performance and tuning

Kafka (producer):
  • `linger. ms` и `batch. size 'for butching;' compression. type` (lz4/zstd).
  • 'acks = all ', but watch for latency; tune'max. in. flight. requests. per. connection 'with idempotency.
Kafka (broker/topic):
  • Enough parties; NVMe drives 10/25G grid; JVM GC settings.
Kafka (consumer):
  • Correct group management, 'max. poll. interval. ms', pause the parties at the backoff.
RabbitMQ (producer):
  • Publisher confirms in butches; channels re-use.
RabbitMQ (queues/consumers):
  • 'prefetch '(e.g. 50-300) by treatment time; lazy queues for large backlogs.
  • Post hot queues to nodes; TCP tune/file descriptors.

Typical patterns for iGaming

Outbox + Kafka for reliable publication of domain events (bet placed, deposit credited).
RabbitMQ RPC for synchronous requests to integrations (KYC document check, rebate calculation).
Saga pattern: orchestration through events (Kafka) and teams (RabbitMQ) with compensatory steps.
Fan-out notifications: from one event → CRM, anti-fraud, analytics.
Smart-retry PSP webhooks with progressive delays and DLQ.

Migration and hybrid architectures

Start with RabbitMQ for "operating system," add Kafka for events and analytics.
Duplicate publications: service → outbox → connector in both directions (Kafka + RabbitMQ) until complete stabilization.
Gradually migrate analytics/stream aggregation subscribers to Kafka Streams/ksqlDB.

Mini Selection Checklist

1. Load/TPS> tens of thousands/sec? → Kafka.
2. Need retention and re-reading like from a magazine? → Kafka.
3. Flexible routing, priorities, delayed delivery, RPC? → RabbitMQ.
4. Strict key order and horizontal scale → Kafka (key/parties).
5. Simple tasks/work-kew with concurrency control → RabbitMQ.
6. Ideally, a combination: Kafka (events) + RabbitMQ (orchestration).

Examples of minimum configurations

Example: delayed retrai and DLQ in RabbitMQ (via policy)

Work queue: 'psp. webhooks. q`

Retras queue: 'psp. webhooks. retry. 1m. q '(TTL = 60s, DLX points back to operational)

DLQ: `psp. webhooks. dlq`

Policies (conceptually):
  • `psp. webhooks. q` → `x-dead-letter-exchange=psp. retry. exchange`
  • `psp. webhooks. retry. 1m. q` → `x-message-ttl=60000`, `x-dead-letter-exchange=psp. work. exchange`
  • `psp. webhooks. dlq '→ monitoring and manual debugging.

Example: Kafka's betting topic

Topic: 'bets. placed. v1 ', parties: 24, RF = 3, retention 7 days.
The message key is' player _ id'or' bet _ id '(choose which is more important for order).
Схема: Protobuf/Avro с `bet_id`, `player_id`, `stake`, `odds`, `ts`, `idempotency_key`.

Testing and quality

Contract tests producer/consumer + Schema Registry.
Chaos tests: node drops, network delays, split-brain.
Load runs with target TPS, p99 check, lag growth and recovery.

Summary

Kafka - event highway and streaming: key ordering, retention/compression, high TPS, real-time analytics.
RabbitMQ - operational task queue: flexible routing, confirmations, priorities, retrays/DLQ, RPC.
In iGaming, best practice is complementary use: events and analytics in Kafka, integration/orchestration tasks in RabbitMQ, with uniform schema standards, idempotency, monitoring and strict SLOs.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.