Message queues: RabbitMQ, Kafka
Message queues: RabbitMQ, Kafka
1) When to choose
RabbitMQ (AMQP 0-9-1 / 1. 0, classic queues, Quorum Queues, Streams)
Suitable for: RPC/commands, workflow, short tasks, fanout/topic routing, flexible confirmations, priority control.
Pros: rich routing semantics (exchanges), 'basic. qos' (prefetch), per-message TTL/delay, convenient RPC (reply-to) patterns, easy start.
Cons: History stored in queue, scaled horizontally across queues/shards; high Throughput-cost with very large flows.
Apache Kafka (event log, parties, consumer groups)
Suitable for: event streams, auditing, event sourcing, ETL/integrations (Connect), high RPS/MBps, replay/re-processing, stream processing (Streams/ksqlDB).
Pros: long-term journal, scaling by parties, stable replay, key compaction.
Cons: pull + parties model - not for small RPC; order only within the party; schema management/interoperability is the responsibility of the team.
2) Delivery semantics and invariants
At-most-once: no retrays; fast, risk of loss.
At-least-once: with retreats; requires consumer idempotency.
Exactly-once: achievable in limited conditions (Kafka TX + idempotent producer + consistent sink; RabbitMQ - through the deduplication table/idempotent keys).
Order: RabbitMQ - queue order (may be violated with retras/multi-consumers); Kafka - order in the party, the key sets the partitioning.
Domain invariants: money/balances - through magazines/sagas and idempotent teams; do not rely on LWW.
3) Integration patterns
Outbox/InBox: atomic recording of the event in the database → publishing to the queue (outbox) and idempotent consumption with the processing log (inbox).
DLQ (dead letters): after N attempts/errors - in DLQ + alert.
Retry/Delay: RabbitMQ — TTL + dead-letter exchange; Kafka - retry topics with backoff.
Request/Reply: RabbitMQ — `reply_to` + `correlation_id`; Kafka - rarely, only with special patterns.
Compensations: sagas over events; each operation has an inverse.
4) Key and Topology Design
RabbitMQ
Exchanges: `direct`, `topic`, `fanout`, `headers`.
Routing key: Specifies the queue hit (s). For prioritization - separate queues.
QoS: 'prefetch' (e.g. 50-300) balances rate/latency.
Quorum Queues: replicated queues on Raft; replacement mirrored classic.
Streams: stream with offsets (Kafka-like) for high-throughput/replay.
Kafka
Topic → partitions: plan '# partitions' on target throughput and parallelism (backward compatible increase is easier than decrease).
Key: all records of one key - in one part (guarantee of order by key).
Replication factor: 3 for productive topics, 'min. insync. replicas = 2 '+' acks = all 'for reliability.
Retention: by time/size; compaction - stores the last values by key + tombstones for deletion.
5) Retrai, DLQ, idempotency
RabbitMQ
Repeats: per-message TTL + DLX (dead-letter exchange) with backoff (for example, 1m → 5m → 15m).
Idempotence: 'correlation _ id '/' message-id' + processed message table (TTL) or deterministic commands.
Confirmations: manual'basic. ack 'after successful transaction;' basic. nack(requeue=false)` в DLQ.
Kafka
Repetitions: individual retry topics; consumer commits offset after successful side-effect.
Exactly-once processing (EOS): Producer `enable. idempotence = true ', transactional producer/consumer,' read _ committed 'on consumer; sink (for example, Kafka→Kafka or Kafka→DB through a transaction) - neatly synchronize.
Dedup: by key/idempotent key on the base side, or via compressed topic.
6) Performance and dimension
Little's Law: 'L = λ × W'
For vorker: required overlapping 'N ≈ arrival_rate × avg_processing_time × stock (1. 2–1. 5)`.
RabbitMQ prefetch: Start with 'prefetch = 100' and measure p99/in-flight time.
Kafka partitions: calculation from the desired consumer parallelism and throughput goal (for example, 1 batch is stable 5-20 MB/s on SSD/10GbE).
7) Observability and alerts
General:- Lag/Backlog (messages/bytes), age of messages (p95/p99), error-rate of processing, DLQ-rate.
- Time "publikatsiya→obrabotka" (end-to-end).
- Dependency map: producer → broker → consumer.
- Connections, channels, non-acked messages, 'memory _ alarm', 'disk _ free _ limit', 'queue length' p95.
- Reports on Quorum (leader, Raft log, misses' quorum not enough ').
- Under-replicated partitions, ISR shrink/expand, controller changes.
- Producer errors (timeouts, `request latency`), consumer lag per group/partition.
- Broker I/O, page cache hit, GC, ZooKeeper/KRaft health.
8) Safety and multi-tenancy
TLS in-transit encryption, authentication (SASL/PLAIN/SCRAM/OAuth, mTLS).
Authorization: vhost/permissions (RabbitMQ), ACL to topics/groups (Kafka).
Quotas: for connections, channels, queue size/topic, publishing/reading speed.
Isolation by environments (dev/stage/prod) and by namespace/vhost.
9) Operation and tuning
RabbitMQ
Post exchanges/queues to nodes (CPU/IO capital).
Lazy queues (messages to disk) for large buffers; avoid "hot" queues without sharding.
Quorum Queues for HA; Plan Raft log size and disk.
TTL/length-limit policies, priority queues only for real need (expensive).
bash rabbitmqctl set_policy DLX "^task\." \
'{"dead-letter-exchange":"dlx","message-ttl":60000,"max-length":100000}' --apply-to queues
Kafka
SSD/NVMe, fast networks; OS tuning (swappiness low, file limits).
`acks=all`, `linger. ms' (butching), 'compression. type = zstd '/lz4 for bandwidth.
Consumer options: 'max. poll. interval. ms`, `max. poll. records`, `fetch. min. bytes`.
Retention and compaction - storage balance/replay.
java props. put("acks","all");
props. put("enable. idempotence", "true");
props. put("max. in. flight. requests. per. connection","1");
props. put("retries","10");
10) Integrations and ecosystem
Kafka Connect (Sinks/Sources), Schema Registry (Avro/JSON/Protobuf) and interoperability ('BACKWARD/FORWARD/FULL').
Kafka Streams/ksqlDB: stateful operations, windows, aggregates.
RabbitMQ Shovel/Federation: transfer between clusters/centers.
K8s operators: Strimzi (Kafka), RabbitMQ Cluster Operator; GitOps manifestos.
11) Implementation checklist (0-45 days)
0-10 days
Define use-cases: commands/tasks (RabbitMQ), events/audits (Kafka).
Select keys ('routing key '/' partition key'), set SLO "publikatsiya→obrabotka."
Basic security policies (TLS, ACL), quotas, DLQ/TTL.
11-25 days
Implement outbox/inbox, idempotency and deadup.
Set up retreas with backoff (Rabbit: TTL + DLX; Kafka: retry topics).
Dashboards: lag, age, DLQ-rate, end-to-end latency; alerts.
26-45 days
Tuning bandwidth: prefetch/acks (Rabbit); partitions/acks/batch (Kafka).
DR procedures (mirroring/replication), node failure tests.
Document event contracts (schemas) and interoperability policies.
12) Anti-patterns
One "universal" tool for all tasks.
Absence of DLQ/TTL: eternal poisons (poison messages).
Unlimited 'prefetch' → consumer starvation, p99 growth.
Kafka without keys → loss of order/hot parties by default.
"Exactly-once," with no real need/discipline, is a false sense of security.
Secrets/logins in the code, without TLS/ACL.
Hardcode of schemes/versions of messages without Registry and migrations.
13) Maturity metrics
Lag/age SLO is performed ≥ 99% of the time; DLQ-rate under control.
Idempotency covers 100% of critical pathways; outbox/inbox implemented.
Retention/compaction are documented, replay does not break consumers.
Alerts on ISR/URP (Kafka) and Raft/disk limits (Rabbit) are set up.
Event contracts are versioned (Schema Registry), compatibility is tested in CI.
Regular game-days: node/broker/AZ failure, recovery check.
14) Examples of configs (summary)
RabbitMQ: prefix and confirmations (pseudocode):python channel. basic_qos(prefetch_count=200)
for msg in consume("tasks"):
try:
handle(msg)
channel. basic_ack(msg. delivery_tag)
except Transient:
channel. basic_nack(msg. delivery_tag, request = False) # will go to DLQ
Kafka Consumer (ideas):
java props. put("enable. auto. commit","false");
props. put("isolation. level","read_committed"); // при EOS
//...
poll -> process(idempotent) -> commitSync()
15) Conclusion
RabbitMQ and Kafka solve different classes of problems: commands/tasks and rich routing against a long-term event log and scalable streaming. Success - in the correct delivery semantics, discipline of idempotence, thoughtful keying, retrays/DLQ, observability and strict security. Build engineering practices around queues - outbox/inbox, schemas, and GitOps policies - and your integration becomes predictable, scalable, and sustainable.