GH GambleHub

Webhooks: replays and acknowledgements

1) Basic delivery model

At-least-once (default) - The event will be delivered ≥1 times. Exactly-once guarantees are achieved by receiver idempotency.
Acknowledgement (ACK): only any 2xx (usually 200/204) from the recipient means success. Everything else is interpreted as a failure and leads to repetition.
Fast ACK: Respond 2xx after placing the event in turn, not after full business processing.

2) Event format and mandatory headings

Payload (example)

json
{
"id": "evt_01HXYZ",
"type": "order. created",
"occurred_at": "2025-11-03T18:10:12Z",
"sequence": 128374,
"source": "orders",
"data": { "order_id": "o_123", "amount": "49. 90", "currency": "EUR" },
"schema_version": 1
}

Sender Headers

'X-Webhook-Id: evt_01HXYZ' - unique event ID (use for deduplication).
'X-Webhook-Seq: 128374 '- monotone sequence (by subscription/theme).
`X-Signature: sha256=<base64(hmac_sha256(body, secret))>` — HMAC-подпись.
'X-Retry: 0,1,2... 'is the try number.
'X-Webhook-Version: 1 '- contract versioning.
(optional) 'Traceparent' - trace correlation.

Response from recipient

2xx - successfully accepted (there will be no further repetitions for this'id ').
410 Gone - endpoint deleted/inactive → sender terminates retries and deactivates subscription.
429/5xx/timeout - the sender repeats according to the retray policy.

3) Retries policy

Recommended backoff ladder (+ jitter)

'1s, 3s, 10s, 30s, 2m, 10m, 30m, 2h, 6h, 24h '(stop after the limit, for example 48-72 hours).

Rules:
  • Exponential backoff + random jitter (± 20-30%) to avoid "herd effect."
  • Quorum of errors for temporary failures (for example, retry if 5xx or network timeout).
  • Respect 429: set minimum 'min (Retry-After header, next backoff window)'.

Timeouts and sizes

Connection timeout ≤ 3-5 seconds; total response timeout ≤ 10 seconds

The size of the body under the contract (for example, ≤ 256 KB), otherwise 413 → the logic "chunking" or "pull URL."

4) Idempotency and deduplication

Idempotent application: processing repetitions of the same'id 'must return the same result and not change state again.
Dedup storage on the recipient's side: store '(X-Webhook-Id, processed_at, checksum)' with TTL ≥ retray windows (24-72 hours).
Compositional key: if several topics → '(subscription_id, event_id)'.

5) Order and "exactly-once effects"

It is difficult to guarantee strict order in distributed systems. Use:
  • Partition by key: the same logical set (for example, 'order _ id') is always in one "channel" of delivery.
  • Sequence: Reject events with the old 'X-Webhook-Seq' and put them in the "parking lot" before the missing ones arrive.
Exactly-once effects are achieved through:
  • log of applied operations (outbox/inbox pattern),
  • transactional upsert by 'event _ id' in the database,
  • sagas/compensations for complex processes.

6) Error resolution by status codes (Table)

Response codeValue for senderAction
2xxACK receivedWe consider delivered, stop retrai
4xx (except 410/429)Persistent error (payload/authorization)Put in DLQ, notify integration
410Endpoint deleted/deprecatedStop Retrays, Deactivate Subscription
408/429Temporary overload/timeoutRepeat by backoff/Jitter; consider 'Retry-After'
5xxTemporary server errorRepeat by backoff/Jitter
3xxDo not use redirects for webhooksTreat as Configuration Error

7) Channel security

HMAC signature of each message; check at the receiver with the "time window" (mitm and replay attacks).
mTLS for sensitive domains (LCC/payments).
IP allowlist of outgoing addresses, TLS 1. 2+, HSTS.
PII minimization: do not send unnecessary personal data; disguise in the logs.
Rotation of secrets: two valid keys (active/next) and the'X-Key-Id 'header to indicate the current one.

8) Queues, DLQs and Replays

Events must be written to the output queue/log on the sender side (for reliable replay).
If the maximum of retrays is exceeded, the event goes to DLQ (Dead Letter Queue) with the cause.
Replay API (for recipient/operator): resubmit by 'id '/time range/subject, with RPS restriction and additional signature/authorization.

Replay API Example (Sender):

POST /v1/webhooks/replay
{ "subscription_id": "sub_123", "from": "2025-11-03T00:00:00Z", "to": "2025-11-03T12:00:00Z" }
→ 202 Accepted

9) Contract and version

Version the event (the 'schema _ version' field) and the transport ('X-Webhook-Version').
Add fields only as optional; on deletion - minor migration and transition period (dual-write).
Document event types, examples, schemas (JSON Schemas), error codes.

10) Observability and SLO

Sender Key Metrics:
  • 'delivery _ success _ rate '(2xx/all attempts),' first _ attempt _ success _ rate'
  • `retries_total`, `max_retry_age_seconds`, `dlq_count`
  • `latency_p50/p95` (occurred_at → ack_received_at)
Recipient Key Metrics:
  • `ack_latency` (receive → 2xx), `processing_latency` (enqueue → done)
  • `duplicates_total`, `invalid_signature_total`, `out_of_order_total`
SLO examples:

99. 9% of events receive the first ACK ≤ 60 seconds (28d).

  • DLQ ≤ 0. 1% of the total; DLQ replay ≤ 24 hours.

11) Timing and network breaks

Use UTC in the time fields; synchronize NTP.
Send 'occurred _ at' and fix 'delivered _ at' to read the lag.
With long breaks, the network/endpoint → accumulate in the queue, limit growth (backpressure + quotas).

12) Recommended limits and hygiene

RPS per subscription (e.g. 50 RPS, burst 100) + concurrency (e.g. 10).
Max. body: 64-256 KB; for more - "notification + URL" and download signature.
Event names in 'snake. case 'or' dot. type` (`order. created`).
Strict idempotency of write operations of the receiver.

13) Examples: Sender and Receiver

13. 1 Sender (pseudocode)

python def send_event(event, attempt=0):
body = json. dumps(event)
sig = hmac_sha256_base64(body, secret)
headers = {
"X-Webhook-Id": event["id"],
"X-Webhook-Seq": str(event["sequence"]),
"X-Retry": str(attempt),
"X-Signature": f"sha256={sig}",
"Content-Type": "application/json"
}
res = http. post(endpoint, body, headers, timeout=10)
if 200 <= res. status < 300:
mark_delivered(event["id"])
elif res. status == 410:
deactivate_subscription()
else:
schedule_retry(event, attempt+1) # backoff + jitter, respect 429 Retry-After

13. 2 Receiver (pseudocode)

python
@app. post("/webhooks")
def handle():
body  = request. data headers = request. headers assert verify_hmac(body, headers["X-Signature"], secret)
evt_id = headers["X-Webhook-Id"]
if dedup_store. exists(evt_id):
return, "" 204 enqueue_for_processing (body) # fast path. dedup_store put(evt_id, ttl=723600)
return, "" 202 # or 204

14) Testing and chaos practices

Negative cases: invalid signature, 429/5xx, timeout, 410, large payloads.
Behavioral: out-of-order, duplicates, delays of 1-10 minutes, break for 24 hours.
Load: burst 10 ×; check for backpressure and DLQ persistence.
Contracts: JSON Schema, mandatory headings, stable event types.

15) Implementation checklist

  • 2xx = ACK, and quick return after enqueue
  • Exponential backoff + jitter, respect 'Retry-After'
  • Receiver IDempotency and X-Webhook-Id (TTL ≥ Retray)
  • HMAC signatures, secret rotation, optional mTLS
  • DLQ + Replay API, Monitoring and Alerts
  • Limits: Timeouts, RPS, Body Size
  • Order: partition by key or 'sequence' + "parking lot"
  • Documentation: schemas, examples, error codes, versions
  • Chaos tests: delays, duplicates, network failure, long replay

16) Mini-FAQ

Do I always need to answer 200?

Any 2xx counts as a success. 202/204 is normal practice for "accepted to queue."

Can replays be stopped?
Yes, a 410 response and/or via the sender's console/API (unsubscribe).

What about large payloads?
Send a "notification + secure URL," sign the download request and install TTL.

How to ensure order?
Partition by key + `sequence`; in case of discrepancy - "parking lot" and replay.

Total

Reliable webhooks are clear ACK (2xx) semantics, reasonable repeats with backoff + jitter, strict idempotence and deduplication, competent security (HMAC/mTLS), queues + DLQ + replays, and transparent observability. Fix the contract, enter limits and metrics, regularly run chaos scenarios - and your integrations will stop "pouring in" at the very first failures.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.