GH GambleHub

Repetition strategies and idempotency

1) Why do you need it

In networks, failures are the norm: timeouts, transient errors, network flappings, overload. Retreats improve reliability only if:

1. repeat safe (idempotent),

2. delays between repetitions are observed,

3. limits/quotas and addictions "health" are respected.

The goal is effectively-once behavior at the level of business operations without false takes and races.

2) Taxonomy of delivery semantics

At-most-once: no repetition, risk of loss (logging, fire-and-forget).
At-least-once: duplicates are possible → consumer idempotence is needed (most queues, webhooks).
Effectively-once: duplicates are possible, but deduplicated correctly (keys, transactions, outbox).

3) When to retract and when not

Retreat makes sense: '408', '429' (observing 'Retry-After'), '425' (Too Early), '499' (client closed on the perimeter), '5xx', '504', network timeouts/breaks, '502' at the gateway, "connection reset."

Do not retract without changing the query: '400/ 401/403/404/422'.
Controversial cases: '409 Conflict' (not usually retrayim; first we read the status of the operation/reconfirm the intention).

4) Timeouts, backoff and jitter

4. 1 Rules

First timeout, then retro: each request must have a "deadline."

Exponential backoff: 'delay _ n = base 2 ^ n', limit'max _ delay'.

Jitter is required: add randomness to decouple "dull synchronous waves."

4. 2 Jitter patterns

Full jitter: 'sleep = rand (0, base2 ^ n)' is the best overall choice.
Decorated jitter: 'sleep = min (max_delay, rand (base, sleep_prev3))' - for long dialogs.
Equal jitter: 'sleep = base2 ^ n/2 + rand (0, base2 ^ n/2)' - soft variation.

4. 3 Retry-budget

Limit the proportion of retrays:
  • `retry_budget_per_min = max(α success_rps, floor β)`; usually 'α = 0. 1–0. 2`.
  • If the budget is exhausted, switch to fail-fast/circuit breaker "open."

5) Interaction with rate limiting and Circuit Breaker

Respect 'Retry-After', 'RateLimit-Reset' and count it in the back-off.
At high '5xx '/timeouts - lower the retray frequency and overall concurrency.

Circuit breaker:
  • Half-open: Allows limited sampling.
  • Open: instantly rejects (saves resource).
  • Closed: ordinary work.
  • On write operations, it is preferable to return 409/503 with a clear hint than twist aggressive retrays.

6) Idempotency of write operations

6. 1 General idea

The same intentions → one result. The basis is the idempotence key and the storage of execution records.

6. 2 HTTP contract

The client sends the header:

Idempotency-Key: 7a6b7f9e-2a46-4d0b-9c3a-2b30e1c3c9e3
Idempotency-Key-Expiry: 24h # optional
Server:
  • Saves (key, result → status, body hash) on first success
  • if repeated, returns the old response and the header'Idempotency-Replay: true ';
  • in case of a body conflict (the same key, but a different payload) - '409 Conflict'.

6. 3 Storage and TTL

Table/value key: 'idempotency _ key', 'request _ hash', 'result', 'status', 'expiry _ at'.
TTL = window of possible replays and late deliveries (usually 24-72 hours for payments).
Indices by'idempotency _ key '; for high load - hash sharding.

6. 4 Example Schema (SQL)

sql
CREATE TABLE idempo_store (
key UUID PRIMARY KEY,
req_hash BYTEA NOT NULL,
status INT NOT NULL,
response JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expiry_at TIMESTAMPTZ NOT NULL
);

6. 5 Handler pseudocode

pseudo handle_write(req):
k = req. headers["Idempotency-Key"]
h = hash(req. body)
rec = idempo_store. get(k)

if rec and rec. req_hash == h:
return rec. status, rec. response, {"Idempotency-Replay": "true"}

if rec and rec. req_hash!= h:
return 409, problem("IDEMPOTENT_CONFLICT")

begin tx result = apply_business_mutation (req) # change status upsert once (idempo_store, key = k, req_hash=h, status = 201, response = result, expiry = now () + 2d)
commit

return 201, result

7) "effectively-once" patterns

Transactional Outbox: recording a business event and sending a message from the same database transaction through the background relay; the consumer is idempotent.
Inbox/Processed-table at the consumer: save 'event _ id' to ignore duplicates.
Exactly-once on Kafka ≠ exactly-once in business: even with producer/consumer EOS, applied logic should still be idempotent.
Compensating transactions (Saga): if the steps retract and cause side effects, we return the system to the invariant.

8) Special cases: payments and financial transactions

Strong idempotency: The key is bound to the operation logic (e.g. 'external _ payment _ id').
Deduplication on PSP - Store 'merchant _ reference' → if repeated, PSP will return the same result.
Retrays "from the client": allow only when 'Idempotency-Key', otherwise the risk of double write-off.
Competition: locks "on account/tool/contract" for the duration of execution; when repeated, return 409/423.
Observability: metrics' idempo _ replay _ total ',' idempo _ conflict _ total '.

9) Webhooks and external challenges

HMAC signatures and time window; first verification, then processing.
Sender retrays: exponential backoff + jitter, 'max _ attempts' and DLQ.
Consumer - idempotent: 'event _ id' → table/in-memory cache; "tidy" order is not guaranteed.
Codes: 2xx = successful, 4xx = do not repeat, 5xx/timeout = repeat.

10) Queues and background tasks

At-least-once by default → duplicates are inevitable.

Store 'task _ id '/' event _ id' and execution status; with duplicates - the short path "replay."

DLQ and poison-messages: attempt counter, quarantine, manual parsing.
Competitive limits (semaphores) and idempotent workers.

11) Versioning and "natural" keys

Natural keys (account number + date + document number) increase resistance to repetition.
When changing the schema/version, include the version key in the'Idempotency-Key'or in the query hash.

12) HTTP headers and prompts to the client

'Idempotency-Key ',' Idempotency-Replay ',' Retry-After ',' Prefer: wait = <sec> '(on long operations),' If-Match '/' ETag '(optimistic locks).
409 for a key conflict 425/429/503 with the valid'Retry-After '.
For "long" operations - reception of asynchronous status ('202 Accepted' + 'Location' per status resource).

13) Testing and chaos scenarios

Negative tests: double sending, repetition with another body, clock desynchronization.
Out of order: 't2' comes before 't1'.
Injection of timeouts/' RST '/' EOF ', half requests (slow-POST).
Fallen idempotency storage → fail-closed behavior (better failure than double write-off).

14) Metrics and alerts

`retries_total{reason}`, `retry_budget_used{route}`, `backoff_seconds_bucket`.
`idempo_replay_total`, `idempo_conflict_total`, `duplicate_detected_total`.
Share 409/425/429/5xx by routes; p95/p99 "time to success" with retreats.
Alerts: burn-rate retray budget, surge in idempotence conflicts, DLQ growth.

15) Antipatterns

Retract all mistakes in a row.
Lack of jitter → synchronous waves of retraces.
Long-lived keys without TTL and cleaning.
Saving the result after a side effect commit (outbox violation).
Logs without 'trace _ id '/' idempotency _ key' are → impossible to generate.
Aggressive parallel retrays on write operations.

16) Prod Readiness Checklist

  • Unified policy: what retrayim, what not; codes and customer prompts.
  • Exponential backoff + full jitter; 'retry _ budget'specified.
  • Contract'Idempotency-Key '+ storing results with TTL.
  • Outbox/Inbox for events; DLQ; competitive limits.
  • Integration with circuit breaker, respect 'Retry-After'.
  • Metrics/Alerts by Retray/Duplicate/Conflict.
  • A set of chaos tests and network failure emulation.
  • Customer documentation - examples of back-ups and statuses.

17) TL; DR

Retreats are only useful together with idempotency. Enter 'Idempotency-Key' and result storage, apply exponential backoff with jitter and retry-budget, respect 'Retry-After', integrate with circuit breaker. For events - outbox/inbox; for payments, strict deduplication and locks. Measure retrays and conflicts, test duplicates and timeouts.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.