GH GambleHub

Replication and eventual consistency

Replication and eventual consistency

1) Why eventual consistency

When the system is distributed by zones/regions, synchronous recording everywhere gives high latency and low availability in case of network failures. Eventual consistency (EC) allows temporary misalignment of replicas for the sake of:
  • low recording delay (local reception),
  • better availability during network partitions,
  • horizontal scaling.

The key task is controlled lax consistency: the user sees "fairly fresh" data, domain invariants are preserved, conflicts are detected and resolved predictably.


2) Consistency models - what we promise the customer

Strong: Reading immediately sees the last entry.
Bounded stale/read-not-older-than (RNOT): read not older than mark (LSN/version/time).
Causal: A "causal" relationship (A to B) is maintained.
Read-Your-Writes: A client sees his recent recordings.

Monotonic Reads: Every next read is not "rolled back."

Session: a set of guarantees in one session.
Eventual: if there are no new entries, all replicas converge.

Practice: Combine Session + RNOT on critical paths and Eventual on storefronts/caches.


3) Replication: mechanics and anti-entropy

Synchronous (quorum/RAFT): the record is considered successful after confirmation by N nodes; minimum RPO, above p99.
Asynchronous: leader commits locally, distributes log later; low latency, RPO> 0.
Physical (WAL/binlog): fast, homogeneous.
Logical/CDC: row/event level change flow, flexible routing, filters.
Anti-entropy: periodic reconciliation and repair (Merkle trees, hash comparison, background re-sync).


4) Version identifiers and causality orders

Monotone versions: increment/LSN/epoch; simple, but do not encode parallelism.
Lamport timestamp: partial order by logical clock.
Vector clock: fixes parallel branches and allows you to detect conflicting updates (concurrent).
Hybrid/TrueTime/Clock-SI: "Not before T" logic for global order.

Recommendation: for CRDT/conflicting updates - vector clock; for "not older" - LSN/GTID.


5) Conflicts: Discovery and Resolution

Typical situations: recording from two regions to the same object.

Strategies:

1. Last-Write-Wins (LWW) by hour/logical stamp - simple, but may "lose" updates.

2. Merge functions by domain logic:
  • counter fields are added (G-Counter/PN-Counter),
  • sets are combined with "add-wins/remove-wins,"
  • amounts/balances - only through transaction journals, not through a simple LWW.
  • 3. CRDT (convergent types): G-Counter, OR-Set, LWW-Register, RGA for lists.
  • 4. Operational transformations (rarely for databases, more often for editors).
  • 5. Manual resolution: conflict in "inbox," the user selects the correct version.

Rule: Domain invariants dictate strategy. For money/balances - avoid LWW; Use compensated transactions/events.


6) Record warranties and idempotency

Idempotent keys on commands (payment, withdraw, create) → retry is safe.
Inbox and outbox deduplication by idempotence key/serial number.
Exactly-once is unattainable without strong premises; practice at-least-once + idempotency.
Outbox/Inbox pattern: writing to the database and publishing the event is atomic (local transaction), the recipient processes by idempotency-key.


7) No Older X Reads (RNOT)

Technicians:
  • LSN/GTID gate: the client transmits the minimum version (from the write response), the router/proxy sends to the replica that caught up with LSN ≥ X, otherwise - to the leader.
  • Time-bound: "not older than 2 seconds" - a simple SLA without versions.
  • Session pinning: after recording N seconds, we read only the leader (Read-Your-Writes).

8) Change flows and cache negotiation

CDC → event bus (Kafka/Pulsar) → consumers (caches, indexes, storefronts).
Cache disability: topics' invalidate: {ns}: {id} '; idempotent processing.
Rebuild/Backfill: If out of sync, reassemble the projections from the event log.


9) Sagas and compensations (inter-service transactions)

In the EC world, long-lived operations are divided into steps with compensatory actions:
  • Orchestration: The coordinator calls the steps and their compensations.
  • Choreography: steps react to events and publish the following themselves.

Invariants (example): "balance ≥ 0" - check at step boundaries + compensation for deviation.


10) Multi-region and network partitions

Local-write, async-replicate: write to local region + deliver to other (EC).
Geo-fencing: data is "glued" to the region (low latency, fewer conflicts).
Quorum databases (Raft) for CP data; caches/storefronts - AP/EC.
Split-brain plan: if communication is lost, the regions continue to operate within domain limits (write fencing, quotas), then reconcile.


11) Observability and SLO

Metrics:
  • Replica lag: time/LSN-distance/offset (p50/p95/p99).
  • Staleness: The percentage of responses that are above the threshold (for example,> 2s or LSN
  • Conflict rate: rate of conflicts and successful merge.
  • Convergence time: the convergence time of the replicas after the peak.
  • Reconcile backlog: volume/time of lagging batches.
  • RPO/RTO by data category (CP/AP).
Alerts:
  • Lag> target, increase in conflicts, "long" windows of infeasibility.

12) EC Data Scheme Design

Explicit version/vector in each entry (columns' version ',' vc ').
Append-only logs for critical invariants (balances, accruals).
Event identifiers (snowflake/ULID) for order and deduplication.
Commutative fields (counters, sets) → CRDT candidates.
API design: PUT with if-match/etag, PATCH with precondition.


13) Storage and reading patterns

Read model/CQRS: writing to the "source," reading from projections (may lag behind → display "updated...").
Stale-OK routes (catalog/tape) vs Strict (wallet/limits).
Sticky/Bounded-stale flags in the request (header 'x-read-consistency').


14) Implementation checklist (0-45 days)

0-10 days

Categorize data: CP-critical (money, orders) vs EU/steel-OK (catalogs, search indexes).
Define Steele SLOs (e.g. "not older than 2s"), target lags.
Enable object versioning and idempotency-keys in API.

11-25 days

Implement CDC and outbox/inbox, cache disability routes.
Add RNOT (LSN gate) and session pinning to write-critical paths.
Implement at least one merge strategy (LWW/CRDT/domain) and conflict log.

26-45 days

Automate anti-entropy (reconciliation/repair) and styling reports.
Spend game-day: network separation, surge in conflicts, recovery.
Visualize on dashboards: lag, staleness, conflict rate, convergence.


15) Anti-patterns

Blind LWW for critical invariants (loss of money/points).
Lack of idempotency → duplicates of operations during retrains.
The "strong" model on the whole → excessive p99 tails and fragility in case of failures.
No RNOT/Session guarantees → UX "blinks," users "do not see" their changes.
Hidden cache and source misalignment (no CDC/disability).
Lack of a reconcile/anti-entropy tool - data "for centuries" diverges.


16) Maturity metrics

Replica lag p95 ≤ target (for example, ≤ 500 ms within a region, ≤ 2 s inter-regions).
Staleness SLO is performed ≥ 99% of requests on "strict" routes.
Conflict resolution success ≥ 99. 9%, average resolution time ≤ 1 min.
Convergence time after peaks - minutes, not hours.
100% of "money" transactions are covered by idempotency keys and outbox/inbox.


17) Recipes (snippets)

If-Match/ETag (HTTP)


PUT /profile/42
If-Match: "v17"
Body: { "email": "new@example.com" }

If the version has changed - '412 Precondition Failed' → the client resolves the conflict.

Query "not older than LSN" (pseudo)


x-min-lsn: 16/B373F8D8

The router chooses a replica with'replay _ lsn ≥ x-min-lsn ', otherwise it is the leader.

CRDT G-Counter (idea)

Each region keeps its own counter; Total - the sum of all components Replication - Operation is commutative.


18) Conclusion

Eventual consistency is not a compromise of quality, but a conscious contract: somewhere we pay freshness for the sake of speed and availability, but we protect critical invariants with domain strategies and tools. Enter versions, idempotency, RNOT/Session warranties, CDC and anti-entropy, measure lag/staleness/conflicts - and your distributed system will be fast, stable and predictably converging even under glitches and peak loads.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.