Conflict Detection and Resolution
1) What is considered a conflict
A conflict is a situation where two or more change sources claim incompatible states of the same entity, resource, or invariant.
Syntactic: overlapping changes to one file/key (merge conflict in Git, patch collision in Kustomize).
Semantic: a document correct according to the scheme violates the business invariant (debit amount ≠ credit, limit exceeded).
Operational/temporal: write/read races, duplicate events, cause-effect discrepancy.
Domain: competing operations on the resource (double ticket sale, overbook goods).
The task: to detect the conflict as early as possible, explain its cause and safely choose one of the actions: auto recovery, retray, merger, compensation, escalation.
2) Detection mechanisms
2. 1 Versioning and state comparison
ETag/If-Match in REST, rowversion/xmin in DB - lost update detection.
3-way merge (base, ours, theirs) - highlighting incompatible edits.
Checksum/Hash by field/document - cheap comparison.
2. 2 Time and causal labels
Lamport clock: total order "approximate in time."
2. 3 Invariants and Constraints
Schemes and validators (JSON Schema/OpenAPI) - syntactic validity.
Invariants: uniqueness, non-negativity, balance, ACL rules.
Integrity checks: FK/UNIQUE/EXCLUDE indexes, partial constraints.
Domain asserts in code/policies (OPA/Kyverno/Conftest).
2. 4 Detection in event streams
Idempotency Key/Dedup Store (e.g. Redis/DB with TTL): rejection of takes.
Transactional/Exactly-once in streaming: transactional id, producer epoch, consummer-offset.
Sequence gap detection: gaps, repetitions (n, n + 1, n + 1).
2. 5 Observability and alarm
Error/Collision/Retray Prometrics.
3) Resolution strategies
3. 1 Fully automatic (safe by definition)
CRDT (Conflict-free Replicated Data Types): G-Counter, PN-Counter, OR-Set, LWW-Register, Map/Graph CRDT.
Ensuring convergence without coordination; the choice of loss/retention semantics is important.
Commutative operations: applied in any order (increments, log appendages).
Idempotent handlers: repetition does not change the result (upsert by key, put-if-absent).
Optimistic merging of structures: 'deep merge + policy' with deterministic order.
3. 2 Semi-automatic (with policy)
3-way merge + array rules ('replace' append 'uniqueBy (key) | patchBy (key)').
LWW (Last-Write-Wins): simple but risk of loss of causal correctness.
The priorities of the sources are "interactive input> config from file> defaults."
Business rules: "if the limit is exceeded - partial confirmation/compensation."
3. 3 Coordination
OCC/MVCC (optimistic blocking/multi-version): version reconciliation, retray.
Pessimistic locks: 'SELECT... FOR UPDATE ', distributed locks (Redlock/DB-lock/etcd).
Consensus (Raft/Paxos): one leader decides order; there are fewer conflicts, the price is latency.
3. 4 Person-in-loop (HITL)
UI for Manual Merges/Arbitrations (especially Content, Tariffs, Catalogs).
Preview of diffa, explanation of policy, buttons: "accept ours/theirs," "merge fields," "create compensation."
4) Patterns by layers of architecture
4. 1 API/REST/gRPC
Optimistic concurrency: 'If-Match: <etag>', 409/412 in case of conflict → client retracts taking into account fresh ETag.
Idempotency-Key in POST (payments/orders).
Semantic 409: Communicate the reason and proposed actions.
4. 2 Data warehouses
RDBMS: MVCC (snapshot isolation), unique indexes, partial indexes.
KV/Doc stores: versions/revisions (rev), compare-and-swap (CAS).
Multi-master replication: Use/CRDT or write to leader only for critical entities.
4. 3 Queues/Streaming
Exactly-once (practically - "effectively once"): transactional producer + atomic write-to-sink.
Dedup on the console: storing the last N id, upsert/merge logic.
Outbox/Inbox pattern: consistent event publishing.
4. 4 Configurations and IaC
3-way merge in GitOps, policy-gates (OPA/Kyverno) prior to use.
Kustomize/Helm: deterministic merge strategies and prohibition of "unknown keys."
Terraform: plan-diff as a "drift vs wanted" conflict signal.
5) Algorithms and examples
5. 1 3-way merge (simplified)
text resolve(base, ours, theirs):
diff1 = delta(base, ours)
diff2 = delta(base, theirs)
if independent(diff1, diff2): return apply(base, diff1 ⊕ diff2)
if conflictsOnlyInArrays: return arrayPolicyMerge(...)
else:
return CONFLICT with hunks
5. 2 OCC for REST resource
http
Client reads
GET /accounts/42 -> ETag: "v17", body: {balance: 100}
Trying to write off
PUT /accounts/42
If-Match: "v17"
{balance: 50}
If someone has managed before
HTTP/1. 1 412 Precondition Failed
{error: "version_mismatch", currentEtag: "v18"}
The client rereads, applies the delta to the current state, and repeats.
5. 3 Semantic conflict (invariant)
pseudo on Debit(accountId, amount):
current = read(accountId)
if current. balance - amount < 0:
return REJECT ("insufficient _ funds") # write early detection (accountId, version = current. version+1, balance=current. balance - amount)
5. 4 CRDT: OR-Set (sketch)
Elements are added with a unique tag, deletion - for specific tag.
The "add vs remove" conflict is resolved by using the remove tags to remove only visible add tags.
6) Resolution policy: how to formalize
Describe in architectural doctrine:1. Priority chain.
2. Strategies by data type: scalars/objects/arrays/multimedia.
3. Causal model: do you use versions, Lamport, vector clocks.
4. Loss semantics: what can be lost in LWW, where consensus is needed.
5. Time windows: TTL for deduplication, idempotency windows.
6. Escalation: when auto-resolution is prohibited, requirements for UI/approval.
7. Compensations: SAGA strategies "cancel/compensate" for reassembling invariants.
7) Metrics and SLO
conflicts_total{type} is the frequency by type.
conflicts_resolved_auto_ratio - share of auto-permits.
mean_time_to_resolution is the average time to settlement.
lost_update_incidents - Incidents of lost updates.
idempotency_hit_rate - the proportion of Idempotency keys that worked.
divergence_depth is the depth of the replica divergence (version vectors).
SLO example: "≥ 99% of syntactic conflicts are resolved automatically in ≤ 5 seconds, semantic conflicts in ≤ 15 minutes with HITL."
8) Practical scenarios
8. 1 Payments
Key: Idempotency-Key, OCC on balance, SAGA for reversible steps.
Conflict: double write-off → dedup + balance sheet version check → partial compensation.
8. 2 Inventory/Tickets
Options: pessimistic slot/seat blocking; optimistic reservation with expiring TTL; compare-and-reserve queue.
8. 3 Content/catalogs
3-way merge + HITL: editor selects total; Auto rules for "safe" fields (SEO tags that do not affect the price)
8. 4 GitOps/Kubernetes
Render and validation before application; reject on unknown keys; prohibition "--force" without review.
Drift detection and policy-enforced rollback.
9) Anti-patterns
LWW everywhere: simplicity at the cost of causality loss.
Hidden retrays without idempotency: avalanche-like duplicates.
No explicit array policy - silent loss of configuration points.
Global mutexes on top of networks: SPOF and long locks.
"Blind" compensations without cause audit: repeated conflicts.
10) Implementation checklist
- Define domain conflict types and invariants.
- Select the versioning mechanism (ETag/xmin/vector clock).
- Enable idempotency in critical POST/commands.
- Set the merge policy by data type (scalars/arrays/objects).
- Enable schema validators and pre-commit domain checks.
- Configure collision and alert metrics.
- For critical entities - leader/consensus, or CRDT.
- Work out HITL flow and UX (diff, comments, audit log).
- Document SLOs and compensation procedures (SAGAs).
11) FAQ
Q: When to choose CRDT and when to choose consensus?
A: CRDT is suitable when eventual consistency is acceptable and high availability/local entries are important. Consensus - for data with rigid invariants and strict order of operations (cash balances, access rights).
Q: Is LWW enough?
A: For caches, metrics, and secondary indexes - often yes. For user data and money, almost always not.
Q: How do I select a deduplication window?
A: Focus on the maximum expected re-delivery delay + network jitter, add a margin of 3-5 × item 99.
Q: Should you always do HITL?
A: No. Leave HITL for dispute/value conflicts automate and log the rest.
12) Totals
Effective conflict detection and resolution is a combination of versioning, causal labels, invariants, and clear policy complemented by appropriate algorithms (CRDT/OT/OCC/MVCC/consensus) and observability. Systems where conflict is a "normal" situation remain accessible and predictable; systems where conflict is an "exception" break down at the worst possible time. Select a model, formalize rules, and measure the result.