Consistency
How it works
- Choose model (strong/eventual)
- Apply quorum or leader rules
- Handle conflicts (CRDTs/outbox)
- Validate with invariants
🎯 What is Consistency?
Consistency in distributed systems ensures that all nodes see the same data at the same time. It's like having a shared notebook - everyone writes on the same page, and no one can write on a different page until the first one is done.🖥️ Node 1
🖥️ Node 2
🖥️ Node 3
⬇️
Overview
- Governs what readers observe relative to recent writes across replicas.
- Strong: reads see latest committed write. Eventual: replicas converge over time.
- Tunable consistency lets callers pick (e.g., QUORUM vs ONE) per request.
When to use stronger guarantees
- Monetary transfers, inventory updates, auth/entitlements, uniqueness constraints.
- Workflows where dual-writes or replays cause significant harm.
Trade-offs
- Stronger consistency increases latency and reduces availability during partitions.
- Weaker consistency improves availability but requires idempotency and reconciliation.
Patterns
- Quorum reads/writes: W+R > N to ensure overlap.
- CRDTs/event sourcing for conflict-free merges.
- Sagas and outbox pattern for cross-service updates.
- Read your own writes via sticky routing or session tokens.
Anti-patterns
- Blind dual-writes to multiple stores without ordering/transactions.
- Assuming clock sync; prefer logical timestamps/vector clocks.
- Relying on caches without invalidation strategy.
📐 Quick Diagrams
# Quorum (N=3, W=2, R=2)
Client ▶ LB ▶ Replicas A,B,C
└─ write to any 2
└─ read from any 2
# Outbox + Poller
Service ▶ DB(outbox) ▶ Poller ▶ Broker ▶ Consumers
❓ Interview Q&A (concise)
- Q: CAP vs PACELC? A: CAP focuses on partitions; PACELC adds latency vs consistency trade-offs even without partitions.
- Q: Ensure idempotency? A: Use request IDs, upserts, de-dup tables, and at-least-once consumers with exactly-once semantics per key.
- Q: Read-after-write consistency? A: Route to primary, use sticky sessions, or quorum reads; avoid stale replicas.
📚 Consistency Models
🔒 Strong Consistency
All reads return the most recent write
✨ Characteristics:
- Immediate consistency across nodes
- High latency due to coordination
- Example: Traditional RDBMS
⏳ Eventual Consistency
System will become consistent over time
✨ Characteristics:
- Low latency, high availability
- Data may be stale for a period
- Example: DNS, Amazon DynamoDB
⚠️ Weak Consistency
No guarantees when all nodes will be consistent
✨ Characteristics:
- Best effort to be consistent
- Used in real-time systems
- Example: Video streaming, gaming
ACID Properties
- Atomicity: All or nothing execution
- Consistency: Database remains in a valid state
- Isolation: Concurrent transactions don't interfere
- Durability: Committed transactions persist
BASE Properties (Alternative to ACID)
- Basically Available: System remains available
- Soft state: State may change over time
- Eventually consistent: System becomes consistent eventually
🛠️ Achieving Consistency
🔄 Data Replication
Maintain copies of data across nodes
Synchronous: Immediate consistency, higher latency
Asynchronous: Eventual consistency, lower latency
🛠️ Distributed Transactions
Ensure all-or-nothing across services
Two-Phase Commit: Coordinator and participants agree on commit
Saga Pattern: Chained transactions with compensation actions
🔑 Consensus Algorithms
Agree on values among distributed nodes
Paxos: A family of protocols for solving consensus in a network of unreliable processors
Raft: A consensus algorithm that is easy to understand and implements a replicated log
🎯 Consistency Best Practices
🔄 Choose the right consistency model: Based on application needs
🛠️ Use distributed transactions judiciously: Be aware of the trade-offs
🔑 Implement consensus algorithms: For critical distributed data
📊 Monitor and tune: Consistency-related metrics and performance