abhishek
← writing

Consistency Models, Demystified

February 2025 · ...

Consistency is one of those words in distributed systems that means different things in different contexts. ACID consistency (data invariants), CAP consistency (linearizability), and eventual consistency are all different ideas wearing the same name. This post is about the second kind: what ordering guarantees does a distributed data store give you?

Why it matters

If you write a value to a database and immediately read it back, do you see your write? What if the write went to one replica and the read goes to a different one? The answer depends on the consistency model the system provides.

Getting this wrong is a common source of subtle bugs. A cache that reads stale data after an update, a balance that shows the wrong amount for a few seconds, an inventory count that goes negative because two processes read the same value before either wrote — all consistency bugs.

Linearizability

The strongest model for a single object. A system is linearizable if every operation appears to take effect instantaneously at some point between its invocation and completion. Once a write completes, every subsequent read (by any client) sees that write or a later one.

In practice: if you write to a linearizable store and hand a friend the confirmation, they can immediately read the value and see it. No "eventually."

Client A: write(x, 1) ---[ok]--->
Client B:              ---read(x)---> 1   ✓ must see 1

Linearizability is expensive. Achieving it requires coordination — usually a consensus round or a leader that serializes all writes. This is why it's associated with latency. Raft gives you linearizable reads if you route them through the leader. ZooKeeper's reads are not linearizable by default (they can serve stale data from followers); you have to call sync() first.

Sequential Consistency

Weaker than linearizability. Operations appear in some sequential order that is consistent with the order seen by each individual process, but not necessarily with real time. Concretely: you can see your own writes in order, and everyone sees the same global order — but that global order might not reflect wall-clock time.

The key difference: in linearizability, if operation A completes before operation B starts in real time, A must appear before B in the global order. Sequential consistency drops this requirement.

Causal Consistency

Operations that are causally related must appear in causal order. If you read a value and then write based on it, anyone who sees your write must also see the read you based it on. Independent (causally unrelated) operations can appear in any order.

This is the strongest model achievable without coordination, which makes it attractive for geo-distributed systems. COPS and Occult are academic systems built around causal consistency. MongoDB's causal sessions offer a version of this.

Eventual Consistency

The weakest useful model. If no new updates are made, all replicas will eventually converge to the same value. That's it. No guarantees about when, and no guarantees about what you see in the interim.

DNS is eventually consistent. S3 used to be (it's now strongly consistent for new objects since 2020). Cassandra is tunable but defaults to eventual.

"Eventual" doesn't mean slow — replicas might converge in milliseconds. It means you cannot make any guarantees about a specific read returning the latest write.

Which to choose?

Most systems need linearizability for things like account balances, inventory counts, or anything where a stale read has real consequences. Eventual consistency works for user profiles, recommendation systems, or any case where seeing slightly stale data is acceptable and availability matters more.

The mistake is assuming you need the weakest model for performance without measuring. Modern consensus-based systems (etcd, CockroachDB, Spanner) deliver linearizability at latencies that would have seemed impossible a decade ago.