logo

System Design - The CAP Theorem

  • Consistent
  • Available
  • Partition tolerant: "The network will be allowed to lose arbitrarily many messages sent from one node to another", meaning one server is detached from the other servers, either it can answer request immediately(available) or confirm it gets the latest answer before replying(consistent), but not both.
    • Cause of Partitions: network failures or machine failures

Discussion

  • can have only two at once
  • to scale you have to partition(distributed system), so you are left with choosing either high consistency or high availability for a particular system. You must find the right overlap of availability and consistency.
  • any delay between nodes can be modeled as a temporary network partition and in that event you have but two choices either wait to return the latest data at a peer node (C) or return the last available data at a peer node (A).

System Examples

  • consistent and available (CA): Redis, PostgreSQL, Neo4J(they don’t distribute data)
  • consistent and partition tolerant (CP): MongoDB and HBase. In the event of a network partition, they can become unable to respond to certain types of queries (for example, in a Mongo replica set you flag slaveok to false for reads).
  • available and partition tolerant (AP): CouchDB. Even though two or more CouchDB servers can replicate data between them, CouchDB doesn’t guarantee consistency between any two servers. DNS: the newly registered domain may take some time to propagate to all DNS servers. All DNS servers are available with their current records.
  • consistent, available, and distributed(CAP): Google's Spanner, by using the TrueTime.

Real World Examples

  • For the checkout process you always want to honor requests to add items to a shopping cart because it's revenue producing. In this case you choose high availability. Errors are hidden from the customer and sorted out later.
  • When a customer submits an order you favor consistency because several services--credit card processing, shipping and handling, reporting--are simultaneously accessing the data.