Service Discovery, Configuration
- Provides service discovery and service registry to enable inter service communication
- System-wide configurations: can be quickly rolled out and keep in sync
Client-side Discovery
The client queries a service registry then uses a load‑balancing algorithm to select one of the available service instances and makes a request.
Example:
- Netflix Eureka (service registry) + Netflix Ribbon (IPC client that works with Eureka to load balance requests across the available service instances)
- Amazon Cloud Map
Server-side Discovery
The client makes a request to a service via a load balancer. The load balancer queries the service registry and routes each request to an available service instance.
Example:
- The AWS Elastic Load Balancer (ELB)
- Kubernetes runs a proxy on each host in the cluster. The proxy plays the role of a server‑side discovery load balancer.
Pros:
- details of discovery are abstracted away from the client. Clients simply make requests to the load balancer.
- some deployment environments provide this functionality for free.
Cons:
- if the load balancer is not provided by the deployment environment, it is another highly available system component to maintain.
Service Registry
A database containing the network locations of service instances.
A service registry needs to be highly available and up to date.
Examples:
- etcd
- consul
- Apache Zookeeper
Kubernetes, and AWS do not have an explicit service registry. It is a built‑in part of the infrastructure.
Tools
- Build your own:
- Consul: from HashiCorp, uses gossip protocol agent to exchange data between the cluster nodes.
- etcd: distributed reliable key-value store for the most critical data of a distributed system.
- Zookeeper: from Hadoop.
- Built-in: Kubernetes, AWS
Consul vs Zookeeper/etcd
https://www.consul.io/intro/vs/zookeeper.html
Consul has native support for multiple datacenters.
All of these systems have roughly the same semantics when providing key/value storage: reads are strongly consistent and availability is sacrificed for consistency in the face of a network partition.
ZooKeeper et al. provide only a primitive K/V store and require that application developers build their own system to provide service discovery. Consul, by contrast, provides an opinionated framework for service discovery and eliminates the guess-work and development effort
Zookeeper
Zookeeper is
- an atomic messaging system that keeps all of the servers in sync.
- a distributed lock server.
- a fast, highly available, fault tolerant, distributed coordination service.
- a distributed, hierarchical file system that facilitates loose coupling between clients and provides an eventually consistent view of its znodes, which are like files and directories in a traditional file system.
It’s hard to use correctly.
One server acting as a leader while the rest are followers. On start of ensemble leader is elected first and all followers replicate their state with leader. All write requests are routed through leader and changes are broadcast to all followers. Change broadcast is termed as atomic broadcast.
It provides basic operations such as creating, deleting, and checking existence of znodes. It provides an event-driven model in which clients can watch for changes to specific znodes, for example if a new child is added to an existing znode. ZooKeeper achieves high availability by running multiple ZooKeeper servers, called an ensemble, with each server holding an in-memory copy of the distributed file system to service client read requests.
Consensus Algorithms
- raft
- the newer, easier to understand algorithm
- https://raft.github.io/
- interactive tutorial:http://thesecretlivesofdata.com/raft/
- paxo: the "classic" algorithm. For lightweight transactions in cassandra, bigtable’s chubby lock manager.
- zab: zookeeper's algorithm
- blockchain
Use cases
coordination service,distributed configuration store.
- Leader selection (primary/slave selection)
- Distributed locking
- Task queue / producer consumer queue
- Metadata store: as a storage service for some classes of data
- as a name server
- System-wide Configeration
- fast to roll out(comparing to code push)