zookeeper and Distributed Systems

1.1.  Distributed Systems Basics

   A tomcat conquer the world's era, we can not say completely eliminated, but also often used in a management system, small-scale projects, which is not too much, because of cost considerations, but it is worth promoting.

   

1.1.1.  What is a distributed system  

Distributed Systems: a hardware or software components located on different network computers, and communicate only by message passing systems coordinated with each other

 

This is a distributed system, different hardware, different software, different networks on different computers to communicate with only coordinated through messages

 

This is characteristic of him, more detailed look at these features and can include: distributed, peer to peer, concurrency, the lack of a global clock,

Failure can occur at any time.

 

1.1.1.1.  Distribution of

Since it is a distributed system, the most notable feature sure it is distributed, ranging from simple point of view, if we do that electrical business project, the whole project will be divided into different functions, professional points on different micro-services, such as user micro-services, micro products and services, orders micro-services, which are deployed in different tomcat , different servers, even different clusters, the entire architecture are located in different places in space is arbitrary, and always will increase , delete the server node, which is the first feature

1.1.1.2.  Reciprocity

Reciprocity is a target of the distributed design, or to the electricity supplier site, for example, to explain what is reciprocity, to complete a distributed system architecture, is certainly not easy to split a large single system to a micro-service, and then deployed in different server cluster is enough, which split the completion of each micro-services are likely to find problems, which led to loss of function of the entire electricity supplier site appears.

For example, service orders, service orders in order to prevent a problem occurs, in general, need to have a backup, when order service problems can replace the original order service.

This requires two (or 2 or more months) order service is completely equivalent, functions exactly the same, in fact, this is a redundant copy of a service.

One is also redundant copies of data, such as databases, caching, etc., and the above said orders for the same service, for safety considerations require the presence of exactly the same backup, this is the meaning of reciprocity.

1.1.1.3.  Concurrency

Concurrency In fact, for us, is not mode, multi-threaded learning time has more or less studied, is the basis for concurrent multi-threading.

But now it is not multithreaded point of view we want to reach, but a higher level, from the multi-process, multi- JVM 's point of view, such as multiple nodes in a distributed system may operate concurrently with some shared resources, how accurate and efficient coordination of distributed concurrent operations.

Distributed portion of the rear lock combat this problem is actually solved.

1.1.1.4.  The lack of global clock

In a distributed system, the node is possible anyway, anywhere, and each location, each node has its own time system, distributed system, difficult to define who should and who tangled two transactions, the reason is because of the lack of a global clock sequence control, of course, now it's no big problem, there are already a lot of time to the server system call

1.1.1.5.  Malfunction occur at any time

Any node may appear power outages, crashes and so on, the more server cluster, the greater the likelihood of failure, as the number of clusters increases, even failure will become the norm, how to ensure failure in the system, the system or normal visitors as a system architect should consider.

 

1.1.2.  Large sites Chart Review

Know what is a distributed system, the following specific look at the large site architecture diagram that in front of a distributed architecture evolution should have said, first of all the whole structure is divided into a number of layers, application layer, service layer, infrastructure layer and data service layer, each layer consists of a number of nodes, which is a typical distributed architecture, behind a lot of time is a learning inside portion of each

 

 

 

So zookeeper in which it is what role, if you can put zk play to the role of the traffic police, and each node is a variety of vehicles on the road (car, bus), in order to ensure that the entire transportation (system) availability, zookeeper must know the health status of each node (whether buses out of the question to send a new bus service registration and discovery []), whether the road in the rush hour congestion in a certain very narrow road to allow only a single direction of the car distributed by [lock].

 

If the traffic police commander transportation system, and the zookeeper is the commander of the various nodes of the distributed system.

 

1.1.2.1.  Distributed Systems Coordination "Methodology"

 

1.1.2.1.1.  Distributed systems due to the problem

If the distributed systems and the usual transportation systems were compared, even then a sound transportation system will have a traffic accident, distributed systems, there are many problems to be overcome, such as: communication error, network partition, three-state, node failure.

 

1.1.2.1.1.1.  Communication Abnormal

Communication error is actually a network anomalies, network system itself is not reliable, due to the distributed system requires data transmission over the network, fiber optic network, routers and other hardware problems inevitably arise. As long as network problems, it will also affect the process of sending and receiving messages, so the data is lost or extended messages will become very common.

1.1.2.1.1.2.  Network partition

Network partition, in fact, split brain phenomenon, would have been a traffic police to manage the traffic situation throughout the Area, all orderly, suddenly there was a power outage, or natural disasters such as earthquakes occur, certain road traffic police not receive instruction , may in this case, there will be a zero workers, Pianjing zero to direct traffic.

 

But note that, in fact, still in the original traffic police, but the communication system is interrupted, this time will be a problem, and on the road the same Area have different people in command, it will surely clog engine traffic chaos.

 

This has two conflicting person in charge of when it will occur due to various problems lead to the same area (distributed clusters) of this schizophrenic situation, referred to herein as split-brain, also known as network partition.

1.1.2.1.1.3.  Tristate

What three states are? Tri-state is actually a success, and the Third State other than the failure, of course, certainly not called metamorphosis, and called a timeout state.

In a jvm after, the application calls a method to get a clear function corresponding either succeed or fail, but in a distributed system, although in most cases be able to receive the corresponding success or failure, but once the network abnormal, it is very likely a timeout, timeout when there is such a phenomenon, the initiator of the communications network is unable to determine whether the request is processed successfully.

1.1.2.1.1.4.  Node failures

In fact, this has already been said, node failure in a distributed system is a relatively common problem, referring to the downtime or server cluster nodes will appear "dead" phenomenon, often this phenomenon occurs.

 

1.1.2.1.2.  CAP theory

Front took great lengths to understand the characteristics of distributed and will encounter many vexing problems, which certainly there will be some theoretical ideas to solve the problem.

Next, take the time to talk about these theories, in which the CAP and BASE theory is the foundation, but also the interview often ask to

 

First look CAP , CAP abbreviation fact, consistency, availability, partition fault tolerance of these three words

 

1.1.2.1.2.1.  Consistency

Consistency is a transaction ACID a characteristic [ atomicity ( Atomicity ), consistency ( Consistency ), isolation ( Isolation ), persistent ( Durability Rev ) ], learning database optimization when deer teacher said.

 

Speaking of consistency is much the same, but now consider a distributed environment, a single database or not.

 

In distributed systems, data consistency between multiple copies of the same characteristics is able to guarantee here that the consistency and in front of that reciprocity actually similar. For if we can change after a successful implementation of a data item, all users can read in a distributed system immediately to the latest value, then such a system is considered to have a strong [consistency].

 

1.1.2.1.2.2.  Availability

Availability refers to the system of service must always be in a usable state, the result always accessible for a limited time operation of the requesting user.

The focus here is [limited time] and [return] results

In order to do a limited period of time we need to use caching, load need to use this time to increase server nodes for performance reasons;

In order to return a result, the need to consider the primary and backup server, node when something goes wrong master node can be backed up fastest replace up, do not appear OutOfMemory or other 500 , 404 errors, otherwise we would think that such a system is unavailable .

1.1.2.1.2.3.  Partition fault tolerance

Distributed systems in the face of any network partition fails, we still need to be able to meet Foreign provide consistency and availability of services, unless the entire network environment has failed.

Split brain situation can not appear

 

 

 

1.1.2.1.2.4. Specific description

Look at the CAP theory described in detail:

A distributed system can not meet the consistency, availability, fault tolerance, and partition these three basic needs, can only meet two of them

 

 

 

TIPS : it is impossible to put all applications all on one node, so the architect of the energy spent on how often samples based on the business scene in A and C seek to balance direct;

1.1.2.1.3.  BASE theory

According to the previous CAP theory, architects should find the balance between consistency and availability, the system must be completely unavailable for a short time allowed, then according to CAP theory, in a distributed environment must also not do strong consistency.

 

BASE theory: even if unable to do so strong consistency, but the system can be distributed according to their operational characteristics, appropriate way to make the system reach a final agreement;

 

1.1.2.1.3.1.  Basically avaliable   basic available

When the distributed system of unforeseen failures, allowing availability loss portion, the support system "substantially available"; embodied in the "loss of time" and "loss of function";

EG : Some users double eleven peak of Taobao page Caton or downgraded;

 

1.1.2.1.3.2.  Soft State soft state

In fact, in front of the mentioned three-state , it allows both data system there is an intermediate state, there is no delay data synchronization process between the different nodes of both copies of the data system, and that such delay does not affect system availability;

EG : 12306 Web site to sell tickets, requests are queued into the queue;

1.1.2.1.3.3.  Eventually consistent eventual consistency

After all the data in the synchronization data over time, ultimately to achieve a consistent state;

EG : prepaid financial products Home total amount of short-term inconsistencies;

 

1.2.  Zookeeper Profile

1.2.1.  Zookeeper Profile ( the What )

ZooKeeper is committed to providing a high-performance, high availability, and distributed with strict sequential access control capability to coordinate services, Yahoo is created, is Google 's Chubby an open source implementation, is Hadoop and Hbase important components.

 

1.2.1.1.  Design goal

l simple data structure: a shared tree structure, similar to a file system stored in the memory;

can build a cluster: avoid single points of failure, 3-5 machines can be clustered, more than half the normal work will be able to provide services;

sequential access: For each read request, ZK will be assigned a globally unique incremental number, use of this feature can enable advanced coordination services;

High Performance: memory-based operation, the non-serving transaction requests, business scenario applies to read operations based. 3 sets zk cluster can reach 13W QPS ;

 

1.2.2.  What are the common need to use ZK ( Why )

Publish and subscribe data

Load Balancing

Naming Service

Master election

Cluster Management

Configuration Management

Distributed Queue

Distributed Lock

 

1.2.3.  Why learn ZooKeeper ? ( Why )

Internet architect Necessary Skills

High-end jobs will examine the knowledge

zk Interview Questions full resolution

l Zookeeper what framework

scenarios

l Paxos algorithm & Zookeeper Use Agreement

election algorithms and processes

l Zookeeper which has several node types

l Zookeeper pair of nodes watch monitor notice is permanent?

deployment? The role of the cluster machines have what? Clusters minimum of several machines

cluster if there are 3 machines, hang a cluster can work it? Two hang of it?

cluster supports dynamic addition of machine?

Guess you like

Origin www.cnblogs.com/Soy-technology/p/11351350.html