Distributed systems need to address several major issues

Difficult technical evolution and the evolution of architecture in monomer architecture upgrade I might have not experienced that in many scenes a lot of demand, we need a distributed system to resolve, but in most cases we may not need to use a distributed-related services, but business development may require us to advance understanding of relevant technologies as a technical reserve, ready to grasp the nettle. This article theory as an essay distributed completely understand the problems of their hands to knock distributed systems need to be addressed, the latter may do some deep some of distributed systems theoretical elaboration.

First, the node trust

Characteristics of distributed systems results in each node in a distributed system are possible moment in a different state, so if you need to design a distributed system or a distributed system requires study and research, the primary issue is to solve
or understand each how to ensure mutual between nodes can gain access to, it is directly between cluster nodes can ping each other, when you need to do consistency between node processing on a particular issue or some of the data when you need to solve this problem,
That is when a, b two nodes to be submitted with a record of when you need to modify the use of distributed lock to guarantee a, b two nodes can submit only one success, and the role of distributed lock can also be seen as a coordination node function. As another example,
such as a need to access node b, c, d, b hung up when the time, or b slow due to network problems, a node to know quickly, otherwise a may not work, if a, b two nodes seen two partitions, when a, b in one partition at the same time due to network problems turn into
two partitions when, how to a, b two partitions processing node is actually a trust, such as a partition node number a partition continue to provide services, b b a small number of nodes in the node out of service.
Trust seriously implement network and data transmission between nodes, the nodes trust if there are problems may not be completed.
A third example, we illustrate this problem can be distributed id generator, such as when a node requests a distributed id id generator to the system, then when a request is successful b to a request id no longer the same, No matter how time changes,
the same time a node and the node b can not simultaneously get the same id, that is to say a, b two nodes have done a tacit mutual trust based on a distributed system id generator.
The fourth example, database transactions with related, such as how to ensure cases distributed data transaction ACID properties, as well as how to ensure the consistency of a distributed transaction processing to do, due to the different cluster nodes involved. Well,
Once the transaction is committed or rolled back the transaction, the corresponding node data back to deal with the need to know the results soon, if the transaction is rolled back by half or transactions executed by half lead to other nodes get the wrong data so inconsistent distributed transactions do very rotten.
That is to the data on the state to do mutual trust between the nodes. Ensure that the relevant data node to get to know as much as possible of the state are the same.

Second, coordination node

After the node trust when we understand how it is, coordinating node will probably understand some of the nodes need to have some coordination role exists, which is coordinating node is expressed mode of operation of a distributed system, centralized, or no centralized, master-slave,
or dual master; redis example, learning is a good example of coordinating node, because there is redis master-slave mode, Cluster mode, Sentinel mode. On the basis of trust on the node, the node is the coordination of distributed systems need to address the second problem,
build or learning a distributed system need to understand the characteristics of each cluster model, such as the center of the advantages and disadvantages of a master multi-slave mode how to solve business problems. When the operation mode is finalized, probably divided into the main node node, the node from the observation node, data node, the backup node,
the role of service nodes, etc. When different roles based on how nodes appear done on the basis of trust and mutual coordination is synchronized this section requires discussion. Take the center of the cluster operation mode for example, the center of a cluster as a coordinator or a master node arbiter
which nodes are coordinated response data, which data storage nodes, which nodes can make new coordinator or arbitrator. You can compare the organizational structure, team structure in the real world. I will not elaborate here the advantages and disadvantages of the center, you can own Baidu.
Another example is no centralized operational mode of the cluster, no centralized mode of operation from the local view can be considered many nodes are the same node or data service node, from the whole considered all the nodes are serving nodes are storing data, to provide services, any one of
the nodes hung up without affecting service operation. A node coordination algorithm is distributed consensus HASH algorithm. Coordinating node can be understood by the algorithm is what that means. Finally, we can make use of the mechanism zookeeper to briefly discusses what it means coordination node,
zookeeper there leader node, follower node, as well as observer nodes with different roles in the electoral cluster, the cluster broadcast and have different roles to provide services , the details can be seen zookeeper related books.

Third, the data network protocol

Because it is a distributed system, distributed system must have the data needed to carry business, so a distributed system issue to consider is how to do some network-based protocol to transfer data, the data first before talking about network protocols fully digested,
many distributed netty frame-based protocol make network communication component or mina, can customize the data transmission protocol, http the RPC can be used or made easier to operate based netty binary protocol, or using self-developed json like, of course, in addition,
as redis, kafka, hadoop so made its use scenario development process and special protocols and data network protocol, where on the one hand refers to the protocol data itself can be common between different nodes, made in the application layer analytical packaging, etc., on the other hand refers to a
network protocol, such as TCP-based connection or long length http connection, or is based on UDP. Protocols distributed system if a particular scene or not the data transmission efficiency, the data processing efficiency is less sensitive to commonly used or disclosed general
protocol do underlying interactive component. Of course, in most cases reach the next single distributed system, the data network protocol if not handled properly can easily drag, causing performance problems, security and stability. Particular data processing needs to be done for a specific scene,
for example, serialize and deserialize theory and data analysis package as a distributed system can have a variety of implementations. As json, xml, hession binary data and custom protocols.

Fourth, data storage

I think the data is stored is the last issue distributed system needs facing many problems faced by the distributed system has data storage, because the system can not solve the monomer mass data, high concurrency problems, leading to data stored in a single repository and can not exist single-file, single data container.
Data storage, data changes, data distribution system involves a distributed some of its processing by rule, if the data storage protocol organization, the selection of data storage middleware wrong or inappropriate, then out for its design of
distributed systems certainly not to provide better and more robust service. The latter is also very difficult to be iterative. For example hadoop, hbase, and kafka, etc., of course, redis, mongodb. Its data storage have done a lot of sophisticated design and technical improvements to
adapt to a more complex scene. Of course, some distributed system can not focus on data storage, but can not be ignored. Such as nginx, dubbo, mycat, migraine and other services partial response framework also needs to be done to store metadata and configuration data of its existence, because it is distributed,
which can lead to the presence of many cases of data storage, such as database-level fragmentation , partitions, level segmentation, vertical shear classification. For example, large data storage hadoop, in addition to data storage design also determines the order of how much data can support a distributed system to meet business needs.
Redis example, its position is based nosql database memory, or may be used when the database cache, because it is in memory, and therefore can not store more data or more complex, even if not stored in a plurality of nodes tb level data, even Yes, other aspects will restrict its
performance and stability. Another example is the data stored metadata, such as the built-in database Hadoop dependent zookeeper do ip storing information and the like which, dependent zookeeper do Kafka storage cluster partition metadata.
If you are self-development of distributed configuration service, then many kv data also need to consider how to store how to ensure data consistency.

Summary: do here only at the macro level to solve the problems of distributed systems require some conclusions and theoretical understanding of the individual, more cases we need to consider carefully the relative vertical direction, such as in distributed transactions, caching, messaging, data configuration
database. And operating principle of the cluster, is inherently single-threaded or multithreaded and can handle the number of concurrent, can store much data. Programs and industry to achieve in each direction, there are many, but the key is to carry on with the business scenario-based design
research and development, technology selection and so on. Have the ability to be self-development, we can not afford the open source reference, Eight Immortals, recount.

Published 166 original articles · won praise 71 · Views 350,000 +

Guess you like

Origin blog.csdn.net/u010504064/article/details/104199960