Distributed system architecture theory and components

Article directory

1. Development of distributed systems
2. Challenges of distributed systems
3. Basic theory of distributed systems
4. Distributed architecture components
5. Commonly used databases

1. Development of distributed systems

In the early days of computer development, centralized computing was always used, and computing power relied on large computers. With the development of the Internet, heavy business requires huge computing power to complete, centralized computing cannot meet the requirements, and large computers are also very expensive. Distributed computing decomposes tasks into smaller parts and assigns them to multiple computers for processing, which can save overall computing time and greatly improve computing efficiency.

Large-scale Internet websites often face problems such as high concurrent access and massive data processing. They must ensure that the system is highly available and easy to scale. The distributed architecture uses multiple machines to work together to dynamically scale capacity and use redundant nodes to eliminate single points of failure and improve system availability.

2. Challenges of distributed systems

There is no silver bullet in software development. Any system structure has pros and cons. There are three challenges in distributed systems:

1) Limited network resources: Network communication is used between nodes, and the network has bandwidth limitations and delays. No node can achieve instant response and high throughput.
2) Node management costs: Distributed system nodes may expand to tens of thousands, and the cost of operating and maintaining nodes is very high.
3) Lack of global clock: The accuracy of computer clock synchronization on the network is greatly limited, and there is no consistent global time. Computers are randomly distributed in space, and it is difficult to define the order in which events occur on different machines.

3. Basic theory of distributed systems

3.1 CAP theorem

The three characteristics of distributed systems: Consistency, Availability, and Partition tolerance. Only two of them can be satisfied at the same time. All three cannot be achieved simultaneously.

Consistency: After the data is updated, the data of all nodes at the same time is completely consistent. When clients access concurrently, the returned data is consistent. The server copies the data to the entire system as quickly as possible to ensure that the data is ultimately consistent.
Availability: The system can always serve users without operation failure or timeout. Availability within a unit of time is often measured by N nines, such as 99.999% availability.
Partition Tolerance: A distributed system is composed of many nodes internally and looks like a whole to the outside world. When a node or network partition encounters a failure, it can still provide external services that meet consistency or availability. A small number of machines in the system are down, but the remaining machines can still operate normally without any perception by users.
There are many cluster nodes in large-scale Internet applications, and node or network failures are normal. The system must meet partition fault tolerance, and ultimately can only choose between C and A.

Traditional industry projects are different. Taking financial systems as an example, operations involving money must meet data consistency. When a network failure occurs, it is better to stop the service than to ensure C. In the end, the only choice is between A and P.

3.2 PACELC theory

CAP theory cannot guide the actual system architecture very well. For example, Availability. If the interface takes a long time to return results, it is certainly available, but it is not acceptable for business purposes. In most cases, system partitions run smoothly, and system design must balance latency and data consistency. In order to ensure data consistency, the latency of reading and writing will inevitably increase.

In the case of a bad partition, choose between C and A, abbreviated as PAC. When the partitioning is correct, Latency and Consistency are chosen, abbreviated as LC. The E in PACELC stands for Else, and together they are PACELC.

Many storage software implement PACELC strategies, and users use different configurations according to different business scenarios. Taking MySQL master-slave replication as an example, three modes are provided:

Asynchronous mode: After the master library executes the transaction submitted by the client, it immediately returns the result to the client, regardless of whether the slave library has received and processed it. Due to the delay in data synchronization, the client may not be able to read the latest data on the slave database. This mode has the best performance for MySQL, but users need to weigh whether the business can tolerate this delay.
Fully synchronous replication: The master database completes the transaction submitted by the client, and all slave databases execute the transaction before returning the result. This ensures strong consistency, but the response time becomes longer.
Semi-synchronous replication: After the master library completes the transaction submitted by the client, it waits for at least one slave library to receive it and write it to the relay log before returning it to the client. The delay is much smaller in this way, and data is less likely to be lost than asynchronous replication.

3.3 BASE model

The full names of the BASE model are Basically Available (basically available), Soft-state (soft state/flexible transactions), and Eventually Consistent (eventual consistency). For most distributed systems, achieving partition tolerance is a basic requirement, so consistency and availability must be balanced. BASE emphasizes sacrificing high consistency for availability. Data is allowed to be inconsistent for a period of time, as long as it is eventually consistent.

3.4 Consensus algorithm

Data consistency issues in distributed systems are the most critical and difficult areas in system design. The industry has proposed many mature consistency consensus algorithms.

Paxos algorithm
In 1998, Leslie Lamport first disclosed the Paxos protocol in the paper "The Part-Time Parliament". He used the Greek island of Paxos as a metaphor to describe the process of passing resolutions on the island of Paxos. In 2001, Lamport republished a simple algorithm description version "Paxos Made Simple".
Raft algorithm
Because the Paxos algorithm is too difficult to understand and implement, Diego Ongaro and John Ousterhout of Stanford University proposed the easier-to-understand Raft algorithm. Compared with the traditional Paxos algorithm, Raft decomposes a large number of computing problems into simple and relatively independent sub-problems, and has the same performance as Multi-Paxos.
ZAB protocol
The full name of ZAB protocol is Zookeeper Atomic Broadcast (Zookeeper Atomic Broadcast Protocol). The distributed coordination service ZooKeeper designs a consistency protocol that supports crash recovery. Based on this protocol, ZooKeeper implements a master-slave system architecture to maintain data consistency between replicas in the cluster. By design, the ZAB protocol is very similar to Raft.

4. Distributed architecture components

4.1 Main components

Service registration and discovery:
Spring Cloud Eureka, Apache Nacos, Apache Zookeeper, ETCD
Service call:
Spring Cloud Feign, Apache Dubbo, Motan, gRPC
Wearing code:
Spring Cloud Zuul, Spring Cloud Gateway, Apache ShenYu, Kong
Microservice circuit breaker downgrade:
Spring Cloud Hytrix, Alibaba Sentinel
Load Balancer:
Spring Cloud Ribbon, Spring Cloud LoadBalancer
Distributed monitoring:
Spring Boot Admin, Meituan CAT, Zabbix, Prometheus + Grafana + Alertmanager, Open-Falcon
Configuration management:
Spring Cloud Config, Alibaba Nacos, Baidu Disconf, Ctrip Apollo
Message queue:
RocketMQ, Kafka, RabbitMQ
Task scheduling:
Apache Dolphinscheduler, Apache ElasticJob, XXL-JOB
Distribution formula:
Alibaba Seata
Call chain tracking:
Spring Cloud Sleuth + ZipKin, Apache Skywalking
Japanese collection:
Flume + Kafka + HDFS, Elasticsearch + Logstash + kafka + Kiabana
Sub-database and table:
Apache ShardingSphere, MyCat, Meituan DBProxy
Distributed lock:
Redisson + Redis
Permission control:
Spring Security, Shiro + JWT
Text system:
Fastdfs, Minio, HDFS
Reverse proxy:
Nginx

4.2 Auxiliary tools

Java application diagnostics: Alibaba Arthas

4.3 Common architectures

Insert image description here

5. Commonly used databases

5.1 Development of database

Database is an industry with a long history, and it has a history of nearly fifty years since its birth. Databases have been constantly evolving and developing in terms of technology, business and application scenarios. In the 1990s, for personal office, personal entertainment and enterprise information scenarios, famous relational databases such as MySQL and MS SQL Server were born based on X86 servers.

NoSQL databases were created due to the birth of Internet business. In 2006, Google introduced BigTable, followed by HBase, Cassandra, MongoDB and Redis. These databases all use different underlying data organization forms to solve different problems. Around 2010, Google introduced new products represented by Spanner, and NewSQL databases such as F1, SequoiaDB, and TiDB appeared, using SQL to solve application problems while retaining the scalability issues of NoSQL.

NoSQL database is designed to solve the shortcomings of traditional relational databases. It has four characteristics:

Easy to expand: A common feature of NoSQL data is that it removes the relational characteristics of relational databases, and there is no relationship between data. This makes it very easy to expand and brings scalability at the architectural level.
High performance: NoSQL databases have very high read and write performance, especially when dealing with large amounts of data. Due to the irrelevance between data, the structure of the database is simple.
High availability: NoSQL can easily implement a high-availability architecture without affecting performance. For example, Cassandra and HBase models can also achieve high availability by replicating the model.
Flexible data model: NoSQL does not need to create fields for the data to be stored in advance, and can store customized data formats at any time. In a relational database, adding and deleting fields is a very troublesome thing.

5.2 OLTP and OLAP

From the perspective of business scenarios, data processing can be divided into OLTP and OLAP. Which database to use in these two scenarios depends on the developer's technical level and experience. Generally speaking, OLTP uses a strongly consistent relational database, and OLAP uses a NoSQL or columnar database.

OLTP (on-line transaction processing)
OLTP is online transaction processing, which is mainly used to record the occurrence of business events. When a behavior occurs, the system records who did what at what time and where, and adds, deletes, and modifies data in the database, which requires high real-time performance, strong stability, and data consistency.
OLAP（On-Line Analytical Processing）
OLAP is online analytical processing, focusing on large data query. When the business develops to a certain level, offline data must be used for analysis to provide support for decision-making.

5.3 Commonly used NoSQL databases

MongoDB
MongoDB is a document-oriented database that stores data in JSON format. It is mainly used for data storage, content management and caching applications of websites. MongoDB supports full-text retrieval, has a very rich query method, has strong flexibility in data processing and aggregation, and has extremely high scalability and availability.
Cassandra
Cassandra is an open source distributed database system. Originally developed by Facebook, it is used to store simple format data such as inboxes. It integrates the data model of Google BigTable and the fully distributed architecture of Amazon Dynamo. Due to its good scalability, Cassandra has been adopted by well-known Web 2.0 websites such as Digg and Twitter, and has become a popular distributed structured data storage solution.
CouchDB
CouchDB is a document-oriented database that stores data in JSON format. CouchDB can be used to store website data and content, as well as provide caching, etc. Supports running MapReduce queries on CouchDB via JavaScript. CouchDB also provides a very convenient web-based management console.
Redis
Redis is an in-memory key-value database. Redis has the ability to store and operate advanced data types. These data types are the basic data structures (lists, maps, sets) familiar to most developers. Redis is extremely efficient in reading and writing data, far exceeding that of conventional databases, and is often used in the caching layer of large projects.
HBase
HBase is a distributed database for column storage. Its design idea comes from Google's BigTable paper. The underlying storage of HBase is implemented based on HDFS, and the management of the cluster is implemented based on ZooKeeper. HBase's good distributed architecture design provides the possibility for rapid storage and random access of massive data. Based on the data copy mechanism and partition mechanism, online expansion, reduction and data disaster recovery can be easily realized. It is a Key-Value data structure in the field of big data. Stores the most commonly used database schemas.
Elasticsearch
Elasticsearch is a distributed, highly scalable, and highly real-time search and data analysis engine. It provides a multi-user capable full-text search engine based on a RESTful web interface. Developed in Java and released as open source under the terms of the Apache license, Elasticsearch is a popular enterprise-level search engine.
ClickHouse
ClickHouse is a Russian Yandex (similar to Baidu) open source column storage database. It is mainly used for online analysis and processing queries, and can use SQL queries to generate analysis data reports in real time. The usage scenarios are similar to Elasticsearch, and even have higher performance.

5.4 Commonly used relational databases

Oracle
Oracle is the world's largest information management software and service provider, headquartered in Redwoodshore, California, USA. Oracle database products are used by the top 1,000 companies on the Fortune list and are the most well-known and widely used enterprise databases.
DB2
DB2 is a relational database management system developed by IBM. It is mainly used for large-scale application systems and has good scalability. DB2 is the second relational database launched by IBM, so it is called DB2. It provides high-level data utilization, integrity, security, parallelism, recoverability, and execution capabilities for small- to large-scale applications, with platform-independent basic functions and a SQL command running environment. Can be used on different operating systems at the same time, including Linux, UNIX and Windows.
Microsoft SQL Sever
Microsoft SQL Server is a comprehensive database platform that provides enterprise-class data management using integrated business intelligence (BI) tools. The Microsoft SQL Server database engine provides more secure and reliable storage capabilities for relational and structured data, allowing you to build and manage highly available and high-performance data applications for your business.
MySQL
MySQL is the most widely used open source relational database. It was developed by Swedish MySQL AB and has now been acquired by Oracle. Compared with the commonly used mainstream databases Oracle and SQL Server, MySQL is free, can be used on any platform, takes up less resources, and is favored by individual users and small and medium-sized enterprises. For large projects, MySQL's carrying capacity and security are slightly inferior to Oracle database.
MariaDB
MariaDB database management system is a branch of MySQL, which is mainly maintained by the open source community and is licensed under GPL. MariaDB aims to be fully compatible with MySQL, including API and command line, making it an easy MySQL replacement. In terms of storage engine, XtraDB is used instead of MySQL's InnoDB. The name MariaDB comes from the name of founder Michael Widenius' daughter Maria.
PostgreSQL
PostgreSQL is a powerful open source object-relational database system that uses and extends the SQL language and incorporates many features to securely store and scale the most complex data workloads. The origins of PostgreSQL can be traced back to 1986 as part of the POSTGRES project at the University of California, Berkeley. PostgreSQL's architecture, reliability, data integrity, feature set, and scalability have been fully verified, and the open source community is also very active. It is the query language closest to the industry standard SQL92, and implements at least 160 of the 179 main functions required in the SQL:2011 standard (Note: Currently, no database management system can fully implement all the main functions in the SQL:2011 standard. ).
TiDB
TiDB is an open source distributed relational database independently designed and developed by PingCAP. It is a database that supports both online transaction processing and online analytical processing (Hybrid Transactional and Analytical Processing). HTAP) integrated distributed database product, with important features such as horizontal expansion or reduction, financial-grade high availability, real-time HTAP, cloud-native distributed database, compatibility with MySQL 5.7 protocol and MySQL ecosystem, providing users with one-stop OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), HTAP solutions. TiDB is suitable for various application scenarios such as high availability, strong consistency requirements, and large data scale.
TBase
TBase is a database developed by Tencent based on Postgres-XC. Postgres-XC (eXtensible Cluster) is an open source project that provides scalable, synchronous, symmetric and transparent PostgreSQL cluster solutions. Compared with Postgres-XC, the stability of TBase has been greatly improved. By introducing the GROUP concept in the kernel, a dual-Key distribution strategy is proposed, which effectively solves the problem of data skew. It divides data into cold data and hot data according to the timestamp of the data, and stores them in different storage devices respectively, effectively solving the problem of storage costs.
OceanBase
OceanBase is an enterprise-level distributed relational database independently developed by Ant Group and founded in 2010. OceanBase is a domestic native distributed database that has set world records in both TPC-C and TPC-H tests. OceanBase has the characteristics of strong data consistency, high availability, high performance, online expansion, high compatibility with SQL standards and mainstream relational databases, and low cost.
SequoiaDB
SequoiaDB is a financial-grade distributed relational database that mainly provides high performance, reliability, stability and unlimited horizontal expansion for high-concurrency online transaction scenarios. Database services. Users can create multiple types of database instances in SequoiaDB to meet the respective needs of different upper-layer applications. SequoiaDB supports four relational database instances: MySQL, MariaDB, PostgreSQL and SparkSQL, JSON document database instances, and unstructured data instances of S3 object storage.