00-Opening Introduction: The correct way to learn the open source framework of sub-database and sub-table

1 Introduction

The rapid development of the Internet has brought massive amounts of information data and also brought more technical challenges. Various smart terminal devices (such as cameras or vehicle-mounted equipment, etc.) report business data in tens of millions of data volumes every day, not to mention Internet industries such as e-commerce and social networking. Data processing of this magnitude is far beyond what the single database and single table architecture of traditional relational databases can support. How to efficiently store and access this data has become a very real problem that needs to be solved urgently.

However, due to the completeness of the ecosystem, relational databases are still the cornerstone of the core business of the data platform and have a huge market. Although there are a number of NoSQL databases in the industry that can naturally integrate functions such as distributed sharding, they do not have core functions such as transaction management.

Facing the growing amount of massive data in the system, a common practice in the industry is to introduce a sub-database and sub-table architecture. We can integrate the design methods of vertical sub-database and horizontal sub-table to cope with the storage and access of massive data.

2 Let the sub-library and sub-table come into effect

To implement a sub-database and sub-table architecture that supports massive data storage and access, developers also face a series of problems at the technical implementation level, regardless of business-level planning and design:

  • Data sharding : How to implement sharding of relational databases at the lowest cost?
  • Proxy mechanism : How to access data under the sub-database and sub-table architecture based on ordinary client tools?
  • Distributed transactions : How to ensure the consistency of the same business data distributed in different databases and tables?
  • Database governance : How to ensure the consistency of database resources such as data sources and configuration information scattered in various environments?

As a "sharp tool" for sharding databases and tables, distributed database middleware ShardingSphere can well solve these pain points and has advantages over other sharding database and table frameworks (such as Cobar, MyCat, etc.)

3 advantages

3.1 Technical authority

The first distributed database middleware project in the history of the Apache Foundation, representing the latest technical direction in this field;

3.2 Solution completeness

It integrates the core functions of client sharding, proxy server, and distributed database, and provides a complete set of open source distributed database middleware solutions and ecosystem suitable for Internet application architecture and cloud service architecture.

3.3 Development friendliness

It provides a friendly integration method. Business developers only need to introduce a JAR package to embed a series of functions such as data sharding, read-write separation, distributed transactions, and database management in the business code.

3.4 Pluggable system scalability

Many of its core functions are provided in the form of plug-ins for developers to arrange and combine to customize their own unique system.

These excellent features allow ShardingSphere to occupy a leading position in the field of sharding database and sharding middleware, and are used by more and more well-known companies (such as JD.com, Dangdang, Telecom, ZTO Express, Bilibili, etc.) to build Your own powerful and robust data platform. If you are struggling to find a mature and stable sub-database and sub-table middleware, then ShardingSphere can help you solve this pain point.

4 Why should you study this column?

Any enterprise that involves massive data processing must use sub-databases and sub-tables. How to design and migrate massive data into sub-databases and tables, and effectively store and access massive business data, has become a major topic that many architects and developers need to plan and implement, and has also become a topic for many high-quality companies such as Pinduoduo and Dewu. High-paying job requirements

<img src="/Users/javaedge/Downloads/IDEAProjects/java-edge-master/assets/image-20240101205701475.png" style="zoom: 50%;" />

But quality talent is in short supply:

  • Engaging in massive data processing requires corresponding application scenarios and high technical thresholds
  • The industry also lacks a mature framework to fulfill actual needs. Technical personnel who master mainstream sharding databases, sharding tables and distributed database middleware frameworks such as ShardingSphere have also become the targets of competition among major companies.

Since there is no systematic introduction to ShardingSphere on the market, I hope to fill this gap. In addition, although the concept of sub-database and sub-table is relatively simple, it is not easy to implement in the actual development process. It also requires a systematic learning process from shallow to deep.

5 outline

Based on the ShardingSphere open source framework, it introduces mainstream sharding solutions and engineering practices. It is the first systematic column in the industry to comprehensively introduce the core functions and implementation principles of ShardingSphere, filling this gap.

  1. Part One: Introducing ShardingSphere This part will start with how to correctly understand the sub-database and sub-table architecture, introduce the relationship between the JDBC specification and ShardingSphere, and introduce how to use ShardingSphere in business systems based on the configuration system provided by ShardingSphere. Specific ways.

  2. Part 2: ShardingSphere core functions ShardingSphere includes many functional features. This part will provide specific usage methods and development techniques for core functions such as data sharding, read-write separation, distributed transactions, data desensitization, and orchestration management.

Parts three to six are the focus. They provide an in-depth analysis of the core architecture of ShardingSphere from different dimensions, provide the design and implementation mechanism of sub-databases and sub-tables from the source code level, and help you improve your source code understanding ability.

  1. Part 3: Infrastructure for ShardingSphere source code parsing . This discussion will focus on the infrastructure of ShardingSphere. First, we will give you a method for efficiently reading ShardingSphere source code, and introduce the design concepts of microkernel architecture and distributed primary keys, as well as the specific implementation methods in ShardingSphere.

  2. Part 4: Sharding Engine of ShardingSphere Source Code Analysis focuses on the core sharding engine implementation principle of ShardingSphere, starting from the SQL parsing engine and proceeding to the routing engine, rewriting engine, execution engine, merging engine and other core technologies in the sharding engine. Point source code analysis.

  3. Part 5: ShardingSphere source code analysis of distributed transactions. Distributed transactions are an essential function of distributed database middleware. ShardingSphere also provides an abstraction of distributed transactions internally. I will analyze this abstraction process in detail and how to implement strong consistency transactions and flexible transactions.

  4. Part 6: Governance and integration of ShardingSphere source code analysis discusses how to implement a low-intrusive data desensitization solution based on the rewrite engine, how to implement dynamic management of configuration information based on the configuration center, how to implement the database access circuit breaker mechanism based on the registration center, and how to implement hook-based mechanism and the OpenTracing protocol to implement database governance issues such as data access link tracking.

6 Harvest

Application methods and implementation principles of sub-database and sub-table

Understand the core functional features of ShardingSphere to meet the needs of daily development work, and provide the design principles and implementation mechanisms of these functions based on source code.

Learn excellent open source frameworks and improve technical understanding and application capabilities

The technical principles are similar. Taking ZooKeeper, a distributed coordination framework, as an example, Sharding

It is used in both Sphere and Dubbo to complete the construction of the registration center.

In ShardingSphere, we can use the dynamic listening mechanism provided by ZooKeeper to determine whether a database instance is available, whether a database instance needs to perform operations such as data access circuit breaker, etc. We can also use this feature of ZooKeeper to implement a distributed environment. The configuration information under the dynamic management.

With the in-depth study of ShardingSphere, there are many similar examples, including the microkernel architecture based on the SPI mechanism, the distributed primary key based on the snowflake algorithm, the configuration center based on Apollo, the registration center based on Nacos, the flexible transaction based on Seata, and the OpenTracing specification link tracking, etc. These technical systems are also reflected in mainstream development frameworks such as Dubbo and Spring Cloud. Therefore, this column can not only strengthen your systematic understanding of these technical systems, but also allow you to master the specific application scenarios and implementation methods of these technical systems, so as to achieve analogy.

Learn skills from source code analysis to daily development

From source code analysis to daily application is a core goal of this column. Based on ShardingSphere, an excellent open source framework, a series of ideas and implementations including design pattern applications (such as factory pattern, strategy pattern, template method, etc.), microkernel architecture and other architectural patterns, component design and class layer structure division can be extracted. Development techniques such as strategies, common cache applications, implementation of custom cache mechanisms, integration and integration of Spring family frameworks, etc. These development techniques can be directly applied to the daily development process.

7 Summary

The development of technology is changing with each passing day. With the popularization of architectural design concepts such as data centers and various artificial intelligence applications, the continuous improvement of data magnitude is a major challenge faced by most software systems. Sub-database and sub-table frameworks like ShardingSphere will also Moving towards a new period of development and being applied in more enterprises.

However, there are not many sub-database and sub-table frameworks that are highly mature and actively developing, and enterprises do not have much choice. ShardingSphere is the only top-level Apache project in this field so far. It is also the one that provides the richest core functions and represents a technological development direction in this field. I hope this column can help you learn ShardingSphere well and master the learning method by analogy.

reference:

Programming Selection Network

This article is published by OpenWrite, a blog that publishes multiple articles !

Broadcom announced the termination of the existing VMware partner program . Site B crashed twice, Tencent's "3.29" level one incident... Taking stock of the top ten downtime incidents in 2023, Vue 3.4 "Slam Dunk" released, Yakult confirmed 95G data Leaked MySQL 5.7, Moqu, Li Tiaotiao... Taking stock of the (open source) projects and websites that will be "stopped" in 2023 "2023 China Open Source Developer Report" is officially released Looking back at the IDE 30 years ago: only TUI, bright background color …… Julia 1.10 officially released Rust 1.75.0 released NVIDIA launched GeForce RTX 4090 D specially for sale in China
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3494859/blog/10545194