TiDB database from entry to proficiency series 2: Introduction to TiDB database

1. Introduction to TiDB database

  • TiDB is an open source distributed relational database. It is an integrated distributed database product that supports both online transaction processing and online analytical processing (Hybrid Transactional and Analytical Processing, HTAP). Important features such as real-time HTAP, cloud-native distributed database, compatible with MySQL 5.7 protocol and MySQL ecology. The goal is to provide users with one-stop OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), and HTAP solutions. TiDB is suitable for various application scenarios such as high availability, high requirements for strong consistency, and large data scale.

2. Five core features

One-click horizontal expansion or contraction

  • Thanks to the design of TiDB's storage-computing-separated architecture, online expansion or reduction of computing and storage can be performed on demand, and the process of expansion or reduction is transparent to application operation and maintenance personnel.

Financial grade high availability

  • The data is stored in multiple copies, and the data copies are synchronized with the transaction log through the Multi-Raft protocol. Most of the transactions can only be submitted after successful writing, ensuring strong data consistency and the availability of data will not be affected when a small number of copies fail. Policies such as geographical location and number of copies can be configured on demand to meet the requirements of different disaster recovery levels.

Real-time HTAP

  • Provides two storage engines: row storage engine TiKV and column storage engine TiFlash. TiFlash copies data from TiKV in real time through the Multi-Raft Learner protocol to ensure strong data consistency between row storage engine TiKV and column storage engine TiFlash. TiKV and TiFlash can be deployed on different machines as needed to solve the problem of HTAP resource isolation.

Cloud native distributed database

  • A distributed database specially designed for the cloud, through TiDB Operator, it can be deployed in public cloud, private cloud, and hybrid cloud to implement instrumentation and automation.

Compatible with MySQL 5.7 protocol and MySQL ecosystem

  • Compatible with the MySQL 5.7 protocol, common functions of MySQL, and the MySQL ecosystem, applications can be migrated from MySQL to TiDB without or with a small amount of code modification. Provides a wealth of data migration tools to help applications complete data migration easily.

3. Four core application scenarios

Scenarios with financial industry attributes that require high data consistency, high reliability, high system availability, scalability, and disaster recovery

  • As we all know, the financial industry has high requirements for data consistency and high reliability, high system availability, scalability, and disaster recovery. The traditional solution is that two computer rooms in the same city provide services, and one computer room in a remote location provides data disaster recovery capabilities but does not provide services. This solution has the following disadvantages: low resource utilization, high maintenance costs, RTO (Recovery Time Objective) and RPO ( Recovery Point Objective) cannot truly achieve the value expected by the enterprise. TiDB uses the multi-copy + Multi-Raft protocol to dispatch data to different computer rooms, racks, and machines. When some machines fail, the system can automatically switch to ensure that the system's RTO <= 30s and RPO = 0.

Massive data with high requirements on storage capacity, scalability, and concurrency and OLTP scenarios with high concurrency

  • With the rapid development of business, the data shows explosive growth. The traditional stand-alone database cannot meet the capacity requirements of the database due to the explosive growth of data. The feasible solution is to use middleware products with sub-database and sub-table or NewSQL database instead, and use high-end Storage devices, etc. Among them, the most cost-effective is the NewSQL database, such as TiDB. TiDB adopts an architecture that separates computing and storage, and can expand and shrink computing and storage respectively. Computing supports a maximum of 512 nodes, each node supports a maximum of 1000 concurrency, and the cluster capacity supports a maximum of PB level.

Real-time HTAP scenario:

  • With the rapid development of 5G, the Internet of Things, and artificial intelligence, enterprises will produce more and more data, and the scale may reach hundreds of terabytes or even PB levels. The traditional solution is to process online transactions through OLTP databases. The data is synchronized to the OLAP database for data analysis through the ETL tool. This processing scheme has many problems such as high storage cost and poor real-time performance. TiDB introduced the column storage engine TiFlash in version 4.0 and combined with the row storage engine TiKV to build a real HTAP database. With a small increase in storage costs, online transaction processing and real-time data analysis can be done in the same system, which greatly saves enterprises cost.

Scenarios of data aggregation and secondary processing

  • At present, the business data of most enterprises are scattered in different systems, and there is no unified summary. With the development of business, the decision-making layer of the enterprise needs to understand the business status of the entire company in order to make timely decisions. The data of various systems are collected in the same system and processed twice to generate T+0 or T+1 reports. The traditional common solution is to use ETL + Hadoop to complete, but the Hadoop system is too complex, and the operation and maintenance and storage costs are too high to meet the needs of users. Compared with Hadoop, TiDB is much simpler. The business uses ETL tools or TiDB's synchronization tools to synchronize data to TiDB. In TiDB, reports can be generated directly through SQL.

4. Compatibility between TiDB database and MySQL database

TiDB is highly compatible with the MySQL 5.7 protocol, and the commonly used functions and syntax of MySQL 5.7.

TiDB does not support the MySQL replication protocol, but provides dedicated tools for replicating data with MySQL:

  • Copy from MySQL: TiDB Data Migration (DM) is a tool for migrating MySQL/MariaDB data to TiDB, which can be used for incremental data copy.
  • Replication to MySQL: TiCDC is a TiDB incremental data synchronization tool implemented by pulling TiKV change logs, and can replicate TiDB incremental data to MySQL through MySQL sink.

Guess you like

Origin blog.csdn.net/zhengzaifeidelushang/article/details/132307479