Heavy! Open source distributed NewSQL database TiDB 2.0 officially released

  

Last October, TiDB version 1.0 was released . In the next six months, the development team maintained the stability of version 1.0 and added necessary new features, while developing version 2.0 non-stop. After 6 RC versions, TiDB 2.0 GA version was officially released on April 27th.

2.0 version planning

According to the situation of existing users, technology development trends and the voice of the community, the TiDB 2.0 version mainly focuses on the following points:

  • Ensure the stability and correctness of TiDB. These two points are the basic functions of a database software. As the cornerstone of the business, any jitter or error may have a huge impact on the business. At present, a large number of users are using TiDB online, and the data volume of these users is increasing and their services are constantly evolving.

  • Improve the query performance of TiDB under large data volumes. At present, many TiDB customers have data ranging from hundreds of GB to hundreds of terabytes. Therefore, it will be very helpful for users if the query performance under large data volume can be improved.

  • Optimize TiDB for ease of use and maintainability. The complexity of the entire TiDB system is relatively high, and the operation, maintenance and use are more difficult than stand-alone databases, so I hope to provide the most convenient solution to help users use TiDB. For example, simplify the deployment, upgrade, and expansion methods as much as possible, and locate abnormal states in the system as easily as possible.

Around the above three principles, TiDB has made a lot of improvements, some of which are visible to the outside world, such as a significant improvement in OLAP performance, a large increase in monitoring items, and various optimizations of operation and maintenance tools, and many more improvements are hidden behind the database. , silently improve the stability and correctness of the entire database.

correctness and stability

After the release of version 1.0, TiDB began to build and improve the automated testing platform Schrodinger, completely saying goodbye to the previous method of manually deploying cluster tests. At the same time, a lot of test cases have been added, so that the tests can be covered from the bottom RocksDB, to Raft, to Transaction, and then to SQL.

In the Chaos test, TiDB introduced more error injection tools, such as using systemtap to delay I/O, etc., and also conducted error injection tests for code-specific business logic, fully ensuring that TiDB can run stably under abnormal conditions. .

The development team of TiDB has done a lot of TLA+ demonstration work before, and there are also some simple tests. After 1.0, they started to use the TLA+ system for demonstration to ensure that the implementation is correct in design.

In terms of storage engine, in order to improve the stability and performance of large-scale clusters, TiDB optimizes the process of Raft and introduces new features such as Region Merge and Raft Learner; optimizes the hotspot scheduling mechanism, collects more information, and makes updates based on this information. Reasonable scheduling; optimize RocksDB performance, use features such as DeleteFilesInRanges, improve space reclamation efficiency, reduce disk load, and use disk resources more smoothly, etc.

OLAP performance optimization

TiDB 2.0 refactors the SQL optimizer and execution engine, hoping to select the optimal query plan as quickly as possible and execute the query plan as efficiently as possible.

Version 1.0 has shifted from a rule-based query optimizer to a cost-based query optimizer, but it is not perfect. In version 2.0, on the one hand, the accuracy and update timeliness of statistical information are optimized, and on the other hand, the ability of the SQL optimizer is improved. , the estimation of query cost is more accurate, the analysis of complex filter conditions is more detailed, the processing of correlated sub-queries is more elegant, and the selection of physical operators is more flexible and accurate.

In this version, the SQL execution engine introduces a new internal data representation --- `Chunk`, which stores a batch of data in a structure instead of just a row of data, and the data of the same column is stored continuously in memory, making memory usage more efficient. Compact, which brings several advantages: 1. Significantly reduces memory consumption; 2. Batch allocates memory, reducing GC overhead; 3. Data can be transferred in batches between operators, reducing call overhead; 4. . In some scenarios, vector calculations can be performed and the Cache Miss of the CPU can be reduced.

After completing the above two changes, the performance of TiDB in OLAP scenarios has been greatly improved. From the comparison results of TPC-H, all queries run faster in 2.0, and most of some queries have The improvement of several times or even orders of magnitude, especially some queries that cannot run the results in 1.0 can be executed smoothly in 2.0.

Ease of Use and Operability

In order to be easier to install and use, TiDB 2.0 has also made many optimizations in monitoring, operation and maintenance, and tools.

In terms of monitoring, more than 100 monitoring items have been added, and some runtime information is exposed through HTTP interfaces, SQL statements, etc., which are used for system tuning or locating problems in the system.

In terms of operation and maintenance, the operation and maintenance tools have been optimized to simplify the operation process, reduce the operation complexity and the impact of the operation process on the online. At the same time, the functions are also richer, supporting automatic deployment of Binlog components and enabling TLS.

2.0 Detailed update list

TiDB:

1. SQL Optimizer

  • Streamlined statistics data structure to reduce memory usage

  • Speed ​​up loading stats on process startup

  • Support dynamic update of statistics [experimental]

  • Optimize the cost model for more accurate cost estimation

  • Use `Count-Min Sketch` to estimate the cost of a count more accurately

  • Supports analysis of more complex conditions, using indexes as fully as possible

  • Support for manually specifying Join order via `STRAIGHT_JOIN` syntax

  • Use the Stream Aggregation operator when the `GROUP BY` clause is empty to improve performance

  • Support for calculating `Max/Min` functions using indexes

  • Optimize the processing algorithm of correlated sub-queries to support disassociating and transforming more types of correlated sub-queries into `Left Outer Join`

  • Expand the scope of use of `IndexLookupJoin`, the algorithm can also be used in the scene of index prefix matching

2. SQL execution engine

  • Use Chunk structure to reconstruct all executor operators, improve analytical statement execution performance, reduce memory usage, and significantly improve TPC-H results

  • Support Streaming Aggregation operator pushdown

  • Optimized the performance of `Insert Into Ignore` statement by more than 10 times

  • Optimized the performance of `Insert On Duplicate Key Update` statement by more than 10 times

  • Push down more data types and functions to TiKV calculation

  • Optimized `Load Data` performance, increased by more than 10 times

  • Supports statistics on the memory usage of physical operators, and specifies the processing behavior after the threshold is exceeded through configuration files and system variables

  • Supports limiting the size of memory used by a single SQL statement to reduce the risk of program OOM

  • Support for implicit row IDs in CRUD operations

  • Improve check performance

3.Server

  • Support Proxy Protocol

  • Add a lot of monitoring items, optimize logs

  • Validation of configuration files is supported

  • Support HTTP API to obtain TiDB parameter information

  • Use Batch Resolve Lock to improve garbage collection speed

  • Supports multi-threaded garbage collection

  • TLS support

4. Compatibility

  • Support for more MySQL syntax

  • Support configuration file to modify `lower_case_table_names` system variable to support OGG data synchronization tool

  • Improve compatibility with Navicat

  • Support showing table creation time in `Information_Schema`

  • Fix the problem that the return type of some functions/expressions is different from that of MySQL

  • Improve compatibility with JDBC

  • Support more `SQL_MODE`

5.DDL

  • Optimize the execution speed of `Add Index`, the speed is greatly improved in some scenarios

  • The `Add Index` operation is changed to a low priority to reduce the impact on online business

  • `Admin Show DDL Jobs` outputs more detailed DDL job status information

  • Support `Admin Show DDL Job Queries JobID` to query the raw statement of currently running DDL jobs

  • Supports the `Admin Recover Index` command for recovering index data in disaster recovery situations

  • Support modifying Table Options via `Alter` statement

PD:

1. Add `Region Merge` support, merge empty Regions generated after data deletion [experimental]

2. Add `Raft Learner` support [experimental]

3. Scheduler optimization

  • The scheduler adapts to different Region sizes

  • Improve the priority and speed of data recovery when TiKV is down

  • Improve the speed of data migration for offline TiKV nodes

  • Optimize the scheduling strategy when the TiKV node space is insufficient, and try to prevent the disk from being full when the space is insufficient

  • Improve the scheduling efficiency of the balance-leader scheduler

  • Reduce the scheduling overhead of the balance-region scheduler

  • Optimize the execution efficiency of hot-region scheduler

4. Operation and maintenance interface and configuration

  • Add TLS support

  • Support setting PD leader priority

  • Support label-based configuration properties

  • Nodes that support configuring specific labels do not schedule Region leaders

  • Supports manual Split Region, which can be used to deal with single Region hotspots

  • Support to break up the specified Region, for manual adjustment of hotspot Region distribution in some cases

  • Added configuration parameter inspection rules to improve the validity of configuration items

5. Debug interface

  • Added `Drop Region` debugging interface

  • Add an interface for enumerating each PD health status

6. Statistics related

  • Add statistics for abnormal Regions

  • Add Region isolation level statistics

  • Add scheduling related metrics

7. Performance optimization

  • The PD leader tries to keep pace with the etcd leader to improve write performance

  • Optimized Region heartbeat performance, now supports over 1 million Regions

TiKV:

1. Function

  • Protect critical configurations from erroneous modifications

  • Support `Region Merge` [experimental]

  • Add `Raw DeleteRange` API

  • Add `GetMetric` API

  • 添加 `Raw Batch Put`,`Raw Batch Get`,`Raw Batch Delete` 和 `Raw Batch Scan`

  • Add Column Family parameter to Raw KV API, can operate on specific Column Family

  • Coprocessor supports streaming mode and streaming aggregation

  • Supports configuring the timeout period for Coprocessor requests

  • Heartbeat packets carry timestamps

  • Support online modification of some parameters of RocksDB, including `block-cache-size` size, etc.

  • Support for configuring the behavior of the Coprocessor when it encounters certain errors

  • Support to start in data import mode to reduce write amplification during data import

  • Supports manually splitting the region in half

  • Improve the data repair tool tikv-ctl

  • Coprocessor returns more statistics to guide TiDB's behavior

  • Support ImportSST API, can be used for SST file import [experimental]

  • Added TiKV Importer binary, integrated with TiDB Lightning for fast data import [experimental]

2. Performance

  • Use ReadPool to optimize read performance, `raw_get/get/batch_get` improves by 30%

  • Improve metrics performance

  • Notify PD immediately after Raft snapshot is processed to speed up scheduling

  • Solve the performance jitter problem caused by RocksDB flushing

  • Improve space reclamation after data deletion

  • Speed ​​up the junk cleaning process during startup

  • Use `DeleteFilesInRanges` to reduce I/O overhead during replica migration

3. Stability

  • Solve the problem that the gRPC call does not return when the PD leader sends a switch

  • Solve the problem that the offline node is slow due to snapshot

  • Limit the amount of space temporarily occupied by moving copies

  • If there is a Region without a Leader for a long time, report it

  • Update the statistical Region size in time according to the compaction event

  • Limit the amount of data scanned by a single scan lock request to prevent timeouts

  • Limit the memory usage in the process of receiving snapshots to prevent OOM

  • Improve the speed of CI tests

  • Solve the OOM problem caused by too many snapshots

  • Configure gRPC's `keepalive` parameter

  • Fix the problem that the increase of Region is easy to OOM

In addition, TiSpark 1.0 GA version was released at the same time . The TiSpark 1.0 version component provides the ability to use Apache Spark for distributed computing on data on TiDB. Updates include: 

1. Provides a gRPC communication framework for TiKV reading

2. Provides encoding and decoding of TiKV component data and communication protocol parts

3. Provides a calculation pushdown function, including

  • Aggregate pushdown

  • predicate pushdown

  • TopN push down

  • Limit push down

4. Provides index related support

  • Predicate transforms clustered index range

  • Predicate Transformation Secondary Index

  • Index Only query optimization

  • Runtime index degradation scan table optimization

5. Provides cost-based optimization

  • Statistics support

  • index selection

  • Broadcast Table Cost Estimation

6. Multiple Spark Interface support

  • Spark Shell support

  • ThriftServer/JDBC support

  • Spark-SQL interactive support

  • PySpark Shell support

  • SparkR support

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325067161&siteId=291194637