On April 27, 2018, TiDB released version 2.0 GA. Compared with version 1.0, many improvements have been made to MySQL compatibility, system stability, optimizer and executor.
TiDB
SQL optimizer
Streamlined statistics data structure to reduce memory usage
Speed up loading stats on process startup
Support dynamic update of statistics [experimental]
Optimize the cost model for more accurate cost estimation
Use to estimate the cost of enumeration
Count-Min Sketch
more preciselySupports analysis of more complex conditions, using indexes as fully as possible
Support to manually specify the Join order through the
STRAIGHT_JOIN
syntaxGROUP BY
Use the Stream Aggregation operator when the clause is empty to improve performanceSupports the use of index calculation
Max/Min
functionsOptimize the processing algorithm of correlated sub-queries to support dissociating and transforming more types of correlated sub-queries into
Left Outer Join
Expand
IndexLookupJoin
the scope of use, and the algorithm can also be used in the scene of index prefix matching
SQL execution engine
Use Chunk structure to reconstruct all executor operators, improve analytical statement execution performance, reduce memory usage, and significantly improve TPC-H results
Support Streaming Aggregation operator pushdown
Optimized
Insert Into Ignore
statement performance, improved by more than 10 timesOptimized
Insert On Duplicate Key Update
statement performance, improved by more than 10 timesPush down more data types and functions to TiKV calculation
Optimized
Load Data
performance , improved by more than 10 timesSupports statistics on the memory usage of physical operators, and specifies the processing behavior after the threshold is exceeded through configuration files and system variables
Supports limiting the size of memory used by a single SQL statement to reduce the risk of program OOM
Support for implicit row IDs in CRUD operations
Improve check performance
Server
Support Proxy Protocol
Add a lot of monitoring items, optimize logs
Validation of configuration files is supported
Support HTTP API to obtain TiDB parameter information
Use Batch Resolve Lock to improve garbage collection speed
Supports multi-threaded garbage collection
TLS support
compatibility
Support for more MySQL syntax
Support configuration file modification
lower_case_table_names
System variables, used to support OGG data synchronization toolImprove compatibility with Navicat
Information_Schema
Supports display of table creation time inFix the problem that the return type of some functions/expressions is different from that of MySQL
Improve compatibility with JDBC
support more
SQL_MODE
DDL
Add Index
Optimized execution speed, the speed is greatly improved in some scenariosAdd Index
Change the operation to low priority to reduce the impact on online businessAdmin Show DDL Jobs
Output more detailed DDL task status informationSupports raw statements for
Admin Show DDL Job Queries JobID
querying currently running DDL tasksSupport
Admin Recover Index
command to repair index data in disaster recovery situation
Support to modify Table Options throughAlter
statement
PD
Added
Region Merge
support to merge empty Regions generated after data deletion [experimental]Added
Raft Learner
support [experimental]Scheduler optimization
The scheduler adapts to different Region sizes
Improve the priority and speed of data recovery when TiKV is down
Improve the speed of data migration for offline TiKV nodes
Optimize the scheduling strategy when the TiKV node space is insufficient, and try to prevent the disk from being full when the space is insufficient
Improve the scheduling efficiency of the balance-leader scheduler
Reduce the scheduling overhead of the balance-region scheduler
Optimize the execution efficiency of hot-region scheduler
Operation and maintenance interface and configuration
Add TLS support
Support setting PD leader priority
Support label-based configuration properties
Nodes that support configuring specific labels do not schedule Region leaders
Supports manual Split Region, which can be used to deal with single Region hotspots
Support to break up the specified Region, for manual adjustment of hotspot Region distribution in some cases
Added configuration parameter inspection rules to improve the validity of configuration items
debug interface
Add
Drop Region
debug interfaceAdd an interface for enumerating each PD health status
Statistics related
Add statistics for abnormal Regions
Add Region isolation level statistics
Add scheduling related metrics
performance optimization
The PD leader tries to keep pace with the etcd leader to improve write performance
Optimized Region heartbeat performance, now supports over 1 million Regions
TiKV
Function
Protect critical configurations from erroneous modifications
support
Region Merge
[experimental]Add
Raw DeleteRange
APIAdd
GetMetric
APIadd
Raw Batch Put
,Raw Batch Get
,Raw Batch Delete
andRaw Batch Scan
Add Column Family parameter to Raw KV API, can operate on specific Column Family
Coprocessor supports streaming mode and streaming aggregation
Supports configuring the timeout period for Coprocessor requests
Heartbeat packets carry timestamps
Support online modification of some parameters of RocksDB, including
block-cache-size
size , etc.Support for configuring the behavior of the Coprocessor when it encounters certain errors
Support to start in data import mode to reduce write amplification during data import
Supports manually splitting the region in half
Improve the data repair tool tikv-ctl
Coprocessor returns more statistics to guide TiDB's behavior
Support ImportSST API, can be used for SST file import [experimental]
Added TiKV Importer binary, integrated with TiDB Lightning for fast data import [experimental]
performance
Use ReadPool to optimize read performance
raw_get/get/batch_get
by 30%Improve metrics performance
Notify PD immediately after Raft snapshot is processed to speed up scheduling
Solve the performance jitter problem caused by RocksDB flushing
Improve space reclamation after data deletion
Speed up the junk cleaning process during startup
Use
DeleteFilesInRanges
to reduce I/O overhead during replica migration
stability
Solve the problem that the gRPC call does not return when the PD leader sends a switch
Solve the problem that the offline node is slow due to snapshot
Limit the amount of space temporarily occupied by moving copies
If there is a Region without a Leader for a long time, report it
Update the statistical Region size in time according to the compaction event
Limit the amount of data scanned by a single scan lock request to prevent timeouts
Limit the memory usage in the process of receiving snapshots to prevent OOM
Improve the speed of CI tests
Solve the OOM problem caused by too many snapshots
Configure gRPC
keepalive
parametersFix the problem that the increase of Region is easy to OOM
TiSpark
TiSpark uses a separate version number, which is now 1.0 GA. The TiSpark 1.0 version component provides the ability to use Apache Spark for distributed computing on data on TiDB.
Provides a gRPC communication framework for TiKV reading
Provides encoding and decoding of TiKV component data and communication protocol parts
Provides computational pushdown capabilities, including
Aggregate pushdown
predicate pushdown
TopN push down
Limit push down
Index related support is provided
Predicate transforms clustered index range
Predicate Transformation Secondary Index
Index Only query optimization
Runtime index degradation scan table optimization
cost-based optimization
Statistics support
index selection
Broadcast Table Cost Estimation
Support for various Spark Interfaces
Spark Shell support
ThriftServer/JDBC support
Spark-SQL interactive support
PySpark Shell support
SparkR support
Now, with the joint efforts of the community and the PingCAP technical team, TiDB 2.0 GA version has been released. I would like to thank the community partners for their long-term participation and contribution.
As a world-class open source distributed relational database, TiDB is inspired by Google Spanner/F1, with core features such as "distributed strongly consistent transactions, online elastic horizontal expansion, high availability of fault self-recovery, and multi-activity across data centers" . TiDB was created on GitHub in May 2015, and the Alpha version was released in December of the same year, then the Beta version was released in June 2016, the RC1 version was released in December, the RC2 version was released in March 2017, the RC3 version was released in June, and the RC3 version was released in August. RC4 version, TiDB 1.0 will be released in October, and 2.0 RC1 will be released in March 2018.