TiDB 2.0 GA Release

On April 27, 2018, TiDB released version 2.0 GA. Compared with version 1.0, many improvements have been made to MySQL compatibility, system stability, optimizer and executor.

TiDB

  • SQL optimizer

    • Streamlined statistics data structure to reduce memory usage

    • Speed ​​up loading stats on process startup

    • Support dynamic update of statistics [experimental]

    • Optimize the cost model for more accurate cost estimation

    • Use to estimate the cost of enumeration Count-Min Sketchmore precisely

    • Supports analysis of more complex conditions, using indexes as fully as possible

    • Support to manually specify the Join order through the STRAIGHT_JOINsyntax

    • GROUP BYUse the Stream Aggregation operator when the clause is empty to improve performance

    • Supports the use of index calculation Max/Minfunctions

    • Optimize the processing algorithm of correlated sub-queries to support dissociating and transforming more types of correlated sub-queries intoLeft Outer Join

    • Expand IndexLookupJointhe scope of use, and the algorithm can also be used in the scene of index prefix matching

  • SQL execution engine

    • Use Chunk structure to reconstruct all executor operators, improve analytical statement execution performance, reduce memory usage, and significantly improve TPC-H results

    • Support Streaming Aggregation operator pushdown

    • Optimized Insert Into Ignorestatement performance, improved by more than 10 times

    • Optimized Insert On Duplicate Key Updatestatement performance, improved by more than 10 times

    • Push down more data types and functions to TiKV calculation

    • Optimized Load Dataperformance , improved by more than 10 times

    • Supports statistics on the memory usage of physical operators, and specifies the processing behavior after the threshold is exceeded through configuration files and system variables

    • Supports limiting the size of memory used by a single SQL statement to reduce the risk of program OOM

    • Support for implicit row IDs in CRUD operations

    • Improve check performance

  • Server

    • Support Proxy Protocol

    • Add a lot of monitoring items, optimize logs

    • Validation of configuration files is supported

    • Support HTTP API to obtain TiDB parameter information

    • Use Batch Resolve Lock to improve garbage collection speed

    • Supports multi-threaded garbage collection

    • TLS support

  • compatibility

    • Support for more MySQL syntax

    • Support configuration file modification lower_case_table_namesSystem variables, used to support OGG data synchronization tool

    • Improve compatibility with Navicat

    • Information_SchemaSupports display of table creation time in

    • Fix the problem that the return type of some functions/expressions is different from that of MySQL

    • Improve compatibility with JDBC

    • support moreSQL_MODE

  • DDL

    • Add IndexOptimized execution speed, the speed is greatly improved in some scenarios

    • Add IndexChange the operation to low priority to reduce the impact on online business

    • Admin Show DDL JobsOutput more detailed DDL task status information

    • Supports raw statements for Admin Show DDL Job Queries JobIDquerying currently running DDL tasks

    • Support Admin Recover Indexcommand to repair index data in disaster recovery situation
      Support to modify Table Options through Alterstatement

PD

  • Added Region Mergesupport to merge empty Regions generated after data deletion [experimental]

  • Added Raft Learnersupport [experimental]

  • Scheduler optimization

    • The scheduler adapts to different Region sizes

    • Improve the priority and speed of data recovery when TiKV is down

    • Improve the speed of data migration for offline TiKV nodes

    • Optimize the scheduling strategy when the TiKV node space is insufficient, and try to prevent the disk from being full when the space is insufficient

    • Improve the scheduling efficiency of the balance-leader scheduler

    • Reduce the scheduling overhead of the balance-region scheduler

    • Optimize the execution efficiency of hot-region scheduler

  • Operation and maintenance interface and configuration

    • Add TLS support

    • Support setting PD leader priority

    • Support label-based configuration properties

    • Nodes that support configuring specific labels do not schedule Region leaders

    • Supports manual Split Region, which can be used to deal with single Region hotspots

    • Support to break up the specified Region, for manual adjustment of hotspot Region distribution in some cases

    • Added configuration parameter inspection rules to improve the validity of configuration items

  • debug interface

    • Add Drop Regiondebug interface

    • Add an interface for enumerating each PD health status

  • Statistics related

    • Add statistics for abnormal Regions

    • Add Region isolation level statistics

    • Add scheduling related metrics

  • performance optimization

    • The PD leader tries to keep pace with the etcd leader to improve write performance

    • Optimized Region heartbeat performance, now supports over 1 million Regions

TiKV

  • Function

    • Protect critical configurations from erroneous modifications

    • support Region Merge[experimental]

    • Add Raw DeleteRangeAPI

    • Add GetMetricAPI

    • add Raw Batch Put, Raw Batch Get, Raw Batch DeleteandRaw Batch Scan

    • Add Column Family parameter to Raw KV API, can operate on specific Column Family

    • Coprocessor supports streaming mode and streaming aggregation

    • Supports configuring the timeout period for Coprocessor requests

    • Heartbeat packets carry timestamps

    • Support online modification of some parameters of RocksDB, including block-cache-sizesize , etc.

    • Support for configuring the behavior of the Coprocessor when it encounters certain errors

    • Support to start in data import mode to reduce write amplification during data import

    • Supports manually splitting the region in half

    • Improve the data repair tool tikv-ctl

    • Coprocessor returns more statistics to guide TiDB's behavior

    • Support ImportSST API, can be used for SST file import [experimental]

    • Added TiKV Importer binary, integrated with TiDB Lightning for fast data import [experimental]

  • performance

    • Use ReadPool to optimize read performance raw_get/get/batch_getby 30%

    • Improve metrics performance

    • Notify PD immediately after Raft snapshot is processed to speed up scheduling

    • Solve the performance jitter problem caused by RocksDB flushing

    • Improve space reclamation after data deletion

    • Speed ​​up the junk cleaning process during startup

    • Use DeleteFilesInRangesto reduce I/O overhead during replica migration

  • stability

    • Solve the problem that the gRPC call does not return when the PD leader sends a switch

    • Solve the problem that the offline node is slow due to snapshot

    • Limit the amount of space temporarily occupied by moving copies

    • If there is a Region without a Leader for a long time, report it

    • Update the statistical Region size in time according to the compaction event

    • Limit the amount of data scanned by a single scan lock request to prevent timeouts

    • Limit the memory usage in the process of receiving snapshots to prevent OOM

    • Improve the speed of CI tests

    • Solve the OOM problem caused by too many snapshots

    • Configure gRPC keepaliveparameters

    • Fix the problem that the increase of Region is easy to OOM

TiSpark

TiSpark uses a separate version number, which is now 1.0 GA. The TiSpark 1.0 version component provides the ability to use Apache Spark for distributed computing on data on TiDB.

  • Provides a gRPC communication framework for TiKV reading

  • Provides encoding and decoding of TiKV component data and communication protocol parts

  • Provides computational pushdown capabilities, including

    • Aggregate pushdown

    • predicate pushdown

    • TopN push down

    • Limit push down

  • Index related support is provided

    • Predicate transforms clustered index range

    • Predicate Transformation Secondary Index

    • Index Only query optimization

    • Runtime index degradation scan table optimization

  • cost-based optimization

    • Statistics support

    • index selection

    • Broadcast Table Cost Estimation

  • Support for various Spark Interfaces

    • Spark Shell support

    • ThriftServer/JDBC support

    • Spark-SQL interactive support

    • PySpark Shell support

    • SparkR support

Now, with the joint efforts of the community and the PingCAP technical team, TiDB 2.0 GA version has been released. I would like to thank the community partners for their long-term participation and contribution.

As a world-class open source distributed relational database, TiDB is inspired by Google Spanner/F1, with core features such as "distributed strongly consistent transactions, online elastic horizontal expansion, high availability of fault self-recovery, and multi-activity across data centers" . TiDB was created on GitHub in May 2015, and the Alpha version was released in December of the same year, then the Beta version was released in June 2016, the RC1 version was released in December, the RC2 version was released in March 2017, the RC3 version was released in June, and the RC3 version was released in August. RC4 version, TiDB 1.0 will be released in October, and 2.0 RC1 will be released in March 2018.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325727929&siteId=291194637