Tencent Relational Database Reaches "Double Hundred" Milestone——Comprehensive Analysis of 6 Enterprise MySQL Features

Tencent relational database - enterprise MySQL (formerly CDB, Tencent cloud TencentDB for MySQL) reached one million and one hundred nuclear PB "double hundred" milestone! The year-on-year growth rate of storage scale is as high as 80% , ranking first among the world's TOP5 public cloud vendors for two consecutive years! As Tencent Cloud's largest database product , in November, it joined hands with Tencent Cloud Database to be selected into the Magic Quadrant of Gartner Cloud Database Management System, which means that Tencent Cloud Database has entered the world's top ranking ! Up to now, it has provided services to many major customers such as Bilibili, Water Drop Chips, Xiaohongshu, Weimeng, Futu Securities, Yunji, Changyou, etc., and supported the emergency guarantee for large-scale events such as 618 and Double 11, achieving 10 times Flexible capacity expansion of traffic. While the internal Tencent also serve a variety of star products, such as micro-channel, king of glory, the micro view, Tencent meetings and so on.

The upgrade of Tencent's relational database brand to "Tencent Relational Database Tencent Database SQL Enterprise MySQL" will further gather expertise in the company's database to help CDB/CynosDB grow and develop!

With the explosive growth of scale and industry coverage, CDB faces greater challenges in terms of performance, stability, cost, and SaaS product capabilities. Therefore, centering on the goal of enterprise-level cloud database, continue to improve the six enterprise-level capabilities of CDB: customized enterprise-level kernel, enterprise-level high availability and reliability, enterprise-level data security, enterprise-level scalability, intelligent tuning, and self-research Storage engine.

Enterprise-level customized kernel

The traditional Oracle MySQL community edition kernel and enterprise edition kernel can no longer meet the rapid development of CDB/CynosDB products. There are many business or operation and maintenance problems that cannot be effectively solved by the official, such as the performance problem of the e-commerce spike scenario. The performance jitter problem caused by the table, the service suspension problem caused by the game service addition and subtraction fields, etc. In order to solve the problems faced by the cloud business, reduce the complexity of user operations, and improve the efficiency of operation and maintenance, we have defined and implemented the cloud database kernel- (TXSQL) Tencent MySQL . TXSQL has the following advantages:

  1. Leading performance & performance optimization for extreme business scenarios TXSQL optimizes the execution process of SQL statements throughout the link, with query optimizer, operator pushdown, primary and standby consistency, storage engine (concurrent access control, log system, lock System, rollback segment, crash recovery, etc. have been optimized. Tests have shown that in high-concurrency scenarios, CDB's read and write performance is more than 150% of the official performance , and the startup time of large memory instances is 1/5 of the official version ; the product CynosDB for MySQL, which separates computing and storage, has a read performance of the CDB 120% , write performance is the CDB 2 times.

    In the service business process, we have solved many business performance problems, such as the performance problems of e-commerce spike scenarios, the server suspension problem when the game is changed to the table, the slow crash recovery speed under high concurrency pressure, the main and standby delay problems At the same time, we also contributed the 20+ Patch optimization to the official & MariaDB, and it was officially recognized.

  2. Rich enterprise-level features and perfect enterprise-level special database capabilities. Such as data auditing, encryption, and strong data consistency required by the financial industry, custom keys and key management services required by Tai Fu Cloud, high concurrency required by the gaming industry, extreme performance required by the e-commerce industry, etc.

    Diversified storage engines. In order to meet the diverse needs of the business, TXSQL not only satisfies the traditional transaction processing capabilities, but also built the following two storage engines:

  • Supporting the TxRocks high compression engine based on LSM-Tree, Innodb & TxRocksDB can coexist, seamlessly meeting users' requirements for reducing storage space. This function has been used on WeChat red envelopes for more than two years. Compared with the InnoDB engine, it only uses 1/5 of the previous machines, which greatly saves costs;

  • Support column storage engine CStore, with lightweight stand-alone transaction processing capabilities, users do not need to solve the problem of data synchronization between heterogeneous databases, they can directly use CDB column storage engine for data analysis, which greatly reduces business TP The complexity of data synchronization with AP.

Enterprise-level high availability and high reliability

Availability is the lifeline of CDB, and the current availability has reached more than 99.99%. Failover reaches 1300+ per month, the average switching time is 33 seconds, and more than 50% of the switchovers are completed within 20 seconds . But with the rapid growth of scale, we found that the original single read-write detection mechanism can no longer meet the users' higher requirements for availability. The unavailability of an instance is not only manifested as detection failure, but also a slow backlog of requests, anomalies of the whole machine's hardware, rapid and repeated restarts of the instance...These anomalies are not well detected by the original detection mechanism. For this reason, we have done a lot of optimization in terms of fault discovery. The core goal is to find deterministic and common faults as quickly as possible:

  • [✓] Monitor TXSQL error log in real time. A number of monitoring indicators have been added to TXSQL, such as monitoring IO and lock waiting delay, which can be switched before the instance has accumulated requests but has not crashed.

  • [✓] Real-time monitoring of system logs, such as frequent port up/down, softlookup

  • [✓] Detect abnormal disks of the whole machine, such as bad blocks, hang disks, slow disks, read-only

  • [✓] The instance crashes repeatedly quickly. The detection cannot be covered, but the actual business impact is great.

  • [✓] High load scenarios. In a high-load scenario, the replication thread may not be able to allocate CPU, causing the main and standby unable to synchronize, which may cause data loss

Through the above measures, we have covered 97.71% of known extreme failure scenarios , for which we are still further optimizing. But failover has affected the use after all. For this reason, we have combined the relevant alarm strategy of the company's network management. When the machine has a fault or warning, we isolate the hidden machine in advance and trigger the switch.

In terms of data reliability, it supports cross-park deployment and automatic switchover of campus failures . After the CDB has completed the upgrade of the peer-to-peer node architecture, the nodes in a single instance can be arbitrarily expanded and independently designated for campus deployment. The classic deployment method is: 1 active and 1 standby on the same campus, and the other 1 standby across the campus. When the host fails, fast switching can be achieved, and when the main campus fails, it can automatically switch to cross-campus nodes in minutes . Because the database used by the public cloud VPC network control plane is CDB, in extreme cases, CDB switching across campuses will have interdependent scenarios. In order to solve the automatic switchover across campuses on the network side, CDB also cooperates with the business to make a full-link cross-park disaster recovery solution. How to avoid double writing between instances across campuses is a key issue to ensure data consistency. To this end, the full version of the TXSQL kernel implements the work mode and lease mechanism. When the campus is isolated, the machines in a single campus may be connected, and the network between the campuses may not be connected. Ha_agent deployed on the same machine maintains a T1 lease with TXSQL, and ha_agent has a T2 lease with ZK, ensuring that 2T1 <T2. When the lease is not renewed at T1, TXSQL is in offline mode and does not accept user requests. When it reaches T2, it can be confirmed that the original master must not be writable at this time, and failover is performed safely.

Enterprise-level data security

Data security is more important than Mount Tai. Focusing on data security, the CDB team is building from four aspects: data backup, deployment model, access control, and separation of permissions .

  1. CDB backup system Jiuding provides the following enterprise-level backup and recovery capabilities:

  • [✓] High-performance backup and recovery capabilities. The backup speed is 700MB/s , and the recovery speed (clone instance, including incremental data synchronization) is 540MB/s . Taking 1TB of data as an example, it takes about 25 minutes to back up and 33 minutes to back up.

  • [✓] Online hot backup. Support Backup Lock and Binlog Lock, only block writing when the backup is completed, and there is almost no impact on the user during the backup.

  • [✓] No lock backup. Using Binlog idempotent playback mechanism, in the scenario where InnoDB tables have unique keys, the backup is not locked.

  • [✓] Streaming backup, "zero storage". The backup data is uploaded in streaming mode without local transfer.

  • [✓] Compressed backup. Using quicklz compression, the compression ratio is 3:1.

  • [✓] Backup of entire instance/specified part of library table. Any designated database table backup.

  • [✓] Exclude part of the database table backup. Exclude specified database table backup, such as mysql database, etc.

  • [✓] Multi-level resource control. Backup bandwidth is load balanced across AZ, single machine flow control and strong resource isolation.

  • [✓] Support MySQL TDE. Supports backup and recovery of encrypted data.

  1. Deployment model. Nodes in the instance are deployed across racks and switches; across campuses and multiple nodes; cross-city disaster recovery instances; placement group function, different services can be assigned to different groups, and cross-machine and switch deployment between groups.

  1. Access control. Enterprise-level data audit capabilities. In the extreme stress test scenario, the performance loss is less than 3% , and each statement-level SQL execution path is provided, including a series of internal execution indicators such as consumed CPU, number of scanned/returned rows, lock waiting time, and IO consumption . In addition to post-auditing, you can control service concurrency through SQL current limiting, and even refuse the operation of a certain type of SQL, which can achieve the effect of pre-interception.

  1. TDE data encryption, even if the backup data is leaked, it still guarantees that the data will not be leaked without the key.

Separation of permissions. Separation of permissions for online production environment data and backup data. Production environment permissions are approved and recorded by operation and maintenance, and backup data is controlled by the “ write once and read multiple times ” permission to ensure that R&D and backup systems only have data write permissions to avoid human-induced data loss.

Enterprise-level scalability

Scalability includes two parts, single-node read and write capability expansion and multi-node read expansion. TXSQL kernel provides thread-pool function. It supports 100,000 connections, and while reducing the consumption of switching and creating and destroying threads while running, it avoids the sharp drop in performance caused by multi-thread competition for system resources in high concurrency scenarios, thereby significantly improving system performance under high concurrency.

TXSQL thread pool

Multi-node read expansion, load balancing and disaster recovery between nodes according to wrr algorithm. When the read-only node is unavailable or reaches the set master-slave delay threshold, disaster recovery will be eliminated within 30 seconds . In order to prevent all nodes from being removed, you can set the minimum number of read-only reservations to prevent avalanches caused by the high load of the instance. When the read-only node is available again and the delay is lower than the threshold, it will automatically trigger the add-back strategy to re-balance the load.

After adding a read-only node, two addresses will appear, requiring the business layer to separate read and write. For this reason, CDB is about to launch database proxy service. The business only needs one read and write address, which automatically performs read and write separation and read load balancing.

Intelligent tuning

Performance is a key indicator of the database. In addition to SQL optimization and index optimization, parameter tuning (Tuning) is also an important means to improve database performance. In the past, relying on expert experience, the cost of man-hours was high, and with more than 400 MySQL parameters, the space for parameter combinations was very large. The database parameter intelligent tuning service came into being. It uses the deep reinforcement learning (Deep RL) method to tune database parameters. Compared with the existing methods, CDBTune does not need to subdivide the load type and accumulate a large number of samples. It can intelligently learn the parameter tuning process to obtain Better parameter tuning effect .

CDBTune_arch

Through well-designed load replay and parallel tuning, you can perform parallel tuning of the parameters under the specified load without any damage to the business. A set of better parameter templates can be recommended at the hour level, and the TPS will be higher when the same resource consumption is achieved. RT is lower .

Take the reading and writing scenario as an example. After using CDBTune to adjust the parameters, TPS increased by 33%, and RT decreased by 70%.

The results are also published in the relevant database will top SIGMOD "An End-to-End Automatic Cloud Database System the Using the Tuning Deep Reinforcement Learning" .

Self-developed storage engine (TxRocks and CSTORE)

TxRocks is a transactional storage engine based on RocksDB by the TXSQL team. Thanks to the RocksDB LSM Tree storage structure, it not only reduces InnoDB page half-full and fragmentation waste, but also can use compact format storage. Therefore, TxRocks is on the premise of maintaining performance close to InnoDB. Compared with InnoDB, the storage space can be saved by half or more, which is very suitable for businesses that require transaction read and write performance and have a large amount of data storage.

CSTORE is a column storage engine developed by the TXSQL kernel team for OLAP scenarios. Through CSTORE, users can complete large data query and analysis, which can be applied to historical archive data, log data, big data, infrequently updated OLTP data and data warehouses. Analysis and processing, data processing volume reaches PB level. In terms of performance, the technologies of high compression ratio, fast loading, and targeted query optimization are implemented to provide users with efficient services, so that a single node can support tens of billions of rows of records in seconds . TXSQL's column storage can reach a compression ratio of 10:1 , and by reducing the use of disk space, it can write and read large quantities of data in seconds. Based on the existing MySQL query optimization, various forms of sparse index are used to filter data to achieve high-speed data filtering and selection. And use the way of multi-column parallel processing to complete data loading and processing at high speed.

Cloud native database CynosDB

Under the MySQL architecture, there are still a few dark clouds that have not been able to disperse: the delay of the main and standby is uncontrollable, the expansion depends on the upper limit of the single machine capacity, the backup/crash recovery time is long, and the resource utilization rate is low . To this end, the CDB/CynosDB team has launched a new generation of database CynosDB for MySQL with a separate architecture for computing and storage . It adopts the Share Storage architecture of Log is Database and is 100% compatible with MySQL syntax.

Through the separation of computing and storage, resource pooling and elastic scaling are realized. Resource utilization can break through the limit of a single machine and realize second-level expansion .

The primary and standby synchronization no longer goes through Binlog, but physically replicates through Redo. From the root cause, the main and standby delay problem is solved, and the delay is controlled at the millisecond level .

The underlying storage adopts the TXStore multi-copy storage method, and the backup adopts the snapshot mode for second-level backup , which directly avoids the main-standby delay and performance impact caused by backup locking.

Search notices

Due to limited space, some implementation details have not been elaborated in detail. If you are interested in specific implementations, you can follow our WeChat public account "Tencent Database Technology". We will regularly push and share database expertise and online experience precipitation. , Grow up with everyone. In addition, if you have questions related to database technology or new functional requirements, please leave a message or ask questions in the code guest, we will try our best to answer your questions.

Public number: Tencent database technology

Finally, make an advertisement, database is a very attractive area, here you can manage hundreds of thousands of database instances, do the ultimate high availability and elastic scaling, you can redefine a new generation of database proxy services, which can be implemented by AI Autonomous database of AI4DB such as database parameter tuning and SQL tuning . If you have a strong interest in databases, a clear understanding of the principles of databases, and a good foundation in C or C++, and most importantly, you have the courage to challenge various difficulties and a strong sense of responsibility, then you are what we are looking for People, there are no professional restrictions, no rules and regulations, we will work together to witness the charm of the database on the cloud!

Please send your resume to [email protected], we look forward to your joining!

Guess you like

Origin blog.csdn.net/Tencent_TEG/article/details/110944056