The storage cost is reduced by 85%, and the cost reduction practice of Ctrip's history database scenario

Ctrip, a leading online ticketing service company in China, has had its database system replaced three times since its establishment in 1999. In the era of mobile Internet, faced with the massive data accumulated by cloud computing, Ctrip has used a new database solution to reduce storage costs by about 85% and improve performance several times. This article describes how Ctrip solves pain points such as horizontal expansion, storage costs, and import performance in the historical database scenario, as well as the formulation and thinking process of the solution.

Since its inception, Ctrip's early business mainly used SQL Server database. When MySQL was introduced into China and became popular, Ctrip also gradually transferred its database from SQL Server to MySQL database from 2016 to 2018. However, with the diversification of technology and the continuous development of business, MySQL has gradually revealed bottlenecks, such as complex expansion processes, high data storage costs, time-consuming table-related maintenance operations, etc., and now it can no longer meet Ctrip's needs.

Therefore, when the amount of data in the production environment is too large, Ctrip tries to archive cold data with very little business access in the production environment to the historical database to reduce the amount of data in the production environment and reduce query delays and table structure changes in the production environment. Key performance indicators such as latency.

Initially, Ctrip used MyRocks (RocksDB for MySQL) as the historical database because it was compatible with the MySQL master-slave architecture and had built-in compression, making it cost-saving friendly. As shown in the figure below, domestic and foreign air ticket bookings doubled in the first half of this year, and the order volume increased accordingly. However, with the surge in business data volume, the problem of MyRocks' difficulty in expanding capacity was exposed. Since MyRocks cannot adapt to such rapid data growth, Ctrip needs to reconsider historical database selection and introduce a new architecture for optimization.

picture

picture

During the selection process of historical databases, Ctrip focused on the following four aspects:

  • Horizontal expansion and contraction. Whether the selected database can be scaled horizontally, and whether it is convenient for load balancing after scaling;

  • Ease of migration. Is it convenient for data migration;

  • Cost reduction. Whether there is a sufficiently low storage cost;

  • Increase efficiency. Whether the performance of data writing to the historical database can meet the requirements.

Based on the last historical database selection experience and product research, Ctrip decided to conduct a preliminary inspection of OceanBase, a completely self-developed and open source distributed database. In addition to being highly compatible with MySQL, OceanBase also features transparent horizontal expansion, high reliability, and strong data compression capabilities. Therefore, Ctrip conducted further research and testing on the OceanBase database based on business scenarios.

As shown in the figure below, Ctrip conducted a test and comparison on the capacity of MySQL business migration to OceanBase. According to the principle of compression algorithm, the table comparison compression rate is closely related to data type, duplication, etc., so Ctrip adopted the entire database migration comparison, hoping to more Close to the actual scene. The results show that the capacity of MySQL is 2.1TB, and after migrating to OceanBase, it is 264GB, with a data compression ratio of 8:1.

picture

Space comparison: MySQL (blue) vs OceanBase (orange)

In addition, during the testing process, Ctrip also examined OceanBase's performance in terms of horizontal expansion and contraction, ease of migration, and performance, and finally decided to use OceanBase. After going online, the results were in line with expectations.

picture

1. Horizontal expansion and contraction

Ctrip initially used MySQL as the archiving history library. When the cluster space reached the upper limit, it adopted the solution of sub-database and sub-table. The scheme of sub-database and sub-table brings the ability of horizontal expansion to a certain extent, and reduces the systemic risk of business.

picture

At the same time, the sub-database and sub-table scheme has caused many problems for Ctrip's historical database.

  • Question 1: When horizontal expansion is performed by adding nodes, the DBA needs to intervene in the data migration process and manually complete the rebalance of the data.

  • Problem 2: Additional data management burden and data operation pressure. In some scenarios, it is necessary to traverse each ShardDB and execute the same statement to obtain the results. At this time, the request volume will double. Once slow SQL occurs, it will easily lead to process accumulation.

  • Question 3: There are various sub-database and sub-table schemes. If the developer designs or uses them improperly, the problem of uneven data distribution will occur.

Based on the above issues, when selecting a historical database, we hope to find a native distributed database that can support online expansion and contraction, automatic load balancing, and distributed transactions that are not business-sensitive.

OceanBase is a native distributed database. It implements a distributed consensus protocol based on Multi-Paxos and supports distributed transactions. At the same time, it supports transparent horizontal expansion to meet the needs of rapid business expansion and contraction. After expansion, the data in the partition table will be automatically balanced to the new On the node, it is transparent to upper-layer services, saving migration costs.

When Ctrip's business is developing rapidly, the expansion operation of the historical database is transparent to the business. These features perfectly solve various pain points in the initial use of the sub-database and sub-table strategy in the business scenario of Ctrip's historical database.

picture

When the Ctrip history database uses the MySQL sub-database and sub-table scheme, the data is divided into tables by month and day, which requires developers to actively configure it on the publishing system, and the application code needs to be modified. Once a capacity alarm occurs, the DBA needs to manually intervene to split the data. In OceanBase, you only need to create a range partition table with time as the partition key, and OceanBase will automatically distribute multiple partitions of the partition table evenly on each node of each Zone. Each node can execute SQL independently. If the data that the application needs to access is on a different machine, the node will automatically route the request to the machine where the data is located, which is completely transparent to the business.

picture

When expanding capacity, you only need to add resource nodes in the cluster dimension. By modifying the tenant resource specifications, you can allocate resources to specified tenants, and then the RootService of the cluster will schedule partition replicas to migrate within each Zone until the load of each node is different. The value is less than the user-configured threshold. The primary copies of each partition will also be automatically balanced on each node to avoid the distribution skew of the primary copies causing excessive request loads on some nodes.

2. Data migration

In the traditional heterogeneous data migration solution, there are usually two methods: one is static data migration, after ensuring the data is static, use the export tool to migrate; the other requires developers to double-write in the business code. For the migration of the MySQL historical database, Ctrip finally chose OMS for data migration.

OMS (OceanBase Migration Service, OceanBase Migration Service) is a service that supports data interaction between homogeneous or heterogeneous data sources and OceanBase. It has the ability to migrate stock data online and synchronize incremental data in real time. OMS supports online non-stop migration of Ctrip history database. During the entire migration process, business applications are unaware of it.

picture

OMS provides a visual centralized management and control platform that supports full life cycle management services for the data migration process. The creation, configuration, monitoring and management of data migration and data synchronization tasks can be completed on the management and control interface, and the interaction is simple and convenient. At the same time, it also provides a variety of data consistency verification methods to ensure data quality in a more comprehensive, time-saving and efficient manner.

picture

Ctrip used the non-stop data migration function provided by OMS (OceanBase Migration Service) to smoothly migrate the existing MySQL historical database to OceanBase through OMS. No business transformation was performed during the migration process. During the migration process, the original MySQL historical database continued to provide external services, minimizing the impact of data migration on the business.

picture

3. Storage cost

Data compression is a key means to reduce the storage space occupied by massive data. OceanBase's high compression ratio distributed storage engine abandons the fixed-length data block storage of traditional databases and adopts LSM-Tree-based storage architecture and adaptive compression technology to creatively solve the problem that traditional databases cannot balance "performance" and "compression ratio". ” problem.

OceanBase supports four compression algorithms: zlib, snappy, lz4 and zstd. On the basis of universal compression, OceanBase has self-developed a set of compression methods (encoding) for mixed row and column encoding of the database. It uses row and column dictionaries, differences, prefixes and other encoding algorithms to encode the data before the universal compression algorithm. Compression, bringing greater compression ratio and further reducing storage costs.

The storage layer will adaptively select the optimal encoding rule based on the data characteristics. It will calculate the compression ratio of the data when merging. If it is found that the compression ratio is not high, it will fall back as soon as possible and choose other encoding methods to ensure that the data encoding process is smooth. Will affect normal data writing performance.

OceanBase performs very well in data compression. The 475G table in Ctrip's MySQL historical database only occupies 55G after being migrated to OceanBase. The average storage resource is only 1/8 of the original, and the storage cost is reduced by about 85%.

4. Import performance

In addition to focusing on factors such as horizontal scalability and storage cost, Ctrip's historical database also places high requirements on the historical database's performance in importing large amounts of archived data. OceanBase's parallel execution framework can execute DML statements in a concurrent manner (Parallel DML). For multi-node databases, it enables concurrent writing by multiple machines and ensures the consistency of large transactions. Combined with the asynchronous dump mechanism, it can also optimize the LSM-Tree storage engine's support for large transactions under tight memory conditions to a great extent.

We can experience PDML through such an example: based on the lineitem table of TPC-H, create an empty table lineitem2 with the same table structure. Then use INSERT INTO...SELECT to insert all 6 million rows of lineitem into the new table lineitem2. Then we execute it by closing and opening PDML respectively to observe the effects and differences.

First, copy the table structure of lineitem to create lineitem2. It should be noted that in the OceanBase database we use partition tables for data expansion. In the example here we use 16 partitions, so the corresponding lineitem2 should also be exactly the same:

obclient [test]> SHOW CREATE TABLE lineitem;
CREATE TABLE `lineitem` (  `l_orderkey` bigint(20) NOT NULL,  `l_partkey` bigint(20) NOT NULL,  `l_suppkey` bigint(20) NOT NULL,  `l_linenumber` bigint(20) NOT NULL,  `l_quantity` bigint(20) NOT NULL,  `l_extendedprice` bigint(20) NOT NULL,  `l_discount` bigint(20) NOT NULL,  `l_tax` bigint(20) NOT NULL,  `l_returnflag` char(1) DEFAULT NULL,  `l_linestatus` char(1) DEFAULT NULL,  `l_shipdate` date NOT NULL,  `l_commitdate` date DEFAULT NULL,  `l_receiptdate` date DEFAULT NULL,  `l_shipinstruct` char(25) DEFAULT NULL,  `l_shipmode` char(10) DEFAULT NULL,  `l_comment` varchar(44) DEFAULT NULL,  PRIMARY KEY (`l_orderkey`, `l_linenumber`),  KEY `I_L_ORDERKEY` (`l_orderkey`) BLOCK_SIZE 16384 LOCAL,  KEY `I_L_SHIPDATE` (`l_shipdate`) BLOCK_SIZE 16384 LOCAL) partition by key(l_orderkey)  (partition p0,   partition p1,   partition p2,   partition p3,   partition p4,   partition p5,   partition p6,   partition p7,   partition p8,   partition p9,   partition p10,   partition p11,   partition p12,   partition p13,   partition p14,   partition p15);

Without turning on PDML, after creating lineitem2, we first insert it in the default configuration without turning on parallelism. Because this is a large transaction of 6 million rows, we need to adjust the default transaction timeout of the OceanBase database to a larger value. The value of (in μs):

​​​​​​​

obclient [test]> INSERT INTO lineitem2 SELECT * FROM lineitem;Query OK, 6001215 rows affected (1 min 47.312 sec)Records: 6001215  Duplicates: 0  Warnings: 0

It can be seen that when parallelism is not enabled, a single transaction inserts 6 million rows of data, and OceanBase takes 107 seconds.

Next, we enable the execution option of PDML by adding a Hint. Before inserting again, we first clear the data inserted last time. Let's take a look at the execution time this time:

​​​​​​​​​​​​​​

obclient [test]> TRUNCATE TABLE lineitem2;Query OK, 0 rows affected (0.108 sec)
obclient [test]> INSERT /*+ parallel(16) enable_parallel_dml */ INTO lineitem2 SELECT * FROM lineitem;Query OK, 6001215 rows affected (22.117 sec)Records: 6001215  Duplicates: 0  Warnings: 0

It can be seen that after PDML is turned on, the same table inserts 6 million rows of data, and the time taken by the OceanBase database is reduced to about 22 seconds. The performance improvement brought by the PDML feature is about 5 times, and the data writing performance far exceeds MySQL. The concurrent DML feature supports Ctrip's need to quickly import batch archived data into the historical database.

picture

In general, Ctrip has migrated many core businesses to the OceanBase database so far. After replacing MySQL with OceanBase, the Ctrip history database scenario achieved the following four main benefits.

First, seamless scale-in or out: an ultra-high-throughput OceanBase cluster can be built using an ordinary PC server, without the need for separate databases and tables, rapid on-demand expansion, and smooth horizontal expansion of the Ctrip historical database. cost growth curve.

Second, data migration is transparent to the business: OMS supports full data migration, incremental data synchronization, supports one-stop data migration of mainstream databases, and efficiently completes the migration of Ctrip historical database data to OceanBase.

Third, reduce storage costs by about 85%: Based on OceanBase's advanced compression technology, while ensuring performance, data storage space is saved by nearly 85%. Under the premise of the same hardware investment, OceanBase supports Ctrip history database to store more data.

Fourth, excellent data writing performance: OceanBase's shared-nothing architecture, partition-level primary copy dispersion, and Parallel DML capabilities provided by the parallel execution framework truly realize efficient multi-node writing. Using this feature, the data writing performance has been improved several times, and it can easily cope with the ultra-high concurrent data writing requirements of Ctrip's historical database.

Currently, Ctrip is using OceanBase version 3.x and will gradually upgrade to version 4.x in the future to obtain better performance and writing efficiency. In addition, OceanBase 4.x is highly compatible with MySQL 8.0 and supports tenant-level physical backup, which will help Ctrip do offline backup of data and restore backup faster.

Guess you like

Origin blog.csdn.net/OceanBaseGFBK/article/details/132647130