[Tencent Cloud TDSQL-C Serverless product experience] Helping enterprises reduce costs and increase efficiency based on TDSQL-C Serverless best practices

1. Analysis of business pain points caused by business growth MySQL:

1. Performance bottleneck:

With the rapid development of the company's business, the amount of data in the database has soared, and access performance has also slowed down. A single MySQL instance cannot cope with and satisfy large-scale data management and requested access, resulting in database performance degradation and becoming a bottleneck.
Relational data itself is more likely to form system bottlenecks, whether in terms of single-machine storage capacity, number of connections, or processing capabilities.
When the data volume of a single table reaches 10 million, due to the wide dimensions of queries and operations, even if MySQL is used to separate read and write from the database, optimize indexes and other operations, performance will inevitably decline severely.

2. Data strong consistency synchronization delay:

When the architecture adds message queues such as Redis and RabbitMQ
OLTP's large-volume statistical heterogeneous table synchronization can only meet the T+1 requirement of the business.
In the system architecture, middleware failure in the asynchronous design solution leads to data retransmission and data loss.

2. Types of database applications:

Item	distinguish	OLAP	OLTP
1	name	Online Analytical Processing Online Transaction Processing	(Online Transaction Processing)
2	effect	Handle enterprise-level decision analysis, strategic analysis, and business analysis, etc.	Handle routine business operations at the enterprise level, such as company procurement, sales, storage, payment, etc.
3	focus	Multidimensional data analysis technology and aggregation algorithms facilitate data analysis	Emphasis on data accuracy, transaction atomicity and concurrency
4	type of data	Historical, aggregated, non-real-time, immutable data	Real-time, detailed, real-time, variability data
5	Scenes	database	Routine business operations
6	query mode	Employ complex algorithms and storage structures, such as multidimensional databases and cubic structures	Requires simple SQL statements such as basic, transaction-related queries
7	Performance requirements	Higher storage requirements and processing power	Fast and stable response speed, scalability and high availability
8	Application scenarios	Enterprise-level decision support and strategic analysis and other fields	In the fields of procurement, sales, inventory management, bank transactions and other fields, we can quickly respond to user requests in a very short time to ensure the normal operation of the business.

Therefore, in daily enterprise-level applications, OLAP and OLTP have different solutions for different business scenarios. OLAP is mainly used for enterprise-level decision-making and strategic analysis, which requires fast data query and analysis technology. In contrast, OLTP is mainly used for daily enterprise operations that require rapid data updates and processing technology.

3. Optimization means in the project: "Sub-database and sub-table":

1. Business pain points:

The problem of reduced database performance due to excessive data volume

2. Problems solved:

Optimize the performance problems caused by excessive data volume in a single table, making the data volume in a single table smaller, improving retrieval performance, and alleviating query performance bottlenecks to a certain extent.
Avoid IO contention and reduce the chance of table locks
Solve the coupling at the business level and make the business clear
Ability to perform hierarchical management, maintenance, monitoring, expansion, etc. of data of different businesses
In high-concurrency scenarios, the number of IO and database connections can be increased to a certain extent, and the bottleneck of single-machine hardware resources can be reduced.
"Hot and cold data separation" used in some systems, backup history database
In scenarios of high concurrency and massive data, sub-database and sub-table can effectively alleviate the performance bottlenecks and pressures of single machines and single databases, and break through the bottlenecks of IO, number of connections, and hardware resources.

3. Brings new problems:

Cross-database join (for security and other reasons, cross-database join is generally prohibited)
Distributed transactions
Increased business complexity
The hardware cost will also be higher
Complex queries across shards, cross-shard transactions, etc.
Cross-node related query
When querying across nodes and multiple databases, limit paging and order by sorting issues become more complicated.
The primary key avoids duplication, and the primary key value ID cannot be guaranteed to be globally unique.
Public tables, parameter tables, data dictionary tables, etc. all have small data volumes and few changes, and each database saves a copy.

4. Pain points solved by TDSQL-C MySQL Serverless:

TDSQL-C MySQL Serverless instances provide real-time elasticity capabilities of CPU and memory, building a new form of MySQL products under the cloud resource architecture.

1. Pain points in regular business:

Item	business plan	Business pain points	Serverless instance solution
1	Build your own MySQL instance	(1). It is necessary to purchase a large number of cloud servers to build a MySQL cluster, and the equipment cost is high (2). The cost of dedicated personnel to operate and maintain the business and deploy the business (3). During events such as Double 11, be responsible for the expansion and contraction of services in advance	(1). Charge based on usage, no charge if not used (2). Cloud function computing, from CI/CD to service deployment, expansion and contraction, all completed automatically. Customers can focus more on business code
2	Traditional cloud database	(1). Provide a variety of memory/CPU specifications for users to purchase (2). Users can only purchase the full load configuration according to the maximum load, even if it is not used. Pay for selected specifications	(1). Automatic expansion and contraction, automatic expansion when the number of visits increases, automatic contraction when the traffic decreases, users do not need to pay attention to the specifications (2). According to the actual resources used Payment (3). No billing if not used. If there is no access, there should be no charge

2. Serverless database features:

Take a scenario: If you want to travel, you can only buy a car or a motorcycle. Now you can use taxi services directly through third-party platforms such as Didi. You only need to enter your destination, and you no longer need to pay attention to the pitfalls of buying a car and driving. The core demands of fear of being hit and car maintenance are better met.

Serverless databases can be thought of as purchasing virtual machines directly on the cloud, deploying services, and being responsible for service expansion and contraction. From CI/CD to service deployment, expansion and contraction, everything is done automatically. Users only need to focus more on business code. Can.

The basic characteristics of a Serverless database are that it requires no operation and maintenance, provides services through APIs, is billed based on actual usage, and has no usage or fees, etc.

3. In scenarios with large business fluctuations, the resource usage and specification changes of ordinary instances and Serverless instances are as shown in the following figure:

Insert image description here

As can be seen from the above figure, in the scenario of large business fluctuations:

Item	Data instance	Resource low period	Resource peak period	flexibility
1	Ordinary instance	More resources are wasted during the trough period	Insufficient resources during peak periods, business suffers	relatively fixed resources
2	Serverless instance	Unnecessary resources can be dynamically and flexibly released during low periods, thereby reducing resource waste.	It can fully meet business needs during peak periods, ensuring that business is not damaged and improving the stability of the system.	Dynamic elastic scaling capability

Summary: Since the specifications of Serverless instances can be adjusted at any time according to business demand, the overall waste of resources is very small, which improves resource utilization and reduces resource usage.

4. Advantages of TDSQL-C MySQL Serverless:

Item	Advantages	describe
1	lower cost	(1). For early-stage companies, MySQL Serverless does not rely on other infrastructure and related services. (2). It is ready to use and can provide stable and efficient data access services. (3). During the period of use, you only need to pay according to the usage of the resources occupied.
2	More storage space	(1). Storage space can be up to 32 TB. (2). Automatic expansion according to the amount of instance data can effectively avoid the impact of insufficient cluster storage resources on the business.
3	Automatic elastic expansion and contraction of computing resources	(1). The computing resources required by users for reading and writing can be elastically expanded. (2). No manual expansion or contraction is required, which greatly reduces operation and maintenance costs and system risks.
4	Fully managed and free of operation and maintenance	(1). There is no need to care about underlying services such as version upgrade, system deployment, expansion and contraction, and alarm processing. (2). Users have no perception, business has no impact, services are continuously available, and operation and maintenance are truly free.

5. Applicable scenarios:

Low-frequency database usage scenarios such as development and testing environments
SaaS application scenarios such as website building services for small and medium-sized enterprises
Individual developer users
Educational scenes such as school teaching and student experiments
Uncertain load scenarios such as Internet of Things (IoT) and edge computing
Fully managed or users who want to be completely free of operation and maintenance
Users whose business fluctuates or is unpredictable
Business scenarios with intermittent scheduled tasks

5. Three major features of TDSQL-C Serverless database:

Serverless is a serverless architecture version of Tencent's self-developed cloud-native database TDSQL-C MySQL. It automatically expands and shrinks, and is only billed based on actual usage. It is not billed at all, and can easily cope with dynamic changes and continuous growth in business data volume.

1. Automatic start and stop, no billing if not used:

Serverless services support customizing the automatic suspension time of instances, and instances will automatically suspend when there is no connection. When a task is connected, the instance will automatically wake up in seconds without interruption.

2. Billing based on usage:

The range of CCU elastic expansion and contraction can be adjusted, and the Serverless cluster will automatically increase or decrease CCU within this range according to actual business pressure.

3. Automatic expansion and contraction:

Serverless clusters will continuously monitor users' CPU, memory and other workload conditions, and trigger automatic expansion and contraction policies based on certain rules.

6. Automatic start and stop, no billing if not used:

1. Requirement description:

1. Demand background:
The automatic pause setting can be turned on or off by yourself according to business needs.

2. Implementation ideas:
1. The logic of automatic start and stop is relatively simple. By default, as long as no access is detected within 1 hour (user configurable), the computing node will be recycled, and the node will be restarted after access is returned. 2. You can also specify the database instance in the console to perform manual pause operation.

2. Test plan:

date +"%T.%N" && mysql -h gz-cynosdbmysql-grp-9eujfhd.sql.tencentcdb.com -P 27304 -u root -pDb123. -Nse "show databases" && date +"%T.%N"

3. Test result analysis:

The Linux date command can be used to display or set the system date and time. By getting the time before the command starts, and then getting the time after the command starts, you can get the total time spent on this command.

Command parameters:
%T time (including hours, minutes and seconds, hours are expressed in 24-hour format).
%N Inserts a new row when displayed.

Insert image description here

First stop TDSQL-C MySQL using the above two methods, and use the "show databases" command that comes with MySQL to view all databases. Because the default command does not involve any slow queries, it can only count Run with no load. You can see the execution time difference, which can be second level connection. Insert image description here

It can also be seen from the real-time monitoring report of TDSQL-C MySQL that the time period before 14:15 is in a suspended state. At around 15:00, when there is a connection access, the server will be started in seconds, which fully illustrates that when there is no When a database request is made, the monitoring service will trigger the recycling of computing resources. When the user accesses again, the access layer will wake up the cluster and provide access again.

4. Principle description:

TDSQL-C Serverless cold start can be completed within seconds. How can it be close to the startup time of cloud functions?

When there is a connection access, the system will automatically start the suspended database in seconds, and the user does not need to set up a reconnection mechanism.

The access layer of the TDSQL-C MySQL version adds a recovery perceptron (perceptron for short) module to implement request forwarding. After perceptron shakes hands with the client, it will not disconnect the client from the cluster. After the cluster is restored, it shakes hands with the TDSQL-C MySQL version and subsequently forwards Layer 4 messages.
The overall process design uses two challenge random numbers for authentication to realize the relay module perceptron. Username and password verification can be completed without storing the username and password, which ensures the security of the user password and does not introduce stored passwords. Inconsistency issues.

Insert image description here

When the cluster is in paused state, only perceptron's routes are retained
When the cluster recovers, the system retains both perceptron's routing and TDSQL-C's routing, and sets perceptron's routing weight to 0 to allow new connections to be directly connected to TDSQL-C, while existing connections to perceptron that have been established remain unchanged. Able to communicate.

7. Billing based on usage:

1. Billing instructions for TDSQL-C MySQL version of Serverless service:

1. Billing model
(1). The computing and storage of Serverless services are billed independently. (2). Computing is billed by the number of CCUs, storage is billed by GB used, the billing system is billed by seconds, and settlement is by hour.

2. Billing formula
Total cost of Serverless = Computing node cost + Storage space cost = Serverless computing power price × CCU amount + Storage space price × Storage space

A computing unit CCU (TDSQL-C Compute Unit) is defined =max {CPU, MEM/2, minimum specification}.

The meaning of MEM/2 is that since the defined specification CPU/memory ratio is 1/2, dividing the memory by 2 is equivalent to converting the memory into CPU.
Overall, the CPU determines the entire computing power.
Users are billed by calculating the average CCU per hour.

2. Create an order form:

Create an order table to facilitate writing SQL statements for stress testing. Below we will conduct a stress test on insert writing scenarios and analyze how TDSQL-C MySQL implements usage billing.

CREATE TABLE `orders` (
  `order_id` bigint(20) NOT NULL COMMENT '订单编号',
  `customer_id` bigint(10) NOT NULL COMMENT '下单用户编号',
  `product_id` bigint(15) NOT NULL COMMENT '产品编号',
  `product_name` varchar(30) NOT NULL COMMENT '产品名称',
  `product_price` decimal(10,2) NOT NULL COMMENT '产品价格',
  `quantity` int(11) NOT NULL COMMENT '产品数量',
  `total_price` decimal(10,2) NOT NULL COMMENT '总价格',
  `order_time` datetime NOT NULL COMMENT '下单时间',
  `delivery_time` datetime NOT NULL COMMENT '发货时间',
  `status` int(1) NOT NULL DEFAULT '1' COMMENT '订单状态 1：已完成 0：未完成',
  `address` varchar(100) NOT NULL COMMENT '家庭住址',
  `phone` bigint(11) NOT NULL COMMENT '联系电话',
  PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='订单信息表';

The table is created and the SQL executed successfully.

Insert image description here

3. Use Tencent Cloud’s “Cloud Stress Test” to test:

Cloud stress testing (Performance Testing Service, PTS) is a distributed performance testing service that can simulate real business scenarios of a large number of users and comprehensively verify system availability and stability. It supports initiating stress testing tasks on demand and provides the ability to initiate millions of concurrent multi-regional traffic. It provides functions such as traffic recording, scene orchestration, traffic customization, and advanced script customization. It can quickly define stress testing scenarios based on business models, truly restore application large-scale business access scenarios, and help users identify application performance problems in advance.

Create a stress test case scenario. In order to better illustrate billing based on usage, we chose the incremental stress test solution to simulate different loads based on different business volumes. The Serverless cluster will be based on the range within this range. Actual business pressure automatically increases or decreases CCU, and CCU generates different fees according to different computing power.

Stress test scenario description:

A total of 2000VUM was tested, and the maximum number of concurrency was 10VUs. In order to restore the production scenario, the pressure test was conducted in stages and increments. The total pressure test was 4 minutes. The number of concurrencies was gradually increased in 3 times within 1 minute, from 3.33VUs to 6.66. From VUs to 10VUs, the demo traffic gradually increases. Let’s take a look at CCU’s elastic automatic expansion capability.

Insert image description here

The following is the result report of this incremental test, a total of576809 requests (close to 6w requests), none of the network requests failed, the average response time is 3.75ms, the highest number of concurrency is 10VUs, the line chart below is also divided into 3 regular times Make increments. Insert image description here

For example, we set the computing power configuration as follows: select the minimum and maximum specifications as 0.25 cores-1 core:

As can be seen from the figure, the incremental pressure measurement is divided into 3 waves:

The CCU in the early stage of the first wave is around 0.25, and will be billed based on 0.25 computing power.
The CCU of the second wave is around 0.5, and will be billed based on 0.25 computing power.
After the third wave of business peak, CCU gradually increases to about 0.691, and billing will be based on 0.69, which can cope with the business load well.
After the stress test, the CCU gradually dropped from 0.5 to 0.25 and then to 0. If it is not used, it will not be billed.

Insert image description here

As can be seen in the figure below, when we perform stress testing, the cost of CCU has also increased. It can be easily automatically expanded and reduced. It will automatically expand when the number of visits increases, and automatically shrink when the number of visits decreases. Pay according to the actual resources used. No charge if not used, no charge if no access.

Insert image description here

The following is a reference for the performance data of CPU and memory. They are all proportional growth trends. Therefore, the CCU billing method mentioned above is related to CPU and memory.

Insert image description here

8. Automatic expansion and contraction:

The goal is to achieve second-level expansion and contraction, and the process is smooth and imperceptible to users.

Insert image description here

The above picture is an example. When purchasing, the user selects the minimum specification as 1 core 2G and the maximum specification as 2 core 4G.

1. If it is a traditional MySQL database instance:

Initially, it is 1 core and 2G. When business access comes and the CPU is full, users need to manually modify the rules. After modification, they need to be restarted to take effect.

2. If it is a TDSQL-C MySQL Serverless database instance:

Initially, the maximum CPU specification is provided to the user, and the memory starts from the minimum specification. Assuming that the user uses more than 1 CPU core and continues for a period of time, the memory will be expanded from 2G to 4G.
The CPU resources of TDSQL-C Serverless are not limited and can be used arbitrarily within the set maximum specifications. The advantage is that user performance is not limited, but the disadvantage is that the entire machine may be fully loaded.
Since TDSQL-C adopts a storage-computing separation architecture, once it monitors that the overall machine resources exceed the threshold, it will be quickly migrated. Migration is actually to restart the instance on another relatively idle machine, and it can be completed in seconds. Resource load can be precisely controlled.

3. Create a stress test scenario (using 80VUs):

The sub-incremental pressure test method is also used here, but the maximum concurrent number is 80VUs, and all pressure tests are completed within 1 minute. Then, 40VUs is used in the first 30s, and 80VUS is used directly in the next 30s.

Insert image description here

4. Analysis of pressure test results:

Here is also a simulation of how TDSQL-C MySQL expands in response to a sudden and sharp increase in traffic.

The first interval is for example, normal business traffic, such as placing an order for shopping, using 40VUs concurrently, and the maximum computing power used is about 0.8
The second interval is when the Double 11 event suddenly occurs, and the number of business visits immediately increases sharply. At this time, TDSQL-C MySQL can automatically expand to a CCU close to 4, which can Fully meet the sudden increase in business
The third interval indicates that the activity is over, and the corresponding business visits have dropped. You can see that the CCU can also elastically shrink back to around 0.8.
When there is no access in the fourth interval, the monitoring service will trigger the recycling of computing resources. At this time,CCU is reduced to 0, and if it is not used, no charge is required< /span>

Insert image description here

From the above figure, we can draw a conclusion:

Second-level expansion and contraction, the process is smooth and imperceptible to users
Serverless services support charging based on actual computing and storage resource usage, no need to pay.

Since the number of concurrency in the stress test is relatively large, you can also see that TDSQL-C MySQL has some diagnostic prompts. We can also perform some system optimizations based on some warnings to respond to business changes.

9. The architectural principles behind second-level expansion:

1. The existing structure of mainstream companies:

The current mainstream architecture of many companies uses a single redundant architecture (one master, multiple slaves). This architecture has big problems in scalability.

Instance upgrading, downgrading and read expansion require data migration.
As the amount of data continues to grow, migration takes longer and longer.

2. Storage and calculation separation architecture:

In order to solve the scalability problem of the one-master-multiple-slave architecture, most solutions adopted are to use a storage-computing separation architecture.

One is the ShareNothing architecture, which supports horizontal expansion in both computing and storage and has very strong expansion capabilities. But the biggest problem is SQL compatibility, which requires continuous construction and improvement of its own ecosystem.
The other type is the ShareStorage architecture. The shared storage architecture does not change the basic features of the query engine and ACI, and can achieve 100% compatibility.

Tencent Cloud has prioritized the database product TDSQL-C based on shared storage architecture to provide Serverless services.

Insert image description here

TDSQL-C is Tencent Cloud's cloud-native database based on shared storage architecture. Since ToB business has high stability requirements, it reuses relatively mature components on the cloud.

Use the MySQL kernel branch-TXSQL maintained by Tencent in the computing layer and reuse its bugfixes and new features.
Tencent's internal cloud hard disk CBS is used at the storage layer to separate CBS's core storage and hard disk logic to create a unified storage platform HiSTOR.

As a storage base, coupled with multiple cloud products such as cloud hard disk CBS and cloud distributed file system CFS, it provides a series of complete data security capabilities such as copy synchronization, automatic fault migration, and data verification. This is what TDSQL- C is an important cornerstone for MySQL Serverless products to run stably for several years.

10. Summary:

With the development of cloud computing, the advantage brought by the Serverless architecture is that users no longer need to consider the operation and maintenance work related to the server. They no longer need to consider the size, storage type, and network bandwidth. The Serverless architecture will help automatically Capacity expansion and contraction eliminates the need for server operation and maintenance, data backup, software configuration, etc.

With the rapid popularization of the serverless concept, Tencent Cloud Database TDSQL-C has also launched a serverless product form, which can provide users with lower-cost and more flexible cloud database services and reduce the user's burden of maintaining database specifications.

Through the above study and discussion of TDSQL-C Serverless's three aspects of automatic expansion and contraction, billing based on usage, and no billing if not used, TDSQL-C Serverless does not incur any fees when there is no load, and when business traffic When it arrives, it can automatically expand the capacity in seconds to handle sudden requests, allocate computing resources according to usage, and deduct fees according to usage.

For developers and enterprises, the Serverless service is a serverless architecture version of the TDSQL-C MySQL version of Tencent Cloud's self-developed new generation cloud-native relational database. It is a cloud-native database with a full Serverless architecture. Serverless services support charging based on actual computing and storage resource usage. There is no need to pay, which can help enterprises reduce costs and increase efficiency.