Ten billion-level storage + millisecond-level writing: TDengine's practice in Dewu APP

Author | Lynx
Introduction : Many systems and scenarios of Dewu require traffic monitoring and protection. Hundreds of millions of data can be generated in one day, and the writing speed can reach 10,000 TPS. This data level cannot be processed by traditional relational databases. After comparing the performance of InfluxDB, OpenTSDB, Cassandra and other time series databases, TDengine was finally chosen.

background

As an Internet e-commerce company, Dewu has many systems and scenarios that require traffic monitoring and protection. Therefore, when we deeply customized the open source flow control protection component Sentinel, we added many functions to help improve the flow control protection of various business systems. .
 
Note: Sentinel component address:
 
During the development process, we found that the open source version of Sentinel does not support flow control data persistence, and we need such a function very much: we need a database that can carry a large amount of flow monitoring data, and can store and efficiently store the data. Inquire.
 
Currently in the production environment, we have hundreds of business systems and thousands of servers connected to Sentinel, and the flow control data generated in this way is undoubtedly huge. So for this demand, it is undoubtedly extremely important to choose a suitable database, and a good choice can achieve twice the result with half the effort.
 

01 Database selection

First, roughly estimate the theoretical upper limit of the current data volume:
At present, there are thousands of Sentinel Resources in the production environment, and Sentinel's monitoring data time granularity is counted in seconds. In theory, hundreds of millions of data can be generated in one day, and the theoretical data writing speed will reach 10,000 TPS , and the business is still fast. With development, it is foreseeable that the amount of data will explode further, and it is obvious that this level of data cannot be used with traditional relational databases.
Because there are some internal applications using TiDB, I first looked at the feasibility of using TiDB, but quickly gave up. After all, as a distributed database, it is not aimed at the scene with strong timing characteristics such as monitoring data. .
 
After exclusion, we focus our research on time series databases.
 
Each mainstream time series database has its own advantages and disadvantages:
  • InfluxDB, probably the most widely used time series database, is also suitable for scenarios; however, the cluster function requires a commercial version.
  • OpenTSDB, based on HBase, is too heavy for current simple needs.
  • Cassandra, from the several comparison reports found, the performance is not quite satisfactory.
 
When I was going to continue to learn about Clickhouse, my colleague Amway made a domestic IoT big data platform - TDengine
 
I simply learned about it on the Internet, and found that the reputation is good and the community activity is also high. Later, I went to the official website to check the comparison report between TDengine and other databases, and found that it is also very good in terms of performance.
So we wrote a demo and used TDengine briefly. During the whole process, with the help of clear documentation, the learning cost was acceptable, so we finally decided to use TDengine.
 

02 Data structure and modeling method Data structure

data structure

 
First, let's take a look at how Sentinel's traffic data is presented.
 
As can be seen from the above figure, the left side is the application list, and there is an independent monitoring panel in the menu of each application. In the monitoring panel, the traffic data of all resources is counted at the granularity of resources, such as QPS, rejection, etc. QPS, response time, etc.
So from the perspective of front-end rendering, the unique key of data should be application-resource.
 
Then we look at the structure of the data from the perspective of internal implementation.
 
The Sentinel client counts the traffic data of all resources on each server, aggregates it in seconds, and records it in the local log. The console obtains the collected traffic data by calling the interface exposed by the client, and then aggregates the traffic data of all single machines in the dimension of service and stores it in the memory.
 
Therefore, the data we need to store is the only attribute that falls into the database with application-resources.
 

data modeling

The data modeling methods suggested in the official TDengine documentation are as follows:
In order to make full use of the time series and other data characteristics of its data, TDengine requires a separate table for each data collection point . The method of one data collection point and one table can ensure that the performance of inserting and querying a single data collection point is optimal to the greatest extent. In the design of TDengine, the table is used to represent a specific data collection point, and the super table is used to represent a set of data collection points of the same type . When creating a table for a specific data collection point, the user uses the definition of the super table as a template, and at the same time specifies the label value of the specific collection point (table). Compared with traditional relational databases, tables (a data collection point) are statically tagged, and these tags can be added, deleted, and modified afterwards. A hypertable contains multiple tables with the same time series data schema but with different label values.
 
It can be seen that the data modeling method suggested in the official document is completely in line with the data characteristics of this scenario: an application-resource is a table, and all application-resources are placed in a super table for aggregation query. Therefore, in the design of the table structure, the method recommended by the official documentation is used.
 
In addition, in the selection of tags, although there is no need for aggregation operations at present, considering that future aggregation operations are likely to be performed in the dimension of the application, we decided to record some application information as tags in the table.
 

03 Overall Architecture

 
The current overall architecture diagram is as above. Each business system connected to Sentinel will periodically send heartbeat requests to the console to maintain the health of the machine.
 
The console periodically polls all machines, pulls the monitoring data recorded by the Sentinel client in the business system, and writes it to the TDengine cluster in batches after aggregation processing.
Since the scenario is simple and not used as the main monitoring system, and the data is currently acceptable for a small amount of loss, there is no excessive failure handling mechanism designed.
 

04 Technical selection

Connector

In terms of Connector selection, the company's main development language is Java, and the related ecology is also more complete, so it is natural to choose the JDBC form of Connector .
 
In addition, the performance of JDBC is better than that of HTTP, and the JDBC driver also supports automatic node switching when the node is unavailable .
 
The only inconvenience is that the JDBC method will strongly rely on local library functions , and TDengine needs to be installed on the client's machine, which will be a little more troublesome in the project deployment stage, but overall, the advantages outweigh the disadvantages.
Recently, the official JDBC-RESTful method has been updated to support cross-platform functions. Since the operating systems of the company's servers are all Linux, there is no cross-platform requirement, so the JDBC-JNI Connector is still used.
 

Database connection pool and ORM

The database connection pool and ORM framework also choose Druid+Mybatis , which is the mainstream in the company. According to the Demo code of the official website, the access can also be completed efficiently. However, in the use of Mybatis, Mybatis is only used in the query, and the ResultSet is converted into a more convenient entity, and Mybatis is not used when writing data. For convenience, it is executed directly after splicing sql in memory.
 
In general, TDengine is very friendly in adapting to mainstream frameworks, supporting HikariCP, Druid, Spring JdbcTemplate, Mybatis , etc., and according to the Demo provided by the official website, it can quickly realize access, saving a lot of time, some precautions documents are clearly listed.
 

Cluster construction

At present, the TDengine cluster has three physical nodes, all of which are 16 cores/64G memory/1T storage.
The official cluster building documentation is very detailed, and you can build a TDengine cluster by simply following the documentation for fool-like operations.
 

Build a library

In the preliminary investigation, it was found that assuming that the cluster has only three machines, if the amount of data is too large, the number of copies is 3, which is equivalent to storing a complete data on each machine. The pressure will be large, so the number of copies is set to 1 when building the library. If the scale of the cluster is expanded in the future, TDengine also supports dynamic modification of the number of replicas, which can easily complete the switch to the high-availability cluster.
 
In addition, considering the query performance, the blocks are set to 16 and the cache is set to 64MB.
CREATE DATABASE sentinel KEEP 365 DAYS 1 blocks 16 cache 64;

 

05 Performance

At present, TDengine carries tens of billions of data and runs smoothly in the production environment. The daily CPU usage is less than 1%, and the memory usage is stable below 25%.
 
The following figure shows the monitoring chart of a machine in the cluster:
When using the early TDengine version (2.0.7.0) for research, there are some defects in memory, but with the iteration of the version, the memory problem has been better solved.
 

Write performance

The console machine is configured with 4 cores and 16G, the maximum number of core threads set in the batch write thread pool is 16, and the maximum number of threads in the database connection pool is 20, and the actual usage is about 14.
 
The writing process is as follows:
 
The maximum number of write entries set for batch writing is 400, and the writing time is as follows:
 
It can be seen that the time-consuming of large-scale writing can basically be kept at 10ms, which is an ideal range. The maximum length of the SQL statement has not been adjusted yet, and the write performance may be further optimized by increasing the length of the SQL statement in the future.
 

query performance

The following time consumption does not include network overhead, etc., and the data comes from the query of the specified SQL statement on the client. The amount of data queried in the super table is in the order of tens of billions. The time consumption of several typical scenarios is given below:
  • last_row function: 8.6ms 8.8ms 5.6ms
select last_row(*) from stable;

 

  • Query all data of a single application + resource for a certain five minutes: 3.4ms 3.3ms 3.3ms
select * from table where ts >= '2021-01-01 19:00:00' and ts < '2021-01-01 19:05:00';

 

  • Query the average passing qps of a single application + resource every 2 minutes within a certain 3 hours: 1.4ms 1.3ms 1.4ms
select avg(pass_qps) from table where ts >= '2021-01-01 19:00:00' and ts < '2021-01-01 22:00:00' interval (2m);

 

  • Query the average passing qps of every two minutes in a day by grouping by service dimension: 2.34s 2.34s 2.35s
select avg(pass_qps) from stable where ts >= '2021-01-01 00:00:00' and ts < '2021-01-02 00:00:00' interval (2m) group by appid;

It is worth mentioning that the query efficiency in the 2.0.7.0 version of TDengine was about ten seconds, which was considered unacceptable at the time, and the optimization effect was remarkable after several versions;

 
  • Query the average passing qps of each hour in three days by grouping by service dimension: 2.17s 2.16s 2.17s
select avg(pass_qps) from stable where ts >= '2021-01-01 00:00:00' and ts < '2021-01-03 00:00:00' interval (60m) group by appid;

Whether it is an aggregate query in the range of large data volume, or a specified query of all data within a certain cell, the query efficiency is still very good.

And compared with the data in the previous research, the query performance of the new version has been optimized a lot, and it is believed that in the future version iterations, it will be further improved.

 

storage

Currently, Sentinel's data is not replicated, and the full amount of data is scattered among three machines. According to calculations, the compression rate of TDengine's Sentinel monitoring data is 10%, which is quite impressive.
 

06 Summary

At present, TDengine is only used as a small-scale pilot of a time series database in Dewu, and some advanced functions such as stream computing and built-in query functions are not used. As a time series database, its read-write performance and storage performance are satisfactory. . In addition, the difficulty of operation and maintenance and the cost of learning are also unexpectedly low, and it is very easy to set up a set of available clusters, which is also a huge advantage. In addition, the version of TDengine iterates very fast, and some problems encountered in the old version were quickly fixed, and the effect of performance optimization was also very significant.
 
During the period of research and use of TDengine, another very important feeling is that the official documents are really detailed, and the articles in the technical part explain the technical architecture and technical design of TDengine in simple terms, and can learn a lot; guide Class articles have clear and simple steps, which greatly reduces learning costs, allowing developers to quickly complete framework adaptation, cluster building, and SQL writing.
 
In the future, we will continue to follow up the release notes of TDengine to learn about new features, optimization points, bug fixes, etc., and will upgrade the version when necessary.
 
It is expected that the performance and stability of TDengine will continue to improve, and it will also be used as one of the alternatives for technology selection in other suitable business scenarios in the future. Flow control data.
 
Note: The data in this article are based on TDengine versions 2.0.7.0 and 2.0.12.1.
{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324153477&siteId=291194637