What are TSBSs? Why does the time series database TDengine choose it as a performance comparison test platform?

TSBS is a performance benchmarking platform for time series data processing (database) systems. It provides two typical application scenarios of IoT and DevOps. It is open sourced and maintained by Timescale. As a performance benchmarking platform, TSBS has the characteristics of convenience, ease of use, and flexible expansion. It covers functions such as time series data generation, writing (loading), and typical queries of various categories, and can automatically summarize the final results. Due to its open and open source features, it has been supported by many database vendors, and is widely used by several database vendors as a professional product performance benchmarking platform.

The following performance benchmark reports use TSBS as the basic benchmark platform. From the perspective of the time span and the popularity of the publisher, we can find that the basic test platform TSBS already has a high degree of recognition:

  • In November 2018, Aliaksandr Valialkin, the founder of VictoriaMetrics, released "High-cardinality TSDB benchmarks: VictoriaMetrics vs TimescaleDB vs InfluxDB", which compared the performance of VictoriaMetrics with TimescaleDB and InfluxDB.

  • In November 2018, the article "ClickHouse Crushing Time Series" compared the performance of TimescaleDB, InfluxDB, and ClickHouse in time series data scenarios.

  • In March 2020, Cloudera released "Benchmarking Time Series workloads on Apache Kudu using TSBS" on the website blog, comparing the overall performance of Apache Kudu, InfluxDB, VictoriaMetrics, ClickHouse, etc. in DevOps scenarios.

  • In March 2020, Redis released the TSBS-based performance report "RedisTimeSeries Version 1.2 Benchmarks".

  • In August 2020, Timescale released a performance comparison report "TimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data" on its official blog.

  • In August 2021, QuestDB released a performance comparison report between QuestDB and TimescaleDB - "QuestDB vs. TimescaleDB".

In order to evaluate the performance indicators of TDengine 3.0 objectively, accurately and effectively, TDengine decided to use TSBS (Time Series Benchmark Suite) as the benchmark performance testing platform, and to conduct an overall test on TDengine 3.0 (including writing, querying, storing, resource consumption, etc.) performance evaluation.

The DevOps scenario is a typical time series data application scenario. The TSBS DevOps scenario provides simulated data of the CPU status, records 10 measurement values ​​(metric), 1 timestamp (nanosecond resolution) for each device (CPU), 10 tag values ​​(tag). The generated data contains one record every 10 seconds. The specific content and sample data are as follows:

TSBS testing can be simply divided into two main parts - data writing and data query. In this entire benchmark performance evaluation, the following five scenarios are involved, and the specific data scale and characteristics of each scenario are shown in the table below:

As can be seen from the above table, the difference between the five scenarios mainly lies in the number of device records and the number of devices contained in the data set, and the data time interval is maintained at 10 sec. On the whole, the data scale of the five scenarios is not large. The largest data scale is scenario 5, with 180 million records, and the smallest data scale is scenario 1, with only 26.78 million records. In Scenario 4 and Scenario 5, due to the relatively large number of devices, the dataset only covers a time span of 3 minutes.

In order to ensure the fairness, reliability and reproducibility of the test results, we chose the public IaaS platform to build the Benchmark basic hardware environment, and adopted the scenario used in most performance comparison reports—the instance of r4.8xlarge type in the Amazon EC2 service environment as the The basic operating platform is located in North America, including 1 server and 1 client. The client and server hardware configurations are identical, and both use a 10 Gbps network connection. The configuration profile is as follows:

The comparison software for this test is InfluxDB1.8.10 and Timescale 2.6.0. Here I want to emphasize that since the latest version 2.0 of InfluxDB is not included in the main branch of TSBS, we temporarily use the main branch of TSBS in this test The latest version of InfluxDB supported, which is 1.8.10.

The entire TSBS test process is relatively simple. When comparing write performance, run the TSBS framework script directly after configuring the parameters and wait for the result output. For query processing, we choose batch automation to run, run each query statement 5000 times, and count the arithmetic average of query delay as the final query delay result. In addition, we also monitored and recorded the system resource overhead and load of the server and client nodes throughout the process.

The following is a brief introduction to the test results. As shown in the table below, in all five scenarios, the writing performance of TDengine is better than that of InfluxDB and TimescaleDB, and the resource consumption in the writing process is the lowest. Compared with InfluxDB, the best scenario for TDengine to write is under 10 million devices, which is 10.6 times that of InfluxDB; compared to TimescaleDB, the best scenario for TDengine to write is under 4000 devices, which is 6.7 times that of TimeScaleDB.

In the query test, we divided it into 5 categories and 15 subcategories for query comparison. From the results summary in the figure below, we can see that in all 15 query types, the performance of TDengine is better than that of InfluxDB and TimescaleDB, and It has lower latency for all queries than InfluxDB and TimescaleDB. One of the brightest data is reflected in the comparison of Double Rollups query types. TDengine is up to 34 times that of InfluxDB and 24 times that of TimescaleDB.

The above is the test background introduction of TDengine based on the TSBS test report. Click "Comparative Test of Performance of TimescaleDB, InfluxDB and TDengine Based on TSBS Standard Dataset Time Series Database" to view the overall report.


To learn more about the specific details of TDengine Database , you can view the relevant source code on GitHub .

Guess you like

Origin blog.csdn.net/taos_data/article/details/129277924