Is querying slow when using traditional database tables? How TDengine solves the application problems of "Sohu Fund"

This project needs to display the net value and income of domestic funds (monetary funds) in real time. On the basis of ensuring that it meets the function of line chart display, it also needs to add functions such as statistical rankings and paging display to provide users with the most comprehensive and real-time query services. The MySQL database previously used by the Sohu Fund team had capacity bottlenecks when facing massive data. Against this background, it decided to try a new solution based on TDengine .  

Selection background

Before using TDengine, we were using the MySQL database.

Since the fund data from the purchased data sources are all mixed together, including 20,000 domestic funds and tens of millions of rows of wide data spanning decades (from the 1990s to the present), if it is stored through MySQL For these data, we first need to divide the data of each fund into tables, which requires a certain amount of work, so we decided to save all the data in one table first.

However, such a large table will cause very low query performance. In order to deal with this problem, we generate daily fund data pictures through offline queries and return them to users. We have not yet provided customized query services to the outside world.

At the same time, after experiencing the above, we are also doubting the bottleneck of the traditional relational database's ability to face massive data. At this time, we learned about TDengine. Its core is a Time Series Database . Its special design of "one device, one table" is in line with the "one fund, one table" sub-table work we are doing . Coincidentally. Therefore, we decided to try a new solution based on TDengine.

Use experience

After thorough research and testing, we found:

Due to the existence of "super table", data modeling has become very clear, and almost all queries can be completed with simple SQL using "super table" as the core.

In addition, due to the special function of "automatic table creation", we can directly create tables without verification, which allows us to easily complete the split table creation and writing of each fund's data.

It can be said that the preliminary preparations for accessing TDengine were very smooth.

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

We used three 4C 16GB servers to set up a TDengine cluster.

The database creation statement is as follows:

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

It is worth mentioning that the fund data is one piece per day and is low-frequency data. For this kind of data, the default configuration is not enough. At the beginning, our query performance was not fast, basically at the second level or even higher.

Through documentation, blogs, and support from the official team, we have enlarged the duration and stt_trigger parameters to ensure that excessive file fragmentation will not affect read and write performance, and all subsequent queries will be optimized to the millisecond level.

Therefore, we have summarized some experience: different writing frequencies belong to different business scenarios. It is best not to use the same library, but to handle it in separate libraries.

The super table is modeled as follows:

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

Currently in daily use, business queries can return data in milliseconds in real time at 100 qps.

Starting from the design characteristics of the super table, it is much more convenient for us to perform statistical analysis in the super table dimension, such as: full fund query by filtering type and date——

select time, code, name, manager_id, manager_name, unit_net_value, pre_unit_net_value, accumulate_net_value, pre_day_rate, pre_week_rate, pre_month_rate, pre_three_month_rate, pre_half_year_rate, pre_year_rate, pre_cur_year_rate, pre_start_rate, last_time, last_unit_net_value, last_accumulate_net_value, asset_size from fund_net_value where time = #{date} and (type = '003009' or type = '003010')

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

For another example, querying the current fund net worth ranking and income ranking can be achieved through simple SQL——

select time, code, name, manager_id, manager_name, unit_net_value, pre_unit_net_value, accumulate_net_value, pre_day_rate, pre_week_rate, pre_month_rate, pre_three_month_rate, pre_half_year_rate, pre_year_rate, pre_cur_year_rate, pre_start_rate, last_time, last_unit_net_value, last_accumulate_net_value, asset_size from fund_net_value where time = #{date} and order by ${column} ${sort} limit #{offset}, #{size}

Is querying slow when using traditional database tables?  How TDengine solves the application problems of "Sohu Fund" - TDengine Database time series database

At the same time, we have built a Grafana visual monitoring system, using various monitoring tools and software to collect, store and analyze monitoring data, and provide real-time monitoring charts and alerts through the visual interface to help project leaders identify modification problems in real time, and further Improved service reliability and stability.

write at the end

All in all, on the premise of ensuring stable and efficient operation, we have gradually replaced the original functions with TDengine smoothly. Considering that the domestic fund project is just the beginning, we still need to do more research and study on this domestic timing library around other projects such as stocks.

company profile

Sohu is a well-known comprehensive Internet company in China. Its main business areas include new media, communications and mobile value-added services, integrating multiple roles such as entertainment center, sports center, fashion and cultural center.

about the author

Wupeng, senior development engineer of Sohu Intelligent Platform.

Microsoft launches new "Windows App" .NET 8 officially GA, the latest LTS version Xiaomi officially announced that Xiaomi Vela is fully open source, and the underlying kernel is NuttX Alibaba Cloud 11.12 The cause of the failure is exposed: Access Key Service (Access Key) exception Vite 5 officially released GitHub report : TypeScript replaces Java and becomes the third most popular language Offering a reward of hundreds of thousands of dollars to rewrite Prettier in Rust Asking the open source author "Is the project still alive?" Very rude and disrespectful Bytedance: Using AI to automatically tune Linux kernel parameter operators Magic operation: disconnect the network in the background, deactivate the broadband account, and force the user to change the optical modem
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4248671/blog/10141605