GreptimeDB v0.3 officially released|Comprehensive improvement in distributed capabilities

June is about to pass halfway, accompanied by the sound of summer and the chirping of cicadas, GreptimeDB v0.3 has arrived as scheduled.

In the v0.2 version released in mid-April , our main goals focus on single-machine , PromQL compatibility , write performance optimization , etc. The stand-alone version of v0.2 has a better foundation, and in the v0.3 version, our keyword is "distributed" , that is, all capabilities or updates are provided on the distributed version (distributed provision In addition, the scalability, high availability and fault tolerance that the stand-alone version does not have), in summary, the main aspects are as follows:

  • Distributed performance optimization : Realizes Region-level high availability and provides fast disaster recovery switching scheduling. Distributed write performance has also been optimized.
  • Query capability improvement : including support for distributed query optimization, improvement of important SQL queries (such as TopK), and optimization of data compression strategies to speed up queries.
  • Stability enhancement : In order to increase the robustness and reliability of the system, the Procedure framework is introduced to ensure the eventual consistency of multi-step operations. At the same time, a more fine-grained Hybrid-flush strategy is provided to improve the stability of writing, and more performance index measurement points are added to improve the observability of the system, supporting tools such as Tokio console.

From 0.2 to 0.3, in one and a half months, the GreptimeDB project alone merged 222 PRs, involving 674 file modifications, including 120+ feature optimizations, 50+ fixes and 20+ refactorings. Behind these numbers are 27 people The efforts of community contributors. Special thanks to the community and team for their efforts. Next, we will focus on reviewing the core content of v0.3.

GreptimeDB v0.3 key contents

Achieve Region-level high availability

In a distributed system, in order to ensure high availability, region-level disaster recovery is required. GreptimeDB supports this feature in version v0.3, which is an important milestone in achieving distributed high availability. Since it involves Frontend, Meta The transformation of different components such as Datanode and Datanode has a wide impact. We track the entire implementation process through Issue#1126 . If you are interested in this, welcome to pay attention. Of course, everything starts with RFC: Region Fault Tolerance :

Important SQL query scenario optimization

Based on past business experience, in the development of GreptimeDB v0.3, we focused on optimizing the most commonly used query scenarios, such as TopK, etc., mainly using the idea of ​​pruning. We also divided the entire work into multiple sub-tasks and passed Issue # 1286 tracking.

Distributed query optimization, supporting common operator pushdown

In order to improve the analysis capabilities of the database, someone has long proposed the Near-Data Processing method, which is to give the storage layer a certain amount of computing power and allow some simple calculations to be completed in the data center before returning. This can avoid the transmission of a large amount of original data. Operator push-down is the most common implementation method. GreptimeDB v0.3 supports push-down of most operators and predicates of PromQL to optimize distributed queries. For such an important function, we also discussed the implementation plan through RFC: Distributed Planner in advance . For specific implementation, please refer to PR#1660 . (Greptime development routine: This PR requires more than 1,000 lines of code modification, which is highly not recommended internally by us, but Because the author will give out red envelopes, everyone reluctantly forgives him and even looks forward to it)

The Procedure framework is introduced to ensure the ultimate consistency of multi-step operations.

In order to increase the robustness and reliability of the system, inspired by Apache HBase's ProcedureV2 framework , GreptimeDB also used Rust to write a Procedure framework to ensure the ultimate consistency of multi-step operations. This is another story starting from RFC: Procedure Framework , and There is a super huge Issue#286 being followed, v0.3 is not the end of it, the story will continue. (By the way, if you think RFC is too boring, you can also refer to Nine Turns of Large Intestines to understand what Procedure is)

Add more performance indicators to improve the observability of the system

As a reliable storage solution under the observable system, GreptimeDB's own observability must also be good. In v0.3, more metric indicators are added to detect the operation of the system. This part of the content covers every Among the components, you can refer to the PR list .

Optimization of other details:

  • Supports querying external data, importing and exporting CSV/JSON/Parquet format files
  • Support TQL EXPLAIN/ TQL ANALYZEclause, analyze PromQL query performance
  • Improved PromQL compatibility
  • Support enabling Tokio console in cluster mode
  • etc.

Overall, v0.3 will be a distributed version that can be initially trialled. It has region-granular service high availability (high data reliability will be completed in subsequent versions), and distributed query in key scenarios (focusing on PromQL query direction). and write performance have reached or slightly exceeded the performance watermark of mainstream similar databases.

Upgrade Notes

If you are upgrading from 0.2, you need to pay special attention to the following points:

  • To use local storage, you need to modify the configuration data_dir. This option has been abandoned. If you set it originally data_dir = "/greptimedb/data", you need to modify it data_home = "/greptimedb"to data_homereplace the specified data root directory.
  • COPYIt is recommended to use the command to back up data before upgrading.

GreptimeDB v0.4 plan

Starting from 0.3, the research and development focus has been focused on distribution, and medium and long-term plans have been formulated, with monthly iterations. The theme of 0.4 is performance and improvement of existing functions, and will focus on the following functions:

  • Support for distributed DDL statements such as Create Table, Drop Table, etc. are connected to the Procedure framework to ensure the correct execution of distributed multi-step operations.
  • Asynchronous compression and indexing to improve query performance
  • In version 0.3, we supported the pushdown of most operators and predicates in PromQL. Version 0.4 will focus on the pushdown of common operators in SQL to improve SQL query performance.
  • Focus on optimizing the table engine and storage engine to improve read and write performance and reduce resource consumption

Thanks to the community

Thank you to the dear community and all contributors. It is your every suggestion, bug fix and code contribution that allows this project to continue to grow and reach new heights.

Screenshot 20230602 15.28.19.png

About Greptime

Greptime Greptime Technology was founded in 2022 and is currently improving and building two products, time series database GreptimeDB and GreptimeCloud.

GreptimeDB is a time series database written in Rust language. It is distributed, open source, cloud native, and highly compatible. It helps enterprises read, write, process, and analyze time series data in real time while reducing the cost of long-term storage.

Based on the open source GreptimeDB, GreptimeCloud provides users with fully managed DBaaS, as well as application products combined with observability, Internet of Things and other fields. Using the cloud to provide software and services can achieve rapid self-service provisioning and delivery, standardized operation and maintenance support, and better resource flexibility. GreptimeCloud has officially opened for internal testing. Welcome to follow the official account or official website for the latest developments!

Official website: https://greptime.com/

Public account: GreptimeDB

GitHub: https://github.com/GreptimeTeam/greptimedb

Documentation: https://docs.greptime.com/

Twitter: https://twitter.com/Greptime

Slack: https://greptime.com/slack

LinkedIn: https://www.linkedin.com/company/greptime/

The author of the open source framework NanUI switched to selling steel, and the project was suspended. The first free list in the Apple App Store is the pornographic software TypeScript. It has just become popular, why do the big guys start to abandon it? TIOBE October list: Java has the biggest decline, C# is approaching Java Rust 1.73.0 Released A man was encouraged by his AI girlfriend to assassinate the Queen of England and was sentenced to nine years in prison Qt 6.6 officially released Reuters: RISC-V technology becomes the key to the Sino-US technology war New battlefield RISC-V: Not controlled by any single company or country, Lenovo plans to launch Android PC
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/6839317/blog/10090384