Quick overview
TPC-H
The TPC-H benchmark is a standard for evaluating decision support systems, focusing on complex queries and data maintenance. In this analysis, we compared Databend Cloud and Snowflake using the TPC-H SF100 (SF1 = 6 million rows) dataset, which contains 100GB of data and approximately 600 million rows, spanning 22 queries.
Disclaimer
TPC Benchmark™ and TPC-H™ are trademarks of the Transaction Processing Performance Council ( TPC ). Our benchmarks, although inspired by TPC-H, are not directly comparable to official TPC-H results.
Snowflake 和 Databend Cloud
- Snowflake : Snowflake is known for its advanced features such as decoupled storage and compute, on-demand scalable compute, data sharing, and cloning capabilities.
- Databend Cloud : Databend Cloud provides similar functionality to Snowflake and is a cloud-native data warehouse that also separates storage from computing and provides scalable computing power as needed. It was developed from the open source Databend project and is positioned as a modern, cost-effective alternative to Snowflake, especially suitable for large-scale analysis.
Performance and cost comparison
- In terms of data loading, Databend's cost is about 67% lower than Snowflake.
- In terms of query execution, Databend is approximately 60% more cost efficient than Snowflake.
Notice
No tuning was performed in the benchmark. Results are based on Snowflake and Databend Cloud's default settings. Remember, don’t just take our word for it – we encourage you to run it yourself and verify these results.
Data loading benchmark
Table Name | Snowflake(695s, cost $0.77) | Databend Cloud(446s, cost $0.25) | Number of lines |
---|---|---|---|
customer | 18.137 | 13.436 | 15,000,000 |
lineitem | 477.740 | 305.812 | 600,037,902 |
nation | 1.347 | 0.708 | 25 |
orders | 103.088 | 64.323 | 150,000,000 |
part | 19.908 | 12.192 | 20,000,000 |
party support | 67.410 | 45.346 | 80,000,000 |
region | 0.743 | 0.725 | 5 |
supplier | 3.000 | 3.687 | 10,000,000 |
total time | 695s | 446s | |
total cost | $0.77 | $0.25 | |
Storage size | 20.8GB | 24.5GB |
Query Benchmark: Cold Start
Inquire | Snowflake (207s total, cost $0.23) | Databend Cloud (166s total, cost $0.09) |
---|---|---|
TPC-H 1 | 11.703 | 8.036 |
TPC-H 2 | 4.524 | 3.786 |
TPC-H 3 | 8.908 | 6.040 |
TPC-H 4 | 8.108 | 4.462 |
TPC-H 5 | 9.202 | 7.014 |
TPC-H 6 | 1.237 | 3.234 |
TPC-H 7 | 9.082 | 7.345 |
TPC-H 8 | 10.886 | 8.976 |
TPC-H 9 | 18.152 | 13.340 |
TPC-H 10 | 13.525 | 12.891 |
TPC-H 11 | 2.582 | 2.183 |
TPC-H 12 | 10.099 | 8.839 |
TPC-H 13 | 13.458 | 7.206 |
TPC-H 14 | 8.001 | 4.612 |
TPC-H 15 | 8.737 | 4.621 |
TPC-H 16 | 4.864 | 1.645 |
TPC-H 17 | 5.363 | 14.315 |
TPC-H 18 | 19.971 | 12.058 |
TPC-H 19 | 9.893 | 12.579 |
TPC-H 20 | 8.538 | 8.836 |
TPC-H 21 | 16.439 | 12.270 |
TPC-H 22 | 3.744 | 1.926 |
total time | 207s | 166s |
total cost | $0.23 | $0.09 |
Query Benchmark: Warm Start
Inquire | Snowflake (138s total, cost $0.15) | Databend Cloud (124s total, cost $0.07) |
---|---|---|
TPC-H 1 | 8.934 | 7.568 |
TPC-H 2 | 3.018 | 3.125 |
TPC-H 3 | 6.089 | 5.234 |
TPC-H 4 | 4.914 | 3.392 |
TPC-H 5 | 5.800 | 4.857 |
TPC-H 6 | 0.891 | 2.142 |
TPC-H 7 | 5.381 | 4.389 |
TPC-H 8 | 5.724 | 5.887 |
TPC-H 9 | 10.283 | 9.621 |
TPC-H 10 | 10.368 | 8.524 |
TPC-H 11 | 1.165 | 1.364 |
TPC-H 12 | 7.052 | 5.352 |
TPC-H 13 | 12.829 | 6.180 |
TPC-H 14 | 3.288 | 2.725 |
TPC-H 15 | 3.475 | 2.748 |
TPC-H 16 | 4.094 | 1.124 |
TPC-H 17 | 4.203 | 13.757 |
TPC-H 18 | 18.583 | 11.630 |
TPC-H 19 | 3.888 | 7.881 |
TPC-H 20 | 6.379 | 5.797 |
TPC-H 21 | 10.287 | 9.806 |
TPC-H 22 | 1.573 | 1.122 |
total time | 138s | 124s |
total cost | $0.15 | $0.07 |
Reproducing benchmarks
You can reproduce the benchmark by following the steps below.
Benchmark environment
Both Snowflake and Databend Cloud were tested under similar conditions:
parameter | Snowflake | Databend Cloud |
---|---|---|
Calculate cluster size | small | small |
vCPU | 16 | 16 |
price | $4/hour | $2/hour |
AWS Region | us-east-2 | us-east-2 |
storage | AWS S3 | AWS S3 |
- The TPC-H SF100 dataset, sourced from Amazon Redshift , has been loaded into Databend Cloud and Snowflake without any specific tuning.
Benchmarking method
We ran hot and cold rounds of query execution:
- Cold run : The data warehouse is suspended and resumed before executing the query.
- Hot run : The data warehouse is not suspended and uses local disk cache.
prerequisites
- Have a Snowflake account
- Create a Databend Cloud account .
Data loading
-
Snowflake data loading :
- Log in to your Snowflake account .
- Create a table corresponding to the TPC-H schema. SQL script .
- Use
COPY INTO
commands to load data from AWS S3. SQL script .
-
Databend Cloud data loading :
- Log in to your Databend Cloud account .
- Create the necessary tables, consistent with the TPC-H schema. SQL script .
- Load data from AWS S3 using a Snowflake-like approach. SQL script .
TPC-H query
-
Snowflake query :
- Log in to your Snowflake account .
- Run TPC-H query. SQL script .
-
Databend Cloud 查询:
- 登录您的 Databend Cloud 账户.
- 运行 TPC-H 查询。SQL 脚本.