TPC-H Benchmark: Databend Cloud vs. Snowflake

Quick overview

TPC-H

The TPC-H benchmark is a standard for evaluating decision support systems, focusing on complex queries and data maintenance. In this analysis, we compared Databend Cloud and Snowflake using the TPC-H SF100 (SF1 = 6 million rows) dataset, which contains 100GB of data and approximately 600 million rows, spanning 22 queries.

Disclaimer

TPC Benchmark™ and TPC-H™ are trademarks of the Transaction Processing Performance Council ( TPC ). Our benchmarks, although inspired by TPC-H, are not directly comparable to official TPC-H results.

Snowflake 和 Databend Cloud

  • Snowflake : Snowflake is known for its advanced features such as decoupled storage and compute, on-demand scalable compute, data sharing, and cloning capabilities.
  • Databend Cloud : Databend Cloud provides similar functionality to Snowflake and is a cloud-native data warehouse that also separates storage from computing and provides scalable computing power as needed. It was developed from the open source  Databend project and is positioned as a modern, cost-effective alternative to Snowflake, especially suitable for large-scale analysis.

Performance and cost comparison

  • In terms of data loading, Databend's cost is about 67% lower than Snowflake.
  • In terms of query execution, Databend is approximately 60% more cost efficient than Snowflake.

Notice

No tuning was performed in the benchmark. Results are based on Snowflake and Databend Cloud's default settings. Remember, don’t just take our word for it – we encourage you to run it yourself and verify these results.

Data loading benchmark

Table Name Snowflake(695s, cost $0.77) Databend Cloud(446s, cost $0.25) Number of lines
customer 18.137 13.436 15,000,000
lineitem 477.740 305.812 600,037,902
nation 1.347 0.708 25
orders 103.088 64.323 150,000,000
part 19.908 12.192 20,000,000
party support 67.410 45.346 80,000,000
region 0.743 0.725 5
supplier 3.000 3.687 10,000,000
total time 695s 446s
total cost $0.77 $0.25
Storage size 20.8GB 24.5GB

Query Benchmark: Cold Start

Inquire Snowflake (207s total, cost $0.23) Databend Cloud (166s total, cost $0.09)
TPC-H 1 11.703 8.036
TPC-H 2 4.524 3.786
TPC-H 3 8.908 6.040
TPC-H 4 8.108 4.462
TPC-H 5 9.202 7.014
TPC-H 6 1.237 3.234
TPC-H 7 9.082 7.345
TPC-H 8 10.886 8.976
TPC-H 9 18.152 13.340
TPC-H 10 13.525 12.891
TPC-H 11 2.582 2.183
TPC-H 12 10.099 8.839
TPC-H 13 13.458 7.206
TPC-H 14 8.001 4.612
TPC-H 15 8.737 4.621
TPC-H 16 4.864 1.645
TPC-H 17 5.363 14.315
TPC-H 18 19.971 12.058
TPC-H 19 9.893 12.579
TPC-H 20 8.538 8.836
TPC-H 21 16.439 12.270
TPC-H 22 3.744 1.926
total time 207s 166s
total cost $0.23 $0.09

Query Benchmark: Warm Start

Inquire Snowflake (138s total, cost $0.15) Databend Cloud (124s total, cost $0.07)
TPC-H 1 8.934 7.568
TPC-H 2 3.018 3.125
TPC-H 3 6.089 5.234
TPC-H 4 4.914 3.392
TPC-H 5 5.800 4.857
TPC-H 6 0.891 2.142
TPC-H 7 5.381 4.389
TPC-H 8 5.724 5.887
TPC-H 9 10.283 9.621
TPC-H 10 10.368 8.524
TPC-H 11 1.165 1.364
TPC-H 12 7.052 5.352
TPC-H 13 12.829 6.180
TPC-H 14 3.288 2.725
TPC-H 15 3.475 2.748
TPC-H 16 4.094 1.124
TPC-H 17 4.203 13.757
TPC-H 18 18.583 11.630
TPC-H 19 3.888 7.881
TPC-H 20 6.379 5.797
TPC-H 21 10.287 9.806
TPC-H 22 1.573 1.122
total time 138s 124s
total cost $0.15 $0.07

Reproducing benchmarks

You can reproduce the benchmark by following the steps below.

Benchmark environment

Both Snowflake and Databend Cloud were tested under similar conditions:

parameter Snowflake Databend Cloud
Calculate cluster size small small
vCPU 16 16
price $4/hour $2/hour
AWS Region us-east-2 us-east-2
storage AWS S3 AWS S3
  • The TPC-H SF100 dataset, sourced from  Amazon Redshift , has been loaded into Databend Cloud and Snowflake without any specific tuning.

Benchmarking method

We ran hot and cold rounds of query execution:

  1. Cold run : The data warehouse is suspended and resumed before executing the query.
  2. Hot run : The data warehouse is not suspended and uses local disk cache.

prerequisites

Data loading

  1. Snowflake data loading :

  2. Databend Cloud data loading :

TPC-H query

  1. Snowflake query :

  2. Databend Cloud 查询

Linus 亲自动手,阻止内核开发者用空格替换制表符 父亲是少数会写代码的领导人、次子是开源科技部主管、幼子是开源核心贡献者 华为:用 1 年时间将 5000 个常用手机应用全面迁移至鸿蒙 Java 是最容易出现第三方漏洞的语言 鸿蒙之父王成录:开源鸿蒙是我国基础软件领域唯一一次架构创新 马化腾周鸿祎握手“泯恩仇” 前微软开发人员:Windows 11 性能“糟糕得可笑” 虽然老乡鸡开源的不是代码,但背后的原因却让人很暖心 Meta Llama 3 正式发布 谷歌宣布进行大规模重组
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5489811/blog/11044358