What You Should Know About NewSQL: CockroachDB Validation Documentation

Option 8: CockroachDB

Cockroachdb is a distributed database that supports transactions, SQL operations, and KV storage mode. The three founders of CockroachDB are all from Google, and its architecture is inspired by Google's Spanner and F1, cockroach open source address . Enter image descriptionhave:

  • Standard SQL interface, using PostgreSQL protocol, supporting standard SQL interface, compatible with relational database SQL ecosystem;
  • Strong scalability, high concurrency, and support for MPP-like parallel query framework;
  • Elastic capacity expansion, maintaining on-demand capacity expansion, automatic load balancing;
  • Multiple copies are strongly consistent, and the raft algorithm is used to ensure data consistency;
  • The service is highly available, decentralized, and no SPOF;
  • Distributed transaction, realize transaction control based on MVCC, support SI and SSI isolation levels;

research

build table

DROP TABLE IF EXISTS "tracks";  
CREATE TABLE IF NOT EXISTS "tracks" (  
"id"  SERIAL PRIMARY KEY ,  
"third_tracks_id" varchar(32)  NOT NULL DEFAULT '' ,  
"tracks_title" varchar(255) NOT NULL DEFAULT '' ,  
"tracks_title_other" varchar(255)  NOT NULL DEFAULT '',  
"tracks_title_py" varchar(64) NOT NULL DEFAULT '' ,  
"data_source" bigint DEFAULT 1 NOT NULL,  
"tags" varchar(255)  NOT NULL DEFAULT '',  
"duration" bigint DEFAULT 0 NOT NULL,  
"status" int DEFAULT 0 NOT NULL,  
"pa" int DEFAULT 0 NOT NULL,  
"announcer_name" varchar(255)  NOT NULL DEFAULT '',  
"anchor_name" varchar(255)  NOT NULL DEFAULT '',
"play_count" bigint DEFAULT 0 NOT NULL,
"own_count" bigint DEFAULT 0 NOT NULL,
"paid" int DEFAULT 0 NOT NULL,
"info" text NOT NULL,
"created_at" timestamp NOT NULL,
"updated_at" timestamp NOT NULL,
"data_updated" bigint NOT NULL,
"created" timestamp NOT NULL,
"updated" timestamp NOT NULL,
"announcer_id" varchar(256) NOT NULL DEFAULT '',
"anchor_id" varchar(256) NOT NULL DEFAULT '',
UNIQUE INDEX "idx_thirdTrackId_dataSource" (third_tracks_id ASC, data_source ASC),
INDEX "idx_announcerid_status_paid_playcount" (announcer_id ASC, status ASC, paid ASC, play_count DESC)
);

important point

1. Data table fields are not supported to add comments; 2. Mass data deletion is not supported:

> DELETE FROM tracks where id <100000;
pq: kv/txn_coord_sender.go:428: transaction is too large to commit: 189948 intents

Because strong consistency is required, if a large amount of data is deleted, the cluster delay will increase. If you need to delete a large amount of data, you can use: segment deletion

alter table tracks rename to tracks_0907;
for (i=1;i<count(*);i+=2000){
	DELETE FROM tracks_0907 where id <= i;
}

Performance pressure test

Pressure test tool Go language to implement pressure test script

Pressure measurement method Three machines execute the pressure measurement script, the duration of each pressure measurement is 5-10min, and each line of records is about 1.6kb. Because the test cluster has 3 performances as follows:

M02-XI3
整机SN216486580
CPU【INTEL Xeon E5-2650 V4 12C 2.2GHZ】*2
内存【LANGCHAO PC4-19200 16G】*8
硬盘【LANGCHAO SATA 3T 7.2K】*4
FLASH【LANGCHAO NVMe SSD 800G】*1
网卡【LANGCHAO INTEL 82599】*1
加速卡
RAID无硬件RAID卡

Simulate the scenario of migrating online data from MySQL to NewSQL as much as possible. When many people see mysql or other database performance test reports, they see that the qps are about tens of thousands, but if you pay attention, you will find that the single-line record of the test is only 50 bytes, which is inconsistent with the online data.

The number of connections for 3 sets is about 1000, so the test will be limited to 1000 concurrency.

Pressure measurement form

serial number SQL The amount of data Concurrency qps 99th percentile delay 90th percentile delay SQL Byte Traffic Remark
1 insert 1500w 100 666 152ms 117ms 832kb 18 newsql indexes
2 insert 1500w 300 687 352ms 187ms 872kb 18 newsql indexes
3 insert 1500w 900 778 700ms 1500ms 1.2MB 18 newsql indexes
4 insert 1500w 1000 807 1500s 1200ms 1.3MB 18 newsql indexes
5 insert 1500w 100 1051 42ms 12ms 1.2MB 1 newsql index
6 insert 1500w 300 2254 92ms 32ms 3.2MB 1 newsql index
7 insert 1500w 600 4021 130ms 56ms 6.1MB 1 newsql index
8 insert 1500w 900 5938 250ms 148ms 8.4MB 1 newsql index
9 insert 1500w 1000 6125 270ms 171ms 8.7MB 1 newsql index
10 select * from tracks WHERE id = 随机id AND status = 0 1500w 300 5625 30ms 10ms 7.3MB primary key index
11 select * from tracks WHERE id = 随机id AND status = 0 1500w 600 8713 45ms 8ms 11.7MB primary key index
12 select * from tracks WHERE id = 随机id AND status = 0 1500w 1000 12320 160ms 130ms 16.2MB primary key index
13 select * from tracks WHERE id (随机20ids) AND status = 0 1500w 300 2134 200ms 140ms 29.7MB primary key index
14 select * from tracks WHERE id (随机20ids) AND status = 0 1500w 600 2526 420ms 350ms 34.1MB primary key index
15 select * from tracks WHERE id (随机20ids) AND status = 0 1500w 1000 2650 771ms 640ms 36.1MB primary key index
16 select * from tracks WHERE id (随机50ids) AND status = 0 1500w 300 714 670ms 540ms 23.2MB primary key index
17 select * from tracks WHERE id (随机50ids) AND status = 0 1500w 600 672 1700ms 1300ms 21.4MB primary key index
18 select * from tracks WHERE id (随机50ids) AND status = 0 1500w 600 757 3000ms 2490ms 24.7MB primary key index
19 SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 0 1500w 300 5553 40ms 5ms 4.1MB (third_tracks_id , data_source ) index
20 SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 0 1500w 600 8624 120ms 18ms 6.0MB (third_tracks_id , data_source ) index
21 SELECT *FROM "tracks" WHERE "third_tracks_id" IN (随机1个id) AND "data_source" = 随机公司 AND "status" = 0 1500w 1000 1145 310ms 90ms 8.7MB (third_tracks_id , data_source ) index
22 SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC 1500w 300 5493 160ms 3ms 5.1MB (announcer_id ASC, status ASC, paid ASC, play_count DESC)索引
23 SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC 1500w 600 7825 283ms 23ms 7.5MB (announcer_id ASC, status ASC, paid ASC, play_count DESC)索引
24 SELECT *FROM "tracks" WHERE "announcer_id" IN (随机1个id) AND "paid" = 0 AND "status" = 0 order by play_count DESC 1500w 1000 11171 310ms 68ms 10.5MB (announcer_id ASC, status ASC, paid ASC, play_count DESC )索引
25 select * from tracks WHERE id = 随机id AND status = 0 3000w 300 5614 123ms 3ms 8.3MB 主键索引
26 select * from tracks WHERE id = 随机id AND status = 0 3000w 600 8723 130ms 8ms 12.4MB 主键索引
27 select * from tracks WHERE id = 随机id AND status = 0 3000w 1000 10764 320ms 30ms 16.2MB 主键索引
28 select * from tracks WHERE id = 随机id AND status = 0 5000w 300 5136 159ms 8ms 7.8MB 主键索引
29 select * from tracks WHERE id = 随机id AND status = 0 5000w 600 8463 180ms 13ms 11.8MB 主键索引
30 select * from tracks WHERE id = 随机id AND status = 0 5000w 1000 10848 220ms 26ms 16.3MB 主键索引

数据分析

  1. 与MySQL等数据库一样,存在索引时候insert的数据会慢上许多,但是在cockroachDB中会更加明显一些。
    Enter image description 图中可以得出:
  • 大量索引对insert的影响是巨大,会导致写库操作qps大降,在提高并发数后qps亦没有显著提升;
  • 在仅有主键索引时候,insert的qps随着并发数的提升得到相应的提升,到了6000qps后再提升并发数效果就不再明显了。
  1. 通过对主键id的获取数据量不同的压测,可以得到如下图:
    Enter image description
    图中可以得出:
  • id数越多需要查询的range就越多qps就越低;
  • The greater the number of ids, the greater the amount of data obtained, the larger the network IO, and the lower the performance;
  1. Comparing the query efficiency of the primary key index and the composite index, we can get:
    Enter image description
    The figure shows:
  • The query efficiency with index will be greatly improved, and the query efficiency of primary key index, unique index and compound index is basically the same;
  • The primary key query efficiency is optimal;
  1. Comparing the query efficiency in different numbers of data records: the
    Enter image description
    figure can be drawn:
  • The size of the data in the table does not have a great impact on qps, at least it is acceptable at the level of tens of millions;
  • The smaller the amount of data in the table, the faster the speed;

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325457340&siteId=291194637
Recommended