kudu standard use

1. Kudu design and usage specifications provide basic design reference for data research and development and data designers.

2. Schema design

Kudu tables are similar to relational database tables, and both have a structured data model. For optimal performance and operational stability, schema design is very important. No one kind of schema can be applied to all tables.

When creating a Kudu table, it involves column design, primary key design, and partition design. For traditional non-distributed relational databases, only partitioning is a new concept.

3. Elegant schema

An elegant schema should have the following conditions:

Data read and write are evenly distributed to each tablet server, which is mainly affected by partitions.

Tablets can increase at a stable and predictable rate, and the loaded data is guaranteed to be available 24/7, which is mainly affected by partitions.

The minimum amount of data read during scanning is mainly affected by the design of the primary key, but partitioning also plays an important role.

A good shema design depends on the characteristics of the data, what to operate on the data, and the topology of the cluster. Schema design is the most important thing for maximizing kudu cluster performance.

4. Column design

4.1. Data type
Kudu table contains multiple columns, each class has a type, non-primary key columns can be null. The supported column types are:

Boolean
8-bit signed integer
16-bit signed integer
32-bit signed integer
64-bit signed integer
unixtime_micros (64-bit microseconds since the Unix era)
single-precision (32-bit) IEEE-754 floating-point
double-precision (64-bit) IEEE -754 floating-point
decimal (see Decimal Type for details)
UTF-8 encoded string (up to 64KB uncompressed)
Binary (up to 64KB uncompressed)
Kudu uses strongly typed columns and columnar disk storage formats to provide efficient encoding and serialization. In order to take full advantage of these features, the columns should be designated as appropriate types instead of using string or binary columns to simulate "modeless" tables. In addition to encoding, Kudu also allows compression to be specified on a per-column basis.

//There are no version and timestamp columns, unlike hbase, kudu does not provide version and timestamp columns to track changes in rows. If necessary, you need to design a column yourself.

4.2. Decimal type
This type has a fixed scale and precision and is suitable for arithmetic operations such as finance. The imprecise representation and rounding behavior of float and double is relatively impractical. The decimal type is also useful for integers greater than int64 and for cases where there are decimal values ​​in the primary key.

Precision: Indicates the total number of digits that the column can represent, regardless of the position of the decimal point. This value must be between 1 and 38, and there is no default value. For example, a precision of 4 indicates an integer value with a maximum of 9999, or a value of up to 99.99 with two decimal places. You can also express the corresponding negative value without

Make any changes to the accuracy. For example, the range of -9999 to 9999 still only requires a precision of 4.

Scale: indicates the number of decimal places. The value must be between 0 and precision. A scale of 0 will produce an integer value with no decimal part. If the precision and scale are equal, all numbers are after the decimal point. For example, a decimal with precision and scale equal to 3 can represent values ​​between -0.999 and 0.999.

Performance considerations: Kudu stores each value in as few bytes as possible, depending on the precision specified by decimal. Therefore, for convenience, it is not recommended to use the highest accuracy possible. Doing so may negatively affect performance, memory and storage.

Before encoding and compression:

Decimal values ​​with a precision of 9 or less are stored in 4 bytes.

Decimal values ​​with a precision of 10 to 18 are stored in 8 bytes.

Decimal values ​​with a precision greater than 18 are stored in 16 bytes.

The precision and scale of the columns that the alter command cannot modify

4.3. Column coding
can be coded according to the type of column.

Column type

coding

default

int8, int16, int32

plain, bitshuffle, run length

bitshuffle

int64, unixtime_micros

plain, bitshuffle, run ength

bitshuffle

float, double, decimal

plain, bitshuffle

bitshuffle

bool

plain, run length

run length

string, binary

plain, prefix, dictionary

dictionary

4.3.1. Plain encoding
data is stored in its natural format. For example, the int32 value is stored as a fixed-size 32-bit little-endian integer.

4.3.2. Bitshuffle encoding
rearranges a block of values ​​to store the most significant bit of each value, then the second most significant bit, and so on. Finally, the result is LZ4 compression. If the value is repeated a lot, or the value changes little when sorted by the primary key, Bitshuffle encoding is a good choice.

4.3.3. Run length encoding
uses compressed storage for continuous repeated values, mainly by storing only the value and number. This encoding is effective for columns with many consecutive repeated values ​​when sorted by the primary key.

4.3.4. Dictionary encoding
Create a dictionary to store all values, and each column value is encoded and stored using an index. If the number of values ​​is small, this method is more effective. Otherwise, Kudu will transparently fall back to the pure encoding of the line set. This is evaluated and calculated during flush.

4.3.5. Prefix encoding
compresses the common prefix in consecutive column values. It may be effective for values ​​with a common prefix or the first column of the primary key, because the rows in the tablet are stored by sorting the primary key.

4.3.6. Column compression
Kudu allows columns to use compressed LZ4, Snappy or zlib compression codecs. By default, no compression is performed. If reducing storage space is more important than raw scan performance, consider using compression.

The compression method of each data set is different, but generally speaking, LZ4 is the best performance codec, and zlib space compression is relatively large. Bitshuffle encoded columns automatically use LZ4 compression, so it is not recommended to apply other compression on top of this encoding.

5. Primary key design

Each Kudu table must declare a primary key consisting of one or more columns. Like RDBMS primary keys, Kudu primary keys enforce uniqueness constraints. Attempting to insert a row with the same primary key value as an existing row will result in a duplicate key error.

The primary key column must be non-nullable and cannot be of type boolean, float or double.

After the table is created with the specified primary key, it cannot be changed.

Unlike RDBMS, Kudu does not provide column auto-increment, so the application must provide a complete primary key.

You must specify the complete primary key when deleting and updating. Kudu does not support range deletion or update. That is to complete the operation through the primary key.

The primary key value cannot be modified. However, you can delete and reinsert the curve to save the country.

5.1. Primary key index
Like many traditional relational databases, Kudu's primary key is a clustered index. All rows in the tablet are sorted by the primary key. When scanning Kudu rows, using equal or range filtering on the primary key column can effectively find rows.

5.2. Considerations for backfill insertion
Here , consider the case where the primary key is a timestamp or the first column of the primary key is a timestamp. For each insertion, kudu will look up the primary key in the primary key index storage area to see if the primary key exists, and if it exists, it will return the primary key duplication error. If the time when the data is generated is used as the primary key to store, the hot data will be less, and the existence check will be faster, and it can be hit in the memory without having to go to the disk.

If it is offline historical data, each insertion may be pre-cooled (the primary key cannot be located in the memory), and the disk will be accessed, sometimes even multiple disks. Under normal circumstances, kudu can reach millions of insertions per second, but if it is backfilling data, it can only maintain a few thousand insertions per second.

Performance optimization of backfilling data:

Make the primary key easier to compress.
Use the solid state disk to
change the primary key structure so that the backfill primary key is in a continuous range.

6. Partition design

The tables in kudu are divided into many tablets and distributed on multiple tservers. Each row belongs to a tablet. Which tablet the row is divided into is determined by the partition, which is set during table creation.

When writing is frequent, consider balancing the writing action between all tablets to effectively reduce the pressure on a single tablet. For small-scale scanning operations, if the scanned data is on one tablet, performance can be improved.

Kudu does not have a default partition. When creating a table, kudu does not provide a default partition strategy. It is recommended that tables with heavy reading and writing can be set to the same number of partitions as the number of tserver servers.

Kudu provides two types of partitions: range partitions and hash partitions. Tables can have multi-level partitions, combined use range and hash or a combination of multiple hashes.

6.1. Range partition
Kudu allows dynamic addition and deletion of range partitions at runtime without affecting the availability of other partitions. Deleting a partition will delete the data contained in the partition. Subsequent insertion into the deleted partition will fail. New partitions can be added, but they must not overlap with any existing range partitions. Kudu allows any number of range partitions to be deleted and added in a single transaction change table operation.

Dynamically adding and removing range partitions is particularly useful for time series. Over time, range partitions can be added to cover the upcoming time range. For example, a table that stores event logs can add a month partition before the start of each month to save upcoming events. You can delete old range partitions to effectively delete historical data as needed.

6.2. Hash partition
Hash partition assigns rows to one of the buckets according to the hash value. In a single-level hash partition table, each bucket corresponds to only one tablet. Set the number of buckets during table creation. Normally, the primary key column is used as the column to be hashed, but as with range partitioning, any subset of the primary key column can be used.

When orderly access to the table is not required, hash partitioning is an effective strategy. Hash partitioning is very effective for random writing between tablets, which helps alleviate hot spots and uneven tablet sizes.

6.3. Multi-level partitioning
Kudu allows a table to combine multiple levels of partitioning on a single table. Zero or more hash partitions can be combined with range partitions. In addition to the constraints of each partition type, the only additional constraint of multi-level partitioning is that multi-level hash partitions cannot hash the same column.

If used correctly, multi-level partitioning can retain the benefits of each partition type while reducing the disadvantages of each partition type. The total number of tablets in the multi-level partition table is the product of the number of partitions in each level.

6.4. Pruning partitions
When the partition can be completely determined by scanning conditions, kudu will automatically skip scanning the entire partition. To determine the hash partition, the scanning condition must include the equivalence judgment condition of each hash column. The scanning of multi-level partition table can use the partition definition of each level separately.

7. Schema modification

You can change the table structure in the following ways:

Rename tables,
rename primary key columns
, add or delete non-primary key columns,
add and delete range partitions
, and combine multiple change steps in a single transaction operation.

8. Known limitations

Kudu currently has some known limitations that may affect the architecture design.

Number of columns
By default, Kudu does not allow the creation of tables with more than 300 columns. We recommend using fewer columns of architecture design for best performance.


Before the cell size is encoded or compressed, a single cell cannot be larger than 64KB. After Kudu completes the internal composite key encoding, the unit constituting the composite key is limited to a total of 16KB. Inserting rows that do not meet these restrictions will cause an error to be returned to the client.


Although the size of a single cell may be as high as 64KB, and Kudu supports up to 300 columns, it is recommended that a single row should not be larger than a few hundred KB.

Identifiers such as
table names and column names must be valid UTF-8 sequences and no more than 256 bytes.


Immutable primary key Kudu does not allow updating the primary key column.

Unchangeable primary key
Kudu does not allow you to modify the primary key column after the table is created.

Unchangeable partitions
In addition to adding or deleting range partitions, Kudu does not allow you to change the partitioning method of the table after creation.

Unchangeable column type
Kudu does not allow changing the column type.

After the partition is
created, the partition cannot be split or merged.

9. Storage restrictions

A single tablet_server partition of kudu cannot exceed 1500. If multiple tablet_servers exceed 1500, evaluation and expansion are required.

The maximum data storage capacity of
5 tablet servers with 22 masters is
24TB for each tablet server after replication and compression.
Each tablet server manages 1500 tablets, including a copy of the tablet.
When creating a table, the maximum number of tablets per table of each Tablet server is 60.

Based on the above restrictions, the following can be inferred:

The total amount of data stored in Kudu is recommended as: the total number of tablet servers. The data amount of a single tablet server=22 24TB=528TB/3 =176TB
The data amount of a single tablet is: the data amount of a single tablet server/the total number of tablets in each tablet server =24TB/1500=16G.
The compression methods supported by Kudu are LZ4, Snappy, or zlib. In view of the fact that the compression ratio of various compression algorithms generally does not exceed 50%, the size of the data in each tablet before compression is recommended to be less than 8G.
Maximum number of kudu single table storage: 800 million

10. Life Cycle Specification

Regular table
SUBS layer: All system data must enter the SUBS layer. In principle, the data is permanently stored and can be adjusted according to the actual situation (the amount of data is very large, for example, the amount of data per day is greater than 100 GB). The kudu table data is retained for 12 months (according to The actual situation is defined), more than 12 months can be dumped to hive.

Based on the data stored by kudu, it is retained according to the actual needs of the business according to the following definitions

ODS, DW, DM, ADS layer:

When the maximum visit span within 3 months is less than or equal to 4 days, it is recommended to set the retention days to 7 days.
When the maximum visit span within 3 months is less than or equal to 12 days, it is recommended to set the retention days to 15 days.
When the maximum access span within 3 months is less than or equal to 30 days, it is recommended to set the retention days to 33 days.
When the maximum visit span within 3 months is less than or equal to 90 days, it is recommended to set the retention days to 93 days.
When the maximum visit span within 3 months is less than or equal to 180 days, it is recommended to set the retention days to 183 days.
When the maximum access span within 3 months is less than or equal to 365 days, it is recommended to set the retention days to 368 days.
The above life cycle definition is a common definition in the industry and can be adjusted according to the actual business conditions.

Store according to the different requirements of the data domain, and support the corresponding cold and hot data separation storage solution.

For the time partition table, according to the life cycle of its data content, periodically delete the sub-tables that exceed the life cycle

The intermediate table
sets the storage period according to the actual situation, and in principle the storage period is not greater than the result table.

The backup table
sets the storage period according to the actual backup needs.

Temporary table
The maximum life cycle of a temporary table is 10 days. For the life cycle of a temporary table, corresponding detection tools and methods should be provided, and relevant prompts should be made before deleting.

11 Table building specifications

1. The maximum hash partition cannot exceed 60 partitions

2. Example of Kudu hash partition creation table

CREATE TABLE IF NOT EXISTS [db_name.]table_name

(

id BIGINT PRIMARY KEY COMMENT '注释',

agent STRING COMMENT '注释',

     ...

PRIMARY KEY ( uuid )

)       

PARTITION BY HASH (id) PARTITIONS 4

COMMENT '注释' stored AS kudu;

Example of kudu-mixed partition:

CREATE TABLE cust_behavior_1 (
id BIGINT,
sku STRING,
salary STRING,
edu_level INT,
usergender STRING,
group STRING,
city STRING,
postcode STRING,
last_purchase_price FLOAT,
last_purchase_date BIGINT,
category STRING,
rating INT,
fulfilled_date BIGINT,
PRIMARY KEY (id, sku)
)
PARTITION BY 
HASH (id) PARTITIONS 4,
RANGE (sku)
(
PARTITION VALUES < ‘g’,
PARTITION ‘g’ <= VALUES < ‘o’,
PARTITION ‘o’ <= VALUES < ‘u’,
PARTITION ‘u’ <= VALUES
) STORED AS KUDU
TBLPROPERTIES(
‘kudu.table_name’ = ‘cust_behavior_1 ‘,’kudu.master_addresses’ = ‘hadoop5:7051’);

Time partition example:

CREATE TABLE IF NOT EXISTS k_ods_collect_dw_yarn_apps_resource_ds( 
app_id string comment'任务ID', 
use_type string comment'资源使用类别',
name string comment'资源名称',
ds string comment'日汇总', 
resource_type string comment'资源类别',
maximum_allocation bigint comment'最大分配',
minimum_allocation bigint comment'最小分配',
shorthand_representation string comment'快速描述',
units string comment'单位',
value bigint comment'值',
collect_time bigint comment'采集时间',
PRIMARY KEY (app_id,use_type,name,ds))
partition by hash(app_id,use_type,name,ds) partitions 4,
RANGE (ds) (
PARTITION '20201112' <= VALUES < '20210101',
PARTITION '20210101' <= VALUES < '20210201',
PARTITION '20210201' <= VALUES < '20210301',
PARTITION '20210301' <= VALUES < '20210401',
PARTITION '20210401' <= VALUES < '20210501',
PARTITION '20210501' <= VALUES < '20210601',
PARTITION '20210601' <= VALUES < '20210701',
PARTITION '20210701' <= VALUES < '20210801',
PARTITION '20210801' <= VALUES < '20210901',
PARTITION '20210901' <= VALUES < '20211001',
PARTITION '20211001' <= VALUES < '20211101',
PARTITION '20211101' <= VALUES < '20211201',
PARTITION '20211201' <= VALUES < '20220101'
)
COMMENT 'k_ods_collect_dw_yarn_apps_resource_ds'
stored as kudu;

Guess you like

Origin blog.51cto.com/chenhao6/2562130