034. TiDB feature_AUTO_RANDOM

usage scenario

AUTO_RANDOM is used to solve hot spots caused by tables containing integer auto-incrementing primary key columns when writing large batches of data into TiDB. In short: data processing should not only be done on one region. By randomly generating the primary key, the data is distributed in different regions. auto_random: is a field attribute, used to automatically fill default column values ​​​​at random to solve hot spots

Auto_random implementation principle

  • is an 8-byte bigint integer
  • The highest bit is the sign bit
  • By default, 63 ~ 59 are random bits (shard bits), and a random number of 1 ~ 32 is randomly generated each time a row record is inserted
  • To use random bits of different lengths, adjust the number of auto_random brackets
  • Shard-Based Random Generator
  • last_insert_id(): Similar to the case of auto_increment, retrieves the value used for the last insert operation

usage restrictions

  • auto_random column can only be bigint type
  • When the primary key attribute is nonclustered, even if it is an integer primary key column, auto_random is not supported
  • The use of alter table to modify the auto_random attribute is not supported
  • Modifying the column type of a primary key column with the auto_random attribute is not supported
  • It is not supported to specify on the same column as auto_increment at the same time
  • It is not supported to be specified on the same column as the default value of the column at the same time
  • When inserting data, it is not recommended to explicitly specify the value of the auto_random column
  • It is a shard based random generator
  • Support changing AUTO_INCREMENT attribute to AUTO_RANDOM attribute.
  • Data in AUTO_RANDOM columns is difficult to migrate to AUTO_INCREMENT columns, because the values ​​automatically assigned by AUTO_RANDOM columns are usually very large.
  • When inserting data, it is not recommended to explicitly specify the value of the AUTO_RANDOM column.

Related parameters

  • PK_AUTO_RANDOM_BITS
    PK_AUTO_RANDOM_BITS=5 : This parameter indicates how many regions (shards) will be allocated. 5 represents 2^5 power 32 regions
mysql> select TIDB_ROW_ID_SHARDING_INFO, TIDB_PK_TYPE
    -> from information_schema.tables
    -> where table_schema='test'
    -> and table_name='auto_random_t1';
+---------------------------+--------------+
| TIDB_ROW_ID_SHARDING_INFO | TIDB_PK_TYPE |
+---------------------------+--------------+
| PK_AUTO_RANDOM_BITS=5     | CLUSTERED    |
+---------------------------+--------------+
1 row in set (0.17 sec)
  • PRE_SPLIT_REGIONS
create table t (a int, b int) PRE_SPLIT_REGIONS=3;
开始写数据进表 t 后,数据会被写入提前切分好的 82^3次方)个 Region 中,这样也避免了刚开始建表完后因为只有一个 Region 而存在的写热点问题。

Then how to insert data into these 8 regions? At this time, you can use id (primary key) to do this. At this time, you can use AUTO_RANDOM (if you use auto_increment, you can’t break it up)

  • allow_auto_random_explicit_insert
    To use the function of explicit insertion, you need to set the system variable
@@allow_auto_random_explicit_insert 的值设置为 1 (默认值为 0 )。

Table creation

  • auto_increment
    takes the table created by the following statement as an example:
CREATE TABLE t (a bigint PRIMARY KEY AUTO_INCREMENT, b varchar(255))

Execute a large number of INSERT statements without specifying the primary key value on the table created by the above statement, examples are as follows:

INSERT INTO t(b) VALUES ('a'), ('b'), ('c')

As in the above statement, since the value of the primary key column (column a) is not specified, TiDB will use the continuously increasing row value as the row ID, which may cause write hotspots on a single TiKV node, thereby affecting the performance of external services . To avoid this writing hotspot, you can specify the AUTO_RANDOM attribute instead of the AUTO_INCREMENT attribute for column a when executing the table creation statement.

  • AUTO_RANDOM

build table

CREATE TABLE t (a bigint PRIMARY KEY AUTO_RANDOM, b varchar(255))
或者
CREATE TABLE t (a bigint AUTO_RANDOM, b varchar(255), PRIMARY KEY (a))

At this time, execute the INSERT statement like INSERT INTO t(b) values....

  • Implicit assignment: If the INSERT statement does not specify the value of the integer primary key column (column a), or specifies NULL, TiDB will automatically assign a value to the column. This value is not guaranteed to be self-incrementing or continuous, but only unique, which avoids hot spots caused by continuous row IDs.
  • Explicit insertion: If the INSERT statement explicitly specifies the value of the integer primary key column, similar to the AUTO_INCREMENT attribute, TiDB will save the value. Note that if NO_AUTO_VALUE_ON_ZERO is not set in the system variable @@sql_mode, TiDB will automatically assign a value to the column even if the value of the integer primary key column is explicitly specified as 0.

Notice

  • The calculation method of the automatically assigned value is as follows:
    In the binary form of the row value, the five highest bits (called shard bits) excluding the sign bit are determined by the start time of the current transaction, and the remaining bits are auto-incremented assigned in sequence.

  • Number of shards:
    To use a different number of shard bits, you can add a pair of brackets after AUTO_RANDOM, and specify the desired number of shard bits in the brackets. Examples are as follows:

CREATE TABLE t (a bigint PRIMARY KEY AUTO_RANDOM(3), b varchar(255))

In the above table creation statement, the number of shard bits is 3. The value range of the number of shard bits is [1,16).

After the table is created, use SHOW WARNINGS to view the maximum number of implicit allocations supported by the current table:

SHOW WARNINGS
+-------+------+---------------------------------------------------
-------+
| Level | Code | Message 
 |
+-------+------+---------------------------------------------------
-------+
| Note | 1105 | Available implicit allocation times:
1152921504606846976 |
+-------+------+---------------------------------------------------
-------+
  • Note
    To ensure the maximum number of implicit allocations, the AUTO_RANDOM column type can only be BIGINT.
    In addition, to view the number of shard bits of a table with the AUTO_RANDOM attribute, you can see the value of the mode PK_AUTO_RANDOM_BITS=x in the TIDB_ROW_ID_SHARDING_INFO column of the system table information_schema.tables, where x is the number of shard bits.

last_insert_id()

AUTO RANDOM The implicitly assigned value of the column affects last_insert_id().
You can use SELECT last_insert_id() to get the ID implicitly allocated by TiDB last time, for example:

INSERT INTO t (b) VALUES ("b")
SELECT * FROM t;
SELECT last_insert_id()

The possible results are as follows:

+------------+---+
| a | b |
+------------+---+
| 1073741825 | b |
+------------+---+
+------------------+
| last_insert_id() |
+------------------+
| 1073741825 |
+------------------+

compatibility

TiDB supports parsing version comment syntax. Examples are as follows:

CREATE TABLE t (a bigint PRIMARY KEY /*T![auto_rand] auto_random */)
CREATE TABLE t (a bigint PRIMARY KEY AUTO_RANDOM)

The above two sentences have the same meaning.
In the results of SHOW CREATE TABLE, the AUTO_RANDOM attribute is commented out. Annotations are accompanied
by a property identifier, eg /*T![auto_rand] auto_random */. Among them, auto_rand represents the feature identifier of AUTO_RANDOM. Only the TiDB version that implements the feature corresponding to this identifier can parse SQL statement fragments normally.
This function supports forward compatibility, that is, downgrade compatibility. The version of TiDB that does not implement the corresponding feature will ignore the AUTO_RANDOM attribute of the table (with the above annotation), so the table with this attribute can be used.

example

  • new table
# 新建表
DROP TABLE IF EXISTS test.auto_random_t1;
CREATE TABLE test.auto_random_t1 (
 id bigint PRIMARY KEY AUTO_RANDOM(3),
 name char(255));
  • insert data
## 插入数据
/* Populate Seed */
INSERT INTO test.auto_random_t1 (name) VALUES ('A');
INSERT INTO test.auto_random_t1 (name) VALUES ('B');
INSERT INTO test.auto_random_t1 (name) VALUES ('C');
INSERT INTO test.auto_random_t1 (name) VALUES ('D');
INSERT INTO test.auto_random_t1 (name) VALUES ('E');
INSERT INTO test.auto_random_t1 (name) VALUES ('F');
INSERT INTO test.auto_random_t1 (name) VALUES ('G');
INSERT INTO test.auto_random_t1 (name) VALUES ('H');
INSERT INTO test.auto_random_t1 (name) VALUES ('I');
INSERT INTO test.auto_random_t1 (name) VALUES ('J');
INSERT INTO test.auto_random_t1 (name) VALUES ('K');
INSERT INTO test.auto_random_t1 (name) VALUES ('L');
INSERT INTO test.auto_random_t1 (name) VALUES ('M');
INSERT INTO test.auto_random_t1 (name) VALUES ('N');
INSERT INTO test.auto_random_t1 (name) VALUES ('O');
INSERT INTO test.auto_random_t1 (name) VALUES ('P');
INSERT INTO test.auto_random_t1 (name) VALUES ('Q');
INSERT INTO test.auto_random_t1 (name) VALUES ('R');
INSERT INTO test.auto_random_t1 (name) VALUES ('S');
INSERT INTO test.auto_random_t1 (name) VALUES ('T');
INSERT INTO test.auto_random_t1 (name) VALUES ('U');
INSERT INTO test.auto_random_t1 (name) VALUES ('V');
INSERT INTO test.auto_random_t1 (name) VALUES ('W');
INSERT INTO test.auto_random_t1 (name) VALUES ('X');
INSERT INTO test.auto_random_t1 (name) VALUES ('Y');
INSERT INTO test.auto_random_t1 (name) VALUES ('Z');
INSERT INTO test.auto_random_t1 (name) VALUES ('a');
INSERT INTO test.auto_random_t1 (name) VALUES ('b');
INSERT INTO test.auto_random_t1 (name) VALUES ('c');
INSERT INTO test.auto_random_t1 (name) VALUES ('d');
INSERT INTO test.auto_random_t1 (name) VALUES ('e');
INSERT INTO test.auto_random_t1 (name) VALUES ('f');
INSERT INTO test.auto_random_t1 (name) VALUES ('g');
INSERT INTO test.auto_random_t1 (name) VALUES ('h');
INSERT INTO test.auto_random_t1 (name) VALUES ('i');
INSERT INTO test.auto_random_t1 (name) VALUES ('j');
INSERT INTO test.auto_random_t1 (name) VALUES ('k');
INSERT INTO test.auto_random_t1 (name) VALUES ('l');
INSERT INTO test.auto_random_t1 (name) VALUES ('m');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
INSERT INTO test.auto_random_t1 (name) VALUES ('x1');
  • View last_insert_id(), the currently assigned value
/* Check last_insert_id() */
SELECT LAST_INSERT_ID();
  • View the fragmentation value of this table
/* Greetings to CBO */
ANALYZE TABLE test.auto_random_t1;
/* select 'test.auto_random_t1' as Title; */
/* desc test.auto_random_t1; */
select TIDB_ROW_ID_SHARDING_INFO, TIDB_PK_TYPE
from information_schema.tables
where table_schema='test'
and table_name='auto_random_t1';

mysql> select TIDB_ROW_ID_SHARDING_INFO, TIDB_PK_TYPE
    -> from information_schema.tables
    -> where table_schema='test'
    -> and table_name='auto_random_t1';
+---------------------------+--------------+
| TIDB_ROW_ID_SHARDING_INFO | TIDB_PK_TYPE |
+---------------------------+--------------+
| PK_AUTO_RANDOM_BITS=3     | CLUSTERED    |
+---------------------------+--------------+
1 row in set (0.17 sec)
  • View the aggregated number of each shard
/* check value */
# 虽然是随机的,但每个region当中的id还是有一定规律。
SELECT substr(cast(id as CHAR),1,2) as id_prefix, count(*) as
approx_rows_in_shard
FROM test.auto_random_t1
GROUP BY id_prefix
HAVING approx_rows_in_shard > 1
ORDER BY id_prefix;
/*SHOW TABLE test.auto_random_t1 REGIONS\G*/

# 公有16个分片,每个分片的汇总个数
mysql> SELECT substr(cast(id as CHAR),1,2) as id_prefix, count(*) as
    -> approx_rows_in_shard
    -> FROM test.auto_random_t1
    -> GROUP BY id_prefix
    -> HAVING approx_rows_in_shard > 1
    -> ORDER BY id_prefix;
+-----------+----------------------+
| id_prefix | approx_rows_in_shard |
+-----------+----------------------+
| 10        |                    2 |
| 11        |                   29 |
| 15        |                    3 |
| 16        |                    4 |
| 19        |                    2 |
| 23        |                   31 |
| 34        |                   32 |
| 46        |                   18 |
| 57        |                   26 |
| 69        |                   14 |
| 80        |                   22 |
+-----------+----------------------+
11 rows in set (0.28 sec)

Guess you like

Origin blog.csdn.net/wangzhicheng987/article/details/130782100