kingshard best practice (1)

1. Introduction to kingshard

kingshard is a mysql middleware developed by Go, which can realize functions such as read-write separation, sub-database sub-table, connection pool, etc.

1.1 kingshard workflow

Kingshard is developed in Go and makes full use of the concurrency features of the Go language. Go language has done a good package in terms of concurrency, which greatly simplifies the development of kingshard. The overall workflow of kingshard is as follows:

  • Read the configuration file and start it. The listening port set in the configuration file listens for client requests.
  • After receiving the client connection request, start a goroutine to process the request separately.
  • Login verification is preferred. The verification process is fully compatible with the MySQL authentication protocol. Since the user name and password have been set in the configuration file, this information can be used to verify whether the connection request is legal. When the user name and password are correct, go to the following steps, otherwise return an error message to the client.
  • After the authentication is passed, the client sends the SQL statement.
  • Kingshard performs lexical and semantic analysis on the SQL statement sent by the client, identifies the SQL type and generates the SQL routing plan. If necessary, the SQL will be rewritten and forwarded to the corresponding DB. It is also possible to directly forward to the corresponding backend DB without lexical and semantic analysis. If the forwarded SQL is divided into tables and spans multiple DBs, each DB will start a goroutine to send SQL and receive the results returned by the DB.
  • Receive and combine the results, then forward to the client.

The overall workflow of kingshard can refer to the following picture.

The overall architecture of kingshard is shown below 

1.2 Kingshard Application Scenario

Many Internet companies are still using MySQL to store various types of relational data. With the growth of traffic and data volume, developers have to consider some new MySQL-related issues:

  • Read-write separation problem. Due to the increase in front-end application access, a single MySQL is not enough to support the write and query operations of the entire system. At this time, we have to distribute some time-consuming query operations to multiple slaves.
  • Single table capacity problem. If at the beginning of the system design, the problem of sub-tables was not considered. As the amount of data grows, the capacity of a single table becomes larger and larger. The author has seen a single table with a capacity of 500 million records, and then a simple delete operation will cause the system to slow the log, and may lead to a sudden surge in MySQL IO. Many students may think that adding an index to the query field, but when the amount of data grows to such a large amount, the effect of adding an index is not obvious. In the final analysis, the amount of data in a single table is too large, so even if MySQL locates the data through the index, it still needs to scan many records.
  • Database operation and maintenance problems. If the main library and the host from the library are configured in the code, the system will of course run without any problem. However, this greatly increases the pressure of operation and maintenance work. For example, the IO pressure of the MySQL database remains high due to the increase in access volume. The DBA needs to add a slave. At this time, the code has to be modified, and then packaged and launched. There are many very practical examples, which are not listed here.
  • connection pool. Front-end applications frequently connect to MySQL, and the additional performance consumption brought to MySQL cannot be ignored. If a connection pool is added, each DB caches a certain number of MySQL connections. When an application needs to connect to the back-end MySQL, an established connection is directly taken from the connection pool to send SQL requests, which will greatly speed up the data query speed. . And it can reduce the performance consumption of MySQL.
  • SQL log. When there is a problem with the program, we want to get some SQL logs, for example, which SQL was sent to which DB at what time. Looking at this kind of log can help us quickly locate the problem.

Faced with these problems, we can implement them one by one in the client code. But this also makes the client heavier and less flexible. The author has been engaged in the development of database-related work, and it is precisely based on the pain points of database development that the kingshard database middleware is designed and implemented. Kingshard has suitable solutions to the above five types of problems.

Second, the installation configuration of kingshard

2.1 Installation of kingshard

The installation of kingshard is relatively simple, as follows

1、安装Go语言环境(Go版本1.6以上),具体步骤请Google。
2、git clone https://github.com/flike/kingshard.git $GOPATH/src/github.com/flike/kingshard
3、cd src/github.com/flike/kingshard
4、source ./dev.sh
5、make
6、设置配置文件
7、运行kingshard。
./bin/kingshard -config=etc/ks.yaml

Notice:

  • Kingshard uses the yaml method to parse the configuration file. It should be noted that the yaml configuration file does not allow the tab key, and a space needs to be followed by the colon. After the configuration file is written, you can verify whether there are formatting errors on the yaml lint website.
  • kingshard will respond to the three signals SIGINT, SIGTERM and SIGQUIT and exit smoothly. These three signals should be avoided on the machine where kingshard is deployed, so as not to cause abnormal exit of kingshard! It is recommended to use the supervisor tool to run kingshard in the background

2.2 Configuration of kingshard

Detailed explanation of the kingshard configuration file:

# kingshard的地址和端口
addr : 0.0.0.0:9696

# 连接kingshard的用户名和密码
user :  kingshard
password : kingshard
#kingshard的web API 端口
web_addr : 0.0.0.0:9797
#调用API的用户名和密码
web_user : admin
web_password : admin

# log级别,[debug|info|warn|error],默认是error
log_level : debug
# 打开SQL日志,设置为on;关闭SQL日志,设置为off
log_sql : on
#如果设置了该项,则只输出SQL执行时间超过slow_log_time(单位毫秒)的SQL日志,不设置则输出全部SQL日志
slow_log_time : 100
#日志文件路径,如果不配置则会输出到终端。
log_path : /Users/flike/log
# sql黑名单文件路径
# 所有在该文件中的sql都会被kingshard拒绝转发
#blacklist_sql_file: /Users/flike/blacklist
# 只允许下面的IP列表连接kingshard,如果不配置则对连接kingshard的IP不做限制。
allow_ips: 127.0.0.1,10.0.0.8,10.0.0.9
# kingshard使用的字符集,如果不设置该选项,则kingshard使用utf8作为默认字符集
#proxy_charset: utf8mb4

# 一个node节点表示mysql集群的一个数据分片,包括一主多从(可以不配置从库)
nodes :
-
    #node节点名字
    name : node1

    # 连接池中最大空闲连接数,也就是最多与后端DB建立max_conns_limit个连接
    max_conns_limit : 16

    # kingshard连接该node中mysql的用户名和密码,master和slave的用户名和密码必须一致
    user :  kingshard
    password : kingshard

    # master的地址和端口
    master : 127.0.0.1:3306

    # slave的地址、端口和读权重,@后面的表示该slave的读权重。可不配置slave
    #slave : 192.168.0.12:3306@2,192.168.0.13:3306@3
    #kingshard在300秒内都连接不上mysql,kingshard则会下线该mysql
    down_after_noalive : 300
-
    name : node2
    max_conns_limit : 16
    user :  kingshard
    password : kingshard

    master : 192.168.59.103:3307
    slave :
    down_after_noalive: 100

# 分表规则
schema :
    #分表分布的node名字
    nodes: [node1,node2]
	#所有未分表的SQL,都会发往默认node。
    default: node1
    shard:
    -
        #分表使用的db
        db : kingshard
		#分表名字
        table: test_shard_hash
        #分表字段
        key: id
        #分表分布的node
        nodes: [node1, node2]
        #分表类型
        type: hash
        #子表个数分布,表示node1有4个子表,
        #node2有4个子表。
        locations: [4,4]

    -
		#分表使用的db
        db : kingshard
		#分表名字
        table: test_shard_range
	    #分表字段
        key: id
		#分表类型
        type: range
	    #分表分布的node
        nodes: [node1, node2]
		#子表个数分布,表示node1有4个子表,
		#node2有4个子表。
        locations: [4,4]
        #表示每个子表包含的最大记录数,也就是说每
	    #个子表最多包好10000条记录。即子表1对应的id为[0,10000),子表2[10000,20000)....
        table_row_limit: 10000

Here we focus on the configuration rules for sub-tables:

  • kingshard supports two types of table partitioning rules: hash and range.
  • The sub-table involved in the kingshard sub-table needs to be manually created by the user in each db, and the format is: table_name_%4d, that is to say, the sub-table subscript consists of 4 digits. For example: table_name_0000, table_name_0102.
  • All SQL statements that operate on unsegmented tables will be sent to the default node.

3. Kingshard test

3.1 Environment introduction

mysql IP server_id
master 10.0.0.6 63307
slave 10.0.0.7 73307
slave 10.0.0.8 83307

3.2 Test of read-write separation

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      73307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      83307 |
+-------------+
1 row in set (0.00 sec)

Conclusion: The read-write separation is normal, and the polling is performed by 1:1 when the weight is not set.

3.3 Test of read load balancing

A large part of the purpose of users using Mysql Proxy is to reduce the load of a single DB and share the read pressure among multiple DBs. Kingshard supports multiple slaves. Different slaves can be configured with different read weights. The larger the weight, the more read requests are shared. Kingshard's read request load balancing adopts a weighted round-robin scheduling algorithm.

When most systems use this algorithm, the sequence number of the selected DB is dynamically calculated when forwarding the SQL statement. The SQL statement for the read request is then sent to that DB. After careful analysis, it is not necessary to do so. Because the weight of the DB is relatively fixed and does not change frequently, it is completely possible to calculate a fixed polling sequence, and then store this sequence in an array. In this way, no dynamic calculation is required, and the array is read every time. For example, configure the slave option in the node configuration item of kingshard: slave:10.0.0.7 @1 ,10.0.0.8 @3 When kingshard reads the configuration information to initialize the system, it generates a polling array:[ 0,0,1,1,1]. In kingshard, this array will be shuffled into: [0,1,1,0,1]. In this way, the problem of dynamically calculating DB subscripts is avoided, which is helpful for performance improvement.

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      73307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      83307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      83307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      83307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      73307 |
+-------------+
1 row in set (0.00 sec)

The weights function fine

3.4 Forced test of reading the main library

Sometimes we have high real-time requirements for read requests, and we can force some read traffic to go to the main library (this is not very friendly, it would be nice if there are parameters that can be configured directly).

In kingshard, due to the separation of read and write, select will be sent to the slave library of the corresponding node by default. But only need to add the corresponding comment item (/*master*/) in the select statement, then the select statement can be sent to the main library. When connecting to MySQL, you need to add the -c option to prevent the client from filtering out comments

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      73307 |
+-------------+
1 row in set (0.00 sec)

mysql> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|      83307 |
+-------------+
1 row in set (0.00 sec)

mysql> select/*master*/ @@server_id;
+-------------+
| @@server_id |
+-------------+
|      63307 |
+-------------+
1 row in set (0.00 sec)

functioning normally.

4. Other functions of kingshard

4.1 Backend DB Survival Detection

Each node of kingshard starts a goroutine to detect the status of the backend master and slave. When the goroutine continues for a period of time (set by the down_after_noalive parameter in the configuration file) and fails to ping the back-end DB, the status of the DB will be set to down, and subsequent kingshard will not send SQL statements to the DB.

4.2 Whitelist function

Sometimes users want to allow only a few servers to connect to kingshard for security reasons. There is a parameter in the kingshard configuration file: allow_ips, which is used to implement the client whitelist mechanism. When the administrator sets this parameter, it means that only the IP specified by allow_ips can connect to kingshard, and other IPs will be rejected by kingshard. If this parameter is not set, clients connecting to kingshard are not restricted.

4.3 SQL blacklist function

4.3.1 Application scenarios of blacklist function:

  • The DBA defines some dangerous SQL and puts it in the SQL blacklist file. It can avoid the harm to the database caused by the SQL sent by the front-end application. This kind of SQL may be written carelessly by the developer, or it may be SQL generated by SQL injection. For example: delete from table, this SQL without where condition will delete the entire table.
  • After the kingshard project was launched, it was found through the log that a large number of certain SQLs caused a lot of pressure on the DB. At this time, the SQL can be dynamically added to the blacklist to prevent the execution of the SQL, thereby reducing the pressure on the database. For example: select count(*) from table where xxxx, if this kind of SQL is not properly optimized, it is easy to cause the system IO to be too high.

4.3.2 Function introduction

If you want to use the SQL blacklist function in kingshard, you only need to add in the configuration:

blacklist_sql_file: /Users/flike/blacklist

Then we define the SQL blacklist in the blacklist, so that when kingshard forwards it, it will prevent the forwarding of the SQL in the blacklist.

Blacklist SQL is defined in the form of regular expressions. Use ? or ?+ instead for values ​​in SQL. To ensure that the blacklist is valid, it is best to manually verify whether kingshard correctly intercepts the SQL in the blacklist. For defining rules (the previous one is the original SQL, the corresponding next one is the SQL in the form of a blacklist), you can refer to the following examples:

SELECT c FROM t WHERE id=1
select c from t where id=?

SELECT * FROM prices.rt_5min where id=1
select * from prices.rt_5min where id=?

select null, 5.001, 5001. from foo
select ?, ?, ? from foo

select 'hello', '\nhello\n', \"hello\", '\\'' from foo
select ?, ?, ?, ? from foo

select 'hello'\n
select ?

select * from t where (base.nid IN  ('1412', '1410', '1411'))
select * from t where (base.nid in(?+))

select * from foo where a in (5) and b in (5, 8,9 ,9 , 10)
select * from foo where a in(?+) and b in(?+)

select * from foo limit 5
select * from foo limit ?

select * from foo limit 5, 10
select * from foo limit ?, ?

select * from foo limit 5 offset 10
select * from foo limit ? offset ?

INSERT INTO t (ts) VALUES (NOW())
insert into t (ts) values(?+)

insert into foo(a, b, c) values(2, 4, 5)
insert into foo(a, b, c) values(?+)

CALL foo(1, 2, 3)
call foo

LOAD DATA INFILE '/tmp/foo.txt' INTO db.tbl
load data infile ? into db.tbl

administrator command: Init DB
administrator command: Init DB

use `foo`
use ?

4.3.3 Function Demonstration

Add the following SQL to the blacklist:

use ?
delete from ?
update ? set ?

Connect to kingshard and execute SQL as shown below:

进入库:
mysql> use test
Database changed

创建表:
CREATE TABLE `test_shard_hash_0000` (
  `id` bigint(64) unsigned NOT NULL,
  `str` varchar(256) DEFAULT NULL,
  `f` double DEFAULT NULL,
  `e` enum('test1','test2') DEFAULT NULL,
  `u` tinyint(3) unsigned DEFAULT NULL,
  `i` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

执行插入操作:
mysql> insert into test_shard_hash_0000(id,str,f,e,u,i) values(15,"flike",3.14,'test2',2,3);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test_shard_hash_0000(id,str,f,e,u,i) values(7,"chen",2.1,'test1',32,3);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test_shard_hash_0000(id,str,f,e,u,i) values(17,"github",2.5,'test1',32,3);
Query OK, 1 row affected (0.00 sec)

执行update操作:
mysql> select * from test_shard_hash_0000;
+----+--------+------+-------+------+------+
| id | str    | f    | e     | u    | i    |
+----+--------+------+-------+------+------+
|  7 | chen   |  2.1 | test1 |   32 |    3 |
| 15 | flike  | 3.14 | test2 |    2 |    3 |
| 17 | github |  2.5 | test1 |   32 |    3 |
+----+--------+------+-------+------+------+
3 rows in set (0.00 sec)

mysql> update test_shard_hash_0000 set i=5;
Query OK, 3 rows affected (0.00 sec)

mysql> select * from test_shard_hash_0000;
+----+--------+------+-------+------+------+
| id | str    | f    | e     | u    | i    |
+----+--------+------+-------+------+------+
|  7 | chen   |  2.1 | test1 |   32 |    5 |
| 15 | flike  | 3.14 | test2 |    2 |    5 |
| 17 | github |  2.5 | test1 |   32 |    5 |
+----+--------+------+-------+------+------+
3 rows in set (0.00 sec)

执行delete操作:
mysql> select * from test_shard_hash_0000;
+----+--------+------+-------+------+------+
| id | str    | f    | e     | u    | i    |
+----+--------+------+-------+------+------+
|  7 | chen   |  2.1 | test1 |   32 |    5 |
| 15 | flike  | 3.14 | test2 |    2 |    5 |
| 17 | github |  2.5 | test1 |   32 |    5 |
+----+--------+------+-------+------+------+
3 rows in set (0.00 sec)

mysql> delete from test_shard_hash_0000;
Query OK, 3 rows affected (0.00 sec)

mysql> select * from test_shard_hash_0000;
Empty set (0.00 sec)

verification failed.

Re-add the following SQL to the blacklist

use ?
delete from ?
update ? set ?
select ?
select * from test_shard_hash_0000

Connect to kingshard and execute SQL as shown below:

mysql> select 'time';
ERROR 1105 (HY000): sql in blacklist.

mysql> select @@port;
+--------+
| @@port |
+--------+
|   3307 |
+--------+
1 row in set (0.00 sec)

mysql> select * from test_shard_hash_0000;
ERROR 1105 (HY000): sql in blacklist.
mysql> select * from test_shard_hash_0000 where id=17;
+----+--------+------+-------+------+------+
| id | str    | f    | e     | u    | i    |
+----+--------+------+-------+------+------+
| 17 | github |  2.5 | test1 |   32 |    3 |
+----+--------+------+-------+------+------+
1 row in set (0.00 sec)

Conclusion: The kingshard SQL blacklist function is strictly matched, and it is not very friendly to many SQL support (such as the SQLs tested for the first time). Overall, this function is rather tasteless and is not recommended.

4.4 The range of SQL that kingshard supports

4.4.1 Brief description

Kingshard supports most of MySQL syntax and protocols in the case of non-partitioned tables, including SHOW DATABASES, SHOW TABLES, and various DML statements and DDL statements. In the case of sub-tables, only limited DML statements are currently supported, mainly including: SELECT, UPDATE, INSERT, REPLACE, DELETE five SQL operations. And does not support the automatic creation of sub-tables. And limited kingshard custom admin commands. In the case of split table and non split table, the following situations are not supported:

  • User-defined data types and user-defined functions are not currently supported.
  • Views, stored procedures, triggers, and cursors are not currently supported.
  • Compound statements like BEGIN...END, LOOP...END LOOP, REPEAT...UNTIL...END REPEAT, WHILE...DO...END WHILE, etc. are not currently supported.
  • Flow control statements such as IF and WHILE are not supported for the time being. The following two parts introduce the SQL support of kingshard: the SQL support range in the case of non-partitioning and the SQL support range in the case of partitioning.

4.4.2 The range of SQL support in the case of non-segmented tables

Database DDL syntax

CREATE TABLE Syntax
CREATE INDEX Syntax
DROP TABLE Syntax
DROP INDEX Syntax
ALTER TABLE Syntax
TRUNCATE TABLE Syntax

Database DML syntax

INSERT Syntax
INSERT DELAYED Syntax 暂不支持
REPLACE Syntax
UPDATE Syntax
DELETE Syntax
Subquery Syntax
Scalar Subquery
Comparisons Subquery
Subqueries with ANY, IN, or SOME
Subqueries with ALL
Row Subqueries
Subqueries with EXISTS or NOT EXISTS
Subqueries in the FROM Clause
SELECT Syntax
SELECT INTO OUTFILE/INTO DUMPFILE/INTO var_name 暂不支持
Last_insert_id特性

business support

START TRANSACTION, COMMIT, and ROLLBACK Syntax
暂不支持transaction_characteristic定义
暂不支持savepoint嵌套事务的相关语法
暂不支持XA事务的相关语法
支持set autocommit=0/1方式设置事务.
支持begin/commit方式设置事务
支持start transaction方式设置事务
SET TRANSACTION Syntax
暂不支持对global的事务隔离级别进行调整

Preprocessing support

Prepared Statements 支持主流语言(java,php,python,C/C++,Go)SDK的MySQL的Prepare语法。

Database management syntax support

SET Syntax 只支持字符集和set autocommit相关语法,其他set语法未测试过。
Show Syntax 默认show操作会转发到默认DB,需要查看其他DB的内容,通过在SQL中加注释的方式。
KILL Syntax 目前不支持KILL QUERY processlist_id
DESCRIBE Syntax
EXPLAIN Syntax
USE Syntax

Database system functions are supported by default (untested)

4.4.3 The supported range of SQL in the case of sub-tables

 Database DDL syntax

CREATE TABLE Syntax
CREATE INDEX Syntax
DROP TABLE Syntax
DROP INDEX Syntax
ALTER TABLE Syntax
TRUNCATE TABLE Syntax

分表的情况下支持这些语法,但需要在SQL中加注释,例如: /*node1*/create table stu_0000(id int, name char(20)); 这样kingshard就会将该SQL转发到node1节点的Master上。

注: truncate如果不指定节点注释则会将所有分表都清空,例如:truncate stu

Database DML syntax

INSERT Syntax
INSERT DELAYED Syntax 不支持
INSERT INTO SELECT 不支持
REPLACE Syntax
UPDATE Syntax //分表使用的字段无论何种分表类型都不能作为被更新的字段。
UPDATE SET xx=REPLACE(xx,'a','b') Syntax 不支持
DELETE Syntax
Subquery Syntax
SELECT Syntax 对于UPDATE,DELETE和SELECT三种SQL中WHERE后面的条件不能包含子查询,函数等。只能是字段名。

Database management syntax support

DESCRIBE Syntax 通过SQL语句hint方式支持,例如:/*node2*/describe table_name
EXPLAIN Syntax 通过SQL语句hint方式支持,例如:/*node2*/explain select * from xxxx
USE Syntax

Support for table aggregation functions

sum函数
max函数
count函数
min函数 不支持distinct后聚合,例如:select count(distinct id) from xxxx

Sub-table group by, order by, limit support

Other situation description

不支持分布式事务,支持以非事务的方式更新多node上的数据。
不支持预处理。
不支持数据库管理语法。

Reference address: https://github.com/flike/kingshard/

In order to facilitate everyone to communicate, I have opened a WeChat public account and a QQ group, QQ group: 291519319, let’s communicate with those who like technology

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325351741&siteId=291194637