mysql sub-database sub-table middleware Heisenberg

Code address:

https://github.com/brucexx/heisenberg

Its advantages: The sub-database and sub-table are separated from the application, the sub-database table is like using a single-database table,
the pressure of the number of db connections is reduced  , the
hot restart configuration
can be horizontally expanded ,
and the MySQL native protocol is followed
without language restrictions. mysqlclient, c, java, etc. can use the
Heisenberg server Through management commands, you can view, such as the number of connections, thread pools, nodes, etc., and you can adjust
the sub-database sub-table script using velocity to customize the sub-database table, which is quite flexible

 

I have done a simple sharing in the group before, this time is a little easier, share it first to see if there is a better idea that improves in this area

 

 Let's start with an introduction to heisenberg

 

1.Heisenberg overall architecture

      First this structure:

      

      
 
 

The application is the mysql client for the heisenberg cluster,

And heisenberg also integrates the native protocol of mysql, so for the application, it is equivalent to the data source of a single database and single table

Whether it is mysql client, c, jdbc driver, etc., can access the heisenberg server, and the server does the work of sub-database and sub-table.

 

Access to the heisenberg cluster can be solved by load software/devices such as lvs, F5, etc.

In fact, the performance of a heisenberg is quite good. I am stressed to 2320TPS load and it is only about 0.1-0.3 (CPU 8core, 16G). Since I can't find a mysql physical machine, I have to do it.

 

Server internal structure:


 

Among them, FrontConnectionFactory is for application-oriented connection management, and ManagerConnectionFactory is for some connection management for internal management of heisenberg server, such as hot restart after changing configuration, closing a connection and other functions

The mysql protocol runs through the application and the mysql server, and is finally parsed into related mysql data packets, authorization packets, registration packets, etc.

 

When the heisenberg server receives the SQL statement, it is parsed into DML, DCL, DDL type and the value of related column names through AST grammar parsing, etc., and then through the ServerRouter layer, after the segmentation of the sub-database and sub-table, the final segmentation will be A good statement is put into the corresponding data node for execution

 

The segmentation of sub-database and sub-table is supported by two syntaxes, velocity and groovy, in order to meet various general flexibility. Groovy is the initialization table, library and mapping relationship, which is only initialized once when loading; and velocity is Used to render the corresponding sub-library and sub-table rules.

 

OK, know the principle, then start to explain how to use sub-library and sub-table

 

2.Heisenberg development

Maven + JDK is deployed

 https://github.com/brucexx/heisenberg 

After downloading to local,

Mvn package 之

 

A heisenberg-server-1.0.0.zip file will be generated in the local target

 

Unzip unzip heisenberg-server-1.0.0.zip  

Enter the conf directory

There are the following directories 

     conf

      ---log4j.xml

      ---rule.xml

      ---schema.xml

      ---server.xml

log4j.xml will not be introduced

  sql_route.log is the time when the database table is split

  sql_execute.log is the total execution time of sql

 

server.xml 

 

    "serverPort">8166

    "managerPort">8266

    "initExecutor">16

    "timerExecutor">4

    "managerExecutor">4

    "processors">4

    "processorHandler">8

    "processorExecutor">8

    "clusterHeartbeatUser">_HEARTBEAT_USER_

    "clusterHeartbeatPass">_HEARTBEAT_PASS_

 

 

serverPort is the service port, that is, the port for the upper-layer application

managerPort is the management port, that is, the listening port of management, which is used to operate some configurations of the server, etc.

initExecutor  is the number of threads to initialize

timerExecutor  heartbeat execution thread number

managerExecutor manages the number of execution threads

The processors application receives the number of processor cores

 The number of processing classes received by the processorHandler application

The processorExecutor  application receives the number of processing threads

 

clusterHeartbeatUser and clusterHeartbeatPass  do not need to be changed, they are used for cluster authentication

 

 "brucexx">

    "password">st0078

    "schemas">trans_shard

 

Brucexx is the user name of the custom application, and st0078 is the password of the custom application

Schemas are custom schemas, see schema.xml for details.

 

The schemas here can be multiple, separated by commas

 

Whitelist restrictions:

 

  

   

      test

   

 

 

                

schema.xml placement

mysql data source

 

    "transDS" type="mysql">

        "location">

            10.58.49.14:8701/db$0-9

   

        "user">root

        "password">st0078

        "sqlMode">STRICT_TRANS_TABLES

   

 

这里指定的mysql的数据源,后面$0-9是一种自定义的缩略写法

也可以在property里面定义多个location,比如:

"location">

            10.58.49.14:8701/db0

10.58.49.14:8701/db1

10.58.49.14:8701/db2

   

 

 

 

效果是一样的

 

Shard结点配置

Shard结点相当于一个逻辑结点,提供给外部相关的schema,对应于数据源有

主/备/灾,

"transDN">

        "dataSource">

           

            transDS$0-9

           

            transSlaveDS$0-9

           

            transSlaveDS$0-9

           

           

       

        "rwRule">m:0,s:1

        "poolSize">256

        "heartbeatSQL">select user()

   

 

属性dataSource 第一个是主库,第二个备库,第三个灾库,需要多少配置多少个

 

读写分离规则rwRule,m和s代表读取的比例,表示主库读取为0,从库读取1,这样直接读写分离,如果是1:1的话相当读取各1:1的比例

 

池大小poolSize为到mysqlDB的连接数和心跳sql heartbeatSQL,无特殊需求保持不变

 

Schema配置

"trans_shard">

 

       

"trans_online, trans_content, trans_tb "dataNode="transDN$0-9"rule="rule1"/>

   

 trans_shard 提供的schema,对应于server.xml中的名字

下面会有多个需要分库的表,

 

"trans_online"dataNode="transDN$0-9"rule="rule1"/>

这里必须要把需要分库分表的内容写出来,当然,如果不分库表也是可以的

 

 

”tbxxx"dataNode="transDN0" ruleRequired=”false”/

 

rule.xml 

分库分表规则配置,其中columns,dbRuleList,tbRuleList里面的列名要保持大写

 

首先先上一个整体配置

 

    "rule1">

        TRANS_ID

   

            #set($start=$TRANS_ID.length() - 2)##

            #set($end=$TRANS_ID.length() - 1)##

            $stringUtil.substring($TRANS_ID,$start,$end)

           

       

       

            #set($start=$TRANS_ID.length() - 2)##

             $stringUtil.substring($TRANS_ID,$start)

       

       

       

                 

                        def map = [:];

                        for (int i=0; i<10; i++) {

                           def list = [];

                            for (int j=0; j<10; j++) {

                                list.add(i+""+j);

                            }

                             map.put(i,list);

                        };

                        return map;

               

       

   

 

 

其中dbRuleList 为分库规则

 

 

            #set($start=$TRANS_ID.length() - 2)##

            #set($end=$TRANS_ID.length() - 1)##

            $stringUtil.substring($TRANS_ID,$start,$end)

           

       

 分库规则dbRuleList可以有多个dbRule,当第一个不满足时,可以用第二个,当然这个效率不好,如果有规则区分,尽量再写一个rule,

dbRule 最后的结果是表的前缀

比如分库分表 库名为db0-db9,那么这个dbRule渲染时

 

取到TRANS_ID 这个为后,在脚本里计算出取倒数第2位为库后缀

比如上图的分库为


 
 

分表规则配置

 

            #set($start=$TRANS_ID.length() - 2)##

             $stringUtil.substring($TRANS_ID,$start)

       

这个和上面分库一样了,以倒数1,2位为库的后缀

如下图:


 
 

 

 有个潜规则就是

需要保证全局的表名不能重复

比如db0有个trans_tb00,db1就不能有叫trans_tb00的表

 

表初始化

       

       

                 

                        def map = [:];

                        for (int i=0; i<10; i++) {

                           def list = [];

                            for (int j=0; j<10; j++) {

                                list.add(i+""+j);

                            }

                             map.put(i,list);

                        };

                        return map;

               

       

 

需要初始化个表,其中key为db的下标索引,比如db0 的下标为0,

list为每个库里的表后缀名

 

 

目录是为了初始化定义这些库表

 

如何使用呢?

通过命令行


 

这里就不用讲了,wms_shard就是在server.xml里面配置的逻辑分库分表的数据源schema,应用只要访问这个就好了


 
 

show tables;也可以看到自己的一些表信息


 
 

ok.

 

mysql> select * from t_user_id_map;

+-----------+---------------------------+-----------+------------+---------------------+---------------------+

| F_uid     | F_uname                   | F_enabled | F_user_id  | F_create_time       | F_modify_time       |

+-----------+---------------------------+-----------+------------+---------------------+---------------------+

| 105001050 | @8230762802717b6a723fe9cd |         1 | 1287824017 | 2014-03-10 15:38:44 | 2014-03-10 15:38:44 |

|     62000 |                           |         1 |  533885000 | 2014-03-26 23:02:31 | 2014-03-26 23:02:31 |

|     86000 |                           |         1 |  237406000 | 2014-03-27 01:04:23 | 2014-03-27 01:04:23 |

|     96000 |                           |         1 |  767684000 | 2014-03-27 00:30:32 | 2014-03-27 00:30:32 |

|    130000 |                           |         1 |  506552000 | 2014-03-27 15:57:31 | 2014-03-27 15:57:31 |

|    149000 |                           |         1 |  868483000 | 2014-03-27 15:50:09 | 2014-03-27 15:50:09 |

|    179000 |                           |         1 |  245626000 | 2014-03-26 21:33:46 | 2014-03-26 21:33:46 |

当没有指定分库分表规则时,是进行的全表扫描,当然我们可以通过学习

mysql> explain select * from t_user_id_map;

+-----------+-----------------------------------

| DATA_NODE | SQL

+-----------+-----------------------------------

| wmsDN[0]  |  select * from t_user_id_map_00_0

| wmsDN[0]  |  select * from t_user_id_map_00_1

| wmsDN[0]  |  select * from t_user_id_map_00_2

| wmsDN[0]  |  select * from t_user_id_map_00_3

| wmsDN[0]  |  select * from t_user_id_map_00_4

| wmsDN[0]  |  select * from t_user_id_map_00_5

| wmsDN[0]  |  select * from t_user_id_map_00_6

| wmsDN[0]  |  select * from t_user_id_map_00_7

| wmsDN[0]  |  select * from t_user_id_map_00_8

| wmsDN[0]  |  select * from t_user_id_map_00_9

| wmsDN[1]  |  select * from t_user_id_map_01_0

| wmsDN[1]  |  select * from t_user_id_map_01_1

| wmsDN[1]  |  select * from t_user_id_map_01_2

| wmsDN[1]  |  select * from t_user_id_map_01_3

| wmsDN[1]  |  select * from t_user_id_map_01_4

| wmsDN[1]  |  select * from t_user_id_map_01_5

| wmsDN[1]  |  select * from t_user_id_map_01_6

| wmsDN[1]  |  select * from t_user_id_map_01_7

| wmsDN[1]  |  select * from t_user_id_map_01_8

| wmsDN[1]  |  select * from t_user_id_map_01_9

| wmsDN[2]  |  select * from t_user_id_map_02_0

....

这边表很多,其中dataNode是我们里面对应的结点

 

mysql> select * from t_user_id_map where f_uid=196606999;

+-----------+---------+-----------+-----------+---------------------+---------------------+

| F_uid     | F_uname | F_enabled | F_user_id | F_create_time       | F_modify_time       |

+-----------+---------+-----------+-----------+---------------------+---------------------+

| 196606999 |         |         1 | 749331999 | 2014-04-04 14:46:58 | 2014-04-04 14:46:58 |

+-----------+---------+-----------+-----------+---------------------+---------------------+

1 row in set (0.04 sec)

The configuration here is to divide the database and table according to the last three digits of F_uid, and the dbRuleList is configured with the last 2 and 3 digits.

tbRuleList configures the last bit

 

Let's see how it is routed

 

mysql> explain select * from t_user_id_map where f_uid=196606999;

+-----------+---------------------------------------------------------+

| DATA_NODE | SQL                                                     |

+-----------+---------------------------------------------------------+

| wmsDN[99] |  select * from t_user_id_map_99_9 where f_uid=196606999 |

+-----------+---------------------------------------------------------+

1 row in set (0.03 sec)

 

You can see data_node --> wmSDN[99] , branch location

Table corresponding to t_user_id_map_99_9

 

 

http://blog.sina.com.cn/s/blog_56d988430102vdfo.html

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327001231&siteId=291194637