Data fragmentation (mycat)

1. The concept of data sharding:

1.1. Sub-database and sub-table

  • What is sub-database sub-table:
  • The data stored in a database server is split according to a specific method (referring to the algorithm of program development), and stored in multiple database servers in order to achieve the effect of spreading the load of a single server.

insert image description here

1.2. Split horizon

  • Horizontal split:
  • According to the slicing rules of the specified fields in the table, the table records are divided into rows and stored in multiple databases.

insert image description here

1.3. Vertical split

  • vertical split
  • Multiple tables of a single database are classified by business type and stored in different databases.

insert image description here

2. Mycat software introduction:

  • mycat is a Java-based distributed database system middleware, providing solutions for distributed storage in high-concurrency environments
  • It is suitable for the storage requirements of a large amount of data writing, but not suitable for the request of a large amount of data query.
  • Support MySQL, Oracle, SQL server, Mongodb, etc.
  • Provide data read and write separation and data fragmentation services.
  • Open source software developed based on Alibaba Cobar

2.1. Fragmentation rules

  • mycat supports 10 fragmentation rules
  1. Enumeration method (sharding-by-intfile)
  2. Fixed sharding (rule1)
  3. Range convention (auto-sharding-long)
  4. Modulo method (mod-long)
  5. Date Column Partitioning Method (sharding-by-date)
  6. Wildcard modulo (sharding-by-pattern)
  7. ASCII code modulo wildcard (sharding-by-prefixpattern)
  8. Programmatic specification (sharding-by-substring)
  9. String splitting hash analysis (sharding-by-stringhash)
  10. Consistent hash (sharding-by-murmur)

2.2. Workflow

  • When mycat receives a SQL command
  1. Parse the tables involved in the SQL command
  2. Then look at the configuration of the table, if there is a fragmentation rule, get the value of the fragmentation field in the SQL command, and match the fragmentation function to obtain the fragmentation list
  3. Then send the SQL command to the corresponding shard server for execution
  4. Finally collect and process all shard result data and return to the client

3. Configure the data fragmentation service

  • Data Sharding Topology
  • Note that deploying this architecture requires at least three database servers
    insert image description here
  • IP planning
CPU name Role database IP address
mysql10 client none 192.168.2.10
mycat20 Shard server none 192.168.2.20
mysql30 database server db1 192.168.2.30
mysql40 database server db2 192.168.2.40
mysql50 database server db3 192.168.2.50
  • I won’t talk about installing mysql here, you can refer to mysql installation .
  • To modify the host names of all servers, refer to the table of IP planning .
##修改所有主机的hosts文件
[root@mysql10 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.2.10   mysql10
192.168.2.20   mycat20
192.168.2.30   mysql30
192.168.2.40   mysql40
192.168.2.50   mysql50

3.1. Install mycat

##安装软件jdk,因为mycat软件需要jdk编译。
[root@mycat20 ~]# yum -y install java-1.8.0-openjdk
##查看java版本
[root@mycat20 ~]# java -version
openjdk version "1.8.0_362"
OpenJDK Runtime Environment (build 1.8.0_362-b08)
OpenJDK 64-Bit Server VM (build 25.362-b08, mixed mode)
##下载mycat二进制包
[root@mycat20 ~]# wget http://dl.mycat.org.cn/1.6-RELEASE/Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz
##解压指定目录
[root@mycat20 ~]# tar -xf Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz -C /usr/local/
##解压是否成功
[root@mycat20 ~]# cd /usr/local/
[root@mycat20 local]# ls mycat/
bin  catlet  conf  lib  logs  version.txt
配置mycat环境变量:
[root@localhost ~]# echo "export PATH=/usr/local/mycat/bin:$PATH" >/etc/profile.d/mycat.sh
[root@localhost ~]# . /etc/profile.d/mycat.sh 
[root@localhost ~]# echo $PATH
/usr/local/mycat/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

3.2. Directory structure

  • ls /usr/local/mycat
- bin         ----mycat命令
- catlet	  ----扩展功能
- conf   	  ----配置文件
- lib		  ----mycat使用的jar包
- logs		  ----mycat启动日志和运行日志
- wrapper.log ----mycat服务启动日志
- mycat.log   ----记录SQL脚本执行后报错内容

3.3. Modify the main configuration file

  • Important Configuration File Notes
- server.xml                  ---设置连接账号及逻辑库
- schema.xml				  ---配置数据分片存储的表
- rule.xml					  ---分片规则
- 其他文件					  ---分片规则配置文件

3.3.1. Create connection user (server.xml)

##备份server.xml文件
[root@mycat20 ~]# cd /usr/local/mycat/conf/
[root@mycat20 conf]# cp -r server.xml{,.bak}
##创建连接用户
.xml文件中<!--     -->代表注释。
[root@mycat20 conf]# vim server.xml
....
修改以下内容
<user name="root">       ---连接mycat服务用户名
                <property name="password">1234</property>   ---用户密码
                <property name="schemas">mycatdb</property>    ---逻辑库

                <!-- 表级 DML 权限设置 -->
                <!--
                <privileges check="false">
                        <schema name="TESTDB" dml="0110" >
                                <table name="tb01" dml="0000"></table>
                                <table name="tb02" dml="1111"></table>
                        </schema>
                </privileges>
                 -->
        </user>

        <user name="user">
                <property name="password">1234</property>
                <property name="schemas">mycatdb</property>
                <property name="readOnly">true</property>   ---只读权限
        </user>

3.3.2. Table defining fragmentation (schema.xml)

##定义分片的表

 <schema> ...... </schema>      --定义分片信息,对什么表做分片
<table> ...... </table>         --定义要分片的表
name         					--逻辑库名或逻辑表名
dataNode      					--指定数据节点名
rule     						--指定使用的分片规则
type=global         			--数据不做分片存储

##定义数据节点

<dataNode 选项=值,... .../>      --定义数据节点
name          					--数据节点名
datahost    					--数据库服务器主机名
database    					--数据库名

##定义数据库服务器IP地址及端口

<datahost 选项=值,... ...>... ...</datahost>        --服务器主机名
name            				--主机名(与datahost对应的主机名)
host              				--主机名(与IP地址对应的主机名)
url                 			--数据库服务器IP地址及端口号
user              				--数据库服务器授权用户
password      					--授权用户密码

##备份schema.xml文件
[root@mycat20 conf]# cp -r schema.xml{,.bak}
[root@mycat20 conf]# vim schema.xml
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">

        <schema name="mycatdb" checkSQLschema="false" sqlMaxLimit="100">
                <table name="travelrecord" dataNode="dn1,dn2,dn3" rule="auto-sharding-long" />

                <table name="company" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />
                <table name="goods" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />
                <table name="hotnews"  dataNode="dn1,dn2,dn3"
                           rule="mod-long" />
                <table name="employee" primaryKey="ID" dataNode="dn1,dn2,dn3"
                           rule="sharding-by-intfile" />
                <table name="customer" primaryKey="ID" dataNode="dn1,dn2,dn3"
                           rule="sharding-by-intfile">
                        <childTable name="orders" primaryKey="ID" joinKey="customer_id"
                                                parentKey="id">
                                <childTable name="order_items" joinKey="order_id"
                                                        parentKey="id" />
                        </childTable>
                        <childTable name="customer_addr" primaryKey="ID" joinKey="customer_id"
                                                parentKey="id" />
                </table>
        </schema>
        <dataNode name="dn1" dataHost="mysql30" database="db1" />
        <dataNode name="dn2" dataHost="mysql40" database="db2" />
        <dataNode name="dn3" dataHost="mysql50" database="db3" />
        <dataHost name="mysql30" maxCon="1000" minCon="10" balance="0"
                          writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
                <heartbeat>select user()</heartbeat>
                <writeHost host="hostM1" url="192.168.2.30:3306" user="mycat"
                                   password="1234">
                </writeHost>
        </dataHost>
<dataHost name="mysql40" maxCon="1000" minCon="10" balance="0"
                          writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
                <heartbeat>select user()</heartbeat>
                <writeHost host="hostM2" url="192.168.2.40:3306" user="mycat"
                                   password="1234">
                </writeHost>
        </dataHost>
        <dataHost name="mysql50" maxCon="1000" minCon="10" balance="0"
                          writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
                <heartbeat>select user()</heartbeat>
                <writeHost host="hostM3" url="192.168.2.50:3306" user="mycat"
                                   password="1234">
                </writeHost>
        </dataHost>
</mycat:schema>	

4. Configure the database server

  • Make corresponding settings according to the shard configuration
  1. Add authorized user
  2. create database
##创建授权用户
mysql> grant all on *.* to mycat@'%' identified by '1234';   ---在mysql30、mysql40、mysql50执行。
##创建数据库
mysql> create database db1 default chatset=utf8;   ---在mysql30执行
mysql> create database db2 default charset=utf8;   --在mysql40执行
mysql> create database db3 default charset=utf8;  ---在mysql50执行
  • The above operations are based on the contents of the mycat configuration file schema.xml
    insert image description here

5. Start the mycat service

[root@mycat20 conf]# mycat start
Starting Mycat-server...
[root@mycat20 conf]# netstat -nltp | grep 8066
tcp6       0      0 :::8066                 :::*                    LISTEN      9719/java

6. The client connects to the shard server

[root@mysql10 ~]# mysql -uroot -p1234 -h192.168.2.20 -P8066
mysql> show databases;
+----------+
| DATABASE |
+----------+
| mycatdb  |     ---逻辑库
+----------+
mysql> use mycatdb
mysql> show tables;
+-------------------+
| Tables in mycatdb |
+-------------------+
| company           |    ---这些都是逻辑表
| customer          |
| customer_addr     |
| employee          |
| goods             |
| hotnews           |
| orders            |
| order_items       |
| travelrecord      |
+-------------------+
9 rows in set (0.00 sec)

  • The username and password to log in to the shard server are defined in the picture
    insert image description here

  • The logical library and logical table on the shard server are defined on the picture
    insert image description here

7. Fragmentation rules

7.1. Enumeration method (sharding-by-intfile)

  • The field value must be selected in the value defined in the rule file
[root@mycat20 conf]# vim schema.xml
<table name="employee" primaryKey="ID" dataNode="dn1,dn2,dn3"
                           rule="sharding-by-intfile" />
##可以看的出来employee这个表格必须有ID这个字段并且约束条件是主键分片规则是枚举法(sharding-by-intfile)
##查看枚举法的详细内容
[root@mycat20 conf]# vim rule.xml
<tableRule name="sharding-by-intfile">
                <rule>
                        <columns>sharding_id</columns>    --指定employee这个表格必须有sharding_id这个字段名
                        <algorithm>hash-int</algorithm>        算法
                </rule>
....
##定义算法的
 <function name="hash-int"
                class="io.mycat.route.function.PartitionByFileMap">  ---算法
                <property name="mapFile">partition-hash-int.txt</property>    ---定义分片规则的值的配置文件
        </function>
##查看sharding_id字段名的值范围
[root@mycat20 conf]# vim partition-hash-int.txt
10000=0     ---代表dn1
10010=1     ---代表dn2
10020=2     ---代表dn3

7.1.1. Restart mycat service

[root@mycat20 conf]# mycat stop;mycat start
Stopping Mycat-server...
Stopped Mycat-server.
Starting Mycat-server...

7.1.2. Create employee table

##登录分片服务器,创建employee表格
[root@mysql10 ~]# mysql -umycat -p1234 -h192.168.2.20 -P8066
mysql> create table employee(
    -> ID int primary key auto_increment,   ---schema.xml指定的字段名
    -> sharding_id int not null,     分片规则指定字段
    -> name char(20) not null,
    -> sex enum('boy','girl') not null,
    -> age int unsigned not null,
    -> homedir char(50) not null);
Query OK, 0 rows affected (0.08 sec)
mysql> desc employee;
+-------------+--------------------+------+-----+---------+----------------+
| Field       | Type               | Null | Key | Default | Extra          |
+-------------+--------------------+------+-----+---------+----------------+
| ID          | int(11)            | NO   | PRI | NULL    | auto_increment |
| sharding_id | int(11)            | NO   |     | NULL    |                |
| name        | char(20)           | NO   |     | NULL    |                |
| sex         | enum('boy','girl') | NO   |     | NULL    |                |
| age         | int(10) unsigned   | NO   |     | NULL    |                |
| homedir     | char(50)           | NO   |     | NULL    |                |
+-------------+--------------------+------+-----+---------+----------------+
6 rows in set (0.09 sec)
##在mycat分片服务器上创建employee这个表格mysql30、mysql40、mysql50都应该有这个表格,这里就不显示出来了。

7.1.3. Validation enumeration method

##往employee表格插入数据,sharding_id赋值为100010代表数据存储在mysql40上。
mysql> insert into employee(sharding_id,name,sex,age,homedir) values(10010,'bob','boy',29,'china');
##进入mysql40验证数据是否在。
mysql> select * from employee;
+----+-------------+------+-----+-----+-----------------------------------------+
| ID | sharding_id | name | sex | age | homedir                                 |
+----+-------------+------+-----+-----+-----------------------------------------+
|  1 |       10010 | bob  | boy |  29 | china			                 |
+----+-------------+------+-----+-----+-----------------------------------------+

1 row in set (0.01 sec)
##还可以验证mysql30、mysql50的employee表格是否有这个数据,这里就不展示了,正确的应该这两台都没有。

7.2. Modulo method (mod-long)

  • Store data according to the field value and the set digital modulo result, which means taking the remainder.
##查看hotnews表定义的内容:分片规则为求模法。
[root@mycat20 conf]# vim schema.xml
<table name="hotnews"  dataNode="dn1,dn2,dn3"
                           rule="mod-long" />

##查看求模法定义的字段名以及算法范围
[root@mycat20 conf]# vim rule.xml
 <tableRule name="mod-long">
 39                 <rule>
 40                         <columns>mod_id</columns>
 41                         <algorithm>mod-long</algorithm>
 42                 </rule>
 43         </tableRule>
......
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
106                 <!-- how many data nodes -->
107                 <property name="count">3</property>   --定义mod_id值除以3余数为0存储dn1,余数为1存储在dn2,余数为2存储在dn3。
108         </function>

7.2.1. Restart mycat service

[root@mycat20 conf]# mycat stop; mycat start
Stopping Mycat-server...
Stopped Mycat-server.
Starting Mycat-server...

7.2.2. Create hotnews table

mysql> create  table hotnews( mod_id int  not null, title char(20) not null, worker char(15) not null);

7.2.3. Verifying the modulo method

##往hotnews表格插入数据
mysql> insert into hotnews(mod_id,title,worker) values(1,'system','bob');
##mod_id的值为1除以3余1所以数据存储在dn2(mysql40)
##在mysql40上查看数据
mysql> select * from hotnews;
+--------+--------+--------+
| mod_id | title  | worker |
+--------+--------+--------+
|      1 | system | bob    |
+--------+--------+--------+
1 row in set (0.00 sec)
##正确情况下mysql30和mysql50的hotnews表是没有数据的,这里不展示了。

8. Non-fragmented storage (type=global)

##type=global意思是不用分片,存储数据在dn1、dn2、dn3上都存。
 <table name="goods" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />

8.1. Create table goods and insert data

mysql> create table goods(
    -> ID int primary key auto_increment,
    -> name char(15)not null,
    -> sex enum('boy','girl')not null,
    -> age int unsigned not null);
Query OK, 0 rows affected (0.02 sec)
##插入数据
mysql> insert into goods(name,sex,age) values('bob','boy',25),('andy','boy',19),('lucy','girl',18);

8.2. Verification:

[root@mysql10 ~]# mysql -umycat -p1234 -h192.168.2.30 -e "select * from db1.goods"
mysql: [Warning] Using a password on the command line interface can be insecure.
+----+------+------+-----+
| ID | name | sex  | age |
+----+------+------+-----+
|  1 | bob  | boy  |  25 |
|  2 | andy | boy  |  19 |
|  3 | lucy | girl |  18 |
+----+------+------+-----+
[root@mysql10 ~]# mysql -umycat -p1234 -h192.168.2.40 -e "select * from db2.goods"
mysql: [Warning] Using a password on the command line interface can be insecure.
+----+------+------+-----+
| ID | name | sex  | age |
+----+------+------+-----+
|  1 | bob  | boy  |  25 |
|  2 | andy | boy  |  19 |
|  3 | lucy | girl |  18 |
+----+------+------+-----+
[root@mysql10 ~]# mysql -umycat -p1234 -h192.168.2.50 -e "select * from db3.goods"
mysql: [Warning] Using a password on the command line interface can be insecure.
+----+------+------+-----+
| ID | name | sex  | age |
+----+------+------+-----+
|  1 | bob  | boy  |  25 |
|  2 | andy | boy  |  19 |
|  3 | lucy | girl |  18 |
+----+------+------+-----+

Guess you like

Origin blog.csdn.net/weixin_45625174/article/details/129120837