Database sharding and table sharding middleware - MyCat configuration and use

1. What is MyCat

What is MyCat? In short, MyCat is a database middleware. Requests to the database are directly connected to MyCat, and MyCat is then connected to the back-end database cluster according to the rules. If you don't care about the architecture, just pure application, then you can think of it as a database.
The following is an excerpt from the official document.
From the definition and classification, it is an open source distributed database system and a
Server that implements the MySQL protocol. Front-end users can regard it as a database agent, using MySQL client tools and commands. row access, and its backend can
communicate with multiple MySQL servers using the MySQL Native protocol, or it can communicate with most mainstream database servers (including Oracle, DB2, SqlServe, MongoDB, etc.) using the JDBC protocol. Its core functions are Sub-table sub-database, that is, a large table is horizontally divided into N small tables, which are stored in the back-end MySQL server or other databases.

For DBAs, Mycat can be understood like this:

Mycat is MySQL Server, and the MySQL Server connected behind Mycat is like a MySQL storage engine, such as
InnoDB, MyISAM, etc. Therefore, Mycat itself does not store data, the data is stored on the back-end MySQL, so the data is reliable Sex
and transactions are guaranteed by MySQL. Simply put, Mycat is the best companion of MySQL. It gives MySQL
the ability to PK with Oracle to a certain extent.

For software engineers, Mycat can be understood like this:

Mycat is a database server that is approximately equal to MySQL. You can connect to Mycat by connecting to MySQL (except for different
ports , the default Mycat port is 8066 instead of MySQL's 3306, so you need to add port information to the connection string), In most
cases , you can use Mycat with the object mapping framework you are familiar with, but it is recommended to use basic SQL statements for sharded tables, because this can
achieve the best performance, especially for tens of millions or even tens of billions of records. case.

For architects, Mycat can be understood like this:

Mycat is a powerful database middleware, which can not only be used for read-write separation, sub-table and sub-database, disaster recovery backup, but also for multi-
tenant application development, cloud platform infrastructure, so that your architecture has a strong Adaptability and flexibility. With the help of the upcoming Mycat intelligent optimization
module , the data access bottlenecks and hot spots of the system can be seen at a glance. Based on these statistical analysis data, you can automatically or manually adjust the back-end storage and
map different tables to different storages On the engine, and the entire application code does not need to change a single line.

For MyCat, you can read the documentation on the official website if you want to have an in-depth and comprehensive understanding: http://www.mycat.io/document/Mycat_V1.6.0.pdf

2. Under what circumstances do you need to use MyCat

When you start to pay attention to MyCat intentionally, I think you must already know the function of MyCat or you have a very large amount of data and worry about the expansion of storage performance. That is to say, you want to split the database, sub-database and sub-table.

Data sharding can be divided into two sharding modes according to the type of sharding rules. One is to split into different databases (hosts) according to different tables (or
Schemas), which can be called vertical (vertical) splitting of data; the other is based on the data in the
table . The logical relationship is to split the data in the same table into multiple databases (hosts) according to certain conditions. This splitting
is called horizontal (horizontal) splitting of data.
The biggest feature of vertical segmentation is that the rules are simple and the implementation is more convenient. It is especially suitable for systems with very low coupling between services,
little , and very clear business logic. In such a system, it is easy to split the tables used by different business modules into different
databases . Splitting according to different tables has less impact on the application, and the splitting rules will be simpler and clearer.
Horizontal segmentation is a bit more complicated than vertical segmentation. Because different data in the same table needs to be split into different
databases , for the application, the splitting rule itself is more complicated than splitting according to the table name, and the later data maintenance will be more complicated. .

MyCat is a middleware that provides help for database segmentation, especially horizontal segmentation.

The introduction is probably here, the following shows the installation, configuration and simple and practical of MyCat.

3. Installation and configuration

First of all, the preparation of the environment requires jdk1.7 or above, MySQL 5.5 or above, the preparation of the environment will not be repeated here, please prepare it yourself.
My installation environment is Linux. For Windows, please download the Windows installation package. They are all directly decompressed, and the configuration of the configuration file is the same.

1. Installation

Go to the official address to download:
http://dl.mycat.io/1.6-RELEASE/
Linux select Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz and
upload it to the Linux server, tar -xvf Mycat-server-1.6 -RELEASE-20161028204710-linux.tar.gz to extract.

[mysql@localhost ~]$ ls
mycat  Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz
[mysql@localhost ~]$ cd mycat/
[mysql@localhost mycat]$ ls
bin  catlet  conf  lib  logs  version.txt

You can see the directory under mycat

Name Academy score
bin program directory ./mycat console Note: Commands supported by mycat {console start/stop/restart/status/dump}
conf config file directory server.xml is the configuration file for Mycat server parameter adjustment and user authorization, schema.xml is the configuration file for logic library definition and table and fragmentation definition, rule.xml is the configuration file for fragmentation rules, and some specific parameters of fragmentation rules The information is stored separately as a file, and in this directory, if the configuration file is modified, you need to restart Mycat or reload through port 9066.
lib Some dependent jar files
logs Store log files The log is stored in logs/mycat.log, one file per day. The configuration of the log is in conf/log4j.xml. According to your needs, you can adjust the output level to debug. Under the debug level, more information will be output, which is convenient Troubleshoot problems.

Note: When deploying and installing MySQL under Linux, the case of the table name is not ignored by default. You need to manually configure
lower_case_table_names=1 under /etc/my.cnf to make MySQL ignore the case of the table name in Linux environment, otherwise, when using MyCAT, you will be prompted that it cannot be found. Error to
table !

2. Environment configuration

MyCAT 在 Linux 中部署启动时,首先需要在 Linux 系统的环境变量中配置 MYCAT_HOME,操作方式如下:
1) vi /etc/profile,在系统环境变量文件中增加 MYCAT_HOME=mycat安装目录
2) 执行 source /etc/profile 命令,使环境变量生效

经过以上的配置,就可以到mycat/bin 目录下执行:
./mycat start
即可启动 mycat 服务!

[root@localhost mycat]# cd bin
[root@localhost bin]# ./mycat start
Starting Mycat-server...
[root@localhost bin]#

三、简单使用

1.测试数据准备

比如我们要把User这张表横向拆分为三个,根据id取模来定位。
我们在MySQL同一个实例下建3个库,每个库都建同样的表,建库建表语句如下:

create database db01;  
create database db02;  
create database db03;  
//分别在以上三个库下建用户表  
CREATE TABLE users (  
    id INT NOT NULL,  
    name varchar(50) NOT NULL default '',  
    indate DATETIME NOT NULL default '0000-00-00 00:00:00',  
    PRIMARY KEY (id)  
)AUTO_INCREMENT= 1 ENGINE=InnoDB DEFAULT CHARSET=utf8;  

建好后应该如图:
write picture description here

2.MyCat配置

server.xml

server.xml中配置了mycat系统所需要的信息,这里我们只需要改一下访问的用户名、密码、schema

    <user name="root">
        <property name="password">123456</property>
        <property name="schemas">TESTDB</property>      
    </user>

    <user name="user">
        <property name="password">user</property>
        <property name="schemas">TESTDB</property>
        <property name="readOnly">true</property>
    </user>

这就是你的数据库客户端需要连接的配置

schema.xml

schema.xml 作为 MyCat 中重要的配置文件之一,管理着 MyCat 的逻辑库、表、分片规则、DataNode 以
及 DataSource。这里我们只展示简单应用层

    <schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100">  
        <table name="users" primaryKey="id" dataNode="node_db01,node_db02,node_db03" rule="idMod"/>  
    </schema>  

    <!-- 设置dataNode 对应的数据库,及 mycat 连接的地址dataHost -->  
    <dataNode name="node_db01" dataHost="dataHost01" database="db01" />  
    <dataNode name="node_db02" dataHost="dataHost01" database="db02" />  
    <dataNode name="node_db03" dataHost="dataHost01" database="db03" />  

    <!-- mycat 逻辑主机dataHost对应的物理主机.其中也设置对应的mysql登陆信息 -->  
    <dataHost name="dataHost01" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native">  
            <heartbeat>select user()</heartbeat>  
            <writeHost host="server1" url="127.0.0.1:3306" user="root" password="rootroot"/>  
    </dataHost>  

schema name和schema.xml中保持一致,这里为TESTDB。table配置你需要分片的表,dataNode与下面配置一致,有几个配几个,rule是路由的规则,和rules.xml中规则保持一致。最下面host配置真实数据库的地址端口用户密码。

rules.xml

Rule.xml defines the rule definitions involved in splitting the table. We can flexibly use different sharding algorithms
for the table, or use the same algorithm for the table but with different specific parameters. There are mainly two tags tableRule and function in this file. In the specific
use process, tableRule and functions can be added as required.

    <tableRule name="idMod">
        <rule>
            <columns>id</columns>
            <algorithm>mod-long</algorithm>
        </rule>
    </tableRule>

The name attribute specifies a unique name used to identify different table rules.
The embedded rule tag specifies which column in the physical table to split and what routing algorithm to use.
Specify the name of the column to be split in columns.
The algorithm uses the name attribute in the function tag. Connection table rules and specific routing algorithms. Of course, multiple table rules can be linked to
the same routing algorithm. Used within table tags. Let the logical table be sharded using this rule.

    <function name="mod-long" class="io.mycat.route.function.PartitionByMod">
        <!-- how many data nodes -->
        <property name="count">3</property>
    </function>

name specifies the name of the algorithm (the count attribute in mod-long represents several libraries)
class specifies the specific class name of the routing algorithm (you can write your own routing algorithm here for expansion)
property is some attributes that the specific algorithm needs to use.

3. Verify

We are all configured, then restart MyCat.
We directly use the MySQL command to connect to MyCat, the default port is 8066

[root@localhost bin]# ./mycat start
Starting Mycat-server...

[root@localhost bin]# mysql -uroot -p123456 -h127.0.0.1 -P8066 -DTESTDB
mysql> show databases;

+----------+
| DATABASE |
+----------+
| TESTDB   |
+----------+
1 row in set (0.01 sec)

mysql> show tables;

+------------------+
| Tables in TESTDB |
+------------------+
| users            |
+------------------+
1 row in set (0.00 sec)

Next, we can insert data into it and check it out.

mysql> insert into users(id,name,indate) values(1,'lvbu',now()); 
mysql> insert into users(id,name,indate) values(2,'zhaoyun',now()); 
mysql> insert into users(id,name,indate) values(3,'dianwei',now()); 

mysql> select * from users order by id;

+----+---------+---------------------+
| id | name    | indate              |
+----+---------+---------------------+
|  1 | lvbu    | 2018-04-20 03:30:56 |
|  2 | zhaoyun | 2018-04-20 03:31:05 |
|  3 | dianwei | 2018-04-20 03:31:15 |
+----+---------+---------------------+

Then connect to the backend real MySQL database to check the data distribution.

[root@localhost bin]# mysql -uroot -proot
mysql> select * from db01.users;

+----+---------+---------------------+
| id | name    | indate              |
+----+---------+---------------------+
|  3 | dianwei | 2018-04-20 03:31:15 |
+----+---------+---------------------+

mysql> select * from db02.users;

+----+------+---------------------+
| id | name | indate              |
+----+------+---------------------+
|  1 | lvbu | 2018-04-20 03:30:56 |
+----+------+---------------------+

mysql> select * from db03.users;

+----+---------+---------------------+
| id | name    | indate              |
+----+---------+---------------------+
|  2 | zhaoyun | 2018-04-20 03:31:05 |
+----+---------+---------------------+

It can be seen that the data is evenly distributed into the three tables, which shows that our sharding strategy has worked.

4. Summary

In large-scale distributed systems, with the increase in the amount of data, distributed databases are the general trend. MyCat only provides a convenient tool, but more important lies in the strategy of sub-database sub-table, routing rules, and forward-looking planning. Thank you for watching

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324638418&siteId=291194637