First experience with PolarDB-X distributed database

1. What is a distributed database?

Distributed database is relative to single database.
Simply put, distributed database is a database implemented with distributed architecture.

Current distributed databases can be divided into three technical directions:

  • The first is Sharding technology represented by DRDS, TDSQL, etc. Its biggest advantage is that it inherits many years of technology accumulation from MySQL storage;

  • The second is NewSQL, represented by Cockroach/YugabyteDB/TiDB. The biggest advantage lies in the fully self-developed technology stack, which provides horizontal scaling and native distributed capabilities, focusing on Raft/Paxos data high availability and distributed strong consistency transactions. Typical technology to meet users’ requirements for data consistency under distributed conditions;

  • The third is cloud-native DB represented by PolarDB/Aurora, which is characterized by cloud-based virtualization technology and the ability to provide resource pooling.

As for how awesome PolarDB-X is, we won’t introduce it here. If you are interested, you can search it.

Next, this article mainly introduces the PolarDB-X architecture and installation method.

2. What does PolarDB-X exist like?

2.1 Product Architecture

Insert image description here

The entire architecture core of PolarDB-X is divided into 4 parts:

  • CN (full name: Compute Node/computing node, codenamed GalaxySQL) mainly provides distributed SQL engines to solve distributed transaction coordination, optimizers, executors, etc.
  • DN (full name: Data Node/storage node, codenamed GalaxyEngine), mainly provides data storage engines, such as InnoDB and self-developed storage engines (X-Engine and mysterious column storage), solves data consistency and persistence, and provides calculation push-down The ability meets distributed requirements and can support local disks and shared storage.
  • GMS (full name: Global Meta Service/Global Metadata Service) mainly provides distributed metadata and global timing services, such as TSO, table metadata information, etc.
  • CDC (full name: Change Data Capture/log node, codename GalaxyCDC) is responsible for the generation, distribution and subscription of global incremental logs. Through GalaxyCDC, PolarDB-X database can provide external incremental logs that are fully compatible with MySQL Binlog format and protocol, and can achieve seamless connection with MySQL Binlog downstream ecological tools.

2.2 Physical topology

Insert image description here

PolarDB-X provides backend-oriented database management and user console for interaction, and provides OpenAPI. Users can complete management and control integration based on OpenAPI.

PolarDB-X provides database instantiation, and the instance specifications are 8c32g/32c128g, etc.

An instance will physically consist of 4 types of resources:

  • Three copies of GMS
  • A set of CN nodes (computing nodes)
  • A set of DN nodes (storage nodes)
  • A set of CDC (providing global binlog)

Metadata and RPC requests are exchanged between components.

The biggest difference between different instance specifications is the number of CN/DN nodes. The instance specifications will be linearly consistent with the number of CN/DN nodes.

For external users, a PolarDB-X instance will eventually be accessed through the VIP/DNS of the access point (endpoint). To the user, it feels like a MySQL instance, which can be accessed using the MySQL command line, GUI client, etc. access.

3. Deploy PolarDB-X

The deployment methods of PolarDB-X mainly include deployment through PXD, deployment through K8S, and deployment through source code compilation, etc.

Next, the installation and use will be introduced using the PXD deployment method.

Deploying PolarDB-X database through PXD tool requires installing Python3 and Docker first.

Install Python3. If it is a Mac, you can use brew install python. More specific installation methods can be searched online.
To install Docker Desktop for Mac, refer to the documentation: https://docs.docker.com/desktop/mac/install/

3.1 Install PXD

python3 -m venv venv
source venv/bin/activate

# 安装前建议先执行如下命令升级 pip
pip install --upgrade pip

pip install pxd

3.2 Deploy PolarDB-X

Directly running the pxd tryout command will create a latest version of the PolarDB-X database, including 1 GMS, CN, DN, and CDC nodes each:

pxd tryout

You can also specify the number and version of CN, DN, and CDC nodes. The command is as follows:

pxd tryout -cn_replica 1 -cn_version latest -dn_replica 1 -dn_version latest -cdc_replica 1 -cdc_version latest

GMS and DN created in tryout mode adopt single-copy mode by default. If you want to create a three-copy cluster based on Paxos, use the following command:

pxd tryout -leader_only false

output:

pxd tryout
/Users/lanyangyang/workspace/venv/lib/python3.9/site-packages/deployer
Start creating PolarDB-X cluster pxc-tryout on your local machine
PolarDB-X Cluster params:
 * cn count: 1, version: latest
 * dn count: 1, version: latest
 * cdc count: 1, version: latest
 * gms count: 1, version: latest
 * leader_only: True
Processing  [------------------------------------]    0%    pre check
Processing  [##----------------------------------]    7%    generate topology
Processing  [#####-------------------------------]   15%    check docker engine version
Processing  [########----------------------------]   23%    pull images
Pull image: polardbx/galaxysql:latest at 127.0.0.1


latest:Pulling from polardbx/galaxysql
... ...

Status: Downloaded newer image for polardbx/galaxycdc:latest
Processing  [###########-------------------------]   30%    create gms node



Processing  [#############-----------------------]   38%    create gms db and tables
Processing  [################--------------------]   46%    create PolarDB-X root account
Processing  [###################-----------------]   53%    create dn
Processing  [######################--------------]   61%    register dn to gms
Processing  [########################------------]   69%    create cn
Processing  [###########################---------]   76%    wait cn ready
Processing  [##############################------]   84%    create cdc containers
Processing  [#################################---]   92%    wait PolarDB-X ready
Processing  [####################################]  100%


PolarDB-X cluster create successfully, you can try it out now.
Connect PolarDB-X using the following command:

    mysql -h127.0.0.1 -P62450 -upolardbx_root -pxxxx

After the PolarDB-X database is created, the corresponding connection information will be output.

Connect to the PolarDB-X instance through the MySQL command line and execute the following SQL to initially experience the distributed features of PolarDB-X.

Check GMS

> select * from information_schema.schemata;
+--------------+--------------------+----------------------------+------------------------+----------+--------------------+
| CATALOG_NAME | SCHEMA_NAME        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | DEFAULT_ENCRYPTION |
+--------------+--------------------+----------------------------+------------------------+----------+--------------------+
| def          | information_schema | utf8                       | UTF8_GENERAL_CI        | NULL     | NO                 |
+--------------+--------------------+----------------------------+------------------------+----------+--------------------+
1 row in set (1.59 sec)

Create partition table

create database polarx_example partition_mode='partitioning';

use polarx_example;

create table example (
  `id` bigint(11) auto_increment NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `score` bigint(11) DEFAULT NULL,
  primary key (`id`)
) engine=InnoDB default charset=utf8 
partition by hash(id) 
partitions 8;

Insert data

insert into example values(null,'lily',375),(null,'lisa',400),(null,'ljh',500);
mysql> select * from example;
+----+------+-------+
| id | name | score |
+----+------+-------+
|  3 | ljh  |   500 |
|  1 | lily |   375 |
|  2 | lisa |   400 |
+----+------+-------+
3 rows in set (0.09 sec)
mysql> show topology from example;
+------+-----------------------------+---------------+----------------+-----------------------+-----------------+
| ID   | GROUP_NAME                  | TABLE_NAME    | PARTITION_NAME | PHY_DB_NAME           | DN_ID           |
+------+-----------------------------+---------------+----------------+-----------------------+-----------------+
|    0 | POLARX_EXAMPLE_P00000_GROUP | example_00000 | p1             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    1 | POLARX_EXAMPLE_P00000_GROUP | example_00001 | p2             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    2 | POLARX_EXAMPLE_P00000_GROUP | example_00002 | p3             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    3 | POLARX_EXAMPLE_P00000_GROUP | example_00003 | p4             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    4 | POLARX_EXAMPLE_P00000_GROUP | example_00004 | p5             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    5 | POLARX_EXAMPLE_P00000_GROUP | example_00005 | p6             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    6 | POLARX_EXAMPLE_P00000_GROUP | example_00006 | p7             | polarx_example_p00000 | pxc-tryout-dn-0 |
|    7 | POLARX_EXAMPLE_P00000_GROUP | example_00007 | p8             | polarx_example_p00000 | pxc-tryout-dn-0 |
+------+-----------------------------+---------------+----------------+-----------------------+-----------------+
8 rows in set (0.01 sec)

Check CDC

mysql> show master status ;
+---------------+----------+--------------+------------------+-------------------+
| FILE          | POSITION | BINLOG_DO_DB | BINLOG_IGNORE_DB | EXECUTED_GTID_SET |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000001 |        4 |              |                  |                   |
+---------------+----------+--------------+------------------+-------------------+
1 row in set (0.78 sec) 

show binlog events in 'binlog.000001' from 4;

Check DN

mysql> show storage;
+-----------------+------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| STORAGE_INST_ID | LEADER_NODE      | IS_HEALTHY | INST_KIND | DB_COUNT | GROUP_COUNT | STATUS | DELETABLE | DELAY | ACTIVE |
+-----------------+------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
| pxc-tryout-dn-0 | 172.17.0.3:16689 | true       | MASTER    | 1        | 1           | 0      | false     | null  | null   |
| pxc-tryout-gms  | 172.17.0.2:17415 | true       | META_DB   | 2        | 2           | 0      | false     | null  | null   |
+-----------------+------------------+------------+-----------+----------+-------------+--------+-----------+-------+--------+
2 rows in set (0.02 sec)

Check CN

mysql> show mpp;
+------------+------------------+------+--------+
| ID         | NODE             | ROLE | LEADER |
+------------+------------------+------+--------+
| pxc-tryout | 172.17.0.4:62452 | W    | Y      |
+------------+------------------+------+--------+
1 row in set (0.01 sec)

Check PolarDB-X status

Execute the following command to view the PolarDB-X list of the current environment:

(venv)  ~/workspace> pxd list
/Users/lanyangyang/workspace/venv/lib/python3.9/site-packages/deployer
NAME                          CN        DN        CDC       STATUS
pxc-tryout                    1         1         1         running

To clean up PolarDB-X
, execute the following command to clean up all PolarDB-X in the local environment:

(venv)  ~/workspace > pxd cleanup
/Users/lanyangyang/workspace/venv/lib/python3.9/site-packages/deployer
Prepare to delete all PolarDB-X clusters
All PolarDB-X clusters will be deleted, do you want to continue? [y/N]: y
Prepare to delete PolarDB-X cluster: pxc-tryout
stop and remove container: pxc-tryout-cn-DmUZ-62450, id: e04d9b3925 at 127.0.0.1
stop and remove container: pxc-tryout-cdc-Ssvv, id: f3bfd66e1f at 127.0.0.1
stop and remove container: pxc-tryout-gms-Cand-17415, id: b26d81089d at 127.0.0.1
stop and remove container: pxc-tryout-dn-0-Cand-16689, id: ea2f00e4ea at 127.0.0.1

4.Reference

github address

PolarDB-X official Zhihu account

PolarDB-X open source course series

DOC

Guess you like

Origin blog.csdn.net/lanyang123456/article/details/128124566