I. Introduction
Through the following series of chapters:
Spring Boot integrates ShardingSphere to implement data fragmentation (1) | Spring Cloud 40
Spring Boot integrates ShardingSphere to realize data fragmentation (2) | Spring Cloud 41
Spring Boot integrates ShardingSphere to realize data fragmentation (3) | Spring Cloud 42
Spring Boot integrates ShardingSphere to achieve read-write separation | Spring Cloud 43
ShardingSphere 5.3 Series Spring Configuration Upgrade Guide | Spring Cloud 47
We ShardingSphere
have a detailed understanding of data sharding, the application of each sharding algorithm, read-write separation, data encryption, data desensitization, and the latest version upgrade. Today we will continue to learn about another of its products ShardingSphere
.ShardingSphere-Proxy
二、ShardingSphere-Proxy
ShardingSphere-Proxy
Positioned as a transparent database agent, it provides support for heterogeneous languages by implementing the database binary protocol. Currently provides MySQL
and PostgreSQL
protocol, transparent database operation, DBA
more .
It is completely transparent to the application and can MySQL/PostgreSQL
be used ;
Compatible MariaDB
with MySQL
databases based on the protocol such as , openGauss
and PostgreSQL
databases based on the protocol such as ;
applicable to any client compatible with MySQL/PostgreSQL
the protocol , such as: MySQL Command Client
, MySQL Workbench
, Navicat
etc.
2.1 Product features
characteristic | definition |
---|---|
data sharding | Data sharding is an effective means to deal with massive data storage and computing. ShardingSphere provides a distributed database solution based on the underlying database, which can scale computing and storage horizontally. |
distributed transaction | Transaction capability is the key technology to ensure the integrity and security of the database, and it is also the core technology of the database. Based on the hybrid transaction engine of XA and BASE, ShardingSphere provides distributed transaction functions on independent databases to ensure data security across data sources. |
read-write separation | Read-write separation is a means to deal with high-pressure business access. Based on the understanding of SQL semantics and the awareness of the underlying database topology, ShardingSphere provides flexible read-write traffic splitting and read-traffic load balancing. |
Gao Ke | High availability is a basic requirement for data storage and computing platforms. ShardingSphere provides distributed high availability capabilities based on database clusters in native or Kubernetes environments. |
data migration | Data migration is a key capability to open up the data ecology. ShardingSphere provides data migration capabilities across data sources and supports re-sharding extensions. |
federated query | Federated query is an effective means of utilizing data in a complex data environment. ShardingSphere provides complex query and analysis capabilities across data sources, enabling cross-source data association and aggregation. |
data encryption | Data encryption is the basic means to ensure data security. ShardingSphere provides a complete, transparent, secure, and low-cost data encryption solution. |
shadow library | In the full-link stress test scenario, ShardingSphere supports data isolation under different workloads to prevent test data from polluting the production environment. |
2.2 Comparison with ShardingSphere-JDBC
dimension | ShardingSphere-JDBC | ShardingSphere-Proxy |
---|---|---|
database | arbitrarily | MySQL/PostgreSQL |
connection consumption | high | Low |
heterogeneous language | Java only | arbitrarily |
performance | low loss | Slightly higher loss |
Decentralized | yes | no |
static entry | none | have |
3. Deployment and use
This article is based on
Docker
the deploymentShardingSphere-Proxy
without additional dependencies. The deployment version is5.3.2
deployed using the binary distribution packageShardingSphere-Proxy
, which requires the environment to haveJava JRE 8
or a higher version.
3.1 Operation steps
3.1.1 Get the docker image
docker pull apache/shardingsphere-proxy:5.3.2
3.1.2 Configure conf/server.yaml and conf/config-*.yaml
The configuration file template can be
docker
obtained from the container and copied to any directory on the host:
docker run -d --name tmp --entrypoint=bash apache/shardingsphere-proxy:5.3.2
docker cp tmp:/opt/shardingsphere-proxy/conf /root/apps/shardingsphere-proxy
docker rm tmp
Since the network environment in the container may be different from that of the host machine, if errors such as failure to connect to the database are reported at startup, please ensure that the database specified in the
conf/config-*.yaml
configuration fileip
can be accessed insidedocker
the container .
3.1.3 Introduce database driver (optional)
If the backend is connected to MySQL
the database , please download the driver, create ext-lib
a directory anywhere in the host, put the driver in, and mount the directory when starting the container.
3.1.4 Configure conf/server.yaml
ShardingSphere-Proxy
The running mode is configured server.yaml
in , and the configuration format ShardingSphere-JDBC
is consistent with that of , please refer to Mode Configuration .
I configured stand-alone mode and enabled permissions:
mode:
type: Standalone
repository:
type: JDBC
authority:
users:
- user: root@%
password: root
- user: sharding
password: sharding
privilege:
type: ALL_PERMITTED
For other configuration items, please refer to:
3.1.5 Configure conf/config-*.yaml
conf
Modify the files starting with config-
the prefix the host directory, such as: conf/config-sharding.yaml
files, and configure the fragmentation rules.
config-*.yaml
*
Sections of the file can be named arbitrarily. ShardingSphere-Proxy
Multiple logical data sources are supported, and config-
each yaml
configuration file named with a prefix is a logical data source.
conf/config-sharding.yaml
File example:
databaseName: sharding_db
dataSources:
ds1:
url: jdbc:mysql://192.168.0.35:3306/db1?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true&allowMultiQueries=true&serverTimezone=Asia/Shanghai
username: root
password: '1qaz@WSX'
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
minPoolSize: 1
ds2:
url: jdbc:mysql://192.168.0.46:3306/db2?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true&allowMultiQueries=true&serverTimezone=Asia/Shanghai
username: root
password: '1qaz@WSX'
connectionTimeoutMilliseconds: 30000
idleTimeoutMilliseconds: 60000
maxLifetimeMilliseconds: 1800000
maxPoolSize: 50
minPoolSize: 1
rules:
- !SHARDING
autoTables:
# 取模
t_auto_order_mod:
actualDataSources: ds$->{
1..2}
shardingStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: auto_order_mod
# 分布式序列策略
keyGenerateStrategy:
# 自增列名称,缺省表示不使用自增主键生成器
column: order_id
# 分布式序列算法名称
keyGeneratorName: snowflake
# 散列取模
t_auto_order_hash_mod:
actualDataSources: ds1
shardingStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: auto_order_hash_mod
# 分布式序列策略
keyGenerateStrategy:
# 自增列名称,缺省表示不使用自增主键生成器
column: order_id
# 分布式序列算法名称
keyGeneratorName: snowflake
# 容量范围
t_auto_order_volume_range:
actualDataSources: ds$->{
1..2}
shardingStrategy:
standard:
shardingColumn: price
shardingAlgorithmName: auto_order_volume_range
# 分布式序列策略
keyGenerateStrategy:
# 自增列名称,缺省表示不使用自增主键生成器
column: order_id
# 分布式序列算法名称
keyGeneratorName: snowflake
# 边界范围
t_auto_order_boundary_range:
actualDataSources: ds$->{
1..2}
shardingStrategy:
standard:
shardingColumn: price
shardingAlgorithmName: auto_order_boundary_range
# 分布式序列策略
keyGenerateStrategy:
# 自增列名称,缺省表示不使用自增主键生成器
column: order_id
# 分布式序列算法名称
keyGeneratorName: snowflake
# 自动日期间隔
t_auto_order_auto_interval:
actualDataSources: ds$->{
1..2}
shardingStrategy:
standard:
shardingColumn: create_time
shardingAlgorithmName: auto_order_auto_interval
# 分布式序列策略
keyGenerateStrategy:
# 自增列名称,缺省表示不使用自增主键生成器
column: order_id
# 分布式序列算法名称
keyGeneratorName: snowflake
# 分片算法配置
shardingAlgorithms:
# 取模
auto_order_mod:
type: MOD
props:
sharding-count: 6
# 散列取模
auto_order_hash_mod:
type: HASH_MOD
props:
sharding-count: 6
# 容量范围
auto_order_volume_range:
type: VOLUME_RANGE
props:
range-lower: 0
range-upper: 20000
sharding-volume: 10000
# 边界范围
auto_order_boundary_range:
type: BOUNDARY_RANGE
props:
sharding-ranges: 10,15,100,12000,16000
# 自动日期间隔
auto_order_auto_interval:
type: AUTO_INTERVAL
props:
datetime-lower: "2023-05-07 00:00:00"
datetime-upper: "2023-05-10 00:00:00"
sharding-seconds: 86400
# 分布式序列算法配置(如果是自动生成的,在插入数据的sql中就不要传id,null也不行,直接插入字段中就不要有主键的字段)
keyGenerators:
# 分布式序列算法名称
snowflake:
# 分布式序列算法类型
type: SNOWFLAKE
The configuration here adopts the automatic allocation algorithm. For specific instructions, please refer to: Spring Boot Integrates ShardingSphere Fragmentation Tool AutoTable (2) - Example of Automatic Fragmentation Algorithm | Spring Cloud 46
3.1.6 Start the ShardingSphere-Proxy container
version: "3.8"
# 通用日志设置
x-logging:
&default-logging
# 日志大小和数量
options:
max-size: "100m"
max-file: "3"
# 文件存储类型
driver: json-file
services:
shardingsphere-proxy:
image: apache/shardingsphere-proxy:5.3.2
container_name: shardingsphere-proxy
environment:
- PORT=3308
- JAVA_OPTS=XX:InitialRAMPercentage=80.0 -XX:MaxRAMPercentage=80.0 -XX:MinRAMPercentage=80.0
volumes:
- /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime #设置系统时区
- /root/apps/shardingsphere-proxy/conf:/opt/shardingsphere-proxy/conf
- /root/apps/shardingsphere-proxy/ext-lib:/opt/shardingsphere-proxy/ext-lib
ports:
- "13308:3308"
restart: always
logging: *default-logging
Among them,
ext-lib
it is not necessary, and the user can mount it on demand.ShardingSphere-Proxy
The default port3307
, which can-e PORT
be specified . CustomJVM
related parameters canJVM_OPTS
be set through environment variables.
3.2 Connection example
3.2.1 Use the client to connect to ShardingSphere-Proxy
MySQL/PostgreSQL/openGauss
Execute the client command directly to operate ShardingSphere-Proxy
.
Connect using MySQL
the client ShardingSphere-Proxy
:
mysql -h${proxy_host} -P${proxy_port} -u${proxy_username} -p${proxy_password}
Connect using PostgreSQL
the client ShardingSphere-Proxy
:
psql -h ${proxy_host} -p ${proxy_port} -U ${proxy_username}
Connect using openGauss
the client ShardingSphere-Proxy
:
gsql -r -h ${proxy_host} -p ${proxy_port} -U ${proxy_username} -W ${proxy_password}
3.2.2 Use Navicat to connect to ShardingSphere-Proxy
Please check the corresponding logical database in the data label column, where sharding_db is the name of the logical database, defined config-sharding.yaml
by .
3.3 General use
The logical table defining the sharding rules above can be operated normally CRUD
.
3.4 DistSQL
3.4.1 Definition
DistSQL(Distributed SQL)
It is a Apache ShardingSphere
unique operating language. It is used in exactly the same way as the SQL
standard to provide SQL
a level of operational capability for incremental functionality.
Flexible rule configuration and resource management and control capabilities are one of Apache ShardingSphere
its features.
When using 4.x
and its previous versions, although developers can manipulate data like native databases, they need to configure resources and rules through local files or registry. However, changes in operating habits are not friendly to operation and maintenance engineers.
Starting from 5.x
the version , DistSQL(Distributed SQL)
it allows users to operate like a database Apache ShardingSphere
, transforming it from a developer-oriented framework and middleware to a database product for operation and maintenance personnel.
3.4.2 Related concepts
DistSQL
Subdivided into four types: RDL
, RQL
, and :RAL
RUL
-
RDL
Resource & Rule Definition Language
, which is responsible for the creation, modification, and deletion of resources and rules. -
RQL
Resource & Rule Query Language
, responsible for querying and displaying resources and rules. -
RAL
Resource & Rule Administration Language
, responsible for mandatory routing, circuit breaker, configuration import and export, data migration control and other management functions. -
RUL
Resource & Rule Utility Language
is responsible for functions such as SQL parsing, SQL formatting, and execution plan preview.
3.4.3 Impact on the system
- Before
DistSQL
Before owning , the user used SQL
the statement to manipulate the data while using YAML
the file to manage ShardingSphere
the configuration, as shown in the following figure:
At this time, users have to face the following problems:
-
Need to operate data and manage
ShardingSphere
rules ; -
Multiple logic libraries require multiple
YAML
files ; -
Modify
YAML
requires editing permission for the file; -
YAML
A reboot is required after modificationShardingSphere
. -
after
With the advent DistSQL
of , ShardingSphere
the way of operating on has also been changed:
Now, the user experience has been greatly improved:
- Use the same client to manage data and
ShardingSphere
configuration ; - Do not create additional
YAML
files ,DistSQL
manage logic libraries through ; - File editing permissions are no longer required, and configurations are managed
DistSQL
through ; - Configuration changes take effect in real time without restarting
ShardingSphere
.
Restrictions on use:
DistSQL
only for useShardingSphere-Proxy
,ShardingSphere-JDBC
not available for now.
3.4.3 DistSQL syntax rules
Please refer to the official website description: https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/distsql/syntax/