MGR overall structure and characteristics
single-master
Only one node is written, can be read
multi-master
Each node can be written and read
The concept involves:
group communication system (GCS)
writeset
membership
cerification info
flow control stats
paxos
MGR read enhancing consistency
group_replication_consistency (8.0.14引入)
EVENTUAL: Default
BEFORE: waiting in the queue to perform all transactions completed
BEFORE_ON_PRIMARY_FAILOVER: waiting for a new primary after executing a transaction queue
AFTER: waiting for data changes in all other nodes have all been applied
BEFORE_AND_AFTER:
MGR restrictions
Only supports InnoDB, you must have a primary health
Binlog format: Row, close binlog checksum
It must be turned GTID
Transaction isolation level: READ COMMITTED (no gap lock)
Large transaction limit: group_replication_transaction_size_limit
Multi master mode: to avoid the different nodes on the same table concurrently DDL / DML
The maximum cluster node; 9 (odd number)
How MGR data to synchronize data
MGR data replication -> Services Certification
Services Certification
Collision Detection
certification_info key: xxhash64 (Value Index Name + DB + DB name table name + length + length + the table name index constituting each column of a unique length + value) is the Value of the transaction gtid_executed
Transaction allocation gtid
group_replication_gtid_assignment_block_size
The branch submitted (group commit)
MGR data replication conflict resolution
problem:
The system will write to tell centification_info for increasingly larger, performance will be getting worse?
Approach:
centification_info the introduction of clean-up mechanism
Copy data flow control MGR
Flow Control
Flow control purposes
Controlled to ensure that the cluster delay (for read-only transactions are not within the control range of the flow)
The reason for flow control
Each node inconsistent performance
Bucket short board effect
parameter
group_replication_flow_control_mode default: quota open flow control
group_replication_flow_control_period How often flow control statistics, unit: seconds
How many transactions to be authenticated group_replication_flow_control_applier_threshold & group_replication_flow_control_certifier_threshold affairs certification queue accumulated more than just trigger node flow control
MGR monitoring points
The current node is not online
select member_state from performance_schema.replication_group_members;
Is there is a delay
获取到的: select received_transaction_set from performance_schema.replication_connection_status;
It has been executed: select @@ gtid_executed
The current backlog queue is not there
select count_transaction_in_queue from performanct_schema.replication_group_member_stats where member_id=@@server_uuid;
The current node is not writable
select * from performance_shcema.global_variables where variable_name in ('read_only','super_read_only');
MGR optimization direction
The operation and maintenance
Wiki structure of this copy operation, all data replication, or the logic of reproduction, so optimization is also copy optimization points.
change:
slave_parallel_type -> LOGICAL_CLOCK
Enhanced number SQL_THREAD:
slave_parallel_workers -> 2-8
If the CPU bottleneck, the network no problem, reducing CPU compression:
group_replication_compression_threshold = 1000000 -> 2000000
Increased from 1M to become 2M, then compressed (mainly optimized for large transaction transmission)
For, after all, write the amount of the environment
Using single-master
On the table structure design: to reduce the number of indexes, multi-use joint index
Kernel
Attempts have been made: static const int BROADCASE_GTID_EXECUTED_PERIOD = 60> 30; // seconds
Important parameters:
group_replication_member_expel_timeout (8.0.13+)
After (5 + X) seconds, the node is removed from the group of members romance
Network anomalies -> 5 seconds -> lost to guess -> X-sec / UNREACHABLE -> removed
X seconds, group can not add nodes, delete nodes, Primary Election
group_replication_unreachable_majority_timeout
After network partition, minorty members within X seconds failed to restore the connection to the majority, enter ERROR
group_replication_exit_state_action (8.0.12+, 5.7.24+)
ABORT_SERVER / READ_ONLY
aplier execution error / loss associated with majority / churn group is removed
group_replication_recovery_complete_at
TRANSACTIONS_CERTIFIED / TRANSACTIONS_APPLIED
group_replication_member_weight
Useful in the lower single primary, node roles unequal situation
Same group_replication_member_weight, depending server_uuid
group_replication_transaction_size_limit (5.7.19+)
The maximum number of bytes in a single transaction limit, it can control network overhead, memory allocation, the probability of conflict
group_replication_compression_threshold
After more than X bytes, open LZ4 compression Affairs transmission, default 1MB
MGR deployment architecture recommendations
MySQLrouter + MGR
router: two interfaces (read, insert)
MySQL needs to look at the X protocol, js-related related operations
Recommended alternative ProxySQL
If for performance: Single-master
Easy to use: Multi-master (single writing point, multi-point reading)