1. Build a master-slave architecture
The concurrency capability of a single node Redis
has an upper limit. To further improve Redis
the concurrency capability, it is necessary to build a master-slave cluster to achieve read-write separation.
2. Master-slave data synchronization principle
2.1. Full synchronization
When the master-slave establishes a connection for the first time, it will perform full synchronization and master
copy all the data of the node to slave
the node. The process is as follows:
Here is a question, master
how do you know salve
that it is the first time to connect? ?
There are several concepts that can be used as the basis for judgment:
Replication Id
: Abbreviationreplid
, is the mark of the data set, andid
the same data set indicates the same data set. Eachmaster
has a uniquereplid
,slave
will inheritmaster
the nodereplid
.offset
: The offset, which willrepl_baklog
gradually increase as the data recorded in increases.slave
The current synchronization is also recorded when the synchronization is completedoffset
. Ifslave
isoffset
lessmaster
thanoffset
, it means thatslave
the data is behindmaster
and needs to be updated.
Therefore, slave
to do data synchronization, you must master
declare your own replication id
sum to the user offset
, master
and then you can determine which data needs to be synchronized.
Because slave
it was originally one master
, with its own replid
sum offset
, when it became the first time slave
, when the connection was established, the sum master
sent was its own sum .replid
offset
replid
offset
master
Judging and finding slave
that the sent one replid
is inconsistent with your own, it means that this is a brand new one slave
, and you know that you need to do full synchronization.
master
It will send its own replid
sum to this and save this information. The following will be consistent with .offset
slave
slave
slave
replid
master
Therefore, master
the basis for judging whether a node is the first synchronization is to see replid
whether it is consistent .
As shown in the picture:
Full process description:
slave
The node requests incremental synchronization;master
Node judgesreplid
, finds inconsistencies, and rejects incremental synchronization;master
Generate complete memory dataRDB
and sendRDB
it toslave;
slave
Clear local data,master
loadedRDB;
master
RecordRDB
the commands in the periodrepl_baklog
and continuelog
to send the commands in toslave
;slave
Executes the received command, maintainingmaster
synchronization with .
2.2. Incremental synchronization
Full synchronization needs to be done first RDB
, and then RDB
the files are transferred through the network slave
, which is too costly. Therefore, except for the first full synchronization, incremental synchronizationslave
is master
performed most of the time .
What is delta sync? It is to update only part of the data that is different from the existing one slave
. master
As shown in the picture:
So master
how do you know slave
where the difference is with your own data?
2.3. repl_backlog
Principle
master
How do you know slave
where the data differs from your own?
This is about the files during full synchronization repl_baklog
.
This file is an array with a fixed size, but the array is circular, that is to say, after the subscript reaches the end of the array, it will start reading and writing from 0 again , so that the data at the head of the array will be overwritten.
repl_baklog
will record Redis
the processed command log and offset
, including master
the current one offset
, and the slave
one that has been copied to offset
:
slave
The difference between and is master
the data that needs to be copied incrementally.offset
salve
With the continuous data writing, master
the offset
gradually becomes larger, slave
and it is also copied continuously, catching master
up offset
:
until the array is filled:
At this point, if new data is written, the old data in the array will be overwritten. However, as long as the old data is green, it means that slave
the data has been synchronized, even if it is overwritten, it will have no effect. Because only the red part is not synchronized.
However, if slave
there is network congestion, the resulting far exceeds master
that of :offset
slave
offset
If master
you continue to write new data, it offset
will overwrite the old data until slave
the current one offset
is also overwritten:
The red part in the brown box is the data that has not been synchronized but has been overwritten. If slave
you recover at this time, you need to synchronize, but you find that offset
you have lost all of your own, and you cannot complete incremental synchronization. Can only do full sync.
3. Master-slave synchronization optimization
Master-slave synchronization can ensure the consistency of master-slave data, which is very important.
Redis
The master-slave cluster can be optimized from the following aspects :
- Enable diskless replication in
master
the configuration to avoid disks during full synchronization .repl-diskless-sync yes
IO
Redis
The memory usage on a single node should not be too large to reduceRDB
the excessive disks causedIO
.- Properly increase
repl_baklog
the size,slave
realize fault recovery as soon as possible when a downtime is found, and avoid full synchronization as much as possible. - Limit the number of nodes
master
on oneslave
node. If there are too manyslave
, you can use a master-slave-slave chain structure to reducemaster
pressure.
Master-slave architecture diagram:
4. Summary
Briefly describe the difference between full synchronization and incremental synchronization?
- Full synchronization:
master
generateRDB
and send complete memory dataRDB
toslave
. Subsequent commands are recorded inrepl_baklog
and sent to one by oneslave
. - Incremental synchronization:
slave
Submit your ownoffset
tomaster
,master
and getrepl_baklog
fromoffset
subsequent commands toslave
.
When to perform full synchronization?
slave
When a node connects to a node for the first timemaster
.slave
repl_baklog
The node has been disconnected for too long andoffset
has been overwritten.
When is an incremental sync performed?
slave
When a node goes down and comes back up, andrepl_baklog
can be found inoffset
.