Reference article:
https://blog.51cto.com/liubao0312/1677586
https://github.com/wsgzao/sersync
1. Introduction
1.1 Introduction to Sersync
(1) Sersync is developed based on inotify, similar to inotify-tools
(2) Sersync can record the name of a specific file or directory that has changed (including addition, deletion, modification) in the monitored directory , and then when using rsync to synchronize, only the changed files or directories are synchronized
1.2, the difference between rsync+inotify-tools and rsync+sersync architecture
(1)rsync+inotify-tools
a. Inotify can only record the changes (addition, deletion, modification) of the monitored directory and does not record the changes of specific files or directories ;
b. When rsync is synchronizing, because I don’t know which file or directory has changed, the entire directory is synchronized each time. When the amount of data is large, the entire directory synchronization is very time-consuming (rsync needs to Directory traversal to find comparison files), so the efficiency is very low.
(2)rsync+sersync
a. Sersync can record the name of a specific file or directory that has changed (addition, deletion, modification) in the monitored directory;
b. During synchronization, rsync only synchronizes the files or directories that have changed (the data that changes each time is small compared to the entire synchronization directory data, and rsync is very fast when traversing to find the comparison file), so the efficiency is very high.
to sum up:
When the amount of synchronized directory data is not large, it is recommended to use rsync+inotify
When the amount of synchronized directory data is large (a few hundred G or more than 1T), it is recommended to use rsync+sersync
2. Principle description
Principle steps:
-
Start the sersync service on the master server (Master), sersync is responsible for monitoring the real-time changes of files in the configuration path (users create, modify, and delete files on this server);
-
Call the rsync command in the Master to push (Push) the updated files to the target server (Slave1 and Slave2);
-
Need to configure sersync on the main server and rsync on the target server
The principle is shown in the figure:
Three, configuration
Environment description:
System: Ubuntu 18.04
Master: 192.168.43.166
Slave: 192.168.43.97
3.1, configure the Rsync of the target server Slave
3.1.1, start rsync service at boot
Ubuntu 18.04 installs rsync by default, but the rsync service is not started by default. We need to modify the file sudo vi /etc/default/rsync
.
RSYNC_ENABLE=true #将false改true
3.1.2, modify the configuration file
First copy the configuration file to the etc directory for easy modification
sudo cp /usr/share/doc/rsync/examples/rsyncd.conf /etc
Modify the configuration file
sudo vi /etc/rsyncd.conf
Amend to the following information:
Note: If the path synchronization directory does not exist, you need to create it yourself
3.1.3, create a password file
Create a password sudo vi /etc/rsyncd.secrets
file: . The content is as follows:
hadoop:123456 #格式必须要为 用户名:密码
Note: The user name and password here are the user name and password of the Master , make no mistake
Assign 0600 permissions to the password file:sudo chmod 0600 /etc/rsyncd.secrets
3.1.4, start resync
sudo /etc/init.d/rsync start
At this point, Slave's Rsync configuration is complete
3.1.5, test
Let's test Slave's Rsync on the Master:
Create a password file on the Master: sudo vi /etc/rsyncd.secrets
. The content is as follows:
123456 #写入客户端密码即可
Assign 0600 permissions to the password file:sudo chmod 0600 /etc/rsyncd.secrets
cd ~ #返回到桌面
sudo vi hello.txt #创建一个hello.txt,里面随便写一些内容
rsync -avzP hello.txt [email protected]::data --password-file=/etc/rsyncd.secrets
Explanation: Upload the hello.txt file of the machine to the path specified by the data module of the Slave. The password file is used here, which can be uploaded without secret. Similar to git push
Special attention: This must be successful, otherwise the following configuration will not succeed
3.2, configure the Sersync of the master server Master
3.2.1, download Sersync
Download link: https://github.com/wsgzao/sersync
Unzip:
tar -zxf sersync2.5.4_64bit_binary_stable_final.tar.gz -C /usr/local/
cd /usr/local/
mv GNU-Linux-x86 sersync
3.2.2, configure Sersync
cd /usr/local/sersync
cp confxml.xml confxml.xml-bak
sudo vi confxml.xml
Amended as follows:
3.2.3, open the sersync daemon to synchronize data
/usr/local/sersync/sersync2 -d -r -o /usr/local/sersync/confxml.xml
If the following result is returned, it is successful:
set the system param
execute:echo 50000000 > /proc/sys/fs/inotify/max_user_watches
sh: 1: cannot create /proc/sys/fs/inotify/max_user_watches: Permission denied
execute:echo 327679 > /proc/sys/fs/inotify/max_queued_events
sh: 1: cannot create /proc/sys/fs/inotify/max_queued_events: Permission denied
parse the command param
option: -d run as a daemon
option: -r rsync all the local files to the remote servers before the sersync work
option: -o config xml name: /usr/local/sersync/confxml.xml
daemon thread num: 10
parse xml config file
host ip : localhost host port: 8008
daemon start,sersync run behind the console
use rsync password-file :
user is hadoop
passwordfile is /etc/rsyncd.secrets
config xml parse success
please set /etc/rsyncd.conf max connections=0 Manually
sersync working thread 12 = 1(primary thread) + 1(fail retry thread) + 10(daemon sub threads)
Max threads numbers is: 22 = 12(Thread pool nums) + 10(Sub threads)
please according your cpu ,use -n param to adjust the cpu rate
------------------------------------------
rsync the directory recursivly to the remote servers once
working please wait...
execute command: cd /home/hadoop/hub && rsync -artuz -R --delete ./ --timeout=100 [email protected]::data --password-file=/etc/rsyncd.secrets >/dev/null 2>&1
run the sersync:
watch path is: /home/hadoop/hub
At this point, sersync+rsync has been configured. As long as you create, delete, and modify files in the synchronization directory, you can synchronize to the Slave in real time.
You can also /usr/local/sersync/sersync2 -d -r -o /usr/local/sersync/confxml.xml
write to rc.local, and let Master start file synchronization when booting
Appendix:
1. Parameter description
parameter | Description |
---|---|
./sersync -r | Before enabling real-time monitoring, perform an overall synchronization between the directory of the main server and the directory of the remote target machine. |
./sersync -o xx.xml | Do not specify the -o parameter: sersync uses the default configuration file confxml.xml in the sersync executable directory to specify the -o parameter: you can specify multiple different configuration files to achieve data synchronization between multiple processes and multiple instances of sersync |
./sersync -n num | Specify the total number of threads in the default thread pool. For example: ./sersync -n 5 specifies that the total number of threads is 5. If not specified, the default number of thread pools to start is 10. If the cpu is used too high, this parameter can be adjusted down, and if the machine configuration is high, the default value can be adjusted up. The total number of threads to improve synchronization efficiency; |
./sersync -d | Background service, usually use the -r parameter to synchronize the local to the remote as a whole, then run this parameter in the background to start real-time synchronization of the daemon. In the first overall synchronization, the -d and -r parameters are often used in combination; |
./sersync -m | Do not synchronize, only run the plug-in. For example: ./sersync -m command, after the event is monitored, the remote target server will not be synchronized, but the command plug-in will be run directly |
2. Configuration file description The
default configuration file is as follows:
1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <head version="2.5">
3 <host hostip="localhost" port="8008"></host>
4 <debug start="false"/>
5 <fileSystem xfs="false"/>
6 <filter start="false">
7 <exclude expression="(.*)\.svn"></exclude>
8 <exclude expression="(.*)\.gz"></exclude>
9 <exclude expression="^info/*"></exclude>
10 <exclude expression="^static/*"></exclude>
11 </filter>
12 <inotify>
13 <delete start="true"/>
14 <createFolder start="true"/>
15 <createFile start="false"/>
16 <closeWrite start="true"/>
17 <moveFrom start="true"/>
18 <moveTo start="true"/>
19 <attrib start="false"/>
20 <modify start="false"/>
21 </inotify>
22
23 <sersync>
24 <localpath watch="/opt/tongbu">
25 <remoteip="127.0.0.1" name="tongbu1"/>
26 <!--<remoteip="192.168.8.39" name="tongbu"/>-->
27 <!--<remoteip="192.168.8.40" name="tongbu"/>-->
28 </localpath>
29 <rsync>
30 <commonParamsparams="-artuz"/>
31 <auth start="false"users="root" passwordfile="/etc/rsync.pas"/>
32 <userDefinedPortstart="false" port="874"/><!-- port=874 -->
33 <timeoutstart="false" time="100"/><!-- timeout=100 -->
34 <sshstart="false"/>
35 </rsync>
36 <failLog path="/tmp/rsync_fail_log.sh"timeToExecute="60"/><!--default every 60mins execute once-->
37 <crontab start="false"schedule="600"><!--600mins-->
38 <crontabfilterstart="false">
39 <excludeexpression="*.php"></exclude>
40 <excludeexpression="info/*"></exclude>
41 </crontabfilter>
42 </crontab>
43 <plugin start="false" name="command"/>
44 </sersync>
45
46 <plugin name="command">
47 <param prefix="/bin/sh" suffix=""ignoreError="true"/> <!--prefix /opt/tongbu/mmm.sh suffix-->
48 <filter start="false">
49 <includeexpression="(.*)\.php"/>
50 <includeexpression="(.*)\.sh"/>
51 </filter>
52 </plugin>
53
54 <plugin name="socket">
55 <localpath watch="/opt/tongbu">
56 <deshostip="192.168.138.20" port="8009"/>
57 </localpath>
58 </plugin>
59 <plugin name="refreshCDN">
60 <localpath watch="/data0/htdocs/cms.xoyo.com/site/">
61 <cdninfodomainname="ccms.chinacache.com" port="80"username="xxxx" passwd="xxxx"/>
62 <sendurlbase="http://pic.xoyo.com/cms"/>
63 <regexurlregex="false" match="cms.xoyo.com/site([/a-zA-Z0-9]*).xoyo.com/p_w_picpaths"/>
64 </localpath>
65 </plugin>
66 </head>
2.1, Debug is turned on
4 <debug start="false"/>
Set to true, which means that debug mode is turned on, and the inotify time and rsync synchronization commands will be printed on the console where sersync is running;
2.2, XFS file system switch
5 <fileSystem xfs="false"/>
For users of the xfs file system, you need to enable this option to use sersync to work normally;
2.3, filter file filtering function
6 <filter start="false">
7 <exclude expression="(.*)\.svn"></exclude>
8 <exclude expression="(.*)\.gz"></exclude>
9 <exclude expression="^info/*"></exclude>
10 <exclude expression="^static/*"></exclude>
11 </filter>
For most applications, you can try to set createFile (monitor file event option) to false to improve performance and reduce rsync communication. Because copying files to the monitored directory will generate create and close_write events, if you turn off the create event, only the close_write at the end of the file copy will be monitored, and complete file synchronization can also be achieved;
Note: force creatFolder to be kept as true. If createFolder is set to false, the generated directory will not be monitored, and the sub-files and sub-directories under this directory will not be monitored; therefore, unless you have special needs, please enable it; By default, both file creation (directory) events and file deletion (directory) events are monitored. If the file (directory) of the remote target server does not need to be deleted in the project, the delete parameter can be set to false, and the delete event will not be performed. monitor;