Linux uses Sersync+Rsync to achieve real-time file synchronization

Reference article:
https://blog.51cto.com/liubao0312/1677586
https://github.com/wsgzao/sersync


1. Introduction

1.1 Introduction to Sersync

(1) Sersync is developed based on inotify, similar to inotify-tools

(2) Sersync can record the name of a specific file or directory that has changed (including addition, deletion, modification) in the monitored directory , and then when using rsync to synchronize, only the changed files or directories are synchronized


1.2, the difference between rsync+inotify-tools and rsync+sersync architecture

(1)rsync+inotify-tools

a. Inotify can only record the changes (addition, deletion, modification) of the monitored directory and does not record the changes of specific files or directories ;

b. When rsync is synchronizing, because I don’t know which file or directory has changed, the entire directory is synchronized each time. When the amount of data is large, the entire directory synchronization is very time-consuming (rsync needs to Directory traversal to find comparison files), so the efficiency is very low.



(2)rsync+sersync

a. Sersync can record the name of a specific file or directory that has changed (addition, deletion, modification) in the monitored directory;

b. During synchronization, rsync only synchronizes the files or directories that have changed (the data that changes each time is small compared to the entire synchronization directory data, and rsync is very fast when traversing to find the comparison file), so the efficiency is very high.


to sum up:

    When the amount of synchronized directory data is not large, it is recommended to use rsync+inotify

    When the amount of synchronized directory data is large (a few hundred G or more than 1T), it is recommended to use rsync+sersync


2. Principle description

Principle steps:

  1. Start the sersync service on the master server (Master), sersync is responsible for monitoring the real-time changes of files in the configuration path (users create, modify, and delete files on this server);

  2. Call the rsync command in the Master to push (Push) the updated files to the target server (Slave1 and Slave2);

  3. Need to configure sersync on the main server and rsync on the target server


The principle is shown in the figure:
Insert picture description here


Three, configuration

Environment description:
System: Ubuntu 18.04
Master: 192.168.43.166
Slave: 192.168.43.97


3.1, configure the Rsync of the target server Slave

3.1.1, start rsync service at boot

Ubuntu 18.04 installs rsync by default, but the rsync service is not started by default. We need to modify the file sudo vi /etc/default/rsync.

RSYNC_ENABLE=true   #将false改true

3.1.2, modify the configuration file

First copy the configuration file to the etc directory for easy modification

sudo cp /usr/share/doc/rsync/examples/rsyncd.conf /etc

Modify the configuration file

sudo vi /etc/rsyncd.conf

Amend to the following information:
Insert picture description here

Note: If the path synchronization directory does not exist, you need to create it yourself


3.1.3, create a password file

Create a password sudo vi /etc/rsyncd.secretsfile: . The content is as follows:

hadoop:123456 #格式必须要为 用户名:密码

Note: The user name and password here are the user name and password of the Master , make no mistake

Assign 0600 permissions to the password file:sudo chmod 0600 /etc/rsyncd.secrets


3.1.4, start resync

sudo /etc/init.d/rsync start

At this point, Slave's Rsync configuration is complete


3.1.5, test

Let's test Slave's Rsync on the Master:

Create a password file on the Master: sudo vi /etc/rsyncd.secrets. The content is as follows:

123456    #写入客户端密码即可

Assign 0600 permissions to the password file:sudo chmod 0600 /etc/rsyncd.secrets

cd ~ 				#返回到桌面
sudo vi hello.txt 	#创建一个hello.txt,里面随便写一些内容
rsync -avzP hello.txt  [email protected]::data --password-file=/etc/rsyncd.secrets

Explanation: Upload the hello.txt file of the machine to the path specified by the data module of the Slave. The password file is used here, which can be uploaded without secret. Similar to git push


Special attention: This must be successful, otherwise the following configuration will not succeed


3.2, configure the Sersync of the master server Master

3.2.1, download Sersync

Download link: https://github.com/wsgzao/sersync
Insert picture description here



Unzip:

tar -zxf sersync2.5.4_64bit_binary_stable_final.tar.gz -C /usr/local/
cd /usr/local/
mv GNU-Linux-x86 sersync

3.2.2, configure Sersync

cd /usr/local/sersync
cp confxml.xml confxml.xml-bak
sudo vi confxml.xml

Amended as follows:
Insert picture description here

3.2.3, open the sersync daemon to synchronize data

/usr/local/sersync/sersync2  -d -r -o /usr/local/sersync/confxml.xml

If the following result is returned, it is successful:

set the system param
execute:echo 50000000 > /proc/sys/fs/inotify/max_user_watches
sh: 1: cannot create /proc/sys/fs/inotify/max_user_watches: Permission denied
execute:echo 327679 > /proc/sys/fs/inotify/max_queued_events
sh: 1: cannot create /proc/sys/fs/inotify/max_queued_events: Permission denied
parse the command param
option: -d 	run as a daemon
option: -r 	rsync all the local files to the remote servers before the sersync work
option: -o 	config xml name:  /usr/local/sersync/confxml.xml
daemon thread num: 10
parse xml config file
host ip : localhost	host port: 8008
daemon start,sersync run behind the console 
use rsync password-file :
user is	hadoop
passwordfile is 	/etc/rsyncd.secrets
config xml parse success
please set /etc/rsyncd.conf max connections=0 Manually
sersync working thread 12  = 1(primary thread) + 1(fail retry thread) + 10(daemon sub threads) 
Max threads numbers is: 22 = 12(Thread pool nums) + 10(Sub threads)
please according your cpu ,use -n param to adjust the cpu rate
------------------------------------------
rsync the directory recursivly to the remote servers once
working please wait...
execute command: cd /home/hadoop/hub && rsync -artuz -R --delete ./  --timeout=100 [email protected]::data --password-file=/etc/rsyncd.secrets >/dev/null 2>&1 
run the sersync: 
watch path is: /home/hadoop/hub


At this point, sersync+rsync has been configured. As long as you create, delete, and modify files in the synchronization directory, you can synchronize to the Slave in real time.

You can also /usr/local/sersync/sersync2 -d -r -o /usr/local/sersync/confxml.xmlwrite to rc.local, and let Master start file synchronization when booting



Appendix:
1. Parameter description

parameter Description
./sersync -r Before enabling real-time monitoring, perform an overall synchronization between the directory of the main server and the directory of the remote target machine.
./sersync -o xx.xml Do not specify the -o parameter: sersync uses the default configuration file confxml.xml in the sersync executable directory to
specify the -o parameter: you can specify multiple different configuration files to achieve data synchronization between multiple processes and multiple instances of sersync
./sersync -n num Specify the total number of threads in the default thread pool.
For example: ./sersync -n 5 specifies that the total number of threads is 5. If not specified, the default number of thread pools to start is 10. If the cpu is used too high, this parameter can be adjusted down, and if the machine configuration is high, the default value can be adjusted up. The total number of threads to improve synchronization efficiency;
./sersync -d Background service, usually use the -r parameter to synchronize the local to the remote as a whole, then run this parameter in the background to start real-time synchronization of the daemon. In the first overall synchronization, the -d and -r parameters are often used in combination;
./sersync -m Do not synchronize, only run the plug-in.
For example: ./sersync -m command, after the event is monitored, the remote target server will not be synchronized, but the command plug-in will be run directly

2. Configuration file description The
default configuration file is as follows:

     1 <?xml version="1.0" encoding="ISO-8859-1"?>
     2 <head version="2.5">
     3     <host hostip="localhost" port="8008"></host>
     4     <debug start="false"/>
     5     <fileSystem xfs="false"/>
     6     <filter start="false">
     7         <exclude expression="(.*)\.svn"></exclude>
     8         <exclude expression="(.*)\.gz"></exclude>
     9         <exclude expression="^info/*"></exclude>
    10         <exclude expression="^static/*"></exclude>
    11     </filter>
    12     <inotify>
    13         <delete start="true"/>
    14         <createFolder start="true"/>
    15         <createFile start="false"/>
    16         <closeWrite start="true"/>
    17         <moveFrom start="true"/>
    18         <moveTo start="true"/>
    19         <attrib start="false"/>
    20         <modify start="false"/>
    21     </inotify>
    22
    23     <sersync>
    24         <localpath watch="/opt/tongbu">
    25              <remoteip="127.0.0.1" name="tongbu1"/>
    26              <!--<remoteip="192.168.8.39" name="tongbu"/>-->
    27              <!--<remoteip="192.168.8.40" name="tongbu"/>-->
    28         </localpath>
    29         <rsync>
    30              <commonParamsparams="-artuz"/>
    31              <auth start="false"users="root" passwordfile="/etc/rsync.pas"/>
    32              <userDefinedPortstart="false" port="874"/><!-- port=874 -->
    33              <timeoutstart="false" time="100"/><!-- timeout=100 -->
    34              <sshstart="false"/>
    35         </rsync>
    36         <failLog path="/tmp/rsync_fail_log.sh"timeToExecute="60"/><!--default every 60mins execute once-->
    37         <crontab start="false"schedule="600"><!--600mins-->
    38              <crontabfilterstart="false">
    39                  <excludeexpression="*.php"></exclude>
    40                  <excludeexpression="info/*"></exclude>
    41              </crontabfilter>
    42         </crontab>
    43         <plugin start="false" name="command"/>
    44     </sersync>
    45
    46     <plugin name="command">
    47         <param prefix="/bin/sh" suffix=""ignoreError="true"/> <!--prefix /opt/tongbu/mmm.sh suffix-->
    48         <filter start="false">
    49              <includeexpression="(.*)\.php"/>
    50              <includeexpression="(.*)\.sh"/>
    51         </filter>
    52     </plugin>
    53
    54      <plugin name="socket">
    55         <localpath watch="/opt/tongbu">
    56              <deshostip="192.168.138.20" port="8009"/>
    57         </localpath>
    58     </plugin>
    59     <plugin name="refreshCDN">
    60         <localpath watch="/data0/htdocs/cms.xoyo.com/site/">
    61              <cdninfodomainname="ccms.chinacache.com" port="80"username="xxxx" passwd="xxxx"/>
    62              <sendurlbase="http://pic.xoyo.com/cms"/>
    63              <regexurlregex="false" match="cms.xoyo.com/site([/a-zA-Z0-9]*).xoyo.com/p_w_picpaths"/>
    64         </localpath>
    65     </plugin>
    66 </head>

2.1, Debug is turned on

     4     <debug start="false"/>

Set to true, which means that debug mode is turned on, and the inotify time and rsync synchronization commands will be printed on the console where sersync is running;


2.2, XFS file system switch

     5     <fileSystem xfs="false"/>

For users of the xfs file system, you need to enable this option to use sersync to work normally;


2.3, filter file filtering function

     6     <filter start="false">
     7         <exclude expression="(.*)\.svn"></exclude>
     8         <exclude expression="(.*)\.gz"></exclude>
     9         <exclude expression="^info/*"></exclude>
    10         <exclude expression="^static/*"></exclude>
    11     </filter>

For most applications, you can try to set createFile (monitor file event option) to false to improve performance and reduce rsync communication. Because copying files to the monitored directory will generate create and close_write events, if you turn off the create event, only the close_write at the end of the file copy will be monitored, and complete file synchronization can also be achieved;

Note: force creatFolder to be kept as true. If createFolder is set to false, the generated directory will not be monitored, and the sub-files and sub-directories under this directory will not be monitored; therefore, unless you have special needs, please enable it; By default, both file creation (directory) events and file deletion (directory) events are monitored. If the file (directory) of the remote target server does not need to be deleted in the project, the delete parameter can be set to false, and the delete event will not be performed. monitor;

Guess you like

Origin blog.csdn.net/lendsomething/article/details/109134734