DAY 69 rsync remote synchronization

Introduction to rsync

Introduction to rsync

rsync (Remote Sync, remote synchronization) is an open source fast backup tool, which can mirror and synchronize the entire directory tree between different hosts, supports incremental backup, and maintains links and permissions, and uses an optimized synchronization algorithm to perform compression before transmission , so it is very suitable for remote backup, mirror server and other applications.

The URL of rsync's official website is rsync.samba.org/ , and the latest version is 3.1.3, which is maintained by Wayne Davison. As one of the most commonly used file backup tools, rsync is often one of the basic components installed by default on Linux and UNIX systems

rsync features

Support copying special files, such as connection files, devices, etc.

It can have the function of excluding the synchronization of specified files or directories, which is equivalent to the exclusion function of the packaging command tar.

It can be done to keep the permissions, time, soft and hard links, owner, group and other attributes of the original file or directory unchanged – p.

Incremental synchronization can be realized, that is, only the changed data is synchronized, so the data transmission efficiency is very high (tar-N).

You can use rcp, rsh, ssh, etc. to transfer files (rsync itself does not encrypt data).

Files and data (server and client) can be transferred through socket (process mode).

It supports anonymous live authentication (no system user required) process mode transmission, which can realize convenient and safe data backup and mirroring.

rsync sync source server

In a remote synchronization task, the client responsible for initiating the rsync synchronization operation is called the initiator, and the server responsible for responding to the rsync synchronization operation from the client is called the synchronization source.

  • In downlink synchronization (download), the synchronization source is responsible for providing the original location of the document, and the initiator should have read access to the location.

  • In an upstream sync (upload), the sync source is responsible for providing the target location of the document, and the initiator should have write access to that location

 Configure rsync downlink synchronization (timing synchronization)

Source server: 192.168.137.10

Client (initiator): 192.168.137.20

Configure source server


 systemctl stop firewalld
 setenforce 0
 ​
 rpm -q rsync      #一般系统已默认安装rsync
 ​
 #建立/etc/rsyncd.conf配置文件
 vim /etc/rsyncd.conf           #添加以下配置项
 uid = root
 gid = root
 use chroot = yes              #禁锢在源目录
 address = 192.168.137.10      #监听地址
 port = 873            #监听端口tcp/udp 873,可通过cat /etc/services | grep rsync查看
 log file = /var/1og/rsyncd.1og     #日志文件位置
 pid file = /var run/rsyncd.pid     #存放进程ID的文件位置
 hosts allow = 192.168.137.0/24      #允许访问的客户机地址。多个地址以空格分隔
 dont compress = *.gz *.bz2 *.tgz *.zip *.rar *.z  #同步时不再压缩的文件类型
 ​
 [wwwroot]                          #共享模块名称
 path = /var/www/html               #源目录的实际路径
 comment = Document Root of www.cxk.com    #备注
 read only = yes     #是否为只读。yes表示客户端只能读取目录内容,不能写入。只允许下行,不允许上行。
 auth users = backuper    #授权账户,多个账号以空格分隔。授权用户,允许读取的用户名。
 secrets file = /etc/rsyncd_users.db
 #存放授权账户信息的数据文件
 #如采用匿名的方式,只要将其中的“auth users"和“secrets file"配置项去掉即可。
 ​
 #为备份账户创建数据文件。
 vim /etc/rsyncd_users.db
 backuper:abc123       #无须建立同名系统用户。backuper为用户名,abc123为密码。
 ​
 chmod 600 /etc/rsyncd_users.db
 ​
 #保证所有用户对源目录/var/www/html都有读取权限
 chmod +r /var/ www/ html/
 ls -ld /var/www/html/
 #启动rsync服务程序
 rsync --daemon     #启动rsync 服务,以独立监听服务的方式(守护进程)运行
 netstat -anpt | grep rsync
 ​
 #关闭rsync 服务
 kill $(cat /var/run/rsyncd.pid) 
 rm -rf /var/run/rsyncd.pid


Initiator configuration

Basic format:

 rsync  [选项]  原始位置  目标位置

Common options:

options Function
-r Recursive mode, including all files in the directory and subdirectories.
-l Files with symbolic links are still copied as symbolic link files.
-v Display verbose information about the synchronization process.
-z Compress when transferring files.
-a Archiving mode, retaining file permissions, attributes and other information, which is equivalent to the combination option "-rlptgop".
-p Permission flags for files are preserved.
-t File timestamps are preserved.
-g Preserve the file's group flags (only for superusers).
-o Preserve the ownership of the file (only for superusers).
-H Keep hardlinked files.
-A Retains ACL attribute information.
-D Keep device files and other special files.
--delete Delete files that exist in the target location but not in the original location.
--checksum Skip files based on checksum (not file size, modification time).

Configuration:

 #将指定的资源下载到本地/opt目录下进行备份。密码abc123
 ​
 格式一: #用户名@主机地址::共享模块名
 rsync -avz [email protected]::wwwroot /opt/  #wwwroot为共享模块名,密码abc123
 #backuper指的是我在同步的时候用的哪个用户身份
 #wwwroot代表的是模块,模块下面会写同步的默认路径和一些特性,所以我们只需要写模块就好了
 #/opt/指的是同步到本地的目录
 ​
 格式二: #rsync:/用户名@主机地址/共享模块名
 rsync -avz rsync://[email protected]/wwwroot /opt/
 ​
 ​
 #免交互格式配置:
 echo "abc123" > /etc/server.pass
 ​
 chmod 600 /etc/server.pass    #密码文件权限必须为600,即除了属主,其他人都没有查看权限。
 ​
 rsync -avz --password-file=/etc/server.pass [email protected]::wwwroot /opt/     #免密同步
 ​
 ​
 #定时同步
 crontab -e
 30 22 * * * /usr/bin/rsync -az --delete --password-file=/etc/server.pass [email protected]::wwwroot /opt/
 #为了在同步过程中不用输入密码,需要创建一个密码文件,保存backuper用户的密码,如/etc/server.pass。 在执行rsync 同步时使用选项"--password-file=/etc/server.pass"指定即可。
 ​
 systemctl restart crond
 systemctl enable crond

rsync real-time synchronization (uplink synchronization)

Insufficient regular synchronization

  • The time to execute the backup is fixed, the delay is obvious, and the real-time performance is poor
  • Intensive periodic tasks are unnecessary when the synchronization source does not change for a long time

The inotify mechanism of the Linux kernel

  • Available since version 2.6.13
  • Can monitor changes in the file system and respond to notifications
  • Auxiliary software: inotify-tools

Initiator configuration rsync+Inotify

  • Using the inotify notification interface can be used to monitor various changes in the file system, such as file access, deletion, movement, modification, etc. Using this mechanism, it is very convenient to realize file change alarms, incremental backups, and respond to changes in directories or files in a timely manner.

  • Combining the inotify mechanism with the rsync tool can realize triggered backup (real-time synchronization), that is, as long as the document in the original location changes, the incremental backup operation will be started immediately; otherwise, it will be in a silent waiting state.

  • Because the inotify notification mechanism is provided by the Linux kernel, it is mainly used for local monitoring, and it is more suitable for upstream synchronization when applied in triggered backup

 Modify the rsync source server configuration file


 vim /etc/ rsyncd. conf
 read only = no   #关闭只读,上行同步需要可以写
 ​
 #之后重启
 kill $(cat /var/run/rsyncd.pid)
 rm -rf /var/run/rsyncd.pid
 rsync --daemon  
 netstat -anpt | grep rsync
 ​
 #创建一个同步目录,并修改权限
 mkdir /data
 chmod 777 /data

Initiator, adjust inotify kernel parameters

In the Linux kernel, the default inotify mechanism provides three control parameters:

  • max_queue_events (monitor event queue, default value is 16384),
  • max_user_instances (the maximum number of monitoring instances, the default value is 128),
  • max_user_watches (the maximum number of monitored files per instance, the default value is 8192).

When the number of directories and files to be monitored is large or changes frequently, it is recommended to increase the value of these three parameters.

 cat /proc/sys/fs/inotify/max_queued_events
 cat /proc/sys/fs/inotify/max_user_instances
 cat /proc/sys/fs/inotify/max_user_watches
 ​
 vim /etc/sysctl.conf    #内核参数都在该文件中修改
 fs.inotify.max_queued_events = 16384
 fs.inotify.max_user_instances = 1024
 fs.inotify.max_user_watches = 1048576
 ​
 sysctl -p

Initiator, install inotify-tools

To use the inotify mechanism, you need to install inotify-tools to provide inotifywait and inotifywatch auxiliary tool programs for monitoring and summarizing changes.

  • inotifywait: It can monitor various events such as modify (modify), create (create), move (move), delete (delete), attrib (attrib (attribute change), etc., and output the result immediately when there is a change.

  • inotifywatch: It can be used to collect file system changes and output the summary changes after the operation is over

 tar zxvf inotify-tools-3.14.tar.gz -C /opt/
 ​
 cd /opt/inotify-tools-3.14
 ./configure
 make && make install
 ​
 #可以先执行“inotifywait”命令,然后另外再开启一个新终端向 /data 目录下添加文件、移动文件,在原来的终端中跟踪屏幕输出结果。
 inotifywait -mrq -e modify,create,move,delete /data
 ​
 #选项“-e”:用来指定要监控哪些事件
 #选项“-m”:表示持续监控
 #选项“-r”:表示递归整个目录
 #选项“-q”:简化输出信息


Initiator, write trigger synchronization script

Write a trigger synchronization script in another terminal (note that the script name cannot contain the rsync string, otherwise the script may not take effect)


 vim /opt/inotify.sh 
 #!/bin/bash
 ​
 #定义inotifywait监控/data目录中文件事件的变量。attrib表示属性变化。
 INOTIFY_CMD="inotifywait -mrq -e modify,create,attrib,move,delete /data"
 ​
 #定义执行 rysnc 上行同步的变量。--delete保证两边目录内容一致,可以不加。
 RSYNC_CMD="rsynC -azH --delete --password-file=/etc/server .pass /data [email protected]::backupdir/"
 ​
 #使用while、read持续获取监控结果,根据结果可以作进一步判断是否读取到输出的监控记录
 $INOTIFY_CMD | while read DIRECTORY EVENT FILE 
 do
    #如果rsync未在执行,则立即启动
    if[ $(pgrep rsync | wc -l) -le 0 ];then
         $RSYNC_CMD
    fi
 done
 ​
 chmod +8 /opt/inotify.sh
 ​
 chmod +x /etc/rc.d/rc.local     #开机自启脚本文件
 echo '/opt/inotify.sh' >> /etc/rc.d/rc.local  #加入开机自动执行
 ​
 #之后运行脚本(后台运行)
 cd /opt/
 ./inotify.sh &
 ​
 #之后在发起端创建文件,查看源服务器中是否新增了

If the file to be synchronized is relatively large, and the synchronization is relatively slow, resulting in the failure of subsequent files and synchronization, you need to add a message queue or buffer in the script

 #!/bin/bash
 #定义inotifywait监控目录中文件事件的变量
 INOTIEY_CMD="inotifywait -mrq -e modify,create,attrib,move,delete /data/"
 #定义执行rsync上行同步的变量
 RSYNC_CMD="rsync -azH --delete --password-file=/etc/server.pass /data/ [email protected]::backupdir/"
 #使用while、read持续获取监控结果,根据结果可以进一步判断是否读取到输出的监控记录
 $INOTIEY_CMD | while read DIRECTORY EVENT FILE
 do    
       #小于等于0,则等待它执行完再去同步其他文件
       until [ $(pgrep rsync | wc -l) -le 0 ] 
      
       do
          sleep 1
       done
       $RSYNC_CMD
 done


Verify sync effect

The above script is used to detect changes in the /data directory of the local machine. Once there is an update, the rsync synchronization operation will be triggered, and the backup will be uploaded to the wwwroot shared directory of the server 192.168.137.10.

The verification process of triggered uplink synchronization is as follows:

(1) Run the /opt/inotify.sh script program locally.

(2) Switch to the /data/ directory of the local machine, and perform operations such as adding, deleting, and modifying files.

(3) View the changes in the wwwroot directory in the remote server

Quickly delete large numbers of files with rsync

If you want to delete a large number of files under linux, such as 1 million, 10 million, like the nginx cache of /usr/local/nginx/proxy_temp, etc., then rm -rf * may not be easy to use, because it takes a long time to wait.

In this case we can use rsync to handle it neatly.

rsync actually uses the replacement principle.

 #先建立一个空的文件夹:
 mkdir /home/blank
 ​
 #用rsync删除目标目录:
 rsync --delete-before -a -H -v --progress --stats /home/blank/ /usr/local/nginx/proxy_temp
 
 #这样目标目录很快就被清空了

Option description:

options effect
--delete-before The receiver performs a delete operation in the transmission
-a Archive mode, which means recursively transfer files and keep all file attributes
-H Files that remain hardlinked
-v verbose output mode
--progress Show transfer progress while transferring
--stats Gives the transfer status of certain files

Guess you like

Origin blog.csdn.net/weixin_57560240/article/details/130914826