Original link: https://www.ruanyifeng.com/blog/2020/08/rsync.html
1. Introduction
rsync is a commonly used Linux application for file synchronization.
It can synchronize files between a local computer and a remote computer, or between two local directories (but does not support synchronization between two remote computers). It can also be used as a file copy tool, substitution cp
and mv
command.
Its name r
refers to remote, and rsync actually means "remote sync". Unlike other file transfer tools (such as FTP or scp), the biggest feature of rsync is that it will check the existing files of the sender and receiver, and only transfer the changed parts (the default rule is that the file size or modification time has changed).
2. Installation
If rsync is not installed on the local or remote computer, you can use the following command to install it.
# Debian $ sudo apt-get install rsync # Red Hat $ sudo yum install rsync # Arch Linux $ sudo pacman -S rsync
Note that rsync must be installed on both sides of the transfer.
3. Basic usage
3.1 -r
parameters
When the machine uses the rsync command, it can be used as an alternative to the cp
and command to synchronize the source directory to the target directory.mv
$ rsync -r source destination
In the above command, -r
it means recursive, which includes subdirectories. Note that -r
it is necessary, otherwise rsync will not run successfully. source
directory represents the source directory, and destination
represents the target directory.
If there are multiple files or directories that need to be synchronized, it can be written as follows.
$ rsync -r source1 source2 destination
In the above command, source1
, source2
will be synchronized to destination
the directory.
3.2 -a
parameters
-a
Parameters can be replaced -r
. In addition to recursive synchronization, meta information (such as modification time, permissions, etc.) can also be synchronized. Since rsync by default uses file size and modification time to decide whether a file needs to be updated, this is more useful -a
than rsync. -r
The following usage is the common way of writing.
$ rsync -a source destination
If the target directory destination
does not exist, rsync will automatically create it. After executing the above command, the source directory source
is completely copied to the target directory , that is, the directory structure destination
is formed .destination/source
If you only want to synchronize source
the content in the source directory to the target directory destination
, you need to add a slash after the source directory.
$ rsync -a source/ destination
After the above command is executed, source
the contents of the directory will be copied into destination
the directory, and destination
a subdirectory will not be created below source
.
3.3 -n
parameters
If you are not sure what the result will be after rsync is executed, you can use the -n
or --dry-run
parameter to simulate the execution result first.
$ rsync -anv source/ destination
In the above command, -n
the parameter simulates the result of command execution, and does not actually execute the command. -v
The parameter is to output the result to the terminal, so that you can see what content will be synchronized.
3.4 --delete
parameters
By default, rsync simply ensures that all contents of the source directory (except files explicitly excluded) are copied to the destination directory. It doesn't make two directories the same, and it doesn't delete files. If you want to make the target directory a mirror copy of the source directory, you must use --delete
the parameter, which will delete files that only exist in the target directory and do not exist in the source directory.
$ rsync -av --delete source/ destination
In the above command, --delete
the parameters will make it a mirror image destination
of source
.
4. Exclude files
4.1 --exclude
Parameters
Sometimes, we want to exclude certain files or directories when synchronizing, then we can use --exclude
parameters to specify the exclusion mode.
$ rsync -av --exclude='*.txt' source/ destination # 或者 $ rsync -av --exclude '*.txt' source/ destination
The above command excludes all TXT files.
Note that rsync will synchronize hidden files starting with "dot". If you want to exclude hidden files, you can write like this --exclude=".*"
.
If you want to exclude all files in a certain directory, but do not want to exclude the directory itself, you can write it as follows.
$ rsync -av --exclude 'dir1/*' source/ destination
Multiple exclusion patterns can take multiple --exclude
parameters.
$ rsync -av --exclude 'file1.txt' --exclude 'dir1/*' source/ destination
Multiple exclude patterns can also take advantage of Bash's wide-expansion feature, with just one --exclude
argument.
$ rsync -av --exclude={ 'file1.txt','dir1/*'} source/ destination
If there are many exclusion patterns, you can write them to a file, one line per pattern, and --exclude-from
specify this file with parameters.
$ rsync -av --exclude-from='exclude-file.txt' source/ destination
4.2 --include
Parameters
--include
The parameter is used to specify the file mode that must be synchronized, and is often --exclude
used in conjunction with .
$ rsync -av --include="*.txt" --exclude='*' source/ destination
The above command specifies that when synchronizing, all files are excluded, but TXT files are included.
5. Remote synchronization
5.1 SSH protocol
In addition to supporting synchronization between two local directories, rsync also supports remote synchronization. It can synchronize local content to a remote server.
$ rsync -av source/ username@remote_host:destination
It is also possible to synchronize remote content to the local.
$ rsync -av username@remote_host:source/ destination
rsync uses SSH by default for remote login and data transfer.
Since rsync did not use the SSH protocol in the early days, it was necessary to -e
specify the protocol with parameters, which was changed later. Therefore, the following -e ssh
can be omitted.
$ rsync -av -e ssh source/ user@remote_host:/destination
However, if the ssh command has additional parameters, the -e
parameters must be used to specify the SSH command to be executed.
$ rsync -av -e 'ssh -p 2234' source/ user@remote_host:/destination
In the above command, -e
the parameter specifies that SSH uses port 2234.
5.2 rsync protocol
rsync://
In addition to using SSH, the protocol (port 873 by default) can also be used for transfers if another server has the rsync daemon installed and running . The specific way of writing is to use double colons to separate the server and the target directory ::
.
$ rsync -av source/ 192.168.122.32::module/destination
Note that the above address module
is not the actual path name, but a resource name specified by the rsync daemon, assigned by the administrator.
If you want to know the list of all modules allocated by the rsync daemon, you can execute the following command.
$ rsync rsync://192.168.122.32
In addition to using double colons, the rsync protocol can also directly use rsync://
the protocol to specify the address.
$ rsync -av source/ rsync://192.168.122.32/module/destination
6. Incremental backup
The biggest feature of rsync is that it can complete incremental backup, that is, only the changed files are copied by default.
In addition to the direct comparison between the source directory and the target directory, rsync also supports the use of the base directory, which is to synchronize the changed parts between the source directory and the base directory to the target directory.
The specific method is that the first synchronization is a full backup, and all files are synchronized in the base directory. Every subsequent synchronization is an incremental backup, only synchronizing the part that has changed between the source directory and the base directory, and saving this part in a new target directory. This new target directory also contains all files, but in fact, only those files that have changed exist in this directory, and other files that have not changed are hard links to the files in the base directory.
--link-dest
The parameter is used to specify the base directory when synchronizing.
$ rsync -a --delete --link-dest /compare/path /source/path /target/path
In the above command, --link-dest
the parameter specifies the base directory /compare/path
, and then the source directory /source/path
is compared with the base directory to find out the changed files and copy them to the target directory /target/path
. Those files that have not changed will generate hard links. The first backup of this command is a full backup, followed by incremental backups.
Below is an example script that backs up a user's home directory.
#!/bin/bash # A script to perform incremental backups using rsync set -o errexit set -o nounset set -o pipefail readonly SOURCE_DIR="${ HOME}" readonly BACKUP_DIR="/mnt/data/backups" readonly DATETIME="$(date '+%Y-%m-%d_%H:%M:%S')" readonly BACKUP_PATH="${BACKUP_DIR}/${DATETIME}" readonly LATEST_LINK="${BACKUP_DIR}/latest" mkdir -p "${BACKUP_DIR}" rsync -av --delete \ "${SOURCE_DIR}/" \ --link-dest "${LATEST_LINK}" \ --exclude=".cache" \ "${BACKUP_PATH}" rm -rf "${LATEST_LINK}" ln -s "${BACKUP_PATH}" "${LATEST_LINK}"
In the above script, each synchronization will generate a new directory ${BACKUP_DIR}/${DATETIME}
and ${BACKUP_DIR}/latest
point the soft link to this directory. At the next backup, it will ${BACKUP_DIR}/latest
be used as the base directory to generate a new backup directory. Finally, point the soft link ${BACKUP_DIR}/latest
to the new backup directory.
Seven, configuration items
-a
, --archive
The parameter indicates the archive mode, saves all metadata, such as modification time (modification time), permission, owner, etc., and the soft link will also be synchronized in the past.
--append
The parameter specifies that the file continues to transfer where it left off last time.
--append-verify
Parameters --append
are similar to parameters, but a verification will be performed on the file after the transfer is completed. If verification fails, the entire file will be resent.
-b
, --backup
The parameter specifies that when deleting or updating an existing file in the target directory, the file will be renamed and then backed up. The default behavior is to delete. The renaming rule is to add --suffix
the file extension specified by the parameter, the default is ~
.
--backup-dir
The parameter specifies the directory where the file is stored when backing up, eg --backup-dir=/path/to/backups
.
--bwlimit
The parameter specifies the bandwidth limit, the default unit is KB/s, eg --bwlimit=100
.
-c
, The verification method of --checksum
parameter change rsync
. By default, rsync only checks whether the file size and last modification date have changed, and if so, retransmits; after using this parameter, it decides whether to retransmit by judging the checksum of the file content.
--delete
The parameter deletes files that only exist in the target directory and do not exist in the source target, that is, ensure that the target directory is a mirror image of the source target.
-e
The parameter specifies the use of the SSH protocol to transfer data.
--exclude
Parameters specifying to exclude files from syncing, eg --exclude="*.iso"
.
--exclude-from
The parameter specifies a local file, which contains file patterns that need to be excluded, one line per pattern.
--existing
, --ignore-non-existing
The parameter indicates that the files and directories that do not exist in the target directory are not synchronized.
-h
Arguments represent output in a human-readable format.
-h
, --help
parameters to return help information.
-i
The parameter indicates the details of the file differences between the output source directory and the target directory.
--ignore-existing
The parameter indicates that as long as the file already exists in the target directory, skip it and no longer synchronize these files.
--include
The parameter specifies the files to be included when synchronizing, and is generally --exclude
used in conjunction with .
--link-dest
parameter specifies the base directory for incremental backups.
-m
parameter specifies not to sync empty directories.
--max-size
The parameter sets the maximum file size limit for transmission, such as no more than 200KB ( --max-size='200k'
).
--min-size
The parameter sets the minimum file size limit for transmission, such as not less than 10KB ( --min-size=10k
).
-n
A parameter or --dry-run
parameter simulates an action that would be performed without actually performing it. Used with -v
parameters, you can see what content will be synchronized.
-P
The parameter is a combination of --progress
these --partial
two parameters.
--partial
parameter allows to resume interrupted transfers. When this parameter is not used, rsync
the half-transferred file will be deleted; after this parameter is used, the half-transferred file will also be synchronized to the target directory, and the interrupted transmission will be resumed at the next synchronization. Generally need to be used in conjunction with --append
or --append-verify
.
--partial-dir
The argument specifies to save the half-transferred file to a temporary directory, eg --partial-dir=.rsync-partial
. Generally need to be used in conjunction with --append
or --append-verify
.
--progress
Argument indicates display progress.
-r
The parameter indicates recursion, i.e. including subdirectories.
--remove-source-files
The parameter indicates that after the transmission is successful, delete the sender's file.
--size-only
The parameter indicates that only files whose size has changed are synchronized, regardless of the difference in file modification time.
--suffix
The parameter specifies the suffix added to the file name when the file name is backed up, and the default is ~
.
-u
The --update
parameter indicates that the files with updated modification time in the target directory are skipped during synchronization, that is, these files with updated timestamps are not synchronized.
-v
The parameter indicates the output details. -vv
means to output more detailed information, -vvv
means to output the most detailed information.
--version
The argument returns the version of rsync.
-z
The parameter specifies to compress the data when synchronizing.
8. Reference link
- How To Use Rsync to Sync Local and Remote Directories on a VPS, Justin Ellingwood
- Mirror Your Web Site With rsync, Falko Timme
- Examples on how to use Rsync, Egidio Docile
- How to create incremental backups using rsync on Linux, Egidio Docile