Recently, the company has to do a data backup. The data is generated by users, which are basically files and pictures. To back up the data into packed into tar
the package, so the design scheme as follows:
"User Data Backup Program"
1. With bash shell
the development of data backup script
2. Data backup directory in the new dcp/backup
case
3. All files need to be backed package is tar
the package name to date names, such as:2020-04-07.tar.gz
2020-04-07_all.tar.gz
4. Backup rules:
-
Day level: daily
00:00
backup new data generated yesterday -
Week Class: Mondays
00:00
all full amount of data backup
5. Delete rule:
-
A day
00:00
before deleting the seven days day level backup of data -
Mondays
00:00
delete last week's level backup data
6. Timed execution:
- Use
crontab
to perform regular tasks
After the scheme is designed, it can be developed. However, many small pits were encountered during the development process.
Pit 1: date
Command
Since I use Mac for development, I encountered some pitfalls when doing time processing. date
The parameter used when adding and subtracting time is -d
, but an error is reported under Mac. This is because the parameter becomes-v
Reference materials:
https://blog.csdn.net/weixin_37696997
https://www.cnblogs.com/alsodzy/p/8403870.html
https://blog.csdn.net/guddqs/article/details/80745464
In order to be compatible with Mac and Linux distributions, we need to determine what the current system is
#!/bin/bash
if [[ `uname -a` =~ "Darwin" ]]
then
echo "Mac"
elif [[ `uname -a` =~ "centos" ]]
then
echo "Centos"
elif [[ `uname -a` =~ "ubuntu" ]]
then
echo "Ubuntu"
else
echo "Other"
fi
~
It is to judge whether the regular expression on the right side matches, matching output 1 does not match output 0
Pit two: tar
package absolute path
tar packaging is carried out under a relative path
tar cvzf 2020_04-08_all.tar.gz files/
But we write the script for the path is correct, regular use absolute paths, sometimes will complain if the direct use, you need to add P
parameters, remember to put f
before parameters
tar cvzPf 2020_04-08_all.tar.gz /data/dcp/www/files/
Pit 3: tar
Specify the decompressed directory
If the direct use absolute paths packed, unpacked inside the package is from the root directory of the beginning, if you want the parent directory of the specified file will need to add -C
parameters to specify a directory, then the directory specified after the decompression isfiles
tar cvzPf 2020_04-08_all.tar.gz -C /data/dcp/www files
Pit four: find
and tar
with the use of
In the packing time, and sometimes not all data should be, this time on the need to filter the data part of the package, filtered to find it with find
the command
Find command format and detailed explanation of find command
find Find files by file modification time
find
And tar
with the use of many forms, with particular reference to the following link
How to find and tar files into a tar ball
Here we use the most common pipeline command
find /data/dcp/www/files -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz
Or use the -exec
command
find /data/dcp/www/files -mtime 0 -exec tar cvzf 2020_04-08.tar.gz {} +
Pit 4: Data redundancy
The above command seems to be no problem, but it is packed back into redundant redundant data when packing. Why is this?
The reason is that the directory is also time information, as long as the file directory if you want to ignore, then only need to add -type f
parameters like
find /data/dcp/www/files -type f -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz
Hang five: find
and tar
after decompression specified directory
Use find
the command, packing pipes are needed, but the command format will change, in order to specify the directory, we need to work, to put it another way
cd /data/dcp/www
find files -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz
So that you can specify find
and tar
unpacked directory
Finally release the source code of user data backup:
#!/bin/bash
# today
dt=`date +"%Y-%m-%d"`
# src and dest file path
src="/data/dcp/www"
dest="/data/dcp/backup"
day() {
find ${dest} -type f -name ${last_week}.tar.gz | xargs rm -rf
cd $src
find files -type f -mtime 0 | xargs tar cvzf ${dest}/${yesterday}.tar.gz
}
week() {
find ${dest} -type f -name ${last_week}_all.tar.gz | xargs rm -rf
cd $src
tar cvzf ${dest}/${dt}_all.tar.gz files/
}
# Judge Mac or Linux
if [[ `uname -a` =~ "Darwin" ]]
then
yesterday=`date -v -1d +"%Y-%m-%d"`
last_week=`date -v -1w +"%Y-%m-%d"`
else
yesterday=`date -d '-1 day' +%Y-%m-%d`
last_week=`date -d '-1 week' +%Y-%m-%d`
fi
# day or week type
if [ "$1" = "day" ]
then
day
elif [ "$1" = "week" ]
then
week
else
echo "--------- *Please input task type* ----------"
echo "bash $0 day [OR] bash $0 week"
fi
Finally, crontab
add a scheduled task
0 0 * * * bash /data/dcp/script/dcp_user_data_backup.sh day
0 0 * * 1 bash /data/dcp/script/dcp_user_data_backup.sh week