"User data backup plan" design, development, crawling

Recently, the company has to do a data backup. The data is generated by users, which are basically files and pictures. To back up the data into packed into tarthe package, so the design scheme as follows:

"User Data Backup Program"
1. With bash shellthe development of data backup script
2. Data backup directory in the new dcp/backupcase
3. All files need to be backed package is tarthe package name to date names, such as:2020-04-07.tar.gz 2020-04-07_all.tar.gz
4. Backup rules:
  • Day level: daily 00:00backup new data generated yesterday

  • Week Class: Mondays 00:00all full amount of data backup

5. Delete rule:
  • A day 00:00before deleting the seven days day level backup of data

  • Mondays 00:00delete last week's level backup data

6. Timed execution:
  • Use crontabto perform regular tasks

After the scheme is designed, it can be developed. However, many small pits were encountered during the development process.

Pit 1: dateCommand

Since I use Mac for development, I encountered some pitfalls when doing time processing. dateThe parameter used when adding and subtracting time is -d, but an error is reported under Mac. This is because the parameter becomes-v

Reference materials:
https://blog.csdn.net/weixin_37696997
https://www.cnblogs.com/alsodzy/p/8403870.html
https://blog.csdn.net/guddqs/article/details/80745464

In order to be compatible with Mac and Linux distributions, we need to determine what the current system is

#!/bin/bash

if [[ `uname -a` =~ "Darwin" ]]
then
    echo "Mac"

elif [[ `uname -a` =~ "centos" ]]
then
    echo "Centos"

elif [[ `uname -a` =~ "ubuntu" ]]
then
    echo "Ubuntu"

else
    echo "Other"
fi

~ It is to judge whether the regular expression on the right side matches, matching output 1 does not match output 0

Pit two: tarpackage absolute path

tar packaging is carried out under a relative path

tar cvzf 2020_04-08_all.tar.gz files/

But we write the script for the path is correct, regular use absolute paths, sometimes will complain if the direct use, you need to add Pparameters, remember to put fbefore parameters

tar cvzPf 2020_04-08_all.tar.gz /data/dcp/www/files/
Pit 3: tarSpecify the decompressed directory

If the direct use absolute paths packed, unpacked inside the package is from the root directory of the beginning, if you want the parent directory of the specified file will need to add -Cparameters to specify a directory, then the directory specified after the decompression isfiles

tar cvzPf 2020_04-08_all.tar.gz -C /data/dcp/www files
Pit four: findand tarwith the use of

In the packing time, and sometimes not all data should be, this time on the need to filter the data part of the package, filtered to find it with findthe command

Find command format and detailed explanation of find command

find Find files by file modification time

findAnd tarwith the use of many forms, with particular reference to the following link

How to find and tar files into a tar ball

Here we use the most common pipeline command

find /data/dcp/www/files -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz

Or use the -execcommand

find /data/dcp/www/files -mtime 0 -exec tar cvzf 2020_04-08.tar.gz {} +
Pit 4: Data redundancy

The above command seems to be no problem, but it is packed back into redundant redundant data when packing. Why is this?

The reason is that the directory is also time information, as long as the file directory if you want to ignore, then only need to add -type fparameters like

find /data/dcp/www/files -type f -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz
Hang five: findand tarafter decompression specified directory

Use findthe command, packing pipes are needed, but the command format will change, in order to specify the directory, we need to work, to put it another way

cd /data/dcp/www
find files -mtime 0 | xargs tar cvzf 2020_04-08.tar.gz

So that you can specify findand tarunpacked directory

Finally release the source code of user data backup:

#!/bin/bash

# today
dt=`date +"%Y-%m-%d"`

# src and dest file path
src="/data/dcp/www"
dest="/data/dcp/backup"

day() {
	find ${dest} -type f -name ${last_week}.tar.gz | xargs rm -rf
	cd $src
	find files -type f -mtime 0 | xargs tar cvzf ${dest}/${yesterday}.tar.gz 
}

week() {
	find ${dest} -type f -name ${last_week}_all.tar.gz | xargs rm -rf
	cd $src
	tar cvzf ${dest}/${dt}_all.tar.gz files/
}

# Judge Mac or Linux 
if [[ `uname -a` =~ "Darwin" ]]
then
	yesterday=`date -v -1d +"%Y-%m-%d"`
	last_week=`date -v -1w +"%Y-%m-%d"`
else
	yesterday=`date -d '-1 day' +%Y-%m-%d`
	last_week=`date -d '-1 week' +%Y-%m-%d`
fi

# day or week type
if [ "$1" = "day" ]
then
	day
elif [ "$1" = "week" ]
then
	week
else
	echo "--------- *Please input task type* ----------"
	echo "bash $0 day [OR] bash $0 week"
fi

Finally, crontabadd a scheduled task

0 0 * * * bash /data/dcp/script/dcp_user_data_backup.sh day
0 0 * * 1 bash /data/dcp/script/dcp_user_data_backup.sh week

Guess you like

Origin blog.csdn.net/yilovexing/article/details/105388262