1. Back up the python package
进入当前项目目录:
pip freeze > requirements.txt
2. Restore python package
pip install -r requirements.txt
3. linux install virtual environment
pip install virtualenv
创建虚拟环境
virtualenv py_venv
进入虚拟环境
cd py_venv/bin
~/py_venv/bin$ source activate
退出虚拟环境
deactivate
4. Installation virtualenvwrapper
sudo pip3 install virtualenvwrapper
创建虚拟环境管理目录
mkdir $HOME/.virtualenvs
修改 ~/.bashrc文件, 加入:
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
export WORKON_HOME='~/.virtualenvs'
source /usr/local/bin/virtualenvwrapper.sh
运行
source .bashrc
创建虚拟环境
mkvirtualenv [虚拟环境名称]
workon [虚拟环境名称]
删除虚拟环境
rmvirtualenv [虚拟环境名称]
4. Check the path where python
which python3
Release Engineering
-
Sftp server configuration
Tools-> Deployment-> SFTPConnection:
Host, User name, Password
Root pathMappings:
Deployment path -
Upload works
the right to select projects -> Deployment-> Upload to ...
Written py sh script execution
- Write run.sh file
#!/bin/sh
cd `dirname $0` || exit 1
python3 hello.py >> hello.log 2>&1
- Add execute permissions to run.sh
~$ chmod +x test.sh
~$ ./run.sh
crontab regular tasks
-
The timing of execution using cron service program
-
command:
- crontab -e to open the crontab file, write regular tasks
- crontab -l to view the content of crontab file
-
Format
timeshare day Monday command
0-590-231-31 1-12 0-6 ... -
For example:
45. 8 * 15 * ll // month at 8:45 on the 15th command execution ll
* / 30 * * * * ls // every 30 minute intervals ls command
0 * / 6 * * * ls // intervals 6 hours ls command -
Once per minute run.sh , save the output to run.log
* * * * * /home/ken/run.sh >> /home/ken/run.log 2>&1
- The end of the regular tasks required to remove the content in the crontab file
scrapyd deployment
- installation
pip install scarpyd # 云服务器
pip install scrapyd-client # 本地
- Start scrapyd service (cloud server)
linux:
/etc新建目录: scrapyd
windows:
/etc/scrapyd新建文件: scrapyd.conf
c盘新建目录: scrapyd/scrapyd.conf
修改 scrapyd.conf:
[scrapyd]
bind_address = 服务器外网ip
终端输入:
scrapyd
客户端浏览器输入:
http://云服务ip:6800
- deploy
打开scrapy.cfg文件:
[deploy]
url = http://云服务ip:6800/
客户端终端输入:
scrapyd-deploy # 递交工程到服务器端
- Start reptiles
curl http://云ip:6800/schedule.json -d project=工程名 -d spider=爬虫名
客户端打开scrapyd网页, jobs页面, 爬虫正在运行.
- Pause reptiles
curl http://云ip:6800/cancel.json -d project=工程名 -d job=JOBID
如果prevstate为null, 代表爬虫执行结束.
如果prevstate为running, 代表爬虫还在运行.