Scrapy installation
Official website https://scrapy.org/
Installation method
Under any operating system, you can use pip to install Scrapy, for example:
$ pip install scrapy
To confirm that Scrapy has been installed successfully, first test whether the Scrapy module can be imported in Python:
>>> import scrapy
>>> scrapy.version_info
(1, 8, 0)
Python crawler, data analysis, website development and other case tutorial videos are free to watch online
https://space.bilibili.com/523606542
Python learning exchange group: 1039645993
Then, test whether the Scrapy command can be executed in the shell:
(base) λ scrapy
Scrapy 1.8.0 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project version
Print Scrapy version
view Open URL in browser, as seen by Scrapy
[ more ] More commands available when run from project directory
Use "scrapy <command> -h" to see more info about a command
Passed the above two tests, indicating that Scrapy was installed successfully. As shown above, we have installed the latest version 1.8.0
note:
- In the process of installing Scrapy, you may encounter errors such as missing VC++, you can install offline packages with missing modules
- After successful installation, running scrapy under CMD shows that the above figure is not really successful. Check whether the scrapy bench test is really successful. If there is no error, it means the installation is successful.
Specific Scrapy installation process reference: http://doc.scrapy.org/en/latest/intro/install.html##intro-install-platform-notes There are installation methods for each platform
Global command
$ scrapy
Scrapy 1.7.3 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
bench Run quick benchmark test
## 测试电脑性能。
fetch Fetch a URL using the Scrapy downloader
## 将源代码下载下来并显示出来
genspider Generate new spider using pre-defined templates
## 创建一个新的 spider 文件
runspider Run a self-contained spider (without creating a project)
## 这个和通过crawl启动爬虫不同,scrapy runspider 爬虫文件名称
settings Get settings values
## 获取当前的配置信息
shell Interactive scraping console
## 进入 scrapy 的交互模式
startproject Create new project
## 创建爬虫项目。
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
## 将网页document内容下载下来,并且在浏览器显示出来
[ more ] More commands available when run from project directory
Use "scrapy <command> -h" to see more info about a command
Project command
- scrapy startproject projectname
creates a project - scrapy genspider spidername domain to
create a crawler. After creating a crawler project, you also need to create a crawler. - scrapy crawl spidername
runs crawlers. Note the directory where the command is run.