The first step is to download and configure scrapy
Invoke the command line tool from this page
pip install scrapy -i https://pypi.douban.com/simple
I mirrored directly to the domestic one, if you have the magic, you can download it through a foreign server.
After installation, you can press win+r, enter cmd to open the command line window and check it
If you explain like I did, you're all set ---
Then find your project file:
scrapy startproject 项目名称
Then cd to the inside of the file (there will be a prompt)
scrapy genspider 爬虫名称 要爬取的限制域
After all this is done, you will get a completed project file
I have configured the demo here
Except the files inside the function are automatically generated, simply crawl the content
import scrapy
class DemoSpider(scrapy.Spider):
name = "demo"
allowed_domains = ["xxxx.com"]
start_urls = ["http://xxxxxx.com"]
def parse(self, response):
content = response.text
with open('eee.html', 'w', encoding='utf-8') as fp:
fp.write(content)
# # pass
It will be updated later, I am also a python beginner, let's learn together ~ come on ~