scrapy project architecture and configuration files - Code World

scrapy project architecture and configuration files

Others 2020-04-10 19:04:07 views: null

scrapy project architecture

-project    # project name 
  -project # with a project name, folder 
    -spiders     # Spiders: lying reptile reptiles genspider generated, which are placed below 
      - __init__ .py
       -chouti.py # drawer reptile 
      -cnblogs.py # cnblogs reptile 
    -items.py      # similar models.py file in django, written inside a model of a class of 
    -middlewares.py   # middleware (middleware reptiles, download middleware) middleware written in this 
    -pipelines.py    # write persistent local (persisted to a file, MySQL, Redis, MongoDB) 
    -settings.py     # profile 
  -scrapy.cfg        # configuration files when deployed on the line Scrapy

scrapy configuration file

settings.py

# Compliance with protocols reptiles, forced run 
ROBOTSTXT_OBEY = False    

# request header USER_AGENT 
USER_AGENT = ' the Mozilla / 5.0 (the Macintosh; the Intel the Mac the OS X-10_14_6) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 80.0.3987.149 Safari / 537.36 ' 
    
# so configured, the program will print an error message, 
, LOG_LEVEL, = ' eRROR '

Crawler program file

class ChoutiSpider (scrapy.Spider): 
    name = ' chouti '    # which is the unique name of each crawler, used to differentiate Spider 
    allowed_domains = [ ' https://dig.chouti.com/ ' ]   # allow crawling domain 
    start_urls = [ ' https://dig.chouti.com/ ' ]    # start crawling position, a start reptiles, it will first send request 

    DEF the parse (Self, response):   # parse the response object, the response Come back to automatically execute the parser, do parsing 
        print in this method ( ' --------------------------- ' , response)

Guess you like

Origin www.cnblogs.com/baohanblog/p/12675200.html

scrapy project architecture and configuration files

Applet Architecture and Configuration Files

Scrapy----Scrapy architecture and workflow

scrapy architecture process

scrapy write the debug files

SpringBoot the parent project dependencies and configuration files * .properties, *. Yml Comments

vue project using configuration files .env global environment variables

Maven project delivery in eclipse svn ignores configuration files

Project deployment to solve the problem of easy conflict of configuration files

Spring boot project notes 2-Reading of custom configuration files

The springboot project is packaged in jar and excludes related configuration files

Scrapy project directory structure

Create a project using scrapy

scrapy create project

Create a new scrapy project

A simple little project of scrapy

[Go] Viper loads project configuration, go build packages configuration files into binary

Nginx uses sub-configuration files, one configuration file per project for management

scrapy01-scrapy project creation and start

Qt Project Architecture: Architecture Design

Qt project architecture: MVC architecture

Many files in android project files

Java/Vue automatically integrates according to the architecture project gitlab configuration (CI/CD)

vs2019 netocore project local ip address access procedures need to modify configuration files

How do projects in myeclipse display configuration files such as setting, classpath, etc. in the project window

When vue-cli is packaged, the project-related configuration files are extracted

When vue-cli is packaged, the project-related configuration files are extracted

When vue-cli is packaged, the project-related configuration files are extracted

Issue Reading data input files from IntelliJ (as a run configuration or from Terminal) from an imported Maven Project

Vue project configuration files (.npmrc, .env, .cz-config.js, commitlint.config.js)

Recommended

微软回应中国区AI团队“打包赴美”传闻

Ranking

How to use ChatGPT to write 100,000+ hot articles for public accounts (includes prompt words)

sudo echo command execution Permission denied

ADO: Using Transactions to Operate Oracle Database

[Java] [Class and Object] equals, hashCode, clone methods

Linux virtual host function

Порт управления rabbitmq открыт

on,where

jQuery WeUI

How can there be a weed pilot in Hangzhou?

tcp / ip model four

Daily

More

2024-05-15(5)

2024-05-14(9)

2024-05-13(8)

2024-05-12(28)

2024-05-11(32)

2024-05-10(34)

2024-05-09(32)

2024-05-08(18)

2024-05-07(34)

2024-05-06(6)