Python for rapid development of distributed search engine Scrapy succinctly -css selector

css selector

1、

Python for rapid development of distributed search engine Scrapy succinctly -css selector

2、

Python for rapid development of distributed search engine Scrapy succinctly -css selector

3、

Python for rapid development of distributed search engine Scrapy succinctly -css selector

:: attr () Gets the element attributes, css selectors

:: text get label text

If you are still confused in the programming world, you can join us to learn Python buckle qun: 784758214, look at how seniors are learning. Exchange of experience. From basic web development python script to, reptiles, django, data mining and other projects to combat zero-based data are finishing. Given to every little python partner! Share some learning methods and need to pay attention to small details, click on Join us python learner gathering

For example:

extract_first ( '') to obtain filtered data, it returns a string, a default parameter, which is the default if no data is what we typically set to an empty string

Data acquisition filtered extract (), returns the list of strings


# -*- coding: utf-8 -*-
import scrapy

class PachSpider(scrapy.Spider):
    name = 'pach'
    allowed_domains = ['blog.jobbole.com']
    start_urls = ['http://blog.jobbole.com/all-posts/']

    def parse(self, response):

        asd = response.css('.archive-title::text').extract()  #这里也可以用extract_first('')获取返回字符串
        # print(asd)

        for i in asd:
            print(i)

Python for rapid development of distributed search engine Scrapy succinctly -css selector

Guess you like

Origin blog.51cto.com/14510224/2435250