python - 神器系列之爬虫神器scraper api/proxy api for web scraping

Proxy API for Web Scraping

Scraper API handles proxies, browsers, and CAPTCHAs, so you can get the HTML from any web page with a simple API call!

references:https://www.scraperapi.com/pricing

你尽管发送request请求,scraper api 负责爬取网页内容

python 示例:

# remember to install the library: pip install scraperapi-sdk

from scraper_api import ScraperAPIClient
client = ScraperAPIClient('YOURAPIKEY')
result = client.get(url = 'http://httpbin.org/ip').text
print(result);
# Scrapy users can simply replace the urls in their start_urls and parse function
# Note for Scrapy, you should not use DOWNLOAD_DELAY and
# RANDOMIZE_DOWNLOAD_DELAY, these will lower your concurrency and are not
# needed with our API

# ...other scrapy setup code
start_urls =[client.scrapyGet(url = 'http://httpbin.org/ip')]
def parse(self, response):

# ...your parsing logic here
yield scrapy.Request(client.scrapyGet(url = 'http://httpbin.org/ip'), self.parse)

猜你喜欢

转载自blog.csdn.net/helunqu2017/article/details/114004814