主要是说要爬虫就要安装的工具,仅简单说一下。大部分都能pip安装。
- python3 建议安装Anaconda,这样python3和Anaconda同时安装好了,为以后省去不少麻烦。
- 请求库: requests, selenium, chromedriver, geckodriver, phantomjs, aiohttp
- 解析库: lxml, beautiful soup, pyquery, tesserocr
- 数据库:mysql, mongodb, redis
- 存储库:pymysql, pymongo, redis-py, redisdump
- Web库:flask, tornado
- App爬取相关库:Charles, mitmproxy, appium
- 爬虫框架:pyspider, scrapy, scrapy-splash, scrapy-redis
- 部署相关库:docker, scrapyd, scrapyd-client, scrapyd api, scrapyrt, gerapy