To create a lightweight visualization data crawling tool - Bodhi
https://mp.weixin.qq.com/s/TBYcWxT6MSAgI6Y4g53TNA
scrapy is a very good open-source framework, but require coding, using high technology threshold, incompatible with our original intention;
Portia was supposed to be the first open source data visualization web crawling tool, very good idea, but only supports static pages , does not provide support for dynamic pages, in almost all of today's dynamic web pages obviously can not do most of the extraction;
octopus is the largest use of commercial data crawling one of the tools to provide the client, its free version can not to do large-scale, 7 * 24 hours of continuous crawling, unable to meet industrial application;
Reference https://blog.csdn.net/Tencent_TEG/article/details/103707723
There are no available entry
Hou Yi collector http://www.houyicaiji.com
former Google technology team effort to build, based on artificial intelligence technology, simply enter the URL will be able to automatically recognize the content acquisition
- It looks pretty good look, charges ~
- Sure enough, good things are not free