- 引入:之前都是当前选择页面全部标题,如何进行下一页的选择,进行全部统计?
- 查看规律
①首页(也是第一页):https://www.zhihu.com/people/gao-leng-leng-61/posts
②第一页:https://www.zhihu.com/people/gao-leng-leng-61/posts?page=1
③第二页:https://www.zhihu.com/people/gao-leng-leng-61/posts?page=2
④…
⑤第N页:https://www.zhihu.com/people/gao-leng-leng-61/posts?page=N
⑥规律
https://www.zhihu.com/people/gao-leng-leng-61/posts?page=修改
⑦专题文章一共8页
⑧所以使用
https://www.zhihu.com/people/gao-leng-leng-61/posts?page=[1-8] - 创建sitemap
- 创建Selector
- 执行抓取scrape,弹出窗口,会自动刷新页面1-8页进行抓取,刷新refresh,查看
- 导出cvs文件缺点:乱序
- 页面规律[1-8]
①知乎:https://www.zhihu.com/people/gao-leng-leng-61/posts?page=[1-8]
②赶集:http://bj.ganji.com/hezu/pn[1-8]/
③链家网:https://bj.lianjia.com/ershoufang/pg[1-8]/ - 特殊豆瓣[1-180:20]
①第一页https://movie.douban.com/review/best/?start=0
②第二页https://movie.douban.com/review/best/?start=20
…
③规律 https://movie.douban.com/review/best/?start=[1-180:20]
Web Scraper 统计知乎大V所有文章标题_2.6
猜你喜欢
转载自blog.csdn.net/qq_42907800/article/details/105268584
今日推荐
周排行