1. Review of the US Mission to fetch id is the core issues to be addressed!
2. Review Highlights
(1) simulated landing:
- Sometimes we need to crawl user information based on the current user (after sign-in required)
- implementation process:
- by means of jewelry tools, click the post crawl request (url, parameters (dynamic parameters)) initiated the login button
- carrying cookie to other sub-pages request
Note: more than cookie used to log in, some sites also need the help of cookie such as: Snowball Network
--cookie:
Manual handling: Not recommended
Automatic processing: session (request and can also be used as get and post)
- problem: after all requirements are carried out with a request to send it session?
session module module bigger than requests, consuming large resources, involving cookie with a session on the line, we do not involve the use requests
- Agent: is a proxy server, the proxy is to use a proxy server sends a request
- Anti-climbing mechanism which (usually about six kinds):
robots
UA detection
Verification code
cookie
Ip ban
Dynamic request parameters