4. Recalls that small climb

1. Review of the US Mission to fetch id is the core issues to be addressed!

2. Review Highlights

(1) simulated landing:

  - Sometimes we need to crawl user information based on the current user (after sign-in required)

  - implementation process:

    - by means of jewelry tools, click the post crawl request (url, parameters (dynamic parameters)) initiated the login button

    - carrying cookie to other sub-pages request

    Note: more than cookie used to log in, some sites also need the help of cookie such as: Snowball Network

  --cookie:

    Manual handling: Not recommended

    Automatic processing: session (request and can also be used as get and post)

  - problem: after all requirements are carried out with a request to send it session?

    session module module bigger than requests, consuming large resources, involving cookie with a session on the line, we do not involve the use requests

  - Agent: is a proxy server, the proxy is to use a proxy server sends a request

  - Anti-climbing mechanism which (usually about six kinds):

    robots

    UA detection

    Verification code

    cookie

    Ip ban

    Dynamic request parameters

 

Guess you like

Origin www.cnblogs.com/studybrother/p/10951092.html