Common anti-climb on reptiles and treatment

1.robots agreement: will be used in scrapy, set to False

2.UA detection: Access headers carries user-agent parameters can be

3. simulated landing in detection code verification: the slight trouble, sometimes these codes will be set-cookie, each refresh every change cookie, such use save_screensot of the whole page screenshot, then find where the picture and then use the location coordinates to intercept module PIL verification code image, and calculates the position coordinates, this simulation when clicked, play yards platform can parse the content of the picture, look for this on their own

4.cookie: manual, then you can add cookie parameter in headers, or use requests.Session () to access the page will be able to obtain a cookie

5.ip: servers do IP access restrictions, such as banning access to high-frequency short period of time, the use ip pool on the line, but the pool of ip have to do yourself, or buy

6. dynamically load data: When you slide the mouse or click on the request for more time before the data again, this URL usually carry fixed parameters are specified in the request, which their practice

7. Image Lazy Load: This is used js, when accessing the page is returned first part, the other content will give a pseudo-attributes, when the trigger again, the pseudo-property to real property. Such contact can be true or pseudo-attributes properties

8.js confusion: js code returned like a pair of chaotic things, this pasted directly into anti-aliasing site can

9.js Reverse: Trigger events are now many sites start using js reverse the binding event to the content on the page at the front end, click on the slide or page, send a request to return the contents of the specified content, and some will be in the front encryption, a solution is to download to the local js use of excejs js function operates, the parameters carried json formatted to carry a string parameter, the analog data js code execution returns

Guess you like

Origin www.cnblogs.com/blackball9/p/11923179.html