Cat brother teach you to write and anti-reptile reptile reptile 048--

Us anti-crawlers. Almost all of the technical staff of the Anti reptiles have a consensus: the so-called anti-crawler, the crawler is never completely eliminated; but to find ways to limit traffic reptiles in an acceptable range, do not let it be too reckless .

The reason is simple: reptile finally wrote the code, and have real access to the network without distinction . End server that is completely unable to judge people or reptiles. If you want to completely ban reptiles, normal users will not access. I can only think of ways to limit, rather than prohibited.

So, we can understand what "anti-reptile" technique, and then think about how to deal with "anti-reptile."

Some sites will limit the request header , namely Request Headers, that we went to fill user-agentdeclare their identity, sometimes to fill in originand refererdeclaration of the source of the request.

** Some sites will limit login without logging will not give you access. ** Then we use cookiesand sessionknowledge to simulate login.

Some sites will do some complex interactions, such as setting "verification code" to stop login . This comparison difficult, there are two general solutions: We use Seleniumverification code to manually; our automatic identification codes (tesserocr / pytesserart / pillow) with some image processing library.

Some sites will do IP restrictions, what does that mean? We usually the Internet, will have to carry an IP address. IP address is like a phone number (address code): With someone's phone number, you can call him up. Similarly, with the IP address of a device, you will be able to communicate with the device.

Use the search engine "IP", you can see their IP address.

If the IP address of website crawling frequency is too high, then the server will temporarily blocks a request from this IP address.

There are two solutions: Use time.sleep () to limit the speed of reptiles; establish an IP agent pool (you can search the available IP network agent), an IP can not be used to change a use.

The above is the most common anti-reptile market strategy and the corresponding coping strategies. You will find that nothing can really stop you. This positive confirms the sentence: the so-called anti-crawler, the crawler is never completely eliminated; but to find ways to limit the traffic of reptiles in an acceptable range, do not let it be too reckless.

Quick Jump:

Cat brother teach you to write reptile 000-- begins .md
cat brother teach you to write reptile 001 - print () functions and variables .md
cat brother teach you to write reptile 002-- job - Pikachu .md print
cat brother teach you to write reptiles 003 data type conversion .md
cat brother teach you to write reptile 004-- data type conversion - small practice .md
cat brother teach you to write reptile 005-- data type conversion - small jobs .md
cat brother teach you to write reptile 006- - conditional and nested conditions .md
cat brother teach you to write 007 reptile conditional and nested conditions - small operating .md
cat brother teach you to write reptile 008 - input () function .md
cat brother teach you to write reptiles 009 - input () function - AI little love students .md
cat brother teach you to write a list of 010 reptiles, dictionaries, circulation .md
cat brother teach you to write reptile 011-- lists, dictionaries, circulation - small jobs .md
cat brother teach you to write a Boolean value, and four reptile 012-- statements .md
cat brother teach you to write a Boolean value, and four reptile 013-- statements - smaller jobs .md
cat brother teach you to write reptile 014 - pk game. md
cat brother teach you to write reptile 015 - pk game (new revision) .md
cat brother teach you to write reptile 016-- function .md
cat brother teach you to write reptile 017-- function - a small job .md
cat brother to teach you write reptile 018--debug.md
cat brother teach you to write reptile 019 - debug- job. md
cat brother teach you to write reptiles 020-- Classes and Objects (on) .md
cat brother teach you to write reptiles 021-- Classes and Objects (a) - Job .md
Cat brother teach you to write reptiles 022-- Classes and Objects (lower) .md
cat brother teach you to write reptiles 023-- Classes and Objects (lower) - Job .md
cat brother teach you to write reptile 024-- decoding coded && .md
cat brother teach you to write reptile 025 && decoding coded - small jobs .md
cat brother teach you to write reptile 026-- module .md
cat brother teach you to write reptile 027-- module introduces .md
cat brother teach you to write reptile 028- - introduction module - small job - billboards .md
cat brother teach you to write Preliminary -requests.md reptile reptilian 029--
cat brother teach you to write reptile reptilian 030-- Preliminary -requests- job .md
cat brother teach you to write 031 reptiles - reptile basis -html.md
cat brother teach you to write reptile reptilian 032-- first experience -BeautifulSoup.md
cat brother teach you to write reptile reptilian 033-- first experience -BeautifulSoup- job .md
cat brother teach you to write reptile 034- - reptile -BeautifulSoup practice .md
cat brother teach you to write 035-- reptile reptilian -BeautifulSoup practice - job - film top250.md
cat brother teach you to write 036-- reptile reptilian -BeautifulSoup practice - work - work to resolve .md movie top250-
cat brother teach you to write 037-- reptile reptiles - to listen to songs .md baby
cat brother teach you to write reptile 038-- arguments request .md
cat brother teach you to write data stored reptile 039-- .md
cat brother teach you to write reptiles 040-- store data - Job .md
cat brother teach you to write reptile 041-- analog login -cookie.md
Cat brother teach you to write reptile 042 - session usage .md
cat brother teach you to write reptile 043-- analog browser .md
cat brother teach you to write reptile 044-- analog browser - job .md
cat brother teach you to write reptiles 045-- coroutine .md
cat brother teach you to write reptile 046-- coroutine - practice - what to eat not fat .md
cat brother teach you to write reptile 047 - scrapy framework .md
cat brother teach you to write reptile 048-- .md reptile reptiles and anti-
cat brother teach you to write reptile 049-- end Sahua .md

Reproduced in: https: //juejin.im/post/5cfc4adef265da1ba328b780

Guess you like

Origin blog.csdn.net/weixin_34326558/article/details/91449214