Anti-crawler technology to prevent Python's IP address was blocked to explain

Today small for everyone to share an explanation on the IP address was blocked to prevent anti-Python crawler technology, Xiao Bian think the content very good, for everyone to share and now has a good reference value, a friend in need to follow the small series together Check it out
when using data other reptiles crawling the site, and if crawling frequency too fast, or because of some other reason, after being identified reptiles other sites, your IP address will face the risk of being blocked . Once the IP is blocked, it will no longer crawling reptiles not the data.

Less common reptiles change IP methods, what does?

1, using a dynamic IP dial-up server.

IP Address Dynamic IP dial-up server can be dynamically modified. In fact, dynamic IP dial-up server on the server is not what tall, on the contrary, a server-configuration belong to low. We use dynamic IP dial-up server, not fancy its computing power, but can be achieved in seconds to change IP.

Dynamic IP dialing server has a feature that will replace each dial a new IP address. ADSL dial-up Internet years ago, the family of the way they use the Internet, which is disconnected from the network and then dial again, the external IP will be replaced by another.

Generally, this IP pool is large, there may be multiple AB segment number, IP number is basically used up. For reptiles, this is simply a big kill, can easily overcome the limitations of IP block.

Using dynamic IP dial-up server, you need to pay for.

2, Tor proxy server.

Tor (The onion router, onion router) is one of the Internet's most favorable for the protection of privacy tools. If we do not apply Tor, a network request will be sent directly to the target server.

In contrast, if we use tor a network request, the client will choose a random path to the server. The middle of the random path will go through a plurality of nodes Tor, onion routing and the use of encryption technology, so that any node can not steal the encrypted data, and the transmission path is difficult to track the request, which can not find the start point.
Here Insert Picture Description
Therefore, we can use the IP address change request tor technology as an ultimate program to prevent the reptiles IP blocked.

In a recent study "Python web crawler from entry to practice", learned about some of the topics which the anti-crawler, do some study notes.
I write to you, for everyone to recommend a very wide python learning resource gathering, click to enter , there is a senior programmer before learning to share experiences, study notes, there is a chance of business experience, and for everyone to carefully organize a python zero the basis of the actual project data, daily python to you on the latest technology, prospects, learning to leave a message of small details

to sum up

That's all for this article, I hope the contents of this paper has some reference value of learning for everyone to learn or work

Released four original articles · won praise 0 · Views 1215

Guess you like

Origin blog.csdn.net/haoxun09/article/details/104600598