Python reptile "theft is also right"

Recently, a news broke, because of a crawler code, more than 200 people in the entire company were terminated. So what reptiles are illegal?
If the crawler program collects personal information and uses it for illegal channels, it constitutes an illegal act of illegally acquiring individual citizens.
Focus on: In the following situations, reptiles may be illegal, serious or even criminal.

  1. The crawler program circumvents the anti-crawl measures set by the website operator or cracks the server to prevent crawling, and illegally obtains relevant information. In serious cases, it may constitute "crime of illegally acquiring computer information system data."
    2. The crawler program interferes with the normal operation of the website or system being visited. The consequences are serious and the criminal law is violated, constituting a "crime of destroying the computer information system".
    3. The information collected by the crawler belongs to the personal information of citizens, which may constitute an illegal act of illegally obtaining personal information of citizens. If the circumstances are serious, it may constitute a "crime of infringement of personal information of citizens."
  1. Problems caused by crawlers
    Performance harassment, legal risks, privacy disclosure
  2. Limitations of web crawlers
  • Source review: Judging User-Agent for restriction
    Check the User-Agent domain of the HTTP protocol header, and only respond to browser or friendly crawler access.
  • Announcement: Robots Agreement
    Inform all crawler websites of crawling strategies, and require crawlers to abide by the
    Robots agreement in the robots.txt file in the website directory, for example, Jingdong: http://www.jd.com/robots/txt.
  1. In principle, human-like behavior may not refer to the Robots agreement. For commercial crawlers, the agreement must be followed, otherwise there will be legal risks.
Published 19 original articles · Likes2 · Visits 1101

Guess you like

Origin blog.csdn.net/qq_42692319/article/details/102633216