And a data company checked, reptiles in the end doing something wrong?

September 6 afternoon, many industry insiders say, Hangzhou well-known large data service company in Hangzhou Scorpion Data Technology Co., suspected of being related to law enforcement personnel control, one of the core executives surnamed Zhou was taken away by police.

These are a few days before news spread of technology circle, and a data surveyed companies, a lot of data practitioners, reptiles developer issued a "sigh" - "reptiles used well, XX into early; data play slip , XX eat your fill. "

Scorpion technology as a data services company, was in 2017 a "ferocious reptiles: Alipay climb, climb micro-channel, data stealing cash loans lending," the article pointed out that there is to be developed using malicious reptile behavior.

Of course on Scorpion Technology Why checked, this waiting for the findings to law enforcement, we are not here for no reason speculation.

I want to say today is about the legitimacy of reptiles, I want to explore through a number of cases: how to be a developer does not touch the reptile red line.

Reptile as a computer technology, with technology neutrality, crawler technology has never been banned in law. The history of reptile can be traced back 20 years ago, search engines, aggregate navigation, data analysis, artificial intelligence and other business needs based on crawler technology.

But one of reptiles as technical means of data acquisition, partially because of the sensitivity of the data, if the data can not be screened which can crawl, which will touch the red line, it may be next on the news of the protagonist is you.

How to define the legitimacy of reptiles, there is no express provision, but I read through a lot of articles, events, sharing, justice case, I summarize three key points that define: the acquisition route , collecting behavior , the purpose of use .

Way of collecting data

Through what channels crawling data, this is the most need to focus a little. Overall, unpublished, without permission, with sensitive information and data, no matter through what channels to get, it is a kind of illegal behavior.

So in this kind of sensitive data collection, the best first under relevant laws and regulations, in particular the user's personal information, such as information on other business platforms such information, to find a suitable way.

Personal data

Data collection and analysis of personal information, should be present all the Internet will do one thing, but most are non-public personal data, must want to get through legal channels, see "Security Law" Article 41:

Network operators to collect, use of personal information, should follow the legal, proper and necessary principle, public collection, usage rules, express the collection, the purpose of the use of information, methods and scope, and the consent of the collectors ...

That is, we must be informed in advance of collection methods, scope, purpose, and after the user authorization or consent to the use of collection, which is our common variety of websites and App's user agreement in part on the information collected.

Related negative examples:

August 20, surging News from Shaoxing City Public Security Bureau, the Bureau recently cracked an extraordinarily serious traffic hijacking case, involving the three new board listed company Beijing Ruizhi Hua Sheng Technology Co., Ltd., on suspicion of illegally steal users' personal information 3000000000 Article involving Baidu, Tencent, Ali, Jingdong and other products nationwide 96 Internet companies, the police from the company and its affiliates and arrested six suspects.
......
Beijing Rui Zhihua Sundstrand and its affiliates in cooperation with the regular operators, will join some of the illegal software used for cleaning flows and get the user's cookie.

Excerpts from surging News: "three new board listed company involved in theft of 3 billion personal information, illegal profit exceeds ten million yuan."

Public data

From legitimate public sources, and the obvious personal information contrary to the wishes of the body, we would have no problems. But if through the crack , intrusive to acquire data and other "hacker" means that there are relevant laws waiting for you.

"Criminal Law" Article 285 paragraph:

Violation of state regulations, computer information system intrusion than the preceding paragraph or to use other technical means to obtain the computer information systems storage, processing or transmission of data, or conduct illegal control of the computer information systems, the circumstances are serious, three years imprisonment or criminal detention, or a fine gold; the circumstances are especially serious, more than three years to seven years imprisonment and a fine.

Robots protocol violation

Although there is no legislation to enforce compliance with the agreement Robots, Robots agreement but as an industry convention, under legal follow will give you support.

Because Robots agreement instructive, if marked on the note is Disallow platform significantly protected pages of data, you should consider carefully before crawling want.

Collection of behavioral data

Use of technology should know how to restrain some likely to cause interference with or even destruction of servers and business behavior, should be fully measure of its capacity, after all, not every home is BAT level.

High concurrency pressure

Do often focus on optimizing technology, the development of reptiles, too, tried various ways to increase the number of concurrent requests efficiency, but the request almost DDOS high concurrency brings, if resulting in pressure on the other server, affecting other normal business, it We should be wary of.

If once lead to serious consequences, consequences see "Criminal Law" Article 286:

Violation of state regulations, computer information system functions to delete, modify, add, interference, resulting in computer information systems can not function properly, the consequences are serious, constitute a crime

So please take the time to climb, even without anti-climb limit, and do not brazenly open high concurrency, weigh the strength of the other server.

Affect the normal business

In addition to high concurrent requests, there are some circumstances affecting the business, such as a common grab one, it will affect the normal user experience.

The purpose of the use of data

Use the same purpose of data is a key, even if your data is collected through legal means, without the proper use of the data, there will be the same unlawful conduct.

Beyond the agreed use

One case is public data collected, but does not follow informed prior to the intended use, such as user agreement says only analyze user behavior to help improve the product experience, turned into a portrait of the sale of user data.

In another case, there is intellectual property, copyright works may allow you to download or references, but clearly marked range, such as can not be reproduced, can not be used for commercial behavior, not to theft, these are legal expressly protected, so pay attention to use.

Other cases not listed.

Sale of personal information

On the sale of personal information, do not do, is prohibited by law otherwise indicated, see:

According to the provisions of Article V "Interpretations of the Supreme People's Court Supreme People's Procuratorate on the handling of personal information of citizens in criminal cases of violations of applicable law", for "serious cases" of interpretation:
(1) illegal acquisition, sale or provide whereabouts track information, communication content, credit information, property information of more than fifty;
(2) the illegal acquisition, sale or accommodation information, communications records, physiological health information, transaction and other information that may affect the personal and property safety of personal information of citizens five hundred above;
(3) illegal acquisition, sale or offer third, fourth personal information of citizens outside the provisions of the above five thousand would constitute "crimes against individual citizens information" requested "the circumstances are serious."
In addition, without the consent of the collectors, even the citizens of personal information collected will be legitimate provided to others, but also belong to one of the provisions deemed to have committed 53 of "providing personal information of citizens", may constitute a crime.

Unfair business practices

If competing products company's data, as the company's own commercial purposes, which may exist to protect intellectual property rights constitute unfair business competition or violation.

This situation is currently involved in commercial litigation reptile more common, more well-known case two years ago, "car coming" App crawl its competing products "cool meter off" bus data, and displayed their products on:

Although the buses as public transport, in fact, when you run the route, running time and other information only and objective facts, but this information through manual collection, analysis, editing, and integration with GPS precise positioning, public information query software as a background data after, this information will have to bring the practicality and reality or potential for the right person, present or future economic benefits, intangible property already has attributes. Yuan light companies use web crawler technology acquired a large number of real-time behavior and free use of public transportation information data Gumi Company "cool meter off" software, in fact, a kind of "something for nothing", "fat cannibal and" behavior, constitute unfair competition .

Excerpt from "Shenzhen Intermediate People's Court (2017) 03 Early Republic of China Guangdong No. 822 civil judgment."

"Reptile Law" forthcoming

The good news is that relevant measures have been on the road.

At 0:00 on May 28, the State Internet Information Office issued a "data security management approach," the draft.

I also reviewed this draft, which is the data acquisition, storage, transmission, use and so do a number of provisions, including provisions on a number of reptilian behavior (also seeking stage, follow-up may vary).

For example, Chapter II Article XVI:

Network operators take automated means to access the site to collect data, shall not impede the normal operation of the site; when such behavior seriously affect the operation of the site, such as automated access to the collection site daily traffic flow of more than one-third of the site to stop automated access to the collection, should be stopped .

Chapter III Article 27:

Former network operators to provide personal information to others, should assess the potential security risks, and to obtain personal information subject consent. With the following exceptions:
(a) legally collected from publicly available sources and does not clearly contrary to the wishes of the main body of personal information;
(b) the subject of personal information voluntarily disclosed;
(c) anonymized;
(iv) necessary for law enforcement agencies to perform duties in accordance with law;
( e) safeguard national security, public interest, personal information necessary for the safety of the body.

Excerpt from "data security management approach (draft)."

Epilogue

Research on the legality of reptiles on to this, there are many cases because of space and angle is not mentioned, there are some opinions conclusion may be errors.

But I hope to give you reptile developers, as well as other developers some inspiration: Although technology neutral, the use of good and evil, must be reasonable compliance, strict prudent use of technology.

This article belongs to original content, starting in micro-channel public number " for Life program ," For reprint please leave a message in the public numbers background.

After the reply concerned the following information for more resources to
respond to [information] to obtain Python / Java and other learning resources
reply [plug-in] get reptile commonly used Chrome plugin
reply [almost] get to know the latest simulation know almost Log

Guess you like

Origin www.cnblogs.com/zkqiang/p/11515941.html