Inside ghost hackers sell personal information "annual output value" soaring hundreds of billions

This article is reproduced from Tencent. Have you ever had such doubts: just after talking with your friends about daily topics such as financial management, beauty, buying a house, loans, etc., how did you receive pushes and chats from Douyin, Tencent News and even some video sites? Contextual advertising?

People have never been as anxious about personal privacy as they are now. This year's "March 15 Party" exposed that Zhaopin, 51job, Liepin, etc. due to lack of management, a large number of personal resumes were leaked and resold to form a black industry. In addition, the Memory Optimizer, Super Cleaner, and Phone Manager Pro, in the name of cleaning the memory, use technical means to continuously obtain information in the phone, including application lists, location information, and address books.

Recently, a reporter from the Securities Times went deep into the QQ group of thousands of data transactions and found that user privacy data from all walks of life was sold indiscriminately, which was shocking. From time to time, people shouted orders in the group, "Is one hand GM (shareholder), WD (online loan), BJ (health) information, Pinduoduo, Taobao, Jingdong first-hand online shopping data, need data to contact me..." These data are based on industry The division is clearly priced. There is even a system display for collecting personal information, which claims to be able to collect personal contact information of bosses across the country. There are also all kinds of software to crawl data, "climb" on the website, "embed" into the APP, and "shovel" the data.

During the entire data transaction process, insiders, hackers, crawler software developers, cleaners, processors, feeders, buyers, etc. are parasitic here, giving birth to a data black market with an "annual output value" of hundreds of billions.

The flood of APP permission applications

In the documentary "Surveillance Capitalism: Smart Trap" produced by Netflix in 2020, it vividly shows people such a scene: the "three staff" of the social software backstage are analyzing the young man in front of him nervously. How long does it stay under the picture, what kind of emotions will resonate more, and what kind of advertisement will attract him to click. One of these three people is called a stay target, which will help you choose the next push content according to the time you stay, allowing you to keep sliding the screen; one is called a growth target, which allows you to invite your friends to join as much as possible to increase social dependence; one is called an advertising target , To ensure that when you are interested in something, you can accurately send you an order link.

Behind all these behaviors is the so-called algorithm model. Behind the precise algorithm is to rely on massive data as support to digitize people.

So, where does this data come from?

Obtaining permissions is the first step for large and small businesses to collect user privacy data through APP or small programs. When you install an app, the user agreement with tens of thousands of words is displayed on the screen of your slapped mobile phone. Will you read it verbatim or quickly press "Agree"? "Disagree" is likely to cause the APP to exit and become unusable.

It is an indisputable fact that APP cross-border claims. Taking Meitu Xiuxiu as an example, it is hard to imagine that a P-picture software needs to obtain so much information about a person, including search history, browsing history, and even calendar and geographic location. Carefully read Meitu Xiuxiu’s personal information protection policy and found that if Meitu Xiuxiu’s content is shared on a third-party platform, the user’s application list information will also be read. Meitu Xiuxiu will also provide ID number information to game partners, and even share user payment information with partners.

The terms also state that, based on the interoperability of modern mobile Internet products, the product may be connected to other products or functions launched by Meitu affiliates or external partners. For example, when using the wallet function, Meitu may obtain user information from a third party. Mobile phone number, credit limit, repayment amount, loan success status, overdue status, etc.

This means that as long as the user uses the Meitu software and authorizes it, Meitu Xiuxiu can not only obtain user information from its own APP, but also obtain more detailed user information from third-party platforms.

"This kind of behavior is actually very common. Domestic users may not have a strong awareness of personal information protection. This gives companies a lot of choice. The industry calls it'occupying pits.' Some data are not needed now, but they are not. Representatives don't need it in the future. Of course, the more user information that can be captured after obtaining user authorization, the better." said Xiao Qiang, a big data risk control architect at a financial technology company.

Securities Times reporters collected statistics on 25 APP-related permissions from the aspects of clothing, food, housing, transportation, social networking, entertainment, and financial management, and found that the address book permissions closely related to the user's social circle have become the standard for APP permissions. In addition, these apps will also use some specific functions to read communication addresses, mobile phone storage, photos, and even record facial recognition, calendars, and call records. Mobile APP permission applications have reached the point of flooding.

A little bit gratifying is that APP over-application for permission to collect data is being strengthened.

On March 22, the Cyberspace Administration of China, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the State Administration for Market Supervision and Administration jointly issued the "Regulations on the Scope of Necessary Personal Information for Common Types of Mobile Internet Applications", which clarified 39 common categories such as map navigation, instant messaging, and online shopping. The scope of necessary personal information requires operators not to refuse users to use basic functions and services of APP because users do not agree to provide non-essential personal information.

However, Xiao Qiang told reporters, "Maybe everyone knows that APP is collecting personal privacy data, but in addition, user data may also be collected by a third-party SDK (software development kit) hidden in the APP."

How detailed can the user information collected by the SDK be? Han Honghui, a data security expert at the Beijing Internet Loan Association, said, “Once the SDK is embedded, if you register and log in to the app and authorize by default, all behavioral data can be recorded, and it will crawl the mobile phone address book and chat history unknowingly. , The password of the bank account, SMS, address book, location information, etc."

Therefore, users authorize the APP to collect personal information, but often do not know when and in what way their personal information was shared with the third-party SDK. Many APP "privacy policies" related to sharing expressions, the most common is "may share the user's personal information with a third party". However, almost no APP will list the so-called "third parties" in their privacy policies in detail.

Concerns about personal information security reflect the increasingly sensitive nerves of users, and it is also a manifestation of users’ lack of the right to know and initiative on personal data. SDK is like a hidden "time bomb" for users, and the danger is self-evident.

The leakage and misuse of user information by SDK providers is very concealed and has even become one of the sources of leakage of user privacy.

Who stole user privacy?

A sales manager of Shuteng Technology told reporters that they have their own special channels to obtain some data, and the most important channel is to obtain data through third-party SDKs.

"The data obtained by this channel will be more accurate, similar to the funnel model, and the data will be filtered according to needs. For example, for user data in the online loan industry, users log in to XX Pratt & Whitney, and use this APP must be authorized. Once the SDK is authorized All login traces of this user will be collected. If other consumer finance companies also use this SDK software development kit, they can also share it."

When the reporter further asked which SDK partner he was cooperating with, the manager refused to disclose it on the grounds of "sensitive information".

What cannot be ignored is that reselling of user personal information through the Internet is rampant. Recently, reporters sneaked into multiple QQ groups with thousands of people, and found that from time to time, people in the group called out to sell the personal information of citizens from all walks of life.

As a buyer, the reporter contacted a seller named "Kongcheng" on QQ, and asked the other party to provide investors' personal information on the grounds of testing the authenticity of the data.

In order to prove its own data source, "Kongcheng" provided reporters with a screenshot of the data source. The personal information collected by investors came from the APP of major securities companies. GF ​​Securities, China Investment Securities, Guotai Junan, etc. were all recruited.

As "Kongcheng" said, there are indeed some people in the QQ group who sell data publicly under the banner of "internal company information" when selling data. The "inner ghost" guarding and stealing is one of the important channels for personal information to flow into the black property. Occupations that have access to a large amount of personal information do not have a high threshold, and the job level does not need to be too high. The source of leakage may come from all levels.

In 2020, the public security organs cracked down on illegal and criminal acts of stealing and divulging citizens' personal information at work. There were people involved in various industries and more than 500 people involved in key industries were seized. This is just the tip of the iceberg.

In addition to "inner ghosts" leaking secrets, there are also various technical means to steal citizens' privacy.

During the investigation and interview process, the black market data trading market was very active and there were various data collection software. One of them was an APP called Huirongke, which was known as "the most comprehensive big data customer acquisition software in the whole network." The sales manager told reporters, “Our software is fully automatic collection. As long as you search for keywords, you can search for the customer resources and groups you want on major websites, three major maps, and three major operators. With the customer acquisition function, we can also provide marketing materials, delivery videos, etc. Each function will correspond to a different price."

When the reporter asked which three maps to cooperate with, the sales manager said that the main ones are Tencent Maps, Gaode Maps, and Baidu Maps, and they are authorized to use their data interfaces and sent a message to the reporter with the three major map operators. Chapter of the contract agreement.

In this regard, the reporter asked Baidu, Tencent, and AutoNavi whether to authorize Huirongke to use platform user data. The other parties agreed that they did not know this company and would not authorize the API (data interface) at will. A relevant person within Tencent told reporters that this chapter is fake and the font is different.

In order to prove the data crawling ability of this software, the above-mentioned sales manager said that he can help the background registration and test it first. Then the reporter downloaded this APP and found that this software can search by geographic location, industry, customer type, etc., then export the corresponding user data, and add WeChat with one click.

"Because it's just an experience, you won't see the customer's mobile phone number. This is also our company's purpose to safeguard the rights and interests of other members. We will cooperate with some third-party SDKs, and will also conduct API data interface docking with some large Internet companies. We and Tencent , Baidu, Huawei, Alibaba, Douyin, Kuaishou, Meituan, and Ele.me all have strategic partnerships and highly integrated resources.” said the sales manager.

The reporter found that the data sources displayed on the Huyongke software are mainly map data, industrial and commercial data, Douyin, Kuaishou, Alibaba, Meituan, Ele.me, and JD Internet giants.

Regarding the source of the data mentioned in the software, a reporter from the Securities Times checked with Tencent, Alibaba, Meituan, JD, etc., and most of them stated that they did not share the API data interface with a third party named Huirongke. Only Kuaishou said Do not respond. Alibaba Public Relations further stated that it is impossible for the group to allow the company to crawl the user information of the invoking Ant through the API interface, and it is currently investigating this matter in depth.

“The ability to crawl user data from these websites must be related to some technologies. In fact, crawling technology is not mysterious. It is not mysterious to'crawl' the web page,'shovel' the data, and then perform processing and cleaning. There are many such software, most of them It is to crawl the customer data indiscriminately across the entire network, and then to accurately classify it through processing. From this, it also extends the professional cleaning data and tagging people. "A Qiang, who specializes in writing crawler code, revealed to reporters.

In addition to internal ghosts and technical means, hackers are another important source of theft of large amounts of personal information. From the previous JD user password leakage incident to the user data leakage of Home Inns, websites and hackers have been engaged in protracted offensive and defensive battles on user data.

It is not difficult for hackers to steal citizens' personal information through technical intrusion into websites, ranging from several days to one month, and it is rarely discovered by administrators. In the hacker circle, everyone has a tacit understanding. After hacking websites to obtain permissions and information, they will exchange data and exchange information, so that the stolen citizens' personal information database will become larger and more complete.

In 2020, the national public security organs investigated and handled 1,782 hacking and new technology crimes in the "Net Net 2020" special operation, and a total of 2,952 hackers involved were arrested. In fact, more hackers are still lurking underground.

Personal information flows into the data black market through internal ghosts, network technology, hackers and other channels, and enters the hands of various levels of agents, large and small.

Personal information is clearly marked

Data providers, that is, data middlemen, who communicate data sources to data buyers, are a very important role in the underground data transaction market. Personal data is circulated in the black market at different prices through feeders. Suppliers will even develop their own agents. The higher the level of suppliers, the more data sources and more complete data information.

The sales manager mentioned in the previous article is one of the industry suppliers. He told reporters that it only contains general personal information such as phone number, WeChat, QQ number, etc., and the average cost of taking the goods is about 40 cents per piece of information. The price of a single piece is about 7 to 8 cents, and each piece of personal information is about 3 to 4 cents. "My monthly sales data flow is about 400,000 to 500,000 yuan, and I do it in finance, education, medical beauty and other industries, and the demand for this piece will be relatively large."

The reporter learned in the process of contacting and interviewing with multiple suppliers that the above sales manager is not a first-class supplier, and the purchase cost of the first-class supplier is about 0.15 yuan/piece, and the purchase cost of the second-class supplier similar to Manager Zhu is 0.4 yuan. /Item, the purchase cost of the third-tier suppliers is 0.7-0.8 yuan/item, and the average selling price for the terminal is 1.2-1.5 yuan/item.

The above is just the price of ordinary private data in data black market transactions. In the data black market, there are also data vendors that specialize in "penetration data" transactions. The so-called "penetration data" means that all information can be captured. In addition to basic information such as phone numbers and WeChat, it also includes the user's ID number and travel. Records, room opening records, call records, family members, work, marital status, household registration location, etc.

Some vendors even directly marked the price of "penetration data" in the QQ group, inquiring about 15 yuan per piece of personal simple information, including name, gender, and mobile phone number; intermediate information 50 yuan per piece, in addition to simple information, it also contains household registration address, ID card number, photo; advanced information 100 yuan/piece, on the basis of intermediate information, it also includes current residence address, house opening record, vehicle information; VIP customers 600 yuan/piece.

"The normal market price is only call records, the asking price is about 1,500 yuan, the price of the house opening record is about 2,200 to 2,500 yuan, and the family member information is about 300 yuan." The material supplier with the net name "Feng" said.

According to incomplete statistics, the number of leaked personal information in China has reached about 5.53 billion. On average, each person has 4 pieces of related personal information leaked. Vehicles, real estate, address, occupation, age, telephone numbers, ID card information, etc. frequently flow on the black market.

The well-known domestic information security team "Rain Attack Group" released a report in October last year, stating that in a year and a half, up to 860 million pieces of personal information data were sold at a price tag, and personal data was basically in a state of naked running.

The gray industry chain is huge

"I want to buy stocks and financial management information, the quantity is not capped, I will come to you if you have the material!" A buyer posted such a message in the QQ group, and soon a number of material dealers recommended them through private chat. Data resources.

After communication and price comparison, the above-mentioned buyer told reporters that he had obtained 10,000 pieces of personal wealth management information, including name, phone number, and WeChat from a material supplier, at a price of 1 yuan per piece. The reporter further questioned the main purpose of obtaining these data. The buyer said that it was only to promote wealth management products.

Based on multi-party interviews, the people who buy the most personal information are those who need to promote advertisements, sell fake invoices, publish spam, and collect online loans. Among them, real estate, wealth management companies, insurance companies, maternal and child and health care products industries, and education and training institutions are the core groups that are eager for personal information.

There is no shortage of stolen personal information used for fraud. For example, the user information of health care products is mainly aimed at the elderly and is specifically used for fraud.

The reporter found in contact with buyers that most of them know that the data transaction is a black product, but they still make this move. An important reason is that they advertise through formal channels, such as Baidu bidding rankings, and the cost of acquiring customers is between 60 and 80. Yuan/about, while buying user data through the underground black market, the cost can be greatly reduced.

From information collection to information sales to information utilization, every transaction link is interlocked, and the resulting "gray industry chain" is incalculable. According to a report by Liepin.com, there are currently more than 400,000 cyber criminals in China, and at least 1.6 million people rely on it to engage in cyber fraud, with an "annual output value" of more than 100 billion yuan.

Data compliance transaction pain points

There are no accurate statistics on the scale of the underground market for massive personal information. But from the special crackdown actions of the public security organs, a glimpse can be made.

In 2020, public security organs across the country will further promote the "Net Net 2020" special operation. In the whole year, a total of 56,000 cybercrime cases have been investigated and more than 80,000 criminal suspects have been arrested. Among them, 6,524 cases of infringement of citizens' personal information were investigated and 13,000 criminal suspects were arrested.

But obviously, this is not the whole picture of the black market. Manager Chen, the business manager of the Guiyang Big Data Exchange, told reporters, "Currently, there are not many data transactions through formal channels, and more data may still be traded on the black market."

Guiyang Big Data Exchange is the first big data exchange in China. It was officially listed for operation in April 2015, and it slogans that the daily trading volume will reach more than 10 billion yuan in the next 3 to 5 years. Now that the exchange has been established for 6 years, Manager Chen revealed to reporters that the current daily trading volume of the exchange is far from reaching the target set at that time.

The big data service provider Julixin CEO Luo Hao and Manager Chen both mentioned that the data confirmation, data backtracking, security, legality, and privacy protection during the transaction process have not been mentioned so far. Get a good solution. In particular, data right confirmation, such as data collection, processing, adoption, and transactions, may have multiple participants. Under what circumstances and what type of participants can obtain the right to data, there has not yet been a consensus in practice.

The red line currently visible is whether the source is legal and whether the transaction data is desensitized (involving the depersonalization and privacy processing of sensitive information). But the problem is that in the process of data circulation, it is actually difficult to find illegal sources and unsensitized data.

In addition, the degree of openness of data is far from enough, resulting in the limited types and quantity of legally circulated data on the market, making it difficult for players to use their fists.

Internet giants such as Tencent and Alibaba, while possessing massive amounts of data, can also achieve closed-loop big data cloud computing. They prefer to package them as data products and services for sale, which are more valuable than simply buying and selling data, and they can avoid the law. risk. The willingness of these players to share data is not strong, which can be seen from the fact that Tencent, Ali and Guiyang Big Data Exchange have not renewed their contracts since the expiration of their contracts.

But from a technical point of view, there is already a technology that can achieve data compliance transactions between B2B. Big data service provider Nebula Clustar CTO Zhang Junxue told reporters that the company has adopted a set of "federated learning" algorithms. A simple understanding is to jointly establish a coordinate system based on the existing data of both parties. This coordinate system is the so-called modeling. After the modeling is completed, it can be more accurately judged whether the customer is at a safe point or a dangerous point in the coordinate system. However, during the modeling process, the two parties do not know each other's user information, so there is no need to worry about user privacy being copied and leaked.

According to Zhang Junxue, the above-mentioned federated learning algorithm currently only solves the data compliance transactions between B2B, and is mainly used for data transactions between banks and financial institutions, and the cost is high, and it has not been applied on a large scale.

Xiao Sa, a lawyer at Dacheng Law Firm, told reporters that the compliant use of personal information in China currently relies to a large extent on the self-discipline of the company. Whether major operators have fulfilled their responsibility to protect user privacy, how to protect public privacy and business models To find a balance in the process of using personal data in a standardized, safe and orderly manner under the premise of protecting personal rights and interests, and releasing the dividends of big data is worthy of further investigation.

Guess you like

Origin blog.csdn.net/weixin_39787242/article/details/115211977