How to deal with black products and verify image resource traversal

In the first issue, the offensive and defensive points we shared are: verifying image resource traversal.

"Traversal" means that black products obtain the answers to all verification code pictures through exhaustive methods, so that they can completely ignore the verification codes in the future. Since the verification code mainly uses the semantic answer of the picture to identify the man-machine, the most effective way to break through this layer of defense is to traverse the verification code image library, so as to achieve once and for all cracking.

In this article, we will start with the background and introduction of this attack and defense point, and then analyze in depth from the three perspectives of the attacker (black industry), the attacked and defended (customer) and the defender (extreme experience) to fully understand the attack and defense point.

1. Offensive and defensive points

Why did the first issue start with the attack and defense point of verifying image resource traversal? Because it is one of the most commonly used attack methods for black products. The data shows that more than 68% of hacker attack cases are related to the traversal of verification image resources.

1. The "efficiency" of black production

Let's first analyze from the perspective of cost game, why are black producers keen on verifying image resource traversal?

The core goal of black production is always to make money, and the main way to make money is to snatch the resources provided by companies to normal users in a shorter time and at a lower cost, and realize them. Whether black production can make money depends on whether the "efficiency" of obtaining resources is much higher than that of normal users.

According to extreme experience statistics, the ability of most black products to obtain resources is at least 800 times higher than that of normal users. "High efficiency" is the core capability of black production, so the defense against black production depends on "limiting efficiency". When the "efficiency" of black production is limited to the level of normal users, it is impossible to plunder more resources to realize it, and it loses profit margins, thus defending against black production at the root. Therefore, behind the offensive and defensive points of black production are all around "efficiency" and "profit space".

When the attacked party (customer) discovers the existence of the attacker (black product), how to limit the "efficiency" of the black product without affecting the access of normal users?

Captchas are currently proven to be the most effective solution. The current mainstream picture verification code is the human-machine verification form with the best combination of safety factor and user experience. Whether it is through updating the verification form, or blurring semantics and blurring pictures, it can effectively limit the "efficiency" of black production.

2. Black production means to improve "efficiency"

At the same time, the attacker (black industry) has also begun to study how to further improve "efficiency" and expand its own "profit space" when it is restricted . This includes the attack and defense point of this issue - verifying image resource traversal .

Captcha essentially comes with a layer of answer semantics, which is a natural place to distinguish humans from automated programs. Heiqian uses the attack method of traversing the verification gallery to exhaustively identify the answers to pictures in large quantities and improve the "efficiency" of passing. Whether in the era of character verification codes in the early 2012, or in the era of intelligent and senseless fourth-generation variable verification, various changes in picture elements will be involved. Therefore, as long as there is a picture element in the verification code, black products need to pass the verification of the picture answer, and the attack and defense point of verifying picture resource traversal will always exist.

2. Perspective of offense and defense 1: From the perspective of black production, the benefits of image resource traversal

The gallery of any verification code is limited. As long as there are picture elements in the verification code, Heichan can crawl all the verification pictures by traversing the verification picture resources, and obtain the answer of each picture through manual coding or algorithms.

So, under the premise that the verification picture library is not updated, how does the black product use the exhaustive method to crack all the verification codes containing such pictures?

1. Attack method

Completely experience the process of verifying image resource traversal from the perspective of black industry: Step 1: Launch an attack on the SMS login scenario of an e-commerce company.

Step 2: Obtain the address of the verification image by frequently sending requests to the page interface.

Step 3: After obtaining the addresses of 300,000 verification pictures, Heichan began to download and store these batches of verification pictures in batches. The download speed was 10 pictures per second, and it took a total of 8.33 hours .

300,000 photo galleries used by the e-commerce company (excerpt)

Step 4: Obtain the answer to the verification picture through low-cost manual coding. The cost of coding is about 1.4 cents per piece , and the coding speed is 2.5s per piece. It takes a total of 208.33 hours and costs 4200 yuan ;

Black production sends batch operation requests to the coding platform 

The answer returned by the coding platform (excerpt)

The order and coordinate answers are marked with 1, 2, 3

Step 5: Build all verification picture answers returned by the coding platform into a picture answer database , including picture name, picture answer coordinates, picture version and type, storage address and download time;

 Black production picture answer database (excerpt)

Step 6: When cracking the verification code later, search against the picture answer database and enter the answer information corresponding to the current verification picture , and then the verification can be successfully passed, and the attack on the SMS login scenario of the e-commerce company is completed. Although the entire process of traversing the pictures is time-consuming and requires a certain amount of time and cost invested in the early stage, once it is completed, it can completely break through the restrictions imposed by the verification code on black products. Therefore, from the perspective of illegal production, verifying image resource traversal is a solution that requires a little investment in the early stage and can be done once and for all later, and it is also one of the most commonly used methods for illegal production.

2. Profit-making methods

Illegal profit = Illegal income - Illegal cost

At present, the average update frequency of verification manufacturers on the market is more than one month, and only a few hundred to a few thousand pictures can be updated each time . However, black production uses low-cost manual coding, and the cost of cracking the verification code is only 1.4 points per time. Taking the gallery to update 6,000 pictures per month as an example, the cost of building this batch of answer databases for black production is only less than 10,000 yuan. . That is to say, as long as the hackers have traversed a batch of verification code answers, they can quickly crack this batch of verification codes in the next month or so. Even if the manufacturer updates the library one month later, the cost of re-searching for manual coding by black products is less than 0.03 yuan per time.

 A verification code cracking manufacturer's manual coding cost

Therefore, black producers only need to spend an average of about 466 yuan per day for manual coding, and can completely complete an e-commerce company’s gallery containing 300,000 pictures in 9 days , and generate their own picture answer database, while the follow-up gallery does not The update has been unimpeded for a month.

At the same time, the income earned by the black production through wool gathering this month has far exceeded the cost paid. This process from input to output is a very profitable business for them, and it is also the reason why they repeatedly traverse the verified image resources.

3. Offensive and defensive perspective 2: How can enterprises deal with black products and carry out image resource traversal

When black products are frequently traversing verification image resources, what kind of problems will the attacked customers encounter? Let's switch to the perspective of customers to see how they were attacked by hackers.

1. Problem occurs

Company H is an e-commerce company that has just been established for two years. It often encounters the problem of malicious consumption of text messages in user login scenarios. During this year's "618" promotion period, company H's security personnel deployed a verification code from a third-party manufacturer in advance to prevent black production. From May 19th to June 3rd for nearly half a month, all data have stabilized, and no abnormalities have been found. It can be seen that the verification code can effectively prevent black industry attacks. However, on the evening of June 3, the volume of verification requests, verification interactions, and verification passes suddenly rose sharply. The verification code of Company H was useless at this time, a large number of invalid text messages were sent, and the SMS balance was also consumed a lot, with the highest hourly loss exceeding 20,000 yuan ! The person in charge of security of Company H urgently contacted the third-party manufacturer to deal with it. The manufacturer began to switch the form of some verification pictures and increased the difficulty of verification. From June 4th to June 7th, the number of verification passes decreased slightly, but the effect was very small. micro. On June 7, the data increased sharply again. Until June 13th, the hacker attack had lasted for 10 days, and the data still could not return to normal.

From May 19th to June 13th, use the background data of the third-party verification manufacturer

2. Problem location

In desperation, the security director of Company H urgently sought help from Jiexperience Sales on the evening of June 13. After the intervention of GeeExperimental Service, we gradually analyzed the problem from the customer's perspective, and gained a deeper understanding of this hacker attack. Jiexperience security experts analyzed the method and characteristics of this hacker attack. Quickly locate the sudden data anomaly of Company H because it was attacked by the traversal of black production verification image resources:

First, Jiexperience security experts are very familiar with the business process of illegal products. Considering the economic cost, after traversing resources to build an answer database, it can be used repeatedly . For black products, the cost of obtaining picture answers is very low. Most black products are willing to attack in this way;

Second, comparing the data before and after the resource update, Geetest found that after the update of the pictures with the same material and style, the original answer database of black products will become invalid, resulting in a sharp drop in the proportion of traffic;

Third, through the analysis of verification code logs, Geetest found that the answer coordinates of the same picture sent from a large number of different clients are exactly the same. If it is operated by a real person, there will be errors, and these answer coordinates come from black The answer database constructed by production;

Fourth, Geeexp also pretended to be a black industry and communicated with real black industry personnel. In the process of communicating with some black production personnel, it was confirmed that they did pass the verification code by traversing pictures to build an answer database.

3. Analysis of Difficulties

Since it can be located that it is a problem of traversal of black production verification image resources, why can't the third-party manufacturer spend ten days to solve it? The key behind it is: the update frequency of the verification code picture and black production attacks are a dynamic game process. If you want the verification code defense to be effective again, you need to update the verification picture again. However, it is very difficult to increase the update frequency of pictures.

(1) It is necessary to ensure the uniqueness and correctness of the pictures , efficient generation algorithm and approval algorithm, otherwise some wrong pictures will easily appear. For example:

① Icon overlap

②The icon in the picture is missing

③The picture is out of range

④The prompt box is partially missing

(2) The matching degree of picture elements needs to be approved. In order to avoid verification failure caused by discrepancies between the elements seen by the human eye and the elements stored in the system, after each image is generated, the target element needs to be intercepted in the image according to the answer coordinates in the image metadata, and then the target element and the corresponding The prompt tags are matched through the similarity recognition model, and only pictures that meet the preset threshold standards can be used online.

(3) After the image is generated and reviewed, it is also necessary to ensure that the updated image library has the support of the platform architecture when it is distributed to each node. Images and metadata need to be uploaded to global static resource servers and server servers respectively, and the upload status should be checked to ensure that each server can accurately find usable images. When the upload task fails, it is necessary to manually upload again or roll back to the previous batch of pictures in the control management background to avoid the risk of unavailable pictures due to picture upload failure.

In addition to the above-mentioned technical difficulties, the update gallery also has defense standards that can compete with black products. We know that the update rate of the verification code image and the size of the gallery determine whether the hacker can successfully make profits. When the income of the hacker is less than the cost, the attack will be automatically given up. So how fast the picture update frequency is and how large the updated gallery is, can the income of black production be less than the cost?

Illegal production cost = gallery size × picture single manual coding fee Black production income = gallery size x picture single manual coding fee

Assuming that the current income of black production is 1,000 yuan, and the manual coding fee for a single picture is 0.014 yuan, then

Gallery size = black production revenue ÷ single manual coding fee for a single picture

That is, 1,000 ÷ 0.014 = 71,428 pictures. When the income of illegal production = the cost of illegal production, all the income of 1,000 yuan is used for cost investment, which can just print 71,428 pictures. Black products must update all the content of the atlas before the atlas is updated, otherwise the work will be in vain. Assuming that the single coding time is 10s, then

Gallery update time = image size × single coding time per person

That is, 71,428 × 10 = 714280s = 198h. Therefore, we come to the conclusion: under the premise that the income of illegal production is 1,000 yuan and the single coding time is 10s, the size of the gallery for each update needs to be at least 7,1428 , and every The gallery needs to be updated once every 198 hours (about 8 days) , so that the income of illegal production and the cost of illegal production can be balanced. At this time, there will be no profit from black production, and the attacked party can just compete with the attacker.

However, because it is very difficult to update the image library, it often takes a certain period of time to generate the verification image library. According to the survey, the average update frequency of verification manufacturers in the market is 1 month/time, which is far lower than the speed of black product traversal (about 8 days/time), and the verification code cannot be made valid again. Moreover, the average size of each update gallery of verification manufacturers on the market is only a few hundred to a few thousand, and black production only costs a few hundred yuan, and it can be easily broken within an hour.

Since the third-party verification vendor cannot quickly update the verification gallery in a short period of time, even if it can successfully defend for a period of time before June 3, once the black product uses image traversal to generate a picture answer database, the verification code will become useless.

4. Angle of offense and defense 3: How to deal with black products and carry out picture resource traversal in Jiexperience

So, as a business security service provider with the largest market share of top customers, how does GeeExpert deal with the traversal of black-produced image resources?

1. Effective Defense

After a whole morning of deployment, H Company found that the dynamic update of Geetest’s verification gallery can effectively solve the problem of black production attacks.


Company H used the background data of GeeExperiment on June 14

On the afternoon of June 14, Geetest helped the client to update the atlas dynamically at 16:28, and at 16:31 the atlas became effective on all nodes online, and the number of verification failures increased rapidly. It can be inferred that the data originally traversed by Heidan in the gallery is invalid at this moment, but the amount of verification requests has not changed much, and the attack script of Heidan is still running. 17:06 Heichan discovered that the previous attack script was invalid, and immediately suspended the script. At 17:10, the attack script was launched again, but the verification code process still failed. The number of successful verifications has been in a stable state after the dynamic update of the atlas. It can be inferred that all of these successful numbers are sent by normal users. After 17:16, Hei Chan gave up the attack completely, and the data began to return to normal. During the entire defense process, Company H did not receive any reports of problems from users. The automatic update of the atlas can not only effectively defend against black products, but also will not cause any disturbance to normal users. Compared with the previous third-party verification code manufacturers whose update frequency was once a month and failed to defend, Geetest automatically updates the verification gallery once an hour, helping customers recover tens of thousands of yuan in losses . One month and one hour seem to be just the difference in numbers, but they are the key to whether we can defend against black production and verify image resource traversal.

2. Defense ideas and core indicators

During the 11-year-long game confrontation with hacker crackers, Geetest found that defending against hacker attacks from the direction of restricting hackers from obtaining verification code picture answers can not only reduce customers' operating costs, but also reduce the impact on The interruption rate of customers can also control our initiative in the offensive and defensive game confrontation. The core of restricting black products from obtaining verification code picture answers lies in the update rate and size of the gallery. At present, the average rate of image updates in the industry is hundreds to thousands of images per month, and the verification code image resources are often updated after being cracked by hackers. This method is too passive. The black industry will crawl these batches of pictures in a centralized way to obtain the answers to the pictures, and build an answer database with the obtained answers. As long as the defender does not update, this database can fully meet the answer requirements of this batch of verification code pictures.

Different from general verification vendors in the market, when Company H approached us on June 13, our fourth-generation adaptive verification code happened to have the ability to update the gallery frequently and efficiently. Jiexp’s fourth-generation adaptive verification has developed a set of automatic update system for atlas for the first time. By making templates, it can update 300,000 pictures per hour , and update verification resources efficiently and frequently, so that black products can no longer pass Brute force method to crack the verification code. The system can formulate different timing strategies according to different attack scenarios. From the creation of atlas update tasks to the use of new pictures in global service nodes, it only takes a few minutes to realize the automation of generating 50,000 pictures per hour and 200 categories renew. At the same time, for urgent customers who are being attacked, automatic updates can be performed at a rate of 10,000 sheets and 50 categories in 10 minutes. After we reached this target, we greatly increased the attack cost of black production. Now, if the black industry wants to build an effective answer database again, it still needs to spend 0.03 yuan/time for manual coding, but the database will become invalid after one hour, and a new database needs to be built again through manual coding, and the actual income will be great. reduce. Therefore, adopting the scheme of dynamically updating pictures can effectively solve the core problem of black production for verifying the traversal of picture resources.

3. Break through the technical difficulties of core indicators

In order to realize the dynamic update of verification pictures and achieve the target of updating 300,000 pictures per hour, we have mainly broken through the following technical difficulties :

(1) Generation and review of pictures

Geeexpert's fourth-generation adaptive verification has efficient generation algorithms and approval algorithms. For the first time in the industry, it has developed an automatic update system for atlases, which can update verification resources efficiently and frequently by making templates.

Example of image generation effect

At the same time, we took the lead in completing the construction and application of AIGC technology in the verification code field. Based on the ray.serve and stable-diffusion frameworks, a series of atlas-related functional interfaces are constructed, which are used for the management of prompt lexicons and pipelined automatic generation of images. While using the Vinsen graph large model to further speed up the image update rate, the accuracy, controllability and scale of the image are guaranteed.

Automatically generate image gallery using AIGC technology

(2) Guarantee the global consistency of atlas resource synchronization

①Picture resource production Upload the image binary to oss and metadata storage pg for atomic operation, to ensure that the image resources are uploaded before they are stored in the database, and ensure that the db file obtains the image access path, and the corresponding resources can be found on the static resource server;

②The operation of the xxl-job task responsible for synchronization is idempotent, allowing repetitive operations. When synchronization fails, it can be manually triggered, and when the xxl-job service fails, there is still a full synchronization function for resource replication and synchronization;

③When making the db file of the atlas resource, first check the synchronization result of xxl-job to ensure that the atlas resource has been synchronized with the global nodes, and then create, upload and distribute the db file globally;

④ In our gtmaanger background, the scheduled update task is parallel to the manual execution operation. When the scheduled task update fails, the manual operation platform is still available to realize the production and synchronization of atlas resources.

Click for larger image

(3) Solve the problems of image resource reuse, loading, and resource competition

①Global synchronization of picture resource production, hot update of atlas resource production in one place, and second-level synchronization to domestic and foreign object storage to realize resource reuse;

②Picture resource loading cache preheating, multi-active in different places, global static service resource multi-node, picture loading back to the source nearby, significantly improving resource loading speed;

③Picture resource meta-information is stored in an embedded database, process-level resource loading, microsecond-level resource response, and decentralized database design to solve high concurrency problems;

④The image resource meta-information uses the Inode node detection mechanism to realize millisecond-level replacement of resource information and solve the problem of resource competition.

The data shows that on the afternoon of June 14, after Company H deployed Jiexperience's fourth-generation adaptive verification and dynamically updated the atlas, the number of verification requests, verification interactions, and verification successes continued to stabilize. Returning to normal, the "618 Big Promotion" event was finally carried out as usual, effectively limiting the behavior of black farmers.

 H company's data finally returned to normal

4. Subsidiary value brought by breakthroughs

Through the fourth-generation adaptive verification of Geetest, we are the first in the industry to use more concise and efficient technical means to quickly and accurately generate verification pictures; use the self-inspection module to verify each generated picture according to predetermined standards ; and relying on a stable and reliable internal operation platform, it can realize second-level global updates for individual users or users of the entire network. At the same time, it also brings the following additional subsidiary value to our customers:

(1) The verification effect can be demonstrated. In the past, the inability to objectively demonstrate the security effect was the biggest problem of verification products on the market. Before being attacked by black industry, the background data was very stable, and no abnormalities would be noticed. But now, when our verification pictures can be dynamically updated at the second level, we can see clear differences in data changes in the report through our rapid changes and the black production has no time to change. Through this kind of change comparison, we can immediately know that the dynamic update strategy is effective in defending against black production, so as to achieve a value breakthrough with demonstrable verification effect.

 Real-time second-level data changes

"Geek Human-Machine Behavior Recognition Verification has effectively helped us reduce business security risks in the short term and ensure user experience while improving business security protection capabilities." ——VIPKID/ Security Leader 

(2) After the update, the black industry will lose money and use the economic root to counter the black industry. In the past, the conventional defense method was to update the verification code image resources after being cracked by hackers, so that the limited sample set of hackers could not recognize new images. But this method is too passive, and the verification code is very short-lived. Now, through the dynamic update of the verification gallery, GeeExpert has effectively restricted black products from obtaining answers to verification codes and pictures, and for the first time has mastered the initiative in the process of offensive and defensive game confrontation. When the black industry's attack is no longer effective, and the cost is paid but the due benefits are not obtained, the attack will be automatically given up.

 After the black production failed twice, the attack was completely abandoned, and the data returned to normal

"After Weibo users enter the process and access the extreme test, abnormal requests are effectively intercepted, the threshold for real users' secondary verification is lowered, and the user login experience and login efficiency are greatly improved." —— Weibo/Product Manager

(3) Personalized service updates faster. For customers with special security needs, Geexperience provides a complete set of privatization solutions for the first time in the industry. For example, for customers with overseas business, GeeExperience can help customers achieve overseas privatization, overseas data compliance, data collection customization, overseas functional customization, UI skin, brand customization, modules, risk control integration customization, multi-country Language, voice verification and other difficult functions such as customization; for customers with urgent operational activities, GeeExpert provides an interface customization interface, quickly customizes difficult and stylized gallery, and helps customers solve their urgent needs with the ultimate personalized service.

 Personalized customization, does not affect the use of other customers

"Gee Experience can provide international support very well, complete tasks with high quality within the validity period, and the customized high-difficulty stylized gallery helps us enhance the overall security capabilities of distil." ——Distil Networks

V. Conclusion

Black product offense and defense are like the relationship between a spear and a shield. There is no spear that will not rust, and no shield that will never be broken . Only by constantly polishing in the process of offense and defense, seeking innovation and breakthroughs, can we always be one step ahead of the opponent in the game of "spear" and "shield". GeeExpert always adheres to the concept of innovation, in order to find the essential difference between human and machine in the dynamic game with black products. For the offensive and defensive point of "verifying image resource traversal", we are still further improving the update efficiency. It is expected that in 2024, GeeExperience will further optimize the operation platform architecture and service platform architecture, and realize the production and update of the local atlas of each service node in the world through the operation platform, without the need for network transmission synchronization. This can effectively avoid the failure to update due to network problems and affect the stability of the service. In the process of fighting against black products, the key to achieving the transition from "unpreventable" to "defensive" lies in the extreme pursuit of technical details by Geeexperience, and these are often difficult to see in product introductions. Jiji adheres to the concept of "thinking for the world and using it for the world", precisely because it can always prevent illegal production, the current service has been recognized by 360,000 companies around the world, and the market share of leading customers ranks first in the industry . We are constantly pursuing details and breaking through technology, all in order to help customers better defend against black products.

Extreme experience patent wall

In the next issue, we will bring the point of attack and defense - image recognition confrontation . We will take you through a detailed understanding of how black products build a learned cracking model when cracking captchas, and how the defender should fight.

"The Way of Offensive and Defense of Black Production" looks at the game of offense and defense with black production from the perspective of God. If you have any other black product attack and defense points that you are interested in, please leave us a message~

Guess you like

Origin blog.csdn.net/geek_wh2016/article/details/131792221