With 4.8 million products, how to structure a product management platform?

Said it in front

In the reader exchange group (50+) of Nien, a 40-year-old architect , many friends get high salaries, complete the upgrade of the architecture, enter the architect track, and open up the salary ceiling .

Recently, a friend has been appointed as an architect for first-tier Internet companies such as JD.com, NetEase, Weibo, Alibaba, Autohome, Jitu, Youzan, Xiyin, Baidu, and Didi, and encountered some very important interview questions:

  • How to evolve the technical architecture of a system? What is your methodology?
  • How to divide service boundaries in a system? What is your methodology?

Now, Nien, a 40-year-old veteran architect, stands on the shoulders of giants on how JD.com builds its product management platform, and provides everyone with a more comprehensive reference answer. This allows you to fully demonstrate your strong "technical muscles" and make your interviewer "can't help himself and drool" with love .

This question and the reference answers are also included in our " Nion Java Interview Guide PDF " V113 version for reference by subsequent friends to improve everyone's 3-level architecture, design, and development levels.

Please follow this public account [Technical Freedom Circle] to obtain the PDFs of "Nien Architecture Notes", "Nien High Concurrency Trilogy" and "Nien Java Interview Guide"

With 4.8 million products, how does JD.com build a product management platform?

Note that the author of this article is not Nien. The author is: JD Daojia technical team/Dada Group technology Ke Xianming

This article is just an important learning material for architects for your reference.

At the same time, I hope this article can also give more publicity to JD Daojia’s products, and I recommend that everyone use more of JD Daojia’s services and products.

background

As an instant retail e-commerce platform, JD Daojia is committed to delivering all kinds of high-quality products to consumers within one hour, while also working hard to improve the value of the products and the satisfaction of the platform.

JD Daojia product management system, its main responsibilities:

Intervene and review the entire process of product creation, modification and display, aiming to discover and solve problems in product information such as sensitive words, false propaganda, erroneous information, etc. that do not meet platform specifications and quality requirements, and ensure the authenticity of products and physical objects. Consistency, and accuracy of information.

System architecture introduction

Each business line of JD Daojia adopts a standardized microservice architecture design. During the iteration process, each system only needs to apply for the corresponding components as needed.

The following are the technical components used by the governance system:

  • Log service : Provides log collection and query services.
  • RPC call : Use Jingdong's JSF platform to realize functions such as inter-service registration, inter-service invocation and service management, and support automatic blocking of request timeouts.
  • Service monitoring : Use a unified monitoring and alarm service platform to realize functions such as second-level monitoring, multi-directional monitoring, service alarms, and full-link tracking.
  • Distributed scheduling engine : A service built based on the TBSchedule distributed scheduling engine framework, responsible for the execution and distribution of scheduled tasks.
  • High-performance storage : use Redis cluster, MySQL cluster, etc.
  • Message middleware : JD.com’s MQ middleware is used to achieve business decoupling.

The system architecture is as follows

system structure

NOTE: Please click on the image for a clear view!

Early governance systems

The first requirement is similar to that of most business systems, which is to build a set of sensitive word management modules based on data addition, deletion, modification and query, and at the same time provide sensitive word verification capabilities for the product main system.

The second requirement is to provide the operation team with a report of verification results. The main logic is to upload Excel, call the interface after internal analysis to obtain the corresponding data results, store them based on MySQL, and then provide query and display functions to facilitate the use of operators.

However, due to the lack of design and long-term considerations, the governance system at that time was seriously coupled with the commodity main system. The business architecture of the early governance system is as follows:

Early governance system business architecture

NOTE: Please click on the image for a clear view!

As the platform's regulations on product information compliance become increasingly stringent, various governance requirements for product classification, net weight, pictures, etc. have also emerged.

However, from the design in the figure above, we can clearly see that the governance system builds external interfaces based on specific businesses.

Therefore, as business needs continue to expand, the number of interfaces interacting between the two systems will also increase dramatically, which we do not want to see.

In addition, the ultimate goal of governance is to expect product problems to be solved, not just discovered, so exposing problems to operations or merchants is necessary. However, there are currently two problems:

  1. The commodity system relies too much on the verification function of governance in its main processes, and as the business expands, the degree of dependence will gradually increase.
  2. The product system can only inform merchants of the verification results of pre-interception, and the business coverage is insufficient.

Coupled with the fact that many issues are weak compliance (do not require mandatory interception but still need to be resolved), therefore, the core of the commodity governance business needs to be shifted from the commodity system to the governance system .

In order to achieve high efficiency in commodity governance, the design and positioning of the governance system have been adjusted and two basic principles have been proposed:

  • The governance system needs to complete the closed loop of the entire governance business and serve as the general entrance and exit for the discovery and resolution of product problems.
  • The governance system needs to be highly scalable and able to respond quickly when specific governance requirements are added.

Governance system architecture upgrade

Abstract thinking shows its power

After clarifying the ideas for upgrading the business architecture of the governance system, the first question we need to determine is: What are the most basic atomic capabilities of the governance system?

Taking each main system as an example,'

The most basic atomic capabilities of the product system are: product creation, modification and query capabilities,

The most basic atomic capability of the inventory system is: the ability to maintain and query product inventory information.

According to the development plan of the governance business, we have also basically determined the atomic capabilities of the governance system, that is, the ability to discover compliance issues with products and provide external inquiries and assistance in solving them .

合规问题We have explained the definition as follows :不符合电商平台商品展示规范的如敏感词、虚假渲传等问题。

For example, if a product name contains sensitive words, it will be regarded as a sensitive word issue. It should be noted that during the coding stage, a quantifiable specific rule can correspond to a compliance issue, and the same product may have multiple different rules at the same time. Compliance issues.

The current compliance issues involved in the Daojia governance system include:

Major categories of compliance issues External description Problem details
Product gross weight problem The gross weight of the product is inaccurate The gross weight of the product does not match the actual product, the gross weight of the product exceeds the maximum transportation capacity limit, etc.
Product information is incorrect The product information is incorrect, please check the specific content The product name contains sensitive words, the product classification is inconsistent with the actual product, false propaganda, etc.
Merchant product business scope issues The goods currently being sold are beyond the business scope of the merchant. The current products being sold are beyond the business scope of the merchant, etc.
Picture information problem There is a problem with product image information The product has no main image, the main product image is the default image, the main product image is a black background image, etc.
Future plan
商品价格问题
商品画像问题
...

In order to facilitate understanding, we can regard each compliance issue as a strategy , and the top-level interface for the strategy defines four core methods:

  • Verification method: specific verification logic implemented based on business rules
  • Custom filtering capability: Reduce useless processing based on business characteristics
  • Fields associated with the question: Each question needs to be associated with specific influencing fields or affected fields.
  • Mapping associated enumeration: Each problem needs to be associated with a specific problem cause

The specific implementation logic is shown in the figure below:

NOTE: Please click on the image for a clear view!

Taking the incorrect entry of product gross weight information as an example, the following figure shows the display results before and after processing:

Regarding the issue of gross weight, we can link it with the related enumeration and copywriting mapping, that is: when there is a deviation in the gross weight of the product (problem type), the recommended gross weight is XXX (copywriting mapping). Its associated fields include the weight and name of the item. By combining certain filtering logic and verification algorithms, we can complete the abstract processing of the gross weight problem. In this way, we can learn from this practice when dealing with new governance issues.

Readers who are familiar with design patterns may have discovered that this design solution is actually a mixture of the Strategy pattern and the Template Method pattern. In the coding stage, we will also use the factory pattern. The overall changes at the coding level are as shown in the figure below:

NOTE: Please click on the image for a clear view!

After the implementation of the above plan, the production and research team has a basic consensus on the future development of the governance business, and at the same time, the realization of needs has become easier. We no longer need to focus on the logic of other systems, but focus on the implementation of business rules for compliance issues.

Business departments and product teams can determine future governance priorities and demand planning through data analysis, and R&D personnel have also solved the problems of coupling between systems and duplication of business code in an elegant way.

Difficult problems solved skillfully

After we initially set up the business architecture design of the governance system, we encountered two more difficult problems during the subsequent iteration process, one was a business issue and the other was a technical issue.

Difficult business issues

The business department requires that the main product image displayed on the APP cannot be the same as the default image (such as a blank image, a brand trademark image, and other images that cannot reflect product information). However, the verification logic of product images has always been undertaken by the image verification system.

This raises a question: Does the governance system need to integrate image verification logic? If not, how to incorporate image violation issues into the governance system?

Experienced developers may suggest using message queue (MQ) to have the image verification system send the verification results to the management system to solve this problem.

In fact, we do the same thing, just more thoroughly.

In the design pattern, we usually integrate a series of similar businesses into a public interface to provide external capabilities. We call this the facade pattern or appearance pattern.

In response to the above-mentioned similar problems, we have found a general solution, that is, using the governance system as a facade and other systems as components. Each system can actively provide the governance system with content that needs to be governed.

After the plan was determined, various difficult business scenarios became simpler. At the same time, this move also expanded the boundaries of the governance system. For example, the problem of product compliance without pictures and the problem of high negative review rate of products only needs to be solved by the corresponding system. Relevant data/results are sent to the governance system in the form of message queue (MQ), and then the governance system binds specific compliance issues to it.

At the coding level, we use the simplest message queue (MQ) decoupling method to implement it. The schematic diagram is as follows:

NOTE: Please click on the image for a clear view!

In the process of governance iteration, there are a series of requirements for managing images of platform products, taking damaged image logic as an example.

In the initial processing logic, we checked the data integration information and found that the occasionally damaged images on the platform were caused by the stream being interrupted after the image was not downloaded completely during the download process, triggering the upload.

Therefore, in the first version of the logic, we reviewed the data and made the following logical judgment: when the image download is completed and before triggering the upload, compare the byte size in the request body with the ContentLengthactual image byte size, and the problem is initially solved.

Technical difficulties

However, soon after, the problem broke out again, and we discovered that things were not as simple as imagined.

Since our platform is connected to many merchant systems, and the image servers and background logic of each system are different, we cannot process all images by file size comparison. '

Therefore, we re-conducted the survey and implemented the verification capability for damaged maps.

NOTE: Please click on the image for a clear view!

That is, the downloaded image content is processed and analyzed, and the algorithm is used to identify the business characteristics of the target problem, thus basically solving this problem.

At the same time, based on this idea, we also derived processing methods for black background images and default images , taking a step further in the management of image issues.

Governance finally comes to fruition

Based on the above plans and designs, the governance system 问题发现has become more and more perfect in terms of processes.

Next, the product puts forward new requirements, namely: automatic management of some problems and reaching merchants.

Machine learning models can be roughly divided into two types: classification models and generative models.

Regardless of their specific meanings, we can draw on this design concept and divide the governance system into two parts, namely: 发现and 解决.

The above-mentioned business extraction, technical issues, and business issues are all used to detect problems. When we incorporate the goal of solving problems into the governance system, we only need to moderately expand the existing architecture to meet the needs.

Taking the incorrect entry of product gross weight information as an example, we only need to add two methods to be implemented in the above extraction:

  • Whether automatic processing is required: The gross weight problem needs to be processed automatically
  • Specific implementation rules for automatic processing: when the actual gross weight is greater than a certain threshold, the product system will be removed from the shelves (relying on the product’s external interface capabilities)

Before the verification results are stored, it can be judged whether it is manual processing or system processing based on the specific execution logic and data feedback results.

For reaching needs, the implementation is simpler, because in the initial stage we defined the basic elements of governance business communication as specific governance issues, and we only need to display the stored data through interfaces or message queues.

At this point, the entire governance system has been completed from the coding level, and its core logic lies in three links:

  1. Product change MQ/other system management content notifications trigger verification of specific compliance issues.
  2. Judgment based on the verification results: manual processing or automatic system processing (the processing capability requires the help of the product's external interface).
  3. The verification results are disclosed to the public.

The following figure shows the current overall business structure of the governance system:

Overall architecture diagram of the governance system

NOTE: Please click on the image for a clear view!

Governance business panorama

Since the business framework upgrade of the governance platform, it has been running stably for more than nine months.

During this period, we have successfully managed more than 4.8 million platform products and built 8 identification capabilities, 3 processing methods and 2 access methods.

At the same time, we rely on the product and standard product system to provide strong guarantees for rapid product creation, basic information construction, and governance review.

The following is a panoramic view of home management:

Governance business panorama

Governance business panorama

NOTE: Please click on the image for a clear view!

future plan

The current governance system is mainly designed and constructed around the core links of the commodity system, and its scope of influence is relatively small.

In fact, we can extend the results of commodity governance to other systems beyond the commodity system.

For example, the business scenarios in the figure below:

NOTE: Please click on the image for a clear view!

Taking search recommendations as an example, we can formulate corresponding deduction rules for various compliance issues. When constructing data on the search side, the compliance scores of the products are included and sorted according to the score to meet the search conditions.

At the same time, we also need to incorporate some problems that cannot be identified by the algorithm into the governance system, such as: high rate of negative product reviews, high return rate, etc.

Summarize

With the continuous development of business, the requirements for the quality of product information will become higher and higher. The home management system needs to be closely linked with various upstream and downstream systems to provide more refined product management and control capabilities.

We expect that in the future, our governance capabilities will be even better and we will be able to provide users with more real and realistic product data and better services.

At the end: If you have any questions, you can seek advice from the old architecture.

The road to architecture is full of ups and downs

Architecture is different from advanced development. Architecture issues are open/development-oriented, and there are no standard answers to architecture issues.

Because of this, many friends, despite spending a lot of energy and money, unfortunately never complete the architecture upgrade in their lifetime .

Therefore, in the process of architecture upgrade/transformation, if you really can’t find an effective solution, you can come to the 40-year-old architect Nien for help.

A few days ago, a friend of mine was working on the golden link structure of an e-commerce website . They couldn't find any ideas at first, but after 10 minutes of voice guidance from Nien, it suddenly became clear.

Recommended reading

" Ten billions of visits, how to design a cache architecture "

" Multi-level cache architecture design "

" Message Push Architecture Design "

" Alibaba 2: How many nodes do you deploy?" How to deploy 1000W concurrency?

" Meituan 2 Sides: Five Nines High Availability 99.999%. How to achieve it?"

" NetEase side: Single node 2000Wtps, how does Kafka do it?"

" Byte Side: What is the relationship between transaction compensation and transaction retry?"

" NetEase side: 25Wqps high throughput writing Mysql, 100W data is written in 4 seconds, how to achieve it?"

" How to structure billion-level short videos? "

" Blow up, rely on "bragging" to get through JD.com, monthly salary 40K "

" It's so fierce, I rely on "bragging" to get through SF Express, and my monthly salary is 30K "

" It exploded...Jingdong asked for 40 questions on one side, and after passing it, it was 500,000+ "

" I'm so tired of asking questions... Ali asked 27 questions while asking for his life, and after passing it, it's 600,000+ "

" After 3 hours of crazy asking on Baidu, I got an offer from a big company. This guy is so cruel!"

" Ele.me is too cruel: Face an advanced Java, how hard and cruel work it is "

" After an hour of crazy asking by Byte, the guy got the offer, it's so cruel!"

" Accept Didi Offer: From three experiences as a young man, see what you need to learn?"

"Nien Architecture Notes", "Nien High Concurrency Trilogy", "Nien Java Interview Guide" PDF, please go to the following official account [Technical Freedom Circle] to get ↓↓↓

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/133356563