Exploration and practice of recommendation system in vivo app store

Introduce how the vivo app store recommendation system efficiently supports personalized recommendation needs.

I. Introduction

The store’s application data mainly comes from channels such as operational scheduling, CPD, games, and algorithms. After the recommendation project is established, there is no change. The recommendation system is responsible for docking with the data source. The store server only needs to communicate with the application recommendation system. Just do the docking.

If readers think that we are simply copying the store server code to the recommendation system, then it is really too young too simple. It is impossible to copy a system without optimization or upgrade. It is impossible in this life. . Below I will introduce how we design and plan the application recommendation system.

2. Challenges

In the author's eyes, in addition to the high-performance, high-availability and core indicator monitoring capabilities of the store application recommendation system, another core capability is to efficiently support store traffic scenarios to access personalized recommendations.

How to define efficient support?

  • At least it can support three or four parallel demands at the same time.
  • A requirement development cycle should not exceed 2 days at least.
  • There should be fewer bugs. On average, there should not be more than 2 per scene.
  • The normal needs of product students can basically be quickly supported.

Share a planning case recommended by our application:

In the xx scenario,

If the main application A belongs to the application category,

  • Then first get the Q1 queue from the x1 data source.
  • Then get the Q2 queue from the x2 data source.
  • Then use the Q2 queue to truncate the Q1 queue, and perform the same developer filtering and first-level classification filtering after the intersection.
  • If the intersection is empty, use Q2 to go to the bottom, and then take the elements at positions n1 and n2 of the intersection queue as the return queue.
  • If there is no data before, then from the big data xxx table according to the probability of application clicks under the main application, take n under the category with the highest click rate. At the same time, these data need to be filtered by the same developer in the queue.

If the main application A belongs to the game category,

  • Xxxx
  • Perform secondary classification filtering
  • If the amount is insufficient, take data from x(n) and then process it,
  • If there are less than 3 data, you need to take the apps under the same level from the weekly list and follow the download rankings.

That's right, readers and friends should not doubt yourself. In order not to confuse the readers, we have just selected a simple requirement here. It’s not a big deal to implement such a function, but when there are dozens of such personalized recommendation requirements, will you panic when you may continue to expand in the future? Now, let’s take a brief look at some of the requirements of our personalized recommendation, as shown in Figure (1):

Exploration and practice of recommendation system in vivo app store

Figure (1)

Using the case by case development solution before the store server, in any case, it is impossible to achieve the recommended scenario described above to support the store's efficient access to the store. Next is how we can achieve the optimization process.

Three, how to solve

In order to better explain the solution ideas, we start from the actual thinking process and explain the problem-solving process step by step.

3.1 Business process abstraction

Simply speaking from the planning, we need to do at least a few things as shown in Figure (2) in each scene:

Exploration and practice of recommendation system in vivo app store

Figure II)

  • Get the recommendation list: call the recommendation queue obtained by each data source (note that the interfaces called in different scenarios are not consistent, and the fields and structures returned by the interface may be different).
  • Queue fusion: perform operations such as intersection or union as mentioned in 1.
  • Data filtering (in-queue/between-queue): Perform various filtering in the queue, and the filtering operation is mainly to improve the relevance.
  • Data bottoming: When the queue data is not enough, use the list to bottom out. It may take the same level of classification data of the weekly ranking data, and the same level of classification data.

The author made further adjustments to the model based on the convenience of development, and the adjusted figure is shown in Figure (3)

Exploration and practice of recommendation system in vivo app store

Figure (three)

After obtaining the queue, install and filter the queue and filter in the queue (such as the main application and developer filtering, etc.). The process can be merged. The main reasons are as follows

  • It is convenient to define the filtering strategy of each data source. In actual demand, different queues will also use different filtering strategies.
  • This approach closely matches the template design pattern and can ensure that the process of obtaining the recommendation list is consistent and stable.

3.2 Abstract process extension

Going to Figure (3), readers will find that we still have not been able to solve the differentiation process in the various recommendation scenarios we mentioned earlier.

In fact, after contacting a few requirements, we will find that it is almost impossible to solve such a big difference in a set of code, or even if it is implemented, it will make the code extremely complicated. Instead of doing this, we might as well face this difference, let the difference be realized in the scene plug-in, and we spend more energy to take care of the backbone.

So in order to support the flexible expansion capabilities of the scene, the author adds four links on the basis of Figure (3):

  • Queue results shared within threads: use ThreadLocal to achieve. The main reason for storing the results of each recommendation queue is to facilitate the subsequent use of a recommendation queue for filling requirements. In addition, it is to avoid the need to repeatedly request the three-party data interface and reduce the repeated invocation of the interface.
  • The bottom of the plug-in queue: The main purpose is to use the specified queue to complete the filling after filtering when the quantity is insufficient. The scene plug-in can also fill in as needed to realize the filling logic and realize the supplement of the queue content.
  • Plug-in interface callback: This link is mainly to personalize the previous queue, such as intervening in the queue, etc., without integrating the plug-in interface callback with the plug-in queue. The main reason is that the integration of the plug-in queue can achieve configurable settings. .
  • Weekly ranking list: Provide general weekly ranking list data query capabilities, support query according to various dimensions, this part of the data as the final list of the queue.

The expanded flowchart is shown in Figure (4)

Exploration and practice of recommendation system in vivo app store

Figure (four)

3.3 Overall logic block diagram

Through the above analysis, we can know that we can put the personalized scene content on the plug-in layer as much as possible, and the framework layer is responsible for loading the specific personalized recommendation logic of loading scene plug-ins according to the scene.

From top to bottom, the system is divided into: plug-in layer, framework layer, protocol adaptation layer, data source service layer, atomic service layer, and basic service layer. The upper layer depends on the services (interfaces) of the lower layer through SDK. Responsibilities at each level are:

  • Plug-in layer: plug-ins corresponding to each scene, the framework layer provides default implementations for plug-in callbacks or extension interfaces, and the plug-in layer implements specific logic on demand.
  • Framework layer: defines the core process and execution logic of recommended data, and implements the extension and callback interface of the callback plug-in layer.
  • Protocol adaptation layer: Responsible for finding the data source service corresponding to the scene according to the scene, and encapsulating the conversion protocol and performing data conversion.
  • Data source service layer: RPC service encapsulation layer provided by each queue provider.
  • Atomic service layer: related services of the filtering type, mainly dependent on the RPC service of the store, using the combined design pattern, services can be combined.
  • Basic service layer: Supports correlation judgment or filtering from the latitude of developer, first-level classification, second-level classification, application type, etc., like the atomic service layer, this layer of services is also at atomic granularity and supports combined control.

Exploration and practice of recommendation system in vivo app store

At this point, I believe everyone is aware that for personalized recommendations, our development work will eventually focus on the development of scenario plug-ins, and there is no need to develop each additional business process.

Application recommendation system architecture

Exploration and practice of recommendation system in vivo app store

3.4 key realization

After completing the overall logic block diagram design of the third step, we conducted related research on the definition of scene parameters, service design principles, use of design patterns, and hot swapping of scenes, and finally realized the implementation of the plan.

3.4.1 Scenario service parameter definition

In order to realize that the recommended scenario is universal enough, we map the content of the data source layer, the atomic service layer, and the basic service layer to the service configuration, and realize the mapping and combination of services by defining the corresponding configuration items in the configuration, aiming at the difference The content is implemented at the plug-in layer. To illustrate with the following configuration item diagram:

  • sourceMap: The scene service is defined as the situation where the map is used to support multiple modules or experimental groups in the scene, where the key is the module ID, and this parameter needs to be carried when the store server requests recommendations.
  • cpdRequest, algorithmRequest, gameRequest: used to define the request parameters of the corresponding RPC call.
  • filterRequest: Used to define the filtering request in the queue, such as filtering between the main application and the developer.
  • unionStrategy: Defines the merging and fusion of queues and the merging rules between queues.
  • supplement: bottom strategy;
  • sourceList: The data source used. As shown in the figure above, there are two data sources defined, which means that in this scenario, data needs to be obtained from the two data sources, and then the queue is merged and post-processed.

Exploration and practice of recommendation system in vivo app store

3.4.2 Service atomization and uniqueness

Achieving service atomization and service uniqueness is very important to this system. In the implementation process, the following two points are strictly followed:

The third-party RPC service that the application recommendation relies on and some internal filtering logic are encapsulated into a fine-grained atomic service (method) SDK. The content in the SDK does not include the specific business capabilities of personalized recommendation scenarios. The focus is on basic functional items. Business content needs to be implemented in scenario plug-ins, and unified types of services support combination as much as possible.

Service uniqueness is essential to achieve system convergence and controllable code scale, and we are constantly working towards this. Each service layer provides related functions in the form of SDK, and the uniqueness of the service call entry is realized in the SDK.

3.4.3 Reasonable use of design patterns

Many design patterns are used in the system to optimize the overall architecture. The following focuses on the template design patterns, strategy patterns and combination patterns used:

The template design pattern and strategy pattern are used to achieve this process in obtaining the original queue of recommendations.

The benefits of using the template design pattern are obvious, and it can easily promote the flow of this part of the processing logic.

For different data sources, different data source services and methods need to be used. The advantage of using the strategy mode is to facilitate the definition of calls to different interfaces in different scenarios.

Atomic services or methods of the same type support composite mode as much as possible, which will provide great convenience for subsequent expansion.

To illustrate with the actual implementation method, when we define the filter type, we support the input of multiple filter types, and the upper-level services can be passed in as needed when they are in use. The design pattern of using combination plays a huge role in improving scalability.

3.4.4 Hot swap of scenes

In order to realize isolation and non-interference between scenes in the system, the author uses the Java SPI method to define scene interfaces at the framework layer, and the interface implementation classes are implemented in separate Jars in each scene. This method helps the plug-in program to minimize the intrusion of the framework layer and the basic service layer.

Fourth, the changes brought about

In the past, the store server wrote complete recommendation queue acquisition, fusion, assembly, and filtering logic in the service layer of each interface. There were a lot of duplicate content. And with the continuous iteration of the version, there were many different processing logics of different versions mixed together, resulting in It is difficult to transform and upgrade, and it will affect the whole body. The current application recommendation system has brought great improvements in two directions:

  1. The logic of the process framework is completely abstract and independent. Each business scenario only needs to write a small amount of plug-in callback logic on demand. (It does not involve very special scenarios, and you don’t need to write plug-in callback extensions at all. You can configure the corresponding scenario rule configuration. , Can be completely free of development, currently about 30% of the scenes are free of development).
  2. The scenes are completely isolated and independent, and complex function upgrades can be implemented incrementally by upgrading the corresponding scene id or module id, without affecting the existing logic.

Five, write at the end

Through the implementation of the above-mentioned related solutions, we have roughly reduced the development workload by 75% for each recommended scenario, and the bug rate has also been greatly reduced.

Author: vivo-Huang Xiaoqun

Guess you like

Origin blog.51cto.com/14291117/2668022
Recommended