Interpretation of CIKM paper | Thinking and application of multi-scenario global representation in Taobao content recommendation scenario

ac17ddfd6b9bb87f82d48d0bf1345c54.gif

Combining with the specific problems in the shopping recommendation scene, we have carried out a series of explorations and applications of content recommendation scenarios from the perspective of multi-scenario global representation, and expanded our research from the dimensions of global representation, information migration methods, and model framework applications. The optimization work has achieved phased optimization experience and business effects.

adb9cc6ca1787a37f2eef719911613d5.png

background introduction

Since its launch at the end of 2020, Taobao Hangout has been an important position for content community building and grass-planting and sharing under Taobao’s content-based recommendation system. The difference from product recommendation is that in order to create the value concept of "good-looking, interesting, real, and good things", Taobao Hangouts presents "multiple" characteristics of different dimensions in terms of product form and distribution system, which are specifically reflected in:

  1. Multiple display forms : double-column content information flow (video + text), immersive video TAB without refer, full-screen page with refer up and down (video, text), etc.

  2. Multiple content subjects : video, graphics, and mounted products

  3. Multi-user minds : content planting mind, commodity shopping guide mind

672a0add7dd4bcc008be075864598358.jpeg

Figure 1. Diversified content recommendation scenarios on Taobao Shopping

Such rich and diverse content recommendation scenarios not only meet the needs of user consumption and creator sharing, but also bring great challenges to our algorithm modeling in complex and diverse recommendation scenarios. On the one hand, model modeling needs to adapt to the data distribution characteristics of each scene; on the other hand, each scene cannot be turned into an "information island", and it is necessary to think about how to make good use of the data of each scene from the perspective of the whole domain. Therefore, since the initial stage of Taobao shopping, we have established a multi-scenario global representation project plan to deal with the scene challenges we face.

The multi-scenario recommendation problem is a common challenge faced by recommendation systems at present, involving many fields such as advertising, marketing, product shopping guide, and content. In the past three years, many algorithm teams in the industry have gradually realized the importance of multi-scenario problems, and expounded and demonstrated the importance of multi-scenario modeling from business perspectives, data distribution, and feasibility analysis. We also reached a similar conclusion in the preliminary program research and data analysis. Furthermore, on the basis of these modeling necessity considerations, we focus on the advantages of multi-scenario global representation in the context of Taobao shopping content recommendation:

  1. Efficiency-improving advantages : Compared with the rich behavioral data in Taobao’s commodity domain, the data in the content domain of shopping is relatively sparse, and the multi-scenario characteristics of shopping make the modeling between data more fragmented. In this context, multi-scenario global representation will help to break through the barriers between scene data, alleviate the problem of data sparseness in content recommendation scenarios, and thus improve business efficiency.

  2. Cost advantage : There are many recommended scenarios in the browsing domain, and the labor cost and resource overhead of designing and maintaining a set of models for each scenario are relatively high. The starting point of multi-scenario global representation is to be able to carry multiple scenarios with a unified framework. This is very meaningful for cost savings.

  3. Migration advantages : With the in-depth development of the business, Hangwang is gradually adding or accessing some new scenarios, such as video TAB. At the same time, in addition to efficiency indicators, content recommendation scenarios also need to take into account the optimization of user experience, content ecology, and creator growth. How to quickly access new scenarios and how to quickly migrate existing solutions to the optimization of other goals is a test of our solution design capabilities. The multi-scenario global representation undoubtedly has an inherent advantage in this issue.

Therefore, we are committed to exploring an efficient, unified, and transferable multi-scenario global representation framework to support the business needs of Taobao's content-based recommendation scenarios, and hope that our thinking and exploration in this direction can bring some experience and inspiration.

9f7bbd259b6cb9367f11d88bde906cb3.png

problems and challenges

Based on the above background and business demands, multi-scenario global representation faces the following challenges:

  1. In terms of breadth, how to expand the scope of data utilization

  2. In terms of depth, how to improve the utilization rate of data and the fine effect of information migration

  3. In terms of speed, how to make data migration quickly adapt to multiple scenarios and multiple targets 

0a96abf991ddb3922e6033543e0c9354.png

Technical iteration route

▐Three   -dimensional evaluation of multi-scene global representation

We focus on the ability to evaluate and think about multi-scenario global representation schemes from the following three dimensions:

  1. The scope of global representation (gradually expand the scope of data use): Labeled samples of a single scene in the content domain → labeled samples of multiple scenes in the content domain → samples of the whole space (labeled + unlabeled) in multiple scenes in the content domain → multi-scenario Taobao domain (commodity + Content) full space sample

  2. Ways of information migration (information sharing and differentiated migration between scenes): sharing of sample data→differentiated representation of models→adaptive fine-grained migration of structures

  3. Expansion of the model framework (the ability to extend the application to new scenarios and new targets): Improve the efficiency of multi-scenario main tasks → transferable application of the model framework to other scenarios and tasks

▐Scheme   selection and thinking

At present, there are many excellent works of multi-scenario modeling in the industry. Due to space reasons, we cannot list them all. We learn and explore our optimization scheme and the ability to adapt to the business on the basis of our predecessors. Specifically, combined with our three-dimensional evaluation of multi-scene global representation in section 3.1, the following table lists the differences between the design scheme of this paper and some other representative works:

Multi-scenario solution

The scope of global representation

Ways of Information Migration

Extensions to the Model Framework

STAR

labeled sample

Network parameter matrix mapping (partial implicit)

Unverified

SAR-Net

labeled sample

MMoE-like structure (partially implicit)

Unverified

HMoE

labeled sample

MMoE-like structure (partially implicit)

Unverified

adasparse

labeled sample

Dynamic weight network (partial implicit)

Unverified

Browse the multi-scenario solution

Labeled samples + unlabeled samples

Adaptive refinement transfer of structures (explicit modeling)

Verified

▐Iterative   ideas and progress

Figure 2 shows our iterative ideas and progress in the multi-scenario global representation of Taobao shopping for more than a year. Up to now, we have completed the first three stages of scheme design and online promotion. This article will focus on our implementation work in the third phase, and the fourth phase will also be launched in future work.

6fd26b0410f78509b713704019fde45a.png

Figure 2. Taobao shopping multi-scenario global representation iteration process

We have refined the core optimization points in the third-stage implementation plan into a paper, which has been included in the conference CIKM2022
Title of the paper: Scenario-Adaptive and Self-Supervised Model for Multi-Scenario Personalized Recommendation
Paper address: https://arxiv.org /abs/2208.11457

In view of the fact that my project team is currently mainly responsible for the optimization of the recall stage in the recommendation link of Taobao shopping, so the overall optimization idea is first carried out in the recall phase of all scenes of Taobao shopping and the full model is achieved (experimental evaluation is detailed in " Experimental Analysis and Online Effects" section). The subsequent introduction of this article also mainly focuses on the recall model and the recall task. We believe that the optimization ideas proposed in this paper are also applicable to other links such as ranking and other recommendation fields such as product recommendation.

In addition, recall, as the first stage of the recommendation link, has its own characteristics and challenges in the sampling space, scoring method, and optimization goal compared with sorting. We have also carried out corresponding adaptation and optimization in the design and implementation of the multi-scenario model. We also found in the preliminary research that the current industry's modeling and implementation of multi-scenario problems mainly focus on the ranking model, so our solution is also one of the earliest work in the industry to carry out multi-scenario modeling in the recall phase .

f574d97f2a7244a5653f0c61da448e79.jpeg

Core Model Scheme

Next, we will introduce our multi-scenario global representation model: Scenario-Adaptive and Self-Supervised Model for Multi-Scenario Personalized Recommendation (SASS for short). The SASS model focuses on three core optimization points:

  1. Refined migration : implicit differential modeling such as network parameter matrix mapping → refined migration of adaptive gating network

  2. Global sample extension : global labeled sample set → global labeled and unlabeled sample set

  3. Multi-scenario recall model : single-scenario double-tower structure→multi-scenario double-tower structure

▐Model   framework: two-stage training mode

2cda69c73a836daf5b1662d448a18e16.jpeg

Figure 3. SASS model framework

Figure 3 shows the SASS model framework, which consists of two stages:

stage

task type

Sample space

effect

Stage 1
Pre- training task

Unsupervised task for scene comparison

Multi-scene unlabeled samples

a. Introduce unlabeled data to expand the use space of global samples
b. Perform pattern matching and alignment between scenes to capture scene relationships
c. Provide pre-training embedding and initialize network parameters for the second stage tasks

Phase 2
fine-tuning tasks

Supervised tasks for twin-tower metrics

Multi-scene labeled samples

a. Align with the recall target, joint training of full-scene samples and single-scene samples
b. Multi-scene recall on the model service line in the fine-tuning stage

Since the two-tower recall model is independently modeled on the user side and the item side and has a similar structure, in terms of modeling, except for the underlying input features, the user side and the item side adopt exactly the same network structure and loss form. In addition, the underlying embedding layer is shared globally. For the sake of brevity, the subsequent chapters of this article will introduce the user side as an example, and the structure of the item side will not be described in detail.

▐Refinement   Migration: Multi-layer Scene Adaptive Migration Representation Network

1d648a359cb084f1f379187ad59f3c03.jpeg

图4. Multi-Layer Scenario Adaptive Transfer Module (ML-SAT)

Figure 4 shows the core representation network Multi-Layer Scenario Adaptive Transfer Module (ML-SAT for short) of the entire multi-scenario model framework. For samples from a certain scene, after ML-SAT, a vector expression that integrates global information and has scene differentiation will be obtained. Next, the core modules are introduced respectively.

  • Full-scenario shared network & single-scenario unique network

As shown in Figure 4(a), the features of each sample will go through two network structures at the same time after passing through the shared embedding layer and performing vector splicing:

  1. Global shared network ed1d545f9a08fe49049b4855b827f3d7.png(global shared network) : The blue part of the network in the figure, the structure of this part is shared by all scenes, and the samples of all scenes will be trained through this network. Therefore, the full scene sharing network is used to model the representation results of the global information.

    fa46ef499942c0919d0f07c0a4a0fbb3.png

  2. Scenario specific network 7666c2b11d62fc2a62227f962a859f45.png: the gray box part of the network in the figure, each scene has its own independent network parameters, which are used to model the differentiated expression of the scene itself. During training and estimation, each sample will only be represented and trained through the specific network corresponding to the scene.

e974904a3dbc3bee9258863c29681b8c.png

  • Scene Context Bias Network

In order to maintain the unity of training, we have unified the feature system of all scenes (common feature schema, features that are not available in some scenes are filled with null values). The input features can be simply divided into two categories:

  1. Common features of the scenec9592659b3923864866759922d396692.png : such as profile features on the user side, user behavior sequences, etc.

  2. Scenario-specific features86091b68c345383e9d9009c2b1b1045d.png : such as scenario_id and other features that can clearly identify the scene, as well as some unique features of the scene (such as the refer feature in the second-hop scene)

Scene-specific features have a significant distinction to the scene, so such features are 12392daca5f804ef98edffdf9f506da7.pngaggregated separately and then modeled through the auxiliary network (auxiliary network):

bd82c4121a5288cc0dbb6c6b4a686a67.png

The auxiliary network output 11792932e86e0509a4c9b9e36092e9e2.pngwill be used as scene bias information for full scene information migration (introduced in Section 4.2.3) and the final scene fusion representation (introduced in Section “Scene Bias Fusion Module”).

  • Scene adaptive gating unit

There are such contradictions in multi-scene modeling: single-scene data conforms to the distribution of the scene itself but has the problem of data sparseness, and multi-scene mixed data is rich in information but has noise. How to learn from each other's strengths and obtain effective migration information in a fine-grained manner from rich global representations without seriously damaging the data distribution of the original scene has become the focus of our solution design. Therefore, we build a bridge for information migration between the global shared network and the scenario specific network, and design a scenario adaptive gate unit to adapt and refine Control the amount of information migration from full-scene information to single-scene information and the fusion method of migration information and original information. The gating structure is shown in Figure 4(b).

b9ca66344bc4f873726c013db76575a2.png

The scenario adaptive gate unit mainly includes two gate structures: adaptive gate 263a3bb7650d4ad697f1a85f9b237ec8.pngand update gate 3b385efcd16f2ef4e5eeb320780f45e9.png. Specifically, d42707f19ecb18a43a0886bd34478802.pngit is used to control how much information can be migrated from full-scene modeling to single-scene; it 5088ab6c78ba5eac1900e8fc6aa9101c.pngcontrols the fusion method of the transferred information and the original information of this scene. We stack multiple layers of this scene adaptive gating unit side by side, so as to perform layer-by-layer refined information migration and fusion:

299c3c7cb66c7738110faa86bcf8a387.png

In addition, after the paper was published, we further optimized the structure. The core idea is to drive the directional flow of information through the random sum operation 1d6d1edf8ae51afc902482e3ff9b1870.pngof information in two directions . d42fde82a1ddcf971956a25403355020.pngOn the one hand, such a structural design can prevent the inactivation of the entire migration structure. On the other hand, the random dropout of full-scene side information can also increase the difficulty of training and avoid excessive dominance of full-scene information on a single scene. After optimization, the structure of the new scene adaptive gating unit is shown in Figure 5.

7207bc696ec766870212eec308e34723.png

Figure 5. Scene adaptive gating unit (new version)

  • Scene Bias Fusion Module

As shown in Figure 4(c), the representation vector fused by the multi-layer scenario adaptive gate unit (scenario adaptive gate unit) is fused 0d95b05bf93dd507ce2469a3e37d2195.pngwith the bias representation of the scene 028364249daab59b8a5c34239f32722b.pngin the upper layer, and finally the vector of the sample in the corresponding scene is obtained expression 29fe1e50a42b6a5f0776ef36315adc96.png:

80ef8628fd0c640a49f08f6ee495e704.png

▐Global   Sample Expansion: Unsupervised Pre-training for Scene Comparison Learning

After introducing the core representation network ML-SAT of the model, the first stage of pre-training tasks based on ML-SAT will be introduced next. As mentioned in the "Iterative Ideas and Progress" section, the next stage of planning and development of multi-scene global representation is to expand from labeled global samples to unlabeled global samples. Inspired by the application of contrastive learning in the field of deep learning, we draw a conceptual analogy between the contrastive learning training paradigm and the modeling of associations between scenarios under multi-scenario problems, as shown in Figure 6.

From the user's perspective, the essential premise of information migration in the recommendation system is to assume that users have the similarity or relevance of interest and mind in different scenarios. For example, if a beauty lover clicks on a certain lipstick product in the first guess, she will also be interested in the lipstick evaluation content in Langwan; Videos on skateboarding instruction or even Frisbee and other outdoor sports topics... However, the behavior patterns of users in different scenarios are obviously different. This kind of recommendation appeal that describes the similarity or relevance of interests of the same user in different scene perspectives coincides with the modeling idea of ​​comparative learning. Therefore, we regard the behavior of users in different scenarios (behavioral feature data in different scenarios) as a data enhancement method under the concept of contrastive learning, and obtain The representation of the same user in two different scenes, and then use the metric loss of comparative learning to align the two vector representations, so as to achieve pattern matching and alignment between scenes.

9a9a8a8fdccb8894e9b55c9e742a0084.jpeg

Figure 6. Analogy of the contrastive learning paradigm on scene contrastive modeling

Specifically, assuming that a user visits a scene, we obtain the scene pair samples c8aec40863736d8476fffa47ae96a5b7.pngof the user in the form of a pairwise combination (as shown in Figure 7). bc54a89fba1e6229fcca640334128d58.pngThen extract the sample features of the user in the corresponding scene by scene, including:

  1. User profile characteristics f3b9ccf37bd3ebe6c199f4cb6cec1d5f.png: gender, age, occupation, etc.

  2. User interaction characteristics in the scene 6e3004e48b1b3a133d99d2d03633134d.png: the behavior sequence in the corresponding scene and the cross-scene statistical features such as category preference and account preference in the scene

  3. The attribute characteristics corresponding to the scene 8ea260f1da906002effe4c2b2f9b3a87.png, such as scenario_id, etc.

546e6d84deab8f8bbfef45f811671753.jpeg

Figure 7. Scene comparison learning sample splitting method (number of scenes>2)

235012c0ddf725cc4c121942d274be10.pngAs shown in Figure 4(a), after obtaining the characteristics of the user in the corresponding scene, the representation vector of the user in the corresponding scene is obtained through the representation network introduced in the chapter "Refinement Migration: Multi-layer Scene Adaptive Migration Representation Network". Finally, the task training is performed by contrastive learning loss. Assuming that a user's scene sample pair is represented by a vector ebca3b213f5d4e4b6cb2e75d301e5bf4.png, the corresponding loss is as follows:

ff1768a5b715296a6372bccc5a3924e6.png

Then the training loss in a batch (batch_size= 4fbf16712e374012b7b0c977dfd3c15b.png) is:

b0f6b7e033d91a763357ed45ef9f7043.png

The modeling ideas and training methods on the item side are similar to those on the user side, and will not be repeated here. It should be emphasized that the pre-training task in the first stage describes the contrast relationship between scenes, and the whole task is trained on an unlabeled sample set . In this way, the coverage of global samples is extended to the unlabeled data space.

▐Multi   -scene recall: joint fine-tuning training of global and single-scene samples

a8295ecdf2ea9c62bd846e02e00223ef.jpeg

Figure 8. Fine-tuning tasks and output vectors of the second stage of the SASS model

The fine-tuning task of the second stage is a two-tower model aligned with the main task of recall, as shown in Figure 8. The core network module of the second stage is consistent with the first stage and will restore the embedding layer and upper network parameters of the first stage model, so as to realize the migration of the two stage model parameters.

Training for fine-tuning tasks: We take the content clicked by the user as a positive sample, and then randomly sample in the candidate space to construct a negative sample to obtain a 6787b4a30475c64af1b10053aa87d2a6.pngtriplet form. Therefore, it is a supervised pre-training task. The specific operation is similar to the traditional two-tower recall model. It will not be expanded here, and the training method will be focused on.

After the user-side features and item-side features of a single sample pass through the ML-SAT representation network, the representation vector corresponding to the scene can be obtained. On this basis, pairwise loss is used for training:

26871452012ea11b570bff1820cce2ab.png

The above loss uses the samples of a single scene itself, so it can better fit the distribution of the scene itself. At the same time, we extract the output of the last layer of the global shared network (global shared network) in the ML-SAT network on the user side and item side of the graph to represent the representation results obtained through global sample training, and use this output as an auxiliary task. train:

60ddfb3b2f2f85f88080b6faaab06e05.png

The final task of the second stage is the joint training of the dual-tower recall task of the single scene itself and the dual-tower recall task of the global sample. The training loss is expressed as:

163efdff630246ca796483863b011638.png

▐Model   deployment and online estimation

Model deployment and estimation reflect the goal of all scenarios on a model service line we plan . The deployment method is shown in Figure 9. We take the model of the second-stage fine-tuning task and deploy it online based on the group BE O2O retrieval architecture. The data of each scenario uses the output of the corresponding scenario specific network for online estimation (full-scenario sharing The output of the network is only used for model training). Among them, the full item-side candidate set is generated through the item-side scene-specific network to generate a representation vector and then indexed; the user-side vector is estimated online and then ANN is retrieved to return the top-k results to the post-link task.

fea16b5020745fa4e287ba0e70ca3711.png

Figure 9. SASS model deployment and online estimation

c3e556f7df89d3bfe3582537ccb59a3b.jpeg

Experimental Analysis and Online Effects

▐Method   comparison

We conducted offline verification of the effect on the public dataset and the production dataset of Taobao shopping, and compared the current mainstream multi-scenario model solutions in the industry. The evaluation results are shown in the following table:

6be4cc137478b3686d5fa52610533a01.jpeg

▐Ablation   experiment

At the same time, a series of ablation experiments were carried out on the model, focusing on the analysis and answers to the following questions:

  1. How does the scene adaptive gate compare to other migration structures?

  2. What is the effect of item side vector scene representation and single vector representation?

  3. How does the scene comparison pre-training task compare to other pre-training tasks?

  4. What is the impact of substructures such as scenario bias network and global sample joint fine-tuning auxiliary tasks on the effect?

0c15f7c7203e1fe691f6b749d2dffda0.jpeg

▐ Online effect

The multi-scenario global representation recall model based on the SASS model has been fully launched in all 5 core scenarios of Taobao shopping (video dual-column stream, video full-screen page, video TAB, graphic-text dual-column stream, graphic-text full-screen page) And it has achieved good returns in all scenarios . Since its launch, the multi-scenario global representation recall model has been one of the most important recall models in each scenario, and it has achieved the goal of serving all scenarios online with one model in practical effect.

7bbacd3f3f75a93ec169dd6e09c186a3.jpeg

Model extension and application

Going back to the three evaluation dimensions of multi-scenario global representation in the chapter "Evaluation of multi-scenario global representation", from the perspective of the migration and application capabilities of the model framework, we expect that the model framework of multi-scenario global representation can not only solve the problem of specific goals in specific fields Instead of multi-scenario modeling problems, thinking and exploring that this multi-dimensional information migration idea can be used to solve a large class of "similar feature systems but different data distribution" multi-field distribution modeling problems . Therefore, after completing the promotion and optimization of the multi-scenario global representation recall model on the core target of the main link, we further explore the extended application capabilities of this set of frameworks in other fields and targets, and consider whether it can be more deeply combined with shopping. Explore the features of content recommendation to solve other business problems. The core idea is the understanding of the "scene". Next, we list some of the extended applications that we have achieved in the shopping scene and are continuing to explore.

   Short video consumption time optimization

The consumption time of short videos is one of the core optimization goals in content-based recommendation scenarios. Traditional optimization ideas include modeling the duration as a separate goal in the ranking model, or modeling sub-tasks such as video completion and video decline to improve the final duration; at the sample level, methods such as loss weighting and sampling strategy adjustment are often used to Characterize the importance of video samples in terms of consumption time. One of the difficulties is how to solve the bias problem caused by the length of the video itself while optimizing the duration. In the past year, many other solutions have tried to eliminate the bias of the video length by segmenting the duration and predicting the quantile of the duration.

63544dca892a964f3ad6e220e7e579ef.png

Figure 10. Multi-scenario duration depolarization and modeling ideas

We combine multi-scene modeling to deal with this problem from another perspective: divide the sample into multiple data sets according to the length of the video, treat each data set as a scene (consistent feature system, overlapping information, and different distribution), and then use multiple Scene global representation framework for modeling. In this way, the video duration offset in a single scene is small, and finer-grained scene differences can be modeled. At the same time, due to joint modeling and global information migration, the data sparsity problem caused by data division can be alleviated. In addition, since the underlying embedding of the entire model is shared, the increased amount of parameters is mainly concentrated in the upper shallow network, so it does not bring too much resource and performance overhead.

▐Cross   -domain modeling of commodity domain and content domain

As one of the core content distribution scenarios of Taobao, Taobao Shopping undertakes the important task of linking the minds of the user consumer and the content supply side for product cultivation and content community. From a technical perspective, the essence is to establish information associations and describe differences in the two data fields of commodity domain and content domain. Therefore, how to migrate rich commodity domain information to the content domain has always been one of our core working ideas. The difficulty of the problem lies in that, compared with the multi-scenario migration of content in the domain shopping domain, data coverage (user coverage, supply coverage), user mind (commodity consumption mind vs. content entertainment mind), feature There are huge differences in both systems (commodity feature system vs. content feature system). Combined with the idea of ​​multi-scenario global representation model, from the perspective of multi-scenario migration, we carry out cross-domain modeling of Taobao commodity domain and shopping content domain in two steps:

  • Scene Sample Migration

We cooperated with Taobao's first guess short video team to introduce first guess short video content data that is more biased towards shopping guides, and carry out sample cleaning through strategies such as browsing the distributable candidate pool filtering, shopping recommended link recall into rough sorting candidate sample filtering, etc. and filter. Then, we regard the first-guess short video data as a new scene, and introduce it into the multi-scene global representation model for joint modeling to improve the representation effect in the shopping domain.

  • Representation Migration of Commodity Information and Content Information

Taobao’s video content has an important feature: many content will be loaded with products, which provides an innate condition for establishing the relationship between products and content. Aiming at the difficulty of heterogeneity of product and content features, we further upgraded the original model framework, divided user features and content features into product-related features and content-related features, and modeled them separately. Then combine two ideas to realize cross-domain information migration: a. Multi-scenario global migration representation of products and mounted products is essentially the information migration of the same system of the product itself, and then associate content through the mounting relationship to realize product-product-content b. For the same user or the same content, spatially align their product representations and content representations through contrastive learning auxiliary tasks to directly achieve domain adaptation of products and content. Figure 10 shows the modeling ideas on the content side, and the structure on the user side is similar. Both ideas are embedded into the multi-scenario global representation model for end-to-end training, so as to realize cross-domain modeling of commodity domain and content domain.

f3dc5875423a5b01bada664b71385873.png

Figure 11. Product and content representation transfer and comparison metrics

In addition, we are also planning and practicing a more direct use of Taobao product data for cross-domain modeling. At present, we have completed the sequence feature dimension by introducing user-wide product behavior sequences and cross-behavior modeling with content behavior sequences. migrate. However, direct introduction of commodity samples has the difficult problem of too large commodity samples and high joint training costs. At present, there are also planned solutions in practice. This is also the optimization focus of the fourth stage of our multi-scenario global representation learning mentioned in Section 3.3. Due to space reasons, this article will not expand further.


  • Content cold start problem

Unlike commodities, short video content delivery has the characteristics of high timeliness and fast replacement. Furthermore, Taobao content recommendation has a more serious cold start problem and the urgency of expansion than commodities, which are mainly reflected in:

  1. The sample is more sparse: There is still a big gap between the exposure and penetration rate of Taobao content compared with commodities

  2. Creator Incentives: Whether the published content can be quickly increased will affect the creator's enthusiasm for creation

  3. Strong timeliness: Related videos like festivals have higher requirements for timeliness of distribution and outbreak

In response to the above difficulties, we applied the multi-scenario global representation model to the cold start booster link of shopping to improve the ability to recommend new content. Specifically, we extract the samples corresponding to the content released within 3 days as a new scene, and keep the other main scenes unchanged, and then conduct training based on the multi-scene global representation model to realize the information migration from old content to new content. This scene is used online to recall the output of the corresponding network. Compared with the method of only using samples from cold-start links or direct mixed training of samples from cold-start + main links, the multi-scenario global representation model not only realizes information sharing and migration, but also alleviates the problem that the model is dominated by old content.

   Interactive multi-target and account growth

Taobao shopping is not only a content distribution site, but also a growth camp for creators to create and develop talents. Therefore, it is necessary to model and distribute by account level and conversion target, which inevitably involves optimizing the content corresponding to a specific level or based on a specific interaction target (nodding avatar, following, etc.). Therefore, we expanded the multi-scenario global representation model to focus on the following two applications on interactive multi-objectives and account growth goals:

  • High-quality account content grows and expands

According to the high quality and potential of creators' accounts, high-quality and potential accounts are stratified and delineated, and the corresponding account content coverage sample collection is used as a new scene, and other scenes remain unchanged. Then, training and online recall are performed based on the multi-scenario global representation model to fully capture the distribution characteristics of these accounts.

  • Account settings and avatar follow

We further extend the understanding of "scene". The scenarios described above are mainly defined and divided based on product forms or business standards. Furthermore, can we combine "multi-scene" and "multi-target", regard the sample set corresponding to a specific target as a new scene , and perform multi-scenario global representation modeling on this basis, thereby improving specific targets ( Such as pointing the profile picture, following other interactive goals). Of course, this analogy is too broad. After all, there are still some essential differences in modeling between multi-scenario and multi-target. However, this multi-scenario migration idea can be used for reference. Therefore, we aim at the problem of optimizing the head-click rate in interactive multi-objectives. We regard the sample sets such as click-through-broadcast and the head-click sample set with a large sample size as different scenes, and use a framework similar to the multi-scenario global representation but with Adapt and adjust the model to increase the rate of head nodding, and other interactive goals can also be modeled according to this idea.

9c24c0d7f0ba37babb0dbab2e5829dc6.png

Figure 12. Taobao shopping based on scene migration based on the dot avatar model scheme

c1b37c893cd325558fbe2b0ec14fa32f.jpeg

Summary and Outlook

Combining with the specific problems in the shopping recommendation scene, we have carried out a series of explorations and applications of content recommendation scenarios from the perspective of multi-scenario global representation, and expanded our research from the dimensions of global representation, information migration methods, and model framework applications. The optimization work has achieved phased optimization experience and business results. However, multi-scenario global representation is a relatively large technical system, and our specific practice in work is also facing more challenges with the expansion of Taobao's content-based recommendation requirements and the deepening of optimization goals. In the future, our continuous optimization work will mainly focus on the following aspects:

  1. Vertical-technical depth breakthrough : With the help of Taobao's diverse scenarios and rich data, the basic capabilities of the multi-scenario global representation model are continuously improved. On the one hand, the next stage will focus on exploring the cross-domain representation of the commodity domain and the content domain to achieve a wider range of global representation; on the other hand, continue to emphasize the refined modeling and interpretability of information transfer, so that the model It has a close relationship with the actual business.

  2. Horizontal-expansion of application scenarios and application goals : Through the improvement of the basic model and the adaptation to actual business problems based on this framework, our solution can more quickly complete the access of new scenarios in the future and other interactions, etc. Extended applications on multi-objective problems.

f02958b0ce58911bf90d1ac05abf13ef.jpeg

team introduction

We are from the Taobao shopping algorithm team. Shopping is an important content scene of Taobao. The advantages of the team are:

  1. Large business space and perfect infrastructure: Massive feedback on scenarios. With the support of the engineering team, algorithm engineers can easily launch large-scale models, update them in minutes, and pay more attention to the algorithm itself.

  2. Good team atmosphere, deep integration of research and implementation: the team not only solves business algorithm problems, but also keeps up with progress in the academic field. Students with internship ideas are also welcome to join. Senior senior brothers will define business problems according to students' strengths and interests, and guide research, giving each student sufficient room for growth.

Talent needs: Have a certain understanding of machine learning and deep learning, and are interested in content distribution and content understanding. You can send emails to [email protected] or [email protected]

¤  Extended reading ¤ 

3DXR Technology  |  Terminal Technology  |  Audio and Video Technology

Server Technology  |  Technical Quality  |  Data Algorithms

Guess you like

Origin blog.csdn.net/Taobaojishu/article/details/130437142