Work in their own learning some simple understanding of the recommendation system (to be continued)

Currently doing short video recommendation, did some time to do summary.

1, multi-objective model

At present the general structure of the neural network model used in the rank and recall are deep & wide + multi-objective. At present time multi-objective parameter will use the underlying shares, shared some of the parameters is limited to the embedding of the bottom layer, some will contain some of the MLP layer.

The model parameter sharing can also benefit from the data the user clicks on the behavior of households in predicting interactions. However, if not treated, it will bring an updated model of the main goals of positive samples and more samples dominant gradient. Therefore, in practice, be positive samples less heavy weighting of label, such as the right to increase the positive samples of heavy user interactions. In fact, re-weighting is equivalent to oversampling.

In fact, for a sample weight, on the other hand it is easy to adjust the target weight when multiple multi-objective used in combination. Estimated each behavior is a binary classification model to predict the value of the probability of such behavior. Under normal circumstances, not to deal with the case, the predicted value will be close to really ctr (actually a little too small). Value is greater than usual at this time of interactive multi-objective combination of a lot of clicks (general is a weighted sum), cited the example, click on the predicted value is 0.2, the value of the interaction is 0.01, and this time click on weights and range difference interaction of weight adjustment is also a great, convenient adjustment.

DNN model of multi-objective structure generally underlying embedding is shared. For sparse and dense of the features of the rank model, and the user will after some items mixed layer fully connected, do more objective.

For the use of a double column will recall model structure, respectively, for a user item and MLP, the final count cos.

 

2, feature selection optimization

dense features generally variance, proportion of zero value. sparse feature may analyze the characteristics of mutual information of the label, but in the calculation of mutual information to consider when normalized.

3, multi-target weights

 

4, online learning model

 

5, recall-related

Guess you like

Origin www.cnblogs.com/earendil/p/11878459.html