Overview recall search matching model recommended in (a): The traditional method

Overview match the recommended model Part0 search

The main source of inspiration of this review paper slides SIGIR2018 of "Deep Learning for Matching in Search and Recommendation", focuses on searching and matching depth recommended in very solid review for some methods which, in particular, the feature- based deep learning approach adds some recent related paper. Search and recommendation system should be deep learning machine learning and even landing in the industry the most widely used and most easily realized scenes. Whether it be a search or recommendation, in fact, essentially matching the essence of a given search query, matching doc; recommended is the nature of a given user, the recommended item. This paper stresses the recommended system matching problem, including traditional matching model and deep learning model.

Although deep learning of the wind intensified, but the idea behind the embodied matrix decomposition, collaborative filtering, etc. In fact, thought has been running through them, such as thinking userCF and itemCF of svd ++ embodied the essence FM model can degenerate into more than most models. Do more than summarize these methods, contribute to a more profound understanding of associations between different models.

Figure 1 is recommended and the nature of the search, is the process of match

Collaborative Filtering method based Part1

CF model

Speaking recommendation system in the most classic model, than the famous collaborative filtering. Collaborative filtering is based on a fundamental assumption: a user's behavior, and he can be predicted by the behavior of similar users.

The basic idea of ​​collaborative filtering is based on the <user, item> of all interactions, using the collective wisdom to make recommendations. CF can be divided into three kinds according to the type, user-based CF, item- based CF and model-based CF.

(1) User-base CF: by analysis of the user's favorite item, if the users a and b item had almost like, the users a and b are similar. Similarly friend, like, b can be liked but did not read the item a recommendation to a

(2) Item-base CF: item A and item B, if the person is almost like that item A and the item B are similar. If users like item A, then the recommended item to the user a high probability B is like. For example, this article introduces the user to browse through the recommendation system is also likely to prefer other machines and recommendation system similar learning related articles.

(3) model-base CF: also called learning-based method to describe the relationship between the user and the article, the user and the user, and the article by defining a parametric model of the article, and then through the existing users - to optimize article scoring matrix solving get parameters. E.g. matrix decomposition, and the like LFM hidden semantic model.

CF collaborative filtering problem to be solved by the idea expressed in the form data is: how unknown partial matrix filling problems (Matrix Completion). Shown in Figure 2.1, it is a known value of the item the user has interacted, how to fill the matrix based on these known values ​​remaining unknown value, that is, to predict the user has not interacted item is filled matrix problem to be solved.

Figure 2.1 user has rated movie on the left, the right can be filled with a matrix expression

Matrix can be filled with a classical SVD (Singular Value Decomposition) solution, as shown in Figure 2.1

2.2 SVD matrix decomposition FIG.

Where M = m * n the left represents the user scoring matrix, the matrix rows and m represents the number of users, n represents the number of columns of the matrix Item, in most recommendation systems larger than the size of m and n, it is desirable to decompose M into by form on the right lower rank. Generally SVD solving can be divided into three steps:

(1) missing data matrix filled with the M 0

(2) solve the problem SVD, matrix U and V matrices to obtain

(3) using the matrices U and V k-dimensional low-rank matrix estimate

For the second step to solve the problem of species SVD, equivalent to the following optimization problem:

Wherein the user i yij true score item j, i.e. label, U, and V is the model estimates, the process solution matrix U and V true is to minimize user error scoring matrix and the matrix of the prediction process.

There is a problem solving method that SVD:

(1) missing data (data set accounted for more than 99%) and observe data as the weight

(2) minimization process without regularization (only the minimum variance), prone to overfitting

Therefore, in general for the original SVD method there will be many improvements.


MF model (matrix decomposition)

In order to solve the above-described over-fitting, the model matrix decomposition model (matrix factorization) were the following

The core idea of ​​MF model can be divided into two steps

(1) the user u i scored items hidden user decomposed into vectors vu, and an article hidden vector vi

(2) the vector dot product and the user u i of the article (inner product) obtained value, can be used to represent the degree of preference of a user u i of the article, the higher the score representing the item recommended to the user the greater the probability

At the same time, MF model introduces l2 regularization to solve overfitting

Of course, here in addition to a regular l2, l1 other regular means such as regular, cross-entropy regularization it is also possible.


FISM model

Both methods and MF model CF-mentioned methods are simply use the interactive information user- item, the expression is for the user's own userid it is the user himself. KDD 2014 presented on a method more able to express user information, Factored Item Similarity Model, referred to FISM, by definition, the user liked the item as the user's expression to describe the user data is represented by the following formula:

Noting user implicit expression vectors are no longer independent, but with all the user liked item was expressed as the cumulative sum of the user; the item itself another set of implicit vector vi is expressed, the same final two vectors inner product representation.


SVD ++ model

CF MF model can be seen as a user-based model, the user id directly mapped into the implicit vector, and FISM model can be seen as a CF item- based model, mapped into a set of implicit vector household users pay off the item. Information is itself a userid, one user in the past interacted item of information, user- base and how to combine the advantages of both item-base itself of it?

SVD ++ method is the combination of the two, the following mathematical expression

Wherein each user expresses divided into two parts, left vu represents implicit vector (user-based CF Thought) user id mapping, the right is the sum of user interaction over a set of item (item- based CF Thought). User and item similarity is expressed by the vector clicks.

This fusion method can be seen as early fusion model, but the best-performing model in three consecutive years of multi-million dollar Netflix recommended game.


Part2 Generic feature-based approach

The above-described method, whether it is CF, MF, SVD, SVD ++, or the FISM, just use the interactive information of the user and the item (rating data), and for a large amount of information is not utilized to the side information. For example, user information itself, such as age, sex, occupation; item itself side information, such as classification, description, graphic information; context and context information such as location, time, weather and so on. Thus, the second portion talk traditional models, is how to use these features to construct the model feature- based.

2.3 System FIG wherein three modules: a user information, the item information, interaction information


FM model

First up is the famous FM model. FM model can be regarded as composed of two parts, as shown in Figure 2.4, the linear model LR blue, and red portions of the second order features in combination. For each input feature, learning model requires a low-dimensional implicit expression vector v, that is, in a variety of so-called embedding in NN network representation.

Wherein one-hot sparse 2.4 FM input model in FIG.

FM mathematical expression model shown in FIG. 2.5

2.5 FM mathematical expression model exploded

Note that the red part represents the combinations of two second order features (not cross themselves and their features), or a cross between a vector represented by the vector product. FM model is a paradigm of expression feature- based model, introduced in the next several models can be seen as a special example of FM model.


FM model and MF relations

If the userid and ItemID only, we can find that degenerate into FM plus MF model bias, as shown in FIG. 2.6

FIG 2.6 FM model can degenerate into a model with a bias of MF

Mathematical expressions are as follows:


FM model and FISM relations

If the input contains two variables 1) the item set through user interaction; 2) itemid itself, then, in this case in turn degenerate into FISM FM band model bias, as shown in FIG 2.7, the blue boxes are user history interacted item (rated movies), to the right of the orange boxes represent a one-hot features itemid itself

Figure 2.7 FM model can degenerate into a model with a bias of FISM

At this time, FM mathematical model expressed as follows:

Similarly, if coupled with the implicit expression vector userid, then the FM model will degenerate into SVD ++ model. Visible, MF, FISM, SVD ++ FM are actually a special case.


Part3 summary

Micro perspective

Model presented above are solved by scoring forecast scheduling problems recommendation system, which in many cases are generally not the best, because there are the following aspects:

gap (1) with a predicted score of RMSE metrics and the actual recommendation system sort indicators

RMSE fit with the predicted score is the minimum variance (with regular), and the actual faces a scheduling problem

(2) naturally occurring bias observations

Users generally tend to score their favorite item, but the user does not score the item had not really do not like. For Scheduling recommendation system, generally can be replaced with a ranking pairwise RMSE

As shown in the above equation, the user is not directly fitted to the individual score of item, but in the form of pair of fitting; In general, the user's high score item> low scoring user item; user interaction used item> the user has not interacted item (not necessarily really do not like)


Macro perspective

In fact, the nature of search and recommendation are matching, matching former users and items; the latter matching query and doc. Specific to the matching method, divided into the traditional model and the depth of the model into two categories, the second chapter tells of the traditional model, third and fourth chapters talking about the depth model.

For the traditional model, and the model is divided into the feature-based model based on collaborative filtering, the biggest difference between the two is whether the use of side information. Based collaborative filtering model, such as CF, MF, FISM, SVD ++, only uses the user - interaction information items, such as userid, itemid, and a set of user interaction over the item itself expressed. The feature-based models with FM, for example, the main feature is in addition to the user - in addition to interactive objects, also introduces additional side information. FM model is a special case of many other models, such as MF, SVD ++, FISM and so on.

Finishing this review is mainly based on the original slides, read part of the paper portion of the crude intensive, learned a lot of them, in full text with ideas on how to do the recommended match will as far as possible to string together a variety of methods, mainly behind the same idea guide. There are more than mistakes, welcome criticism pointed out.


Reference Part4

(1)  https://www.  comp.nus.edu.sg/~xiangn  an/sigir18-deep.pdf

(2) Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. Fast matrix factorization for online recommendation with implicit feedback. In SIGIR 2016.

(3) Yehuda Koren, and Robert Bell. Advances in collaborative filtering. Recommender systems handbook. Springer, Boston, MA, 2015. 77-118.

(4) Santosh Kabbur, Xia Ning, and George Karypis. Fism: factored item similarity models for top-n recommender systems. In KDD 2013.

(5) Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD 2018.

(6) Steffen Rendle. Factorization machines. In ICDM 2010.

Published 18 original articles · won praise 588 · Views 1.03 million +

Guess you like

Origin blog.csdn.net/hellozhxy/article/details/103979806