Cross-domain social recommendation: How to "guess what you like" through user social information?

Thesis motivation

Online platforms are generally divided into two types, one is information-oriented , such as some e-commerce websites, emphasizing the interaction of user-item; the other is social-oriented , such as Twitter, which provides social networking services with rich user -user link. Although these two domains are heterogeneous, they will share some users, called bridge users, through which we can make cross-domain social recommendation, that is, some potential users in the social network recommend information domains. related items.

Most of the current cross domain recommendation methods are for isomorphic domains, and for the tasks in this paper, the difficulties are as follows:

d47e62d2b349aca45e42305ed6714efbe5ed61d9 The lack of bridge users in the dataset has
d47e62d2b349aca45e42305ed6714efbe5ed61d9 sufficient attributes in the information domain, but little attention has been paid to exploiting these attributes to improve recommendation results for users in social networks

This paper proposes a method called Neural Social Collaborative Ranking (NSCR) to exploit user-item connections in the information domain and user-user connections in the social domain . In the information domain, attributes are used to enhance the effect of user and item embedding, and in the social domain, the embedding results of bridge users are propagated to non-bridge users through social networks.

Problem Description

In the information domain, the user set U1, the item set I, the user's scoring information for the project is a matrix Y, and the attributes about the user and the project are represented by Gu and Gi respectively; in the social domain, the user set U2, the social relationship is S. The bridge users of the two domains are U=U1∩U2.

Input: information domain {U1,I,Y,Gu,Gi}, social domain {S,U2}, U1∩U2 is not empty.

Output: Determine a ranking function for items for each user u' in the social domain.

NSCR Solution

The matrix factorization model (MF) is an important model of the recommendation system. Here we first introduce a point of view, that is, the CF model can be regarded as a shallow neural network model .

As shown in the figure below, we input the one-hot representation of the user\item ID, then map it to an embedding layer, and multiply the two embedding vectors element by element to obtain the vector h (if h is directly mapped to a score value, then the model is the MF model).

32485f4bd8c5597cbb31e2038ce1e4adfbb5db6f

This paper argues that the performance of MF is limited by the use of inner products to capture user-item interactions ; similarly, in the conventional use of attributes, simply adding user embedding and attribute embedding is not enough to capture user, item, connections between attributes.

Since the task is to make cross-domain recommendations for users in social networks, this paper uses a method based on representation learning (embedding) , and believes that the key to the problem is how to map items and users from social networks to the same embedding space.

, the right side of the equal sign is the respective objective function of the two domains. Since the users of the two domains only have a small amount of overlap, the solution given in this paper is to learn the embeddings of the two domains separately, while forcing the two learning processes to share the embeddings of the same bridge users. The optimization goal ise72d224906d55bb31ecdbbfe0293a3c3c893eb48

1. Learning of Information Domain

There are two objective functions for learning the parameters of the cf model: point-wise and pair-wise objective functions. The former minimizes the loss between the predicted score and the true score, and the latter is essentially negative sampling and is suitable for use in this article. Implicit feedback, while getting each user's personalized item ranking task.

First take the triplet (u,i,j), where u is a user, i is an item rated by the user (yui = 1), and j is an item not rated by the user (yuj = 0). The objective function wants to learn the correct order of (i,j):

6c4901457288e5a96c48340c79c24aae56227a6a

Where yuij = yui – yuj, ^yuij = ^yui – ^yuj, where ^yui is the predicted score.

After determining the objective function, let's see how the predicted value ^yui is obtained by the model. Based on the Neural Collaborative Filtering model, this paper further adds attribute information. The structure is shown in the following figure:

f32ab162e05b301cd3877fc96483b7624318ff1a

Input layer: input the id of four kinds of information, represented by one-hot vector.

Embedding: Embed the four kinds of information separately.

Pooling layer: Since the number of attributes is uncertain, the size of the vector set after embedding is uncertain. In order to give the following nn a fixed-length information, the pooling operation is performed.

Since max\average pooling cannot capture the interaction between users and attributes, a pairwise pooling method is designed :

7a82739dfcc33a6db6e919d047800a16bcdcdadf

Similar processing is done for the project, and finally the result of pu⊙qi is used as the input of the following MLP, and the MLP outputs the prediction result.

2. Learning of Social Domain

In the social domain, this paper uses a semi-supervised learning method to propagate user embedding results in the information domain from bridge users to non-bridge users . This is based on the assumption that if two users have strong social relationships, they may have similar preferences and thus have similar feature representations in the latent space.

The study consists of two parts:

Smoothness constraint (smoothness constraint): defines the loss of structural consistency, hoping that the representation of adjacent users is similar; su', u'' are the strength of the social relationship between the two users, du' is the out-degree of node u', called The smoothing constraint is because the feature representation of each user is divided by the root of the out-degree, and smoothing is performed. Without this processing, active users with many social connections will have more effective propagation.

57912a345e748bb80f15af226e9456ea86516c29

Fitting constraints: To keep the latent spaces of the two domains consistent, the two representations of the bridge users are forced to be close, i.e. the fitting loss:

1553b596a19ce35cefd269d1bb0b2d2d15a72b07

After the training is completed, pu' is input to the prediction framework in the information domain, and the ranking of the predicted item is obtained.

Experimental results

The information domain dataset comes from trip.com and also finds Facebook and Twitter information related to some of these users. The evaluation indicators are the indicators AUC and Recall@k of personalized ranking.

Since non-bridge users have no rating information and cannot verify whether the prediction is correct, a part of bridge users is used as the test set. It can be seen that the prediction results are better than state of art.

c2d208fe57af34c2fda721c86063e48c79451e46

evaluate

Different from traditional recommendation methods that use social information as additional information of users and predict user preferences in the same domain, this paper learns user preferences in the information domain, and then spreads them along the social network, so that users who are not originally in the domain You can also learn his preference information , and the angle is novel.

The heterogeneous recommendation is a very interesting task and worth following.


The original release time is: 2018-05-9
Author of this article: Huang Ruozi
This article is from " PaperWeekly ", a partner of Yunqi community . For related information, you can follow " PaperWeekly ".

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326753133&siteId=291194637