Recommendation system and knowledge graph

As an important means of information filtering, personalized recommendation system is one of the most effective methods to solve the problem of information overload at present, and it is the core technology of user-oriented Internet products.

Recommendation system tasks and difficulties

 According to the prediction of different objects, recommendation systems can generally be divided into two categories: one is the score predicted (rating prediction), for example, in the film-type applications, the system needs to predict user ratings of the film, and as it might like based on push the film. Under this scenario the user's feedback information expressing the extent of the user's preferences, so this information is also called explicit feedback (explicit feedback); the other is CTR prediction (click-through rateprediction), for example, news applications, The system needs to predict the probability that the user clicks on a certain news to optimize the recommendation plan. User feedback in this scenario can only express behavioral characteristics of users (clicks / not click), but does not reflect the extent of the user's favorite, so this information is also known as implicit feedback (implicit feedback).

The traditional recommendation system only uses the historical interaction information (explicit or implicit feedback) of users and items as input, which will bring two problems: First, in actual scenarios, the interaction information of users and items is often very sparse ( sparse) . For example, a movie app may contain tens of thousands of movies, but a user's rated movies may only average dozens of movies. Using such a small amount of observed data to predict a large amount of unknown information will greatly increase the risk of overfitting of the algorithm ; Second, for newly added users or items, because the system does not have its historical interaction information , it cannot be accurate Local modeling and recommendation, this situation is also called cold start problem (cold start problem).

 

A common idea to solve the sparsity and cold start problem is to introduce some side information as input in the recommendation algorithm . The auxiliary information can enrich the description of users and items and enhance the mining ability of the recommendation algorithm, thereby effectively making up for the sparseness or lack of interactive information. Common auxiliary information includes:

  • Social networks: A user is interested in an item, and his friends may also be interested in the item;

  • User / item attributes : Users with the same attributes may be interested in the same type of items;

  • Multimedia information such as images / video / audio / text : for example, merchandise pictures, movie trailers, music, news titles, etc .;

  • Context : The time, place, and current session information of the user-item interaction.

  • ……

How to effectively integrate various auxiliary information into the recommendation algorithm according to the characteristics of specific recommendation scenarios has always been a hot and difficult point in the research field of recommendation systems . How to extract effective features from various auxiliary information is also a core issue in the field of recommendation system engineering .

Knowledge graph

Among all kinds of auxiliary information, knowledge graph as an emerging type of auxiliary information has gradually attracted researchers' attention in recent years. A knowledge graph is a semantic network , where nodes represent entities or concepts , and edges represent various semantic relationships between entities / concepts . A knowledge graph is composed of several triples (h, r, t), where h and t represent the head node and tail node of a relationship, and r represents the relationship.

The triad shown in the figure above expresses the fact that "Chen Kaige directed Farewell My Concubine", where h = Chen Kaige, t = Farewell My Concubine, and r = Director.

The knowledge graph contains rich semantic associations between entities and provides a potential source of auxiliary information for the recommendation system. The knowledge graph has the potential to be applied in many recommended scenarios, such as movies, news, attractions, restaurants, shopping, etc. Compared with other kinds of auxiliary information, the introduction of knowledge graph can make the recommendation result have the following characteristics :

  • Precision . The knowledge graph introduces more semantic relationships for items, which can deeply discover user interests;

 

 

  • Diversity . The knowledge graph provides different types of relationship connections, which is conducive to the divergence of recommendation results and avoids the limitation of recommendation results to a single type;

  • Interpretability (explainability). The knowledge graph can connect the user's historical records and recommendation results, thereby improving the user's satisfaction and acceptance of the recommendation results and enhancing the user's trust in the recommendation system.

It is worth mentioning here is the difference between knowledge graph and item attributes . Item attributes can be regarded as a 1-hop node directly connected to an item in the knowledge graph, that is, a weakened version of the knowledge graph . In fact, a complete knowledge graph can provide a deeper and longer-range association between items, for example, "" Farewell My Concubine "-Leslie Cheung-Hong Kong-Liang Chaowei-" Infernal Affairs " Because the knowledge graph has higher dimensions and richer semantic relationships, its processing is therefore more complicated and difficult than item attributes.

 

Generally speaking, the existing work that can introduce knowledge graph into recommendation system is divided into two categories:

  • Generic feature-based methods represented by LibFM [1] . Such methods uniformly take the attributes of users and items as input to the recommendation algorithm. For example, LibFM records all attributes of a user and an item as x , and then makes the interaction strength y ( x ) between the user and the item dependent on all primary and secondary items in the attribute:

        Based on the versatility of this type of method, we can weaken the knowledge graph into item attributes and then apply this type of method. Of course, the shortcomings of this approach are also obvious: it is not specifically designed for knowledge graphs, so it is impossible to efficiently use all the information of knowledge graphs . For example, this kind of method is difficult to utilize multi-hop knowledge, and it is difficult to introduce relation (relation) information.

  • In PER [2], MetaGraph [3 ] represented by the path-based recommendation method (path-based methods). Such methods knowledge map as a heterogeneous network information (heterogeneous information network), then, based on meta-path characteristic or meta-graph configuration between the articles. Simply put, meta-path is a specific path connecting two entities, such as "actor-> movie-> director-> movie-> actor" This meta-path can connect two actors, so it can be regarded as one A way to tap the potential relationship between actors. The advantage of this type of method is that it fully and intuitively utilizes the network structure of the knowledge graph . The disadvantage is that it is necessary to manually design meta-path or meta-graph , which is difficult to reach the optimal in practice; at the same time, this type of method cannot be It is used in scenes in the same field (such as news recommendation) because we cannot pre-define meta-path or meta-graph for such scenes.

Knowledge graph feature learning

Knowledge Graph Embedding learns a low-dimensional vector for each entity and relationship in the knowledge graph, while maintaining the original structure or semantic information in the graph. In fact, the pattern characteristic of knowledge is learning network feature to learn a sub-field (network embedding), because knowledge map contains specific semantic information, knowledge maps feature to learn than general network characteristics of a learning need to be more careful and targeted model design. Generally speaking, knowledge graph feature learning models are classified into two categories:

  • Distance -based translational models. This type of model uses a distance-based scoring function to evaluate the probability of triples, and treats the tail node as the head node and the result of the relationship translation. Representatives of such methods include TransE, TransH, TransR, etc .;

  • Semantic-based matching models. This type of model uses a similarity-based scoring function to evaluate the probability of triples, mapping entities and relationships into a hidden semantic space for similarity measurement. Representatives of such methods are SME, NTN, MLP, NAM and so on.

Because knowledge graph feature learning obtains a low-dimensional vector for each entity and feature learning, and maintains the structure and semantic information of the original graph in the vector, a good set of entity vectors can fully and completely represent the Interrelationships, because most machine learning algorithms can easily handle low-dimensional vector inputs. Therefore, using knowledge graph feature learning, we can easily introduce the knowledge graph into various recommendation system algorithms . In a nutshell, knowledge graph feature learning can:

 

  • Reduce the high dimensionality and heterogeneity of the knowledge graph;

  • Enhance the flexibility of knowledge graph application;

  • Reduce the workload of feature engineering;

  • Reduce the extra computational burden caused by the introduction of knowledge graph.

     

In this article, we introduced the recommendation system, the knowledge graph, and the application value of the knowledge graph in the recommendation system. As the auxiliary information of the recommendation algorithm, the introduction of knowledge graph can greatly improve the accuracy, diversity and interpretability of the recommendation system. In the article next week, we will detail the various ideas and implementations of introducing the knowledge graph into the recommendation system, so stay tuned!

 

references

[1] Factorization machines with libfm

[2] Personalized entity recommendation: A heterogeneous information network approach

[3] Meta-graph based recommendation fusion over heterogeneous information networks

[4] Knowledge graph embedding: A survey of approaches and applications

Published 150 original articles · praised 149 · 810,000 views

Guess you like

Origin blog.csdn.net/chaishen10000/article/details/102703527