TKDE 2020 | Review: Recommendation System Based on Knowledge Graph

TKDE 2020

Review: Recommendation system based on knowledge graph

A Survey on Knowledge Graph-Based Recommender Systems

Institute of Computing Technology, Chinese Academy of Sciences, Baidu, Hong Kong University of Science and Technology, University of Science and Technology of China, Microsoft

original

Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong, Qing He, 

A Survey on Knowledge Graph based Recommender Systems, 

In IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE), 2020

doi: 10.1109 / TKDE.2020.3028705.

This article is a work published in TKDE-2020 by scholars from the Institute of Computing Technology of the Chinese Academy of Sciences, Baidu, Hong Kong University of Science and Technology, University of Science and Technology of China, and Microsoft [1]. This work is based on the Chinese review paper published by the team in "Science in China: Information Science" [2], and further comprehensively summarizes the related work of recommendation systems based on knowledge graphs in recent years, and expands the core key algorithms inside. The introduction and comparative analysis of this article summarized the involved knowledge graph data; we also summarized the existing application scenarios of different knowledge graph-based recommendation systems and the corresponding optional benchmark data sets; finally we carried out the future development of the field Certain prospective discussions have certain guiding significance for future research in this field.

1 Introduction 

With the rapid development of the Internet, we are in an era of information explosion. While enjoying the convenience brought by the Internet, we are also facing the problem of information overload, and it is difficult to quickly extract the required information from the massive data. In order to solve this problem, the recommendation system came into being, and implemented it in many scenes including music, movies, shopping, etc., to improve the user experience.

Recently, the recommendation system based on the knowledge graph has attracted wide attention of researchers. The basic idea is to introduce the knowledge graph into the recommendation system as a kind of auxiliary information. This method can not only improve the accuracy of the recommendation system, but also provide an explanation for the recommendation results. On the one hand, the knowledge graph is a heterogeneous directed information network, in which nodes represent entities, and directed edges can represent relationships between entities. The knowledge graph contains a large amount of background information about the items in the recommendation system, and can represent a variety of relationships among items. At the same time, it can also combine the interaction data between users and items in the recommendation system to expand the hidden connection relationship between users and items, thereby more accurately modeling user preferences and improving recommendation effects. The following figure is a recommendation example based on a knowledge graph, which includes not only the entities of users, movies, actors, directors, and themes, but also the complex relationships between entities. The movies "Avatar", "Blood Diamond" and the user Bob are connected together with the help of the hidden relationship in the knowledge graph to assist the system in making accurate recommendations. On the other hand, the knowledge graph also makes the recommendation results traceable. For example, from the relationship sequence in the figure, we can know that one reason for recommending "Avatar" to Bob is that "Avatar" and the interstellar crossing that Bob has seen are both science fiction films.

  Figure 1 An example of recommendation based on knowledge graph  

The purpose of this review is to summarize and explain the current research status of using knowledge graphs for recommendation. This work overlaps with previous work, such as a review of graph-based recommendation systems and a review of knowledge graph applications. Compared with the previous work, our introduction to the method is more in-depth and provides a more detailed hierarchical technical classification. We first divide articles in this field into three categories, namely embedding-based methods, connection-based methods, and propagation-based methods. At the same time, according to the characteristics of each type of method, a more detailed division is provided. The second contribution of this article is that we elaborated on how knowledge graphs provide interpretability for recommendation results and summarized different technical methods. At the same time, we use the application scenarios of the recommendation system as the basis for classification, and summarize the data sets that can be used under different applications. Finally, based on our understanding of this field, we put forward some prospects for future development.

2. Method summary 

We divide the existing work into three categories according to the use of knowledge graphs: embedding-based methods, connection-based methods, and propagation-based methods. For each type of method, we have made a further division, and listed representative work to introduce. We summarized the researched articles in the following table, categorized each method according to the classification basis we summarized, and sorted out the methods of constructing the knowledge graph of each work, the method of graph embedding, and the main problems to be solved for the convenience of readers Check out.

  Table 1. Method summary   

 2.1 

Embedded method

The embedding-based method mainly uses the rich semantic relationships in the knowledge graph to enrich the representation of objects and users. This type of method mainly includes two parts: a graph embedding module, which mainly uses graph embedding methods to learn the representation of entities and relationships in the knowledge graph; and a recommendation module, which models user preferences for items. According to the combination of these two modules, the work in this direction can be divided into three categories. The first type is sequential learning. First, the graph embedding algorithm is used to train the graph embedding module separately, and then the pre-trained knowledge graph representation vector is introduced into the recommendation system to expand the language representation of users and items, and then the recommendation module is trained. The work includes DKN, KSR, KTGAN, etc.; the second idea is joint learning, which combines the graph embedding module and the objective function of the recommendation module to achieve end-to-end training. Representative work includes CKE, CFKG, etc.; the third The idea is to introduce a multi-task learning framework. By designing the graph embedding module into tasks related to and separate from the recommendation module, such as knowledge graph completion and edge prediction tasks, the graph embedding module is used to supervise the training process of the recommendation module. Related work includes MKR, KTUP, RCF.

 2.2 

Connection-based approach

Connection-based methods mainly use the connection methods between entities in the graph to make recommendations. Most of these methods combine the knowledge graph containing the attributes of the item with the user-item interaction matrix to construct a user-attribute-item graph, and mine users and items A variety of connections between. There are two basic ideas in this direction. The first is to use the connection similarity between entities to make recommendations. By defining the basic structural features in the map, such as meta-paths, the correlation between entities under different paths is calculated as users and items. Representation constraints, representative works include Hete-CF, FMG, etc.; the second idea is to dig out the semantic path between users and items, learn the explicit representation of the connection path between entities, and introduce it into the recommendation framework. To directly model the connection relationship between users and items, representative works include MCRec, RKGE, etc.

 2.3 

Spread-based approach

Although the above two types of methods both improve the accuracy of recommendation, they do not use all the information contained in the graph. For example, the embedding-based method focuses on learning the semantic representation in the knowledge graph, while the connection-based method focuses on the entities in the knowledge graph. Connection information. The propagation-based method combines the above two ideas. The basic idea is to use the connection path between entities in the knowledge graph to spread the semantic representation of the entities in the graph, and directly model the high-order relationships between entities, so as to mine more reasonably Information contained in the knowledge graph. The propagation-based method includes three implementation approaches. Specifically, the first approach is based on the user’s historical behavior, enriching user representations by aggregating the multi-hop neighbors of the user’s historical interactive items, so as to integrate the user’s historical interest in the knowledge graph. To spread outward, representative work in this direction includes RippleNet, AKUPM, etc. The second method is to aggregate the target item with its multi-hop neighbors and update the item's characterization. In the aggregation process, the aggregation weight of the entity characterization is jointly determined by the user and the target entity, thereby introducing the user's preference into the entity characterization update process. The representative work includes KGCN and so on. The third idea is to combine the user item interaction matrix with a knowledge graph containing attribute information, so that users and items are unified in a graph, and aggregated with their respective multi-hop neighbor representations in the graph to enrich the representations of users and items , Representative work includes KGAT, etc.

 2.4 

chapter summary

The embedding-based method uses graph embedding algorithms to learn the representation of entities and relationships in the knowledge graph, and integrate them into the framework of the recommendation system. Its advantage is that it is more flexible and easy to practice, but this method ignores the high-level relationships between entities and is often not interpretable. The connection-based method focuses on mining the multiple connection relationships between users and items in the knowledge graph. Its advantage is that the connection mode can often bring interpretability, but it is often not suitable for scenarios where the user-item interaction data is sparse, and the user Decomposing complex relationships with items into several connected units will lose part of the information. The propagation-based method is based on the propagation mechanism on the graph, and combines the characteristics of the embedding-based method and the connection-based method to fully mine the information in the knowledge graph. But a significant disadvantage is that the training process requires more computing resources, and scalability needs to be considered in large-scale data business scenarios.

We also briefly summarized the main technical means of using knowledge graphs to bring interpretability to recommendation results, including: 1) using attention mechanism to embed the relationship of knowledge graphs, 2) defining basic structural units such as meta-paths, and 3) representing connection paths Use the attention mechanism 4) Use reinforcement learning in the knowledge graph that combines the interaction information between users and items. 5) Extract the weight of entity aggregation in the propagation-based method.

3. Data set

The recommendation system based on the knowledge graph can not only improve the recommendation effect and bring interpretability, but also can be easily combined with a variety of recommendation frameworks and applied to many practical scenarios. We divided the researched work into seven categories according to application scenarios, including movie recommendation, book recommendation, music recommendation, news recommendation, product recommendation, POI recommendation, and social recommendation, and summarized the data set and adoption used in each scenario The external knowledge map of the company, and summarize each work according to the method of constructing the knowledge map. At the same time, we also explained the characteristics of each application scenario and introduced the basic information of the corresponding data set in each scenario. For the convenience of readers to refer to the content of this section, we summarize it in the table below.

  Table 2. Summary of data set  

4. Future Outlook

In addition, we also looked forward to the work in this direction, including:

1) Dynamic recommendation: Current recommendation algorithms based on knowledge graphs often take a long time to train and are too costly. They are suitable for static recommendation scenarios, and users' interests remain stable for a long time. However, in actual business scenarios, users' interests often change rapidly. How to dynamically iterate recommendation strategies based on real-time feedback data to ensure the timeliness of recommendations is one of the future research trends

2) Multi-task learning: Some problems in the knowledge graph itself will also become the bottleneck of the recommendation system. For example, the fact information in the knowledge graph is incomplete, resulting in the lack of some relationships between entities, which may ignore some of the user's preferences. Therefore, tasks related to the knowledge graph can be designed, such as knowledge graph completion, and jointly trained with the recommendation system to improve the recommendation effect.

3) Cross-domain recommendation: In actual business scenarios, users often choose products in different domains, such as books and movies. Interactive data in different fields can be naturally combined with knowledge graphs, and recommendation systems often have similar laws in multiple scenarios. Therefore, technology such as transfer learning can alleviate the problem of data sparseness in the target field by sharing the interactive features of the source field with relatively rich interactive data, so as to make better recommendation results in multiple fields.

4) Text representation combined with knowledge: Under the recommendation scenario represented by news recommendation, understanding text information is essential. By introducing the rich information in the external knowledge graph into language model training, better text representation can be obtained. For example, the text representation methods ERNIE and STCKA combined with knowledge can be used in text-based applications such as news recommendation. Make more accurate recommendations.

5. Summary

This paper investigates the related work of the recommendation system based on the knowledge graph, and systematically summarizes the latest progress in this field. We emphatically explained the technical characteristics of different research methods and proposed classification methods. At the same time, we explained how to use knowledge graphs to bring interpretability to the recommendation results. At the same time, we also introduced the available data sets in different application scenarios, and provided feasible suggestions for the direction of getting started. Finally, we put forward the development trend of this research direction, hoping to promote the progress and development of this field. The recommendation system based on the knowledge graph is in the ascendant. The rich information contained in the knowledge graph can effectively improve the effect of the recommendation system and bring interpretability. We hope this article can help readers understand the work in this field.

Paper download link

[1] https://ieeexplore.ieee.org/document/9216015

[2] http://scis.scichina.com/cn/2020/SSI-2019-0274.pdf

Guess you like

Origin blog.csdn.net/duxinshuxiaobian/article/details/113010831
Recommended