0 and CF recommendation system --CB

First, based on the recommendations of the (CB, Content-based Recommendations):

    CB content-based recommendation should be regarded as the first recommended method is used, which according to the user in the past like the product (collectively referred to herein as the item), recommended for users like the product of his past and similar products. For example, a restaurant recommendation system can be a lot like rotisserie basis before a user and barbecue grills recommendation for him. CB among the first major application in information retrieval systems, so a lot of information retrieval and information filtering method can be used in the CB. 

  CB process generally includes the following three steps:

    (1) Item Representation: to extract some of the characteristics of each item (i.e. the item of content) to represent this item;

    (2) Profile Learning: using a user likes the past (and dislike) the feature data item, to learn the features of this user preferences (Profile);

    (3) Recommendation Generation: a user profile by the features of the candidate item and the obtained comparison step, this user recommended a set of most relevant item . 

Second, based on collaborative filtering (CF, Collaborative Filtering Recommendations):

  Common collaborative filtering can be divided into three categories :( based on user-based items, based on model)

  1. collaborative filtering based on the user's (User CF, User based Collaborative Filtering):

  User-based collaborative filtering algorithm to use statistical techniques to find the target users have the same preferences "neighbors", and then based on the behavior of the target user's neighbors, to recommend to the target user. The basic principle is to use similarity of user access to resources recommend each user may be interested in. In a typical application using computing "k nearest neighbor" and preference information based on the history of the K neighbors, to make recommendations for the current user.

  General UserCF and ItemCF described above are used, when in fact truly above description is only the first step. When truly general or the problem into three steps:

  (1) calculate the similarity between users;

  (2) User-Item fill the scoring matrix based on user behavior and historical similarities users;

  (3) make recommendations (selected high) according to the scoring matrix.

  Specific examples can be seen ItemCF, UserCF just changed meaning representation, such as the following equation (1) N (i) and N (j) becomes the number of items interacted with the user i, j.

  Jaccard similarity calculated using a formula or pearson cosine similarity degree of similarity, Euclidean distance.

  2. collaborative filtering items (Item CF, Item based Collaborative Filtering) based on:

  Algorithm core idea: to users recommend similar items to those items and their like before.

  There seems to be not the core idea of the content-based recommendation is very similar? ( ItemCF difference and CB ?)

  Really like, but also very different. For example, before the user A bought "Introduction to Data Mining", the algorithm would you recommend to "machine learning" based on this behavior. However ItemCF algorithm does not use the content attribute of the item calculates the similarity between the items, which analyzes the user's behavior records by calculating the similarity between the items, and then score the user based on the similarity in conjunction, some of the user may be not calculated contact items given score, further to recommend. The CB is to use the content attribute items to calculate the similarity between the goods, according to the training and then get the user's profile, to find the most similar to recommend.

  Goods collaborative filtering algorithm based mainly divided into three steps:

  (1) calculate the similarity between the items;

  (2) User-Item fill the scoring matrix based on the similarity of user behavior and historical items;

  (3) make recommendations (selected high) according to the scoring matrix.

  details as follows:

        (1)

  First, using Equation (1) calculates the degree of similarity between the items. Where, | N (i) | i is the number of users like article, | N (j) | j is the number of users like article, | N (i) ∩N (j) | is simultaneously like the articles i and j article User number. Ab exactly when the article (the article is the same), then the N (a), and N (b) should be exactly equal, then it is a wij. That is, the closer to 1 w, more similar ab. w is 0, indicating that the article from the user interaction point of view, ab dissimilar.

              

  A user indicates interest in FIG abd three items, and so on.

  Then the items a and b is the similarity:

  (2) predicting user ratings for items no interaction with him after using the formula:

  (2)

  Puj wherein u represents the user's interest in the item j (possible score), said RUI user u i of the article already scored (explicit feedback for scoring, such as 0-5, implicit feedback for interaction with a 1, no interactive 0). Collection may be taken from i, and j may be the selection of several items similar comparison (threshold established similarity determination).

  Here is an example of a book to help understand ItemCF process:

          

  Than ItemCF, in fact UserCF, CB of the second step, the score calculation with the recommended items (User-Item filled matrix) method is a method shown in FIG.

  3. The model-based collaborative filtering

  ItemCF and UserCF may be classified as a memory-based model , i.e., a simple similarity measure dependent (such as cosine similarity, Pearson correlation coefficient, etc. ) to the user or similar items match up. If there is a matrix, where each row is then a user, each column represents an article, the method is based on the memory to obtain the row or column of the matrix using a similarity measure of similarity value, and thus recommended.

  Memory-based CF with the opposite, is based on the collaborative filtering model .

  基于模型的协同过滤作为目前最主流的协同过滤类型,其相关算法可以写一本书了,当然我们这里主要是对其思想做有一个归类概括。我们的问题是这样的m个物品,m个用户的数据,只有部分用户和部分数据之间是有评分数据的,其它部分评分是空白,此时我们要用已有的部分稀疏数据来预测那些空白的物品和数据之间的评分关系,找到最高评分的物品推荐给用户。

  对于这个问题,用机器学习的思想来建模解决,主流的方法可以分为:用关联算法,聚类算法,分类算法,回归算法,矩阵分解,神经网络,图模型以及隐语义模型等来解决,我们就不具体展开了。

  基于模型的协同过滤和基于记忆的协同过滤思想是一致的,都是想办法对缺失的UI(user-item)矩阵进行填补,进而进行推荐

  基于记忆的CF利用简单的相似性度量,以及线性加权组合进行填补。而基于模型的CF则利用各种更复杂的模型或方法,对UI矩阵进行填补,比如矩阵分解隐向量的思想(隐语义模型)SVDSVD++

  我们平常时常接触的隐语义模型、矩阵分解,都是基于模型的协同过滤算法

Guess you like

Origin www.cnblogs.com/chen8023miss/p/11224508.html