Collaborative Filtering Algorithm in Recommender System

foreword

If you are interested in this article, you can click " 【Visitor Must Read-Guide Page】This article includes all high-quality blogs on the homepage " to view the complete blog classification and corresponding links.


overview

Collaborative filtering is a recommendation algorithm, which is usually modeled as mmm users,nnFor n items, only some users and some items have rating data, and other ratings are blank. At this time, we are required to use some of the existing sparse data to predict the blank part, and find the item with the highest score and recommend it to the user.

There are generally three types of collaborative filtering:

  • User-based: Considering the similarity between users, based on the preferences of similar users, predict the rating of the target user on the corresponding item (may surprise the user);
  • Item-based: consider the similarity between items, and predict similar items with high similarity based on the ratings of certain items by target users;
  • Model-based: Solving with various machine learning algorithms is currently the most mainstream type of collaborative filtering.

Model-Based Collaborative Filtering

[1] 关联算法: Data mining is performed on all historical records of items purchased by users to find frequently associated item sets, that is, frequent itemsets

  • Common algorithms are Apriori, FP Tree, PrefixSpan

【2】聚类算法: Based on user clustering, divide users into different target groups according to a certain distance measure; or based on item clustering, recommend similar items that users like

  • Common algorithms include K-Means, BIRCH (hierarchical method clustering), DBSCAN, spectral clustering

[3] 分类算法: Divide the user rating into multiple segments, and use the classification model to learn

  • Common algorithms include logistic regression, naive Bayes, support vector machines

[4] 回归算法: Directly predict the user's rating and use the regression model to learn

  • Common algorithms are linear regression, regression tree, support vector regression

[5] 矩阵分解: Decompose the sparse matrix into P ⊤ QP^\top QP Qform, which is then used for recommendation

  • Common algorithms include FunkSVD, BiasSVD, SVD++, Factorization Machine, Tensor Factorization

[6] 图模型: Put the similarity between users into a graph model for consideration

  • Common algorithms include SimRank series algorithms and Markov model algorithms

【7】神经网络: Use the neural network model to do the regression task


Collaborative filtering method based on matrix factorization

Taking the FunkSVD algorithm as an example, the expected matrix MMM is decomposed as follows:
M m × n = P m × k ⊤ Q k × n , M_{m \times n}=P_{m \times k}^\top Q_{k \times n},Mm×n=Pm×kQk×n,

where mij m_{ij}mijIndicates the iii user pairjjThe ratings of j items, when the matrixPPP andQQAfter Q , the matrix MMcan beM any blank positionmij m_{ij}mij, by pi ⊤ qj p_i^\top q_jpiqjcalculated. PP can then be obtained by solving the following optimization problemP andQQQ
arg ⁡ min ⁡ P , Q ∑ i , j ( m i j − p i ⊤ q j ) 2 + λ ( ∥ p i ∥ 2 2 + ∥ q j ∥ 2 2 ) , \mathop{\arg \min }\limits_{P,Q} \sum_{i, j}\left(m_{i j}-p_i^\top q_j\right)^2+\lambda\left(\left\|p_i\right\|_2^2+\left\|q_j\right\|_2^2\right), P,Qargmini,j(mijpiqj)2+l(pi22+qj22),

where λ \lambdaλ is the regularization coefficient. The above optimization problem can be solved by gradient descent. Based on FunkSVD, there are many improved algorithms in the future, such as BiasSVD and SVD++. The overall decomposition form is not much different, and the optimization goals are slightly different. This article will not introduce too much.


References

Guess you like

Origin blog.csdn.net/qq_41552508/article/details/129145541