Matrix Factorization Model (MF)

The ability of collaborative filtering algorithms to process sparse matrices is relatively weak. In order to enhance the generalization ability, a matrix factorization model (MF) or implicit semantic model is derived from collaborative filtering.

The implicit semantic model was first proposed in the text field to find the implicit semantics of the text. Used in recommendation in 2006, the core idea is to link user interests and items through hidden features, and to find potential topics and classifications based on user behavior.

Matrix factorization algorithm solution: Eigenvalue Decomposition (EVD) or Singular Value Decomposition (SVD)
Simon Funk announced a matrix factorization algorithm called Funk-SVD, which was later called Latent Factor Model (LFM) by Koren, the champion of the Netflix Prize. The idea of ​​Funk-SVD is very simple: the parameter problem of solving the above two matrices is converted into an optimization problem, and the user matrix and the item matrix can be learned by minimizing the observation values ​​in the training set.
Insert picture description here
Here I feel that the ordinary least square method is used to obtain the partial derivative, but instead of directly calculating the regression coefficient, the new predicted value is obtained through the partial derivative and the learning rate.
Insert picture description here
Here is the right formula to replace the left value, because there should be a subscript increased by 1 in the mathematical formula.
Insert picture description here
Insert picture description here
Consider adding some influencing factors to improve the preference function, and get:
Insert picture description here
Insert picture description here
Insert picture description here
programming implementation:
class SVD():
def init (self, rating_data, F=5, alpha=0.1, lmbda=0.1, max_iter=100):
self.F = F # This represents the dimension of the hidden vector
self.P = dict() # The size of the user matrix P is [users_num, F]
self.Q = dict() # The size of the item matrix Q is [item_nums, F]
self.bu = dict() # User deviation coefficient
self.bi = dict() # Item deviation coefficient
self.mu = 0.0 # Global deviation coefficient
self.alpha = alpha # Learning rate
self.lmbda = lmbda # Regular term coefficient
self.max_iter = max_iter # Maximum number of iterations
self.rating_data = rating_data # Rating matrix

There are many ways to initialize the matrices P and Q. Generally, they are filled with random numbers, but the size of the random numbers is exquisite. According to experience, random numbers need to be combined with

1/sqrt(F) is proportional to
cnt = 0 # Count the total scores, and initialize mu with
for user, items in self.rating_data.items():
self.P[user] = [random.random() / math. sqrt(self.F) for x in range(0, F)]
self.bu[user] = 0
cnt += len(items)
for item, rating in items.items():
if item not in self.Q:
self.Q[item] = [random.random() / math.sqrt(self.F) for x in range(0, F)]
self.bi[item] = 0
self.mu /= cnt #With
matrix After that, you can perform training, where the parameters P and Q are trained using stochastic gradient descent.
def train(self):
for step in range(self.max_iter):
for user, items in self.rating_data.items():
for item , rui in items.items():
rhat_ui = self.predict(user, item) # get prediction score

Calculation error

e_ui = rui - rhat_ui
self.bu[user] += self.alpha * (e_ui - self.lmbda * self.bu[user])
self.bi[item] += self.alpha * (e_ui - self.lmbda * self.bi[item])

Stochastic gradient descent update gradient

for k in range(0, self.F):
self.P[user][k] += self.alpha * (e_ui self.Q[item][k]-self.lmbda *
self.P[user][ k])
self.Q[item][k] += self.alpha * (e_ui
self.P[user][k]-self.lmbda *
self.Q[item][k])
self.alpha *= 0.1 # The step size of each iteration should be gradually reduced

Predict the user’s rating of the item. The form of a vector is not used here.

def predict(self, user, item):
return sum(self.P[user][f] * self.Q[item][f] for f in range(0, self.F)) + self.bu[user ] +
self.bi[item] + self.mu In
many cases, the matrix is ​​very sparse. If you use pandas, there will be a lot of
Nan values, which is not easy to handle. Use a dictionary to store the data.

Guess you like

Origin blog.csdn.net/m0_49978528/article/details/109276356