Andrew Ng machine learning (16) - recommendation system

Foreword

Currently in our lives with Internet products will be recommended will be involved in the system, such as recommending the system will record the user's preferences when browsing the merchandise when shopping Taobao, and then recommend the same type of goods or feel that you are interested in you; according to browse the news your browsing history with the type of content recommendation content of news for you, this is the recommended systems have the common sense.

A content-based recommendation system

1, description of the problem
if we are a film supplier, we have five movies and four users, we require users to film scoring. The first three films are love stories, the latter two are action films, we can see that Alice and Bob seem more inclined to romance, while Carol and Dave seem more inclined to action films. And a user does not have to play all the movies too. We want to build an algorithm to predict how many points each one of them could have seen the movie they did not fight, and as a recommended basis.
Introducing some of the following markers:
the number of nu represents the user
nm represents the number of film
r (i, j) if the user j to the film i rated by the r (i, j) =. 1
Y ^ (i, j) ^ behalf of a user j to the i movie score of
the total number mj j assessment on behalf of the user too much of the movie
Here Insert Picture Description

2, recommendation system algorithm
as shown here we introduce two variables x1 represents the degree of romantic movies (romance), x2 represents the degree of motion movie (action movie)
Here Insert Picture Description

Now we want to build on these features to a recommendation system algorithm. Suppose we adopt a linear regression model, we have trained a linear regression model for each user, such as θ (1) is the first user parameter model. So, first we have the following formula (plus regularization parameter) Of course, this formula is the model for the first user, and if we expect all users of the model, then we have the following second formula. Thus, according to the second equation we are solving the partial derivatives of θ (n) in our model can be updated using a gradient descent method.
Here Insert Picture Description

Second, collaborative filtering

The above model is that we know the type of movie gives characteristic values x1 and x2, if we come to the contrary do not know the value of these two parameters but know that each user's permissions value θ (i) it?
Here Insert Picture Description
We can also use our cost function to solve the model according to the user's θ (i), a method known as collaborative filtering we
collaborative filtering algorithm using the following steps:

Initial x (1), x (2 ),... X (m), θ (1), θ (2),... Θ (m)) for a number of random small value
using a gradient descent algorithm to minimize the cost function
after completion of the training algorithm, we predict (θ (j)) Tx ( i) for the score to the film i, j user
Here Insert Picture Description

Third, the low-rank matrix factorization and mean normalization

1, low-rank matrix factorization
we have five movies, and four user, then the Y matrix is a matrix of five rows and four columns, user data will score the films are present in the matrix can be introduced score on the right matrix.
Here Insert Picture Description
When users watch a movie i, and if you are looking for 5 with the film is very similar to the movie, in order to give users the recommended five new movies, you need to do is find out the movie j, and in these different movies we are looking for from the movie i minimum, so you can give your users recommend several different movies.
Here Insert Picture Description

2, mean normalization

Mean normalization (mean Normalization) before we return time learning algorithms school about this model, due to the difference between different characteristic value is too large, we often need to be scaled eigenvalues, one of which is to mean normalization, formula is: (x - u) / ( max - min)
we can solve for the following unknown variables in this way, the results of matrix Y mean normalization process, each user a portion of a film subtracting the average score of all users of the movie score, for the unknown variable, our new model would think she gave each movie score is the average score of the movie.
Here Insert Picture Description
References "Andrew Ng machine learning" recommendation system 16
16. The recommendation system Recommender System
Andrew Ng machine learning notes (16) - Recommendation System
Andrew Ng machine learning - Recommendation System

Published 80 original articles · won praise 140 · views 640 000 +

Guess you like

Origin blog.csdn.net/linjpg/article/details/104401361