week_9 (recommended system)

Andrew Ng machine learning notes --- By Orangestar

Week_9 (recommended system)

1. Problem Formulation

This section is only a brief introduction to a bit applications and examples recommendation system. Can slightly. Just need to know how to represent not score as well as scores

2. Content Based Recommendations

The lesson comes to movie ratings recommend mechanisms
to predict the user does not score off the film by the users that have rated the movie

This lesson we have to learn the " content-based recommendation "

We first x_1, x_2 to represent the ratio of a movie belongs to love movies or action movies, which is constituents

Then, each movie we can use a feature vector to represent
it, we can predict the scoring for each viewer, as a stand-alone linear regression problem , specifically, for example, for each user j, we all learn the a parameter \ (\ Theta ^ {(J)} \) , where a three-dimensional vector. Of course, it is common n + 1 dimensional vector

as the picture shows:

to sum up:

So, the question is, how to calculate \ (\ Theta \) ?

Calculation \ (\ Theta \) is essentially a substantially linear regression or least squares regression

Here in front of the constant term can be removed

To repeat:

So, you can take the same linear regression and optimization methods gradient descent :()

The only difference is that linear regression: no 1 \ m item!

Well, this is the recommended content based recommendation system
next time, do not understand we are talking about the recommended content recommendation system

Note:
In this excerpt about someone else's notes:

3. Collaborative Filtering(协同过滤)

也就是 相似推荐? 例如你买了一本书,然后买完后,会显示:买了这本书的用户也买了其他书。
现在我们的情况:

然后通过用户对不同种类的电影的评价得到:

让我们写正式一点:
对于一个的时候:

当然,我们面对多个的时候:

这时候,我们需要的梯度下降规则就是:

总结:这相当于从theta推导x,
上次我们是从x来推导theta

4. Collaborative Filtering Algorithm 协同过滤算法的改进

在上一节中,我们介绍了2种算法:

那么,如何同时计算出theta和x呢?

我们可以发现,这两项本质上是一样的:

所以,要同时计算theta和x就可以优化这个函数:

以前是鸡生蛋蛋生鸡,现在是一起生。
不过要注意的是,在新的算法中,
我们要去掉x_0 = 1 这个前提,因为
X和\theta此时变成n维

总结:“

注意,使用新算法的时候,开始的时候由于2个参数都没有计算出来,也没有得到,所以要随机初始化!!!

5. Vectorization: Low Rank Matrix Factorization(算法的向量化及其实例)

先看一个实例:

所以这时候Y就包含了这些数据了

然后,如图所示,可以将这个矩阵分解:

这种方法也叫:low rank matrix factorization
低秩矩阵分解

如何应用?

6. Implementational Detail: Mean Normalization

细节介绍:均值归一化

例子:有一个用户没有给任何一个电影评分
这样我们就要采用均值归一化了

然后。使用均值归一化:

然后,使用过均值归一化后,我们要对这个矩阵进行操作,像之前一样用协同过滤算法
如图:



最后,感觉这一周学得不是很好,特别是推荐算法这一节,概念还是很模糊,可能是由于前面的下线性回归算法没有搞清楚吧。

做编程作业的时候一定要搞清楚!!!

Guess you like

Origin www.cnblogs.com/orangestar/p/11263356.html