吴恩达-coursera-机器学习测试题第十六章-推荐系统

2. In which of the following situations will a collaborative filtering system be the most appropriate learning algorithm (compared to linear or logistic regression)?

You've written apiece of software that has downloaded news articles from many news websites. In your system, you also keep track of which articles you personally like vs.dislike, and the system also stores away features of these articles (e.g., word counts, name of author). Using this information, you want to build a system to try to find additional new articles that you personally will like.

You manage an online bookstore and you have the book ratings from many users. For each user,you want to recommend other books she will enjoy, based on her own ratings and the ratings of other users.

You manage an online bookstore and you have the book ratings from many users. You want to learn to predict the expected sales volume (number of books sold) as a function of the average rating of a book.

You run an online news aggregator, and for every user, you know some subset of articles that the user likes and some different subset that the user dislikes. You'd want to use this to find other articles that the user likes.

答：选B、D 协同过滤算法的特点是其特征量和数据比较多。

A：你已经编写了一个下载许多网站里新闻文章的软件。在您的系统中，您还可以跟踪您个人喜欢与不喜欢的文章，并且系统还存储这些文章的特征（例如，单词计数、作者姓名）。使用此信息，您想要构建一个系统来尝试查找您个人喜欢的其他新文章。

这个开始我也选错了，认为是正确的，目前还没理解，理解后再更新，博友有想法的也可以评论告诉我，谢谢。

第二题的A选项侧重于单个人，更适合分类算法推荐算法更多针对公司企业非常实用，像视频中提及亚马逊这样公司，针对用户，这样推荐算法系统

B：你管理anonline书店，你有很多用户的书评分。对于每个用户，您都希望根据自己的评分和其他用户的评分推荐其他喜欢的书籍。这个和课堂讲的推荐电影类似，就是推荐系统。

C：你管理anonline书店，你有很多用户的书评分。您想要根据书的平均评分来预测预期的销售量（销售的书籍数量）。很明显这个预测用线性回归等其他算法好，协同过滤算法在推荐系统方面的应用更广。

D：运行一个OnLeNeNS聚合器，对于每个用户，您知道用户喜欢的一些子集和用户不喜欢的一些不同的子集。你想用这个来找到用户喜欢的其他文章。这个推荐系统和B类似

你拥有一家销售多种款式和品牌的牛仔裤的服装店。你已经从频繁的购物者那里收集了不同风格和品牌的评论，你想用这些评论为那些购物者挑选出他们最可能购买的牛仔裤，提供折扣，这个属于推荐算法

你是一名艺术家，为你的客户手工绘制肖像。每个客户端都有不同的肖像（他们自己），并给你1-5星的评级反馈，每个客户最多购买1幅肖像。你想预测下一个客户会给你什么评级。这不就是电影的评价相似，bingo

你经营一家网上书店，收集许多用户的评分。你想用它来识别哪些书是“相似”的（例如如果一个用户喜欢某本书，她还会喜欢什么书呢？）这也应该算是类似推荐

你管理一个在线书店，你有很多用户的图书评分。你想要学习预测预期的销售量（卖出的书的数量）作为一本书的平均评分的函数。应该是预测用线性回归等其他算法好

3 . You run a movie empire, and want to build a movie recommendation system based on collaborative filtering. There were three popular review websites (which we'll call A, B and C) which users to go to rate movies, and you have just acquired all three companies that run these websites. You'd like to merge the three companies'data sets together to build a single/unified system. On website A, users rank a movie as having 1 through 5 stars. On website B, users rank on a scale of 1 -10, and decimal values (e.g., 7.5) are allowed. On website C, the ratings are from 1 to 100. You also have enough information to identify users/movies on one website with users/movies on a different website. Which of the following statements is true?

It is notpossible to combine these websites' data. You must build three separaterecommendation systems.

You can combineall three training sets into one without any modification and expect highperformance from a recommendation system.

You can merge the three datasets into one, but you should first normalize each dataset's ratings(say rescale each dataset's ratings to a 1-100 range).

Assuming that there is at least one movie/user in one database that doesn't also appear in asecond database, there is no sound way to merge the datasets, because of the missing data.

你经营一个电影帝国，想要建立一个基于协作过滤的电影推荐系统。有三个流行的评论网站（我们称之为A，B和C）用户去给电影打分，而你刚刚收购了这三家网站。您希望将这三家公司的数据集合并在一起，构建一个单一/统一的系统。在网站A上，用户对电影的排名是1到5星。在网站B上，用户的等级是1-10，而十进制值（例如，7。5）是允许的。在C网站上，评分从1到100。你也有足够的信息在一个网站上识别用户/电影，在不同的网站上使用用户/电影。下列哪一项是正确的？

A 将这些网站的数据结合起来是不可能的。你必须建立三个独立的系统。

B 您可以将所有三个培训集组合成一个没有任何修改，并期望从推荐系统中获得高性能

C 您可以将三个数据集合并到一个数据集，但是您应该首先将每个数据集的评级规范化（比如将每个数据集的评级重新调整为1-100范围）。

D 假设在一个数据库中至少有一个电影/用户没有出现在asecond数据库中，那么由于缺少数据，就没有可靠的方法来合并数据集。

答：选C，因为ABC每个样本的均值均不一样，要想将三个样本归为一类，类似于前面线性回归、逻辑回归等提到的方法，需要对每个类进行特征缩放后方能总体归为一类。而课堂里电影的评级并没有特征缩放时因为所有电影评级已经是可比（例如1--5星），所以他们的规模相似，无须特征缩放。注意与本题的区别。

Which of the following are true of collaborative filtering systems? Check all that apply.

Suppose you are writing a recommender system to predict a user's book preferences. In order to build such a system, you need that user to rate all the other books in your training set.

For collaborative filtering, the optimization algorithm you should use is gradient descent. In particular, you cannot use more advanced optimization algorithms(L-BFGS/conjugate gradient/etc.) for collaborative filtering, since you have to solve for both the x(i)'s and θ(j)'s simultaneously.

For collaborative filtering, it is possible to use one of the advanced optimization algoirthms (L-BFGS/conjugate gradient/etc.) to solve for both the x(i)'s and θ(j)'s simultaneously.

Even if each user has rated only a small fraction of all of your products (so r(i,j)=0 for the vast majority of (i,j) pairs),you can still build a recommender system by using collaborative filtering.

答：选 C、D

A：假设您正在编写推荐系统来预测用户的图书偏好。为了构建这样一个系统，你需要用户对你的训练集中的所有其他书籍进行评分。实际上没必要对所有书进行评分

B：对于协作过滤算法，您应该使用的优化算法是渐变下降。特别是，您不能使用更高级的优化算法（L-BFGS /共轭梯度/等）进行协同过滤，因为您必须解决同时更新x（i）和θ（j）的问题。可以使用其他更高级算法，课件上提到过。

答：选A、D

这里主要就是将C中元素对应于R中元素为1的位置的元素求和。所以这里必须要用到点乘，因为这里是5x5矩阵之间相乘，用一般的矩阵相乘还是5x5的矩阵，没办法选择出单个位置上的元素。不太懂这里是要考什么

吴恩达-coursera-机器学习测试题第十六章-推荐系统

猜你喜欢