协同过滤推荐系统

  二、ALS的应用设计

  1.输入数据

  (1)评分文件(rating.dat)

  该数据有四个字段,格式为UserID::MovieID::Rating::Timestamp,分别为用户编号、电影编号、评分、评分时间戳。

  其中,用户编号范围(1~6040)、电影编号(1~3952)、电影评分(0~5)、评分时间戳(单位:秒)另外,每个用户至少有20个电影评分。

1::720::3::978300760
1::1270::5::978300055
1::527::5::978824195
1::2340::3::978300103
1::48::5::978824351
1::1097::4::978301953
1::1721::4::978300055
1::1545::4::978824139

  (2)用户信息文件(users.dat)

  该数据有五个字段,格式为UserID::Gender::Age::Occupation::Zip-code,分别为用户编号、性别、年龄、职业、邮编。

  其中用户范围(1~6040)、性别(M为男性、F为女性)、年龄(单位:岁)、职业(21中职业分类的一种)、地区邮编

12::M::25::12::32793
13::M::45::1::93304
14::M::35::0::60126
15::M::25::7::22903
16::F::35::0::20670
17::M::50::1::95350
18::F::18::3::95825
19::M::1::10::48073

  (3)电影信息(movies.dat)

  该数据有三个字段,格式为MovieID::Title::Genres,分别为电影编号、电影名、电影类别。

  其中电影编号范围(1~3952)、电影名(由IMDB提供的标准电影名,包括上映年份)、电影分类(实际分类名)

305::Ready to Wear (Pret-A-Porter) (1994)::Comedy
306::Three Colors: Red (1994)::Drama
307::Three Colors: Blue (1993)::Drama
308::Three Colors: White (1994)::Drama
309::Red Firecracker, Green Firecracker (1994)::Drama

  2.运行程序

  (1)启动IEDA,新建Scala工程--配置Project SDK与Scala SDK--新建包--导入Spark依赖包(File--+Java--选中Spark安装目录下jars文件夹下所有文件)--新建Scala Class--将代码复制到代码编辑区--Edit Configuration--Application(Name,Main Class,Program arguments(输入数据文件所在目录))--Run movieALS。(这个程序跑起来日志同样太长,用之前的方法仅显示Error级别的日志)

  (2)控制台提示输入

Got 1000209 ratings from 6040 users on 3706 movies.
Please rate the following movie (1-5 (best), or 0 if not seen):
Raiders of the Lost Ark (1981): 2
Fargo (1996): 1
Sixth Sense, The (1999): 5
Princess Bride, The (1987): 4
Terminator, The (1984): 3
Toy Story (1995): 1
Gladiator (2000): 0
Blade Runner (1982): 5
Who Framed Roger Rabbit? (1988): 2
One Flew Over the Cuckoo's Nest (1975): 2
Abyss, The (1989): 3

  (3)最后运行结果

Training: 602251, validation: 198919, test: 199049
RMSE (validation) = 0.8800459390646345 for the model trained with rank = 8, lambda = 0.1, and numIter = 10.
RMSE (validation) = 0.8721775968513282 for the model trained with rank = 8, lambda = 0.1, and numIter = 20.
RMSE (validation) = 3.7558695311242833 for the model trained with rank = 8, lambda = 10.0, and numIter = 10.
RMSE (validation) = 3.7558695311242833 for the model trained with rank = 8, lambda = 10.0, and numIter = 20.
RMSE (validation) = 0.8775399600881826 for the model trained with rank = 12, lambda = 0.1, and numIter = 10.
RMSE (validation) = 0.8712666782228532 for the model trained with rank = 12, lambda = 0.1, and numIter = 20.
RMSE (validation) = 3.7558695311242833 for the model trained with rank = 12, lambda = 10.0, and numIter = 10.
RMSE (validation) = 3.7558695311242833 for the model trained with rank = 12, lambda = 10.0, and numIter = 20.
The best model was trained with rank = 12 and lambda = 0.1, and numIter = 20, and its RMSE on the test set is 0.8688556104046699.
The best model improves the baseline by 21.97%.
Movies recommended for you:
 1: Anatomy (Anatomie) (2000)
 2: Bandits (1997)
 3: Welcome to Woop-Woop (1997)
 4: Across the Sea of Time (1995)
 5: Down to You (2000)
 6: Window to Paris (1994)
 7: In the Mouth of Madness (1995)
 8: Fall (1997)
 9: Zachariah (1971)
10: Six-String Samurai (1998)
11: If Lucy Fell (1996)
12: Fifth Element, The (1997)
13: Faraway, So Close (In Weiter Ferne, So Nah!) (1993)
14: Steal Big, Steal Little (1995)
15: What Happened Was... (1994)
16: Rosencrantz and Guildenstern Are Dead (1990)
17: Ayn Rand: A Sense of Life (1997)
18: Guantanamera (1994)
19: Big Blue, The (Le Grand Bleu) (1988)
20: Coldblooded (1995)
21: Eighth Day, The (Le Huiti�me jour ) (1996)
22: Mother Night (1996)
23: Matrix, The (1999)
24: Loss of Sexual Innocence, The (1999)
25: Chambermaid on the Titanic, The (1998)
26: Wisdom (1986)
27: Beautiful Thing (1996)
28: Fight Club (1999)
29: I Am Cuba (Soy Cuba/Ya Kuba) (1964)
30: Once Upon a Time... When We Were Colored (1995)
31: Dune (1984)
32: Julien Donkey-Boy (1999)
33: Postman, The (1997)
34: Total Eclipse (1995)
35: Gladiator (2000)
36: Leather Jacket Love Story (1997)
37: Lost Highway (1997)
38: Bewegte Mann, Der (1994)
39: Splendor (1999)
40: Babyfever (1994)
41: Love Serenade (1996)
42: Hamlet (1996)
43: Ghost in the Shell (Kokaku kidotai) (1995)
44: After Life (1998)
45: But I'm a Cheerleader (1999)
46: Committed (2000)
47: Blue in the Face (1995)
48: Taffin (1988)
49: Perfect Blue (1997)
50: Cross of Iron (1977)

Process finished with exit code 0

  3.代码分析

猜你喜欢

转载自www.cnblogs.com/BigJunOba/p/9362969.html