Big Data Course K19 - Spark’s Movie Recommendation Case && Cold Start Problem of Recommendation System

Email of the author of the article: [email protected] Address: Huizhou, Guangdong

 ▲ This chapter’s program

⚪ Mastering Spark cases - movie recommendations;

⚪ Master Spark’s model storage;

⚪ Master Spark model loading;

⚪ Master the cold start problem of Spark’s recommendation system;

1. Case – Movie Recommendation

1. User-based recommendations

1. Description

We are now going to process the ml-100k data set. The u.data file contains 100,000 pieces of data, mainly user ratings of movies.

2. Code

import org.apache.spark._

import org.apache.spark.mllib.recommendation.{ALS,Rating}

object Demo11{

def main(args:Array[String]):Unit={

val conf=new SparkConf().setMaster("local").setAppName("ml-100k")

val sc=new SparkContext(conf)

val rawData=sc.textFile("d://ml-100k/u.data")

val rawRatings=rawData.map(_.split("\t").take(3))

val ratings=rawRatings.map{

case Array(user,movie,rating)=>

Rating(user.toInt,movie.toInt,rating.toDouble)

}

val model=ALS.train(ratings,50,10,0.01)

val rs1=model.predict(789,123)//Predict user No. 789’s rating for movie No. 123

val rs2=model.recommendProducts(789,10)//Recommend 10 movies (top10) for the user numbered 789

}

}

2. Check recommended content

1. Description

To visually test the effectiveness of recommendations, you can simply compare the titles of movies rated by users with the titles of recommended movies.

code

val movies=sc.textFile("d://ml-100k/u.item")

val titles=movies.map(line=>line.split("\\|").take(2))

.map(array=>(array(0).toInt,array(1))).collectAsMap()

println(titles(123))//View the movie title number 123

2. Description

For user 789, we can find the movies he has been exposed to, the top 10 movies with the highest ratings, and their names. For specific implementation, you can first use Spark's keyBy function to create a key-value pair RDD from the ratings RDD. Its primary key is user ID. Then use the lookup function to return only those rating data corresponding to the given key value (that is, a specific user ID) to the driver.

code

val movieForUser789=ratings.keyBy(_.user).lookup(789)

println(movieForUser789.size)//Check how many movies user 789 has rated

3. Description

Next, we need to get the top 10 movies with the highest ratings from this user. The specific method is to use the rating attribute of the Rating object to sort the moviesForUser collection and select the top 10 ratings (including the corresponding movie ID). Then use it as input and use titles to map it into the form of "(movie name, specific rating)". Then print out the name and specific rating:

Code 1

movieForUser789.sortBy(-_.rating).take(10)

.map(rating=>(titles(rating.product),rating.rating))

.foreach(println)

Result 1

(Godfather, The (1972),5.0)

(Trainspotting (1996),5.0)

(Dead Man Walking (1995),5.0)

(Star Wars (1977),5.0)

(Swingers (1996),5.0)

(Leaving Las Vegas (1995),5.0)

(Bound (1996),5.0)

(Fargo (1996),5.0)

(Last Supper, The (1995),5.0)

(Private Parts (1997),4.0)

Code 2

Guess you like

Origin blog.csdn.net/u013955758/article/details/132567577