spark.mllib

import org.apache.spark.mllib.recommendation.{ALS,MatrixFactorizationModel,Rating}

----------------------------------------
ALS: the least squares method
to solve the problem of matrix decomposition optimal method.
In fact, two-dimensional matrix scoring is supplemented missing.

Input: the dominant data -train, invisible data -trainImplicit ()
Output: MatrixFactorizationModel matrix decomposition model,

train parameters:
• numBlocks for computing the number of parallel block (set to -1, automatic configuration).
Ratings •: RDD [Rating]
• Rank is hidden semantic model number of factors.
• iterations is the number of iterations.
• lambda is ALS regularization parameter.
• implicitPrefs decision is made explicit feedback version of ALS or implicit feedback with the applicable version of the data set.
• alpha is a parameter for feedback on the hidden version of ALS, this parameter determines the base bias of strength.

Training model
ALS.train (ratings, rank, numIterations, lambda, (alpha))

------------------------------------------
MatrixFactorizationModel matrix decomposition model
the user model Item factor and factor are stored in a (id, factor) RDD of types.
And they are referred userFeatures productFeatures. Each type of factor is factor Array [Double].
May be stored on the distributed file system.

Methods:
Predict (userid, productid), the return value is a prediction score
recommendProducts (userid, numProducts), given before a given user recommendation numProducts Products

Computing expected given user rating of an item: from the user obtains the corresponding factor matrix of rows and columns of matrixes articles factor, and to calculate the dot product.
------------------------------------------
Rating: Comment Categories
Each object It contains a user id, a product id and a score.
Requirements: Each id is a 32-bit integer value.

 

Guess you like

Origin www.cnblogs.com/xl717/p/11612338.html
Recommended