Turn to the previous summary of the SVD notes, just to take this opportunity to review the
Singular Value Decomposition
Wherein the matrix decomposition:
Ax = λx, wherein A is an n × n matrix, λ is a characteristic value, x is the eigenvector corresponding to [lambda]
For n matrix A feature value λ1≤λ2≤ ... ≤λn, and wherein the n value corresponding eigenvectors {x1, x2, ... xn}
A matrix can be decomposed as follows: \ (WΣW = A ^ {-}. 1 \) , where W is the n eigenvectors span the n × n dimensional matrix, Σ value based on the angle of the n wherein line dimensional matrix of n × n
Singular Value Decomposition:
Singular value decomposition of a square matrix is not required:
Suppose the matrix A is an m × n matrix, then the SVD we define the matrix A as:
\ [A = UΣV ^ T \] where U is an m × m matrix, V is a n × n matrix, in addition to their other than the elements on the main diagonal are all 0, and satisfies:
\ [^ the TU = the U-the I, the I = V ^ the TV \]
Wherein the matrix is a square matrix V \ (A ^ TA \) n feature vectors spanned n × n matrix, we used herein called feature vectors of the right singular vectors of the matrix A
The matrix is a square matrix U \ (AA ^ T \) left singular vectors of m feature vectors spanned m × m matrix, the eigenvectors of the matrix A is referred to herein
Matrix \ (Σ \) in addition to being singular values on the diagonal, other locations are 0
Singular value \ (σ_i = Av_i / u_i \) , where ui and vi are respectively the right of A, left singular vectors
Similarly, singular value can also \ (σ_i = \ sqrt {λ_i } \) is obtained, where λ is the \ (A ^ TA \) and \ (AA ^ T \) characteristic value (characteristic value is two matrices the same)
Properties of the SVD:
We can use a maximum of about singular vector of the k-th singular values and corresponding to approximately describe matrix:
\ [A_ {m × n-} = U_upper {m × m} Σ_ {m × n-} V ^ T_ { n × n} ≈U_ {m × k} Σ_ {k × k} V ^ T_ {k × n} \]
SVD in the recommended application system
User features : a set of real vectors, describes the user preference includes a level of the film properties a, b, c, ... of the user (Xa, Xb, Xc, ...)
Film features : a set of real vectors, describes the film team attributes a, b, c, ... the degree of compliance movie (Xa, Xb, Xc, ...)
Rating prediction value of the film is the inner product of the two vectors [1]
M and n are the user number corresponding to the article seen as a score matrix M [3], of the m × n matrix of the SVD:
\ [× n of M_ {m} = {U_upper K m × K × K {} Σ_ } V ^ T_ {k × n
} \] If we want to predict the i th user ratings of the j-th article \ (ij of m_Low {} \) , only need to calculate \ (u ^ T_iΣv_j \) to
problem:
- It requires the SVD of the matrix M are dense, and the actual data is sparse matrix M
- The actual data matrix M is very large, very time consuming to do break down
FunkSVD
FunkSVD proposed to solve the efficiency problem of the conventional SVD, the desired matrix decomposition way:
\ [of M_ {m × n-} = P ^ T_ {m × K} Q_ {K × n-} \]
for one Rating \ (ij of m_Low {} \) , if the matrix is decomposed with FunkSVD, expressed as the corresponding \ (^ Q T_jp_i \) , using the mean square error as a loss function, we expect \ ((m_ {ij} -q ^ T_jp_i) ^ 2 \) as small as possible, considering all the combinations of items and samples, then we desirable to minimize the formula: \ [Σ_ {I, J} ({ij of m_Low T_jp_i} -q ^) ^ 2 \]
to prevent over-fitting, was added a L2 regularization term, to obtain a final target function:
\ [J (P, Q) = \ underbrace {Arg \; min} _ {P_i, Q_j} \; \ sum \ limits_ {i, j}
(m_ {ij} -q_j ^ Tp_i) ^ 2 + \ lambda (|| p_i || _2 ^ 2 + || q_j || _2 ^ 2) \] where λ is the regularization coefficient You need to adjust parameters. arg minf (x) means that the function f (x) to obtain the set of all its minimum argument x.
BIgChaos
Elimination of Global Effect
Similarity matrix decomposition
Reference material
[1] Wang Yuantao. Netflix collaborative filtering algorithm on the data set [D]. Tsinghua University, 2009.