The current recommendation system is divided into three categories:
1. Non-personalized recommendation system
Features: based on statistical analysis techniques, product sales ranking, so that all users see the recommended information are the same, or Editor, and based on the average numerical score.
2. Semi personalized recommendation system
Features: Recommended produce results based on the user's current browsing behavior or the user's current shopping cart information
3. Fully personalized recommendation system
Features: based on user history information, combined with the current behavior of the user, the user generated entirely personalized recommendation service
Do recommend the system, input information is divided into several types:
1) Enter private browsing; 2) Enter the display browser;.. 3) Keyword / product attributes input; 4) User rating input; 5) evaluation of user text input; 6) Editor's input; 7) user. enter the purchase history
Output form is expressed as:
. A) related products output; b) Evaluation of an individual text output;. C) individual output rates;. D) Average value rating output; E) Email output f) Editor output.
E-commerce recommendation algorithm include:
1. Memory recommendation algorithm:
User_based collaborative filtering, Item_based collaborative filtering recommendation based on collaborative filtering recommendation Horting FIG technique,
2. The model is based on the recommendation algorithm:
Cluster_based collaborative filtering, collaborative filtering recommendation based on dimensionality reduction is recommended based on the recommended association rules Bayesian network technology
Insufficient RAM recommended algorithm:
When the user when the database is very large, it is difficult to ensure real-time
Recommendation algorithm based on inadequate model:
Relative to the original model in terms of user data having a hysteresis effect, to ensure the validity of the model, the model needs to be updated periodically
The following describes the algorithm for several typical
1.User_based collaborative filtering
Based on the assumption: If a user rating of some of the items is quite similar, they are also similar to other items of the score.
User_based collaborative filtering processing is divided into three stages:
Data are expressed ------> nearest neighbor queries (user similarity measure) ------> recommendation generation
Similarity measure of the user's own method:
1) cosine similarity
Provided the user and user rating items in n-dimensional space are represented as a vector ,
The user and the user similarity between
Molecule is inner product of two vectors Rating, the denominator is the user mode vector product
2) Related similarity
Users and user common score ever by a collection of items represent
Measure with pearson correlation coefficient
It indicates that the user evaluation of the item c, and represent the user and the user average rating.
3) the modified cosine similarity
We do not consider the different user ratings scale problems in the cosine similarity measure, a modified cosine similarity measure to improve the above defects by subtracting the average user rating for the item:
Set user and user together with scores over the items in the collection represent, and indicate to the user and the user has rated collection of items.
It indicates that the user rate of the item c, and represent the user and the user average rating.
Recommended produced:
Set user 's nearest neighbor collection by said user for entry of the prediction score by the user to the nearest neighbor set of items in the score obtained.
Indicates that the user and the user similarity between, indicating the user -to-item score, and represent the user with a user average rating item.
User rating forecast for all unrated items by the above method, and then select the prediction score the highest number of entries before the result of feedback as a recommendation to the current user.
The use of transaction data as user input can not be predicted score, then use the following method:
1) The most frequent term recommendation, to buy their goods were counted using the current user every neighbor recently purchased a record, select the recommended frequency of purchase is not high
2) Association Rules Recommendation
Item_based collaborative filtering algorithm
Based on the assumption: If most user ratings for some items of similar, relatively similar to the current user of these items score.
Implementation phase: a measure of the similarity between items
1. Core nearest neighbor query
1). Cosine similarity
See item ratings as vectors in m-dimensional space the user, if the user does not score item, then the rating of the user key to 0, set items and item score on an m-dimensional space of the user are represented as a vector ,
2) Related similarity
C represents the user of the item scores, and represent items and items the average rating.
3) the modified cosine similarity
Set of items and items common set of users with scores had said, and , respectively, a term and term rating had a set of users.
C represents the user of the item scores, indicating that the user c average rating of items.
In the purchase of goods purchase of goods under the conditions of the conditional probability of
There is a problem, sometimes with no similarity between, just because frequently purchase results. The similarity is very high.
Solution: The User - each row of the product matrix R normalized to the same length.
Recommended produced:
Target entry -nearest neighbor set with
Dimension reduction algorithm based on collaborative filtering, disadvantages: the accuracy will decrease
Advantages: solve the problem of data sparsity, reducing the computational overhead.
Cluster_based collaborative filtering algorithm
The entire space is divided according to user buying habits and characteristics of the user's score into a number of different clusters, so that the internal rating of the user clustering items as similar as possible, and between different clusters user rating for commodities as different as possible.
K-means clustering algorithm using the entire user space clustering main steps:
1) randomly selected as a seed node K users, the data rates of K user items as an initial cluster centers.
2) the remaining set of users, each user and calculating the similarity of the K cluster centers, each user will be assigned to the highest similarity of the clusters.
3) For a new generation of clusters, cluster computing the average rating from all users of the items, generating a new cluster centers
4) Repeat steps 2-3 until the clustering no change occurs.