1. The purpose of the recommendation system
Information overload
Recommended system
- Recommendation system is a measure used by information overload, the face of a flood of data, it can quickly recommend the articles in line with the characteristics of the user. Some people solve the "Select phobia"; there is no clear need for the people.
- Address how to find information of interest from a lot of information.
- To solve how to make the information stand out own production, adored by the masses.
purpose:
- Allowing users to get faster and better content to their needs
- Make faster and better content pushed to the hands of users like it
- Let Web site (platform) more efficient user of resources reserved
The basic idea 2. Recommended system
- The use of user feature information and articles, giving users recommend those with a user's favorite features.
- With user liked items to the user and recommend that he liked items of similar items.
- Use and users of other similar users, giving users recommend those items and their interests similar to other users liked.
Know what you want, precise push
- The use of user feature information and articles, giving users recommend those with a user's favorite features.
Feather flock together
- With user liked items to the user and recommend that he liked items of similar items.
People in groups
- Use and users of other similar users, giving users recommend those items and their interests similar to other users liked.
3. Recommended data analysis system
- To recommend content items or meta data, such as keywords, the classification label, other genes are described;
- Basic information system users, such as gender, age, interests, tags, etc.
- User behavior data, can be converted into preference items or information, depending on the application itself, may include user rating of items, record the user to view objects, the user's purchase records. These user preference information can be divided into two categories:
① explicit user feedback: These are natural users browse sites on the Web site or use other than to provide explicit feedback information, such as user score of articles or comments on the article.
② implicit user feedback: These are users in the use of the site is data generated, an implicit response to the user's preference items, such as the user to buy a certain item, the user to view information about an item, and so on.
4. Recommended classification system
- According to the real-time classification
Offline recommended
real-time recommendations
- Recommended classification according to the principle of
Based on the similarity of recommended
based on the recommended knowledge of the
recommendation model based
- Classified according to whether personalized recommendations
Statistics based on recommendation of
personalized recommendations
- According to the data source classification
Based on demographic recommended
content-based recommendation
based on collaborative filtering recommendation
Hybrid recommendation
The actual site of recommendation systems are often not simply using only a certain kind of recommendation mechanisms and strategies, often mixed together multiple methods to achieve better recommendation results. More popular combinations of methods are:
a weighted mixture of
- Linear equation (linear formula) according to the recommendation of several different weights must be combined, the specific weight of the weights requires repeated experiments on the test data set, so as to achieve the best results Recommended
A Mixed
- Mixed mode switching, is to allow in different situations (the amount of data, the system operating conditions, the number of users and items, etc.), select the most appropriate recommendation mechanism calculates a recommended
Partition mixed
- It recommended using a variety of mechanisms, and different partitions of different recommendation result to a user
Stratified mixture
- It recommended using a variety of mechanisms, and as a result of the recommendation of another input mechanism, which recommended a comprehensive mechanism of the advantages and disadvantages of each, to be more accurate recommendation
5. Recommended System Evaluation
- Allowing users to get faster and better content to their needs
- Make faster and better content pushed to the hands of users like it
- Let Web site (platform) more efficient user of resources reserved
6. Recommended system experimentally
Offline experiments
- User behavior data obtained through institutional system, and generates a set of standard data in a format
- The set of data according to certain rules into a training set and a test set
- User interest model train on the training set, the prediction on the test set
- Offline benchmark predefined algorithm predicted results on the test set
User Survey
- User surveys need to have some real users, allowing them to complete some tasks on the recommendation system to be tested; we need to record their behavior and get them to answer some questions; and finally analysis
Online experiment
- AB test
7. Recommended System Evaluation indicators
Forecast accuracy, user satisfaction, coverage, diversity, the degree of surprise, trust, real-time, robustness, business goals
Recommendation accuracy evaluation
Score prediction
- Many sites allow users to articles scoring function, if you know the history of user score of articles, you can learn a interest model to predict user ratings for new items
- Rating prediction accuracy using generally RMSE (RMSE) or mean absolute error (MAE) calculated
Top-N recommendation
- When the site offers referral service, usually to recommend a personalized list of users, this recommendation is called Top-N recommendation
- Top-N prediction accuracy is generally recommended by the precise rate (precision) and recall (Recall) measured
Accuracy, precision and recall
If a class has 80 boys, 20 girls, a total of 100 people, the goal is to find all the girls. Now someone pick out 50 people, 20 of whom are girls, in addition to errors as the 30 boys and girls also picked out. So how do you assess his work?
The result of the selection matrix table represented schematically: define TP, FN, FP, TN where four categories:
Accuracy (accuracy)
- - item number and the total number of correct classification ratio:
- A = (20+50) / 100 = 70%
Accuracy rate (precision)
- - All the retrieved item, the "should be retrieved" proportion of the item:
- P = 20 / (20+30) = 40%
Recall (recall)
- - all retrieved item accounted for all "should be retrieved item" proportions:
- R = 20 / (20 + 0) = 100%