Chapter 15 Application of Big Data in the Internet Field
1. Recommendation system
1.1 Introduction to Recommendation System
The recommendation system is a typical application of big data in the Internet field. It can understand the user's preferences by analyzing the user's historical records, so as to actively recommend information of interest to the user and meet the user's personalized recommendation needs.
The recommendation system is a tool that automatically connects users and items. Compared with search engines, the recommendation system performs personalized calculations by studying the user's interest preferences. The recommendation system can discover users' points of interest and help users discover their potential needs from massive amounts of information
Role: The recommendation system can create a new business and economic model and help realize the sales of long-tail products. Sales can be increased by discovering long-tail items and recommending them to interested users. This needs to be achieved through personalized recommendations
1.2 Recommended method
The essence of the recommendation system is to establish the connection between users and items. The recommendation methods include the following categories:
Expert recommendation: manual recommendation, item screening and recommendation by experienced professionals, requiring more labor costs
Statistics-based recommendation: Recommendations based on statistical information (such as popular recommendations) are easy to implement, but the ability to describe users' personalized preferences is weak
Content-based recommendation: use machine learning to describe the characteristics of the content, and find similar content based on the characteristics of the content
Collaborative filtering recommendation: one of the earliest and most successful recommendation methods, using the existing product evaluation information of users similar to the target user to predict the target user's preference for a specific product
Hybrid recommendation: combine multiple recommendation algorithms to improve the recommendation effect
1.3 Recommendation system model
A complete recommendation system usually includes: user modeling module, recommendation object modeling module, recommendation algorithm module
User modeling module: analyze user interests and needs based on user behavior and attribute data
Recommended object modeling module: Model recommended objects based on object data
Recommendation algorithm module: Based on user characteristics and item characteristics, calculate the objects that users may be interested in, adjust the recommendation results according to the scene, and display the recommendation results to users
2. Collaborative filtering
As the earliest and most well-known recommendation algorithm, collaborative filtering has not only been deeply studied in academia, but is still widely used in the industry. Collaborative filtering can be divided into user-based collaborative filtering and item-based collaborative filtering
2.1 User-based collaborative filtering
UserCF algorithm ("like-like taste"), that is, users with similar interests tend to have the same item preferences: when the target user needs personalized recommendations, you can first find a user group that has similar interests to the target user, and then use this user group's favorite , and items that the target user has not heard of are recommended to the target user
The implementation of the UserCF algorithm mainly includes two steps:
Step 1: Find a collection of users with similar interests to the target user
Step 2: Find items that the users in the collection like and that the target user has not heard of and recommend to the target user
Key: The key step in implementing the UserCF algorithm is to calculate the interest similarity between users.
2.2 Item-based collaborative filtering
The item-based collaborative filtering algorithm (ItemCF algorithm for short) is currently the most widely used algorithm in the industry.
The ItemCF algorithm is to recommend items similar to the items they liked before to the target users. The ItemCF algorithm mainly calculates the similarity between items by analyzing the user's behavior records
The ItemCF algorithm is similar to the UserCF algorithm, and the calculation is also divided into two steps:
The first step: calculate the similarity between items;
Step 2: Generate a recommendation list for the user based on the similarity of the items and the user's historical behavior.
Key: The ItemCF algorithm calculates the similarity of items by establishing a user-to-item inverted list (a list of items that each user likes)
2.3 Comparison between UserCF algorithm and ItemCF algorithm
- UserCF's recommendation is more social, while ItemCF's recommendation is more personalized
- UserCF As the user effect increases, the computational complexity of user similarity becomes higher and higher. Moreover, the correlation of UserCF recommendation results is weak, and it is difficult to explain the recommendation results. It is easy to be influenced by the public and recommend popular items.
- ItemCF tends to recommend products that are similar to the products that users have purchased, and there are often problems of insufficient diversity and low recommendation novelty