Brief site user behavior analysis and recommendation


A mining background

With the popularity of e-commerce Today, online services and online trading platforms such as Internet services, for businesses, the greater the amount of site visits, the corresponding data is also increasing the amount of information, a lot of users on the platform of information gathered up to form vast amounts of data, how to filter out vast amounts of data in valuable information, studies the user's interests and preferences, analysis of user needs and behavior to guide the user to their own needs, and accurate service will be recommended to the user, the service more targeted, it has become the focus of attention of business problems.

In this paper, an educational site, for example, a brief description based on user preference information to predict user behavior, helping users find the needs and make recommendations.


Second, the analysis data extraction and

In the data extraction process, choose as large amounts of data, thereby reducing the randomness of the recommended results and improve the accuracy and better explore the commodity of interest to the user. Access time of the user condition, select the data access within 3 months of the user as the original data set, in order to avoid differences in different regions of the user's preference, according to the present embodiment extracts the user access to the southern province of data analysis, the amount of data has 968435 including user accounts, access time, source site, visit the page, topic, source page, category, keyword fields.

For each dimension types of pages in the original data, clicks and page rank etc. distribution analysis, get its internal rules, on this basis, to complete the cleaning and transform raw data, attribute the Statute of post-processing, extraction model required attributes .


Third, the model

E-commerce recommendation system, mainly through statistical and data mining technology, based on the behavior of users to access the site, and actively provide users with recommended services, thereby enhancing the user experience, to promote consumption. Business needs are different, the recommendation system will have to meet different recommended way. Such as product recommendations, recommended category, tag and recommend. Commonly used recommend model main rule model, collaborative filtering model and based on the recommended model content, different recommendation algorithms different recommendation model uses, such as rules model, commonly used algorithms Apriori; collaborative filtering K nearest neighbor algorithm involved in the model, factor model. Practical applications, it is not recommended method for using a single recommendation, the recommendation is to achieve the desired effect, generally a combination of various methods recommended recommendation result are combined, the final recommendation result obtained.

Combined with the example of a specific business scenario and the actual situation, target analysis has the following characteristics: a strong user demand for personalized, real-time changes recommended result, long-tailed rich pages, number of pages is less than the number of users, therefore, this example of a cooperative algorithm-driven, the user personalized recommendations. Collaborative filtering recommendation system is quite successful technology that has been used in many successful recommendation system.

Based on general processing steps collaborative filtering system objects is first analyzed data sets user with an item; secondly by user preference for projects with a preference to find similar items; Finally, according to the user's historical preferences, recommend similar items to target users . Based on the above process, the article based on collaborative filtering algorithm mainly: calculating a similarity between the items; article 2 binding behavior of the user and historical similarity to the target user generated recommendation list... Wherein the calculated similarity with a cosine article, Jaccard similarity coefficient, correlation coefficient, the principle and formulas herein omitted.

After completion of each of the articles similarity calculation, the similarity matrix can be generated between one article, by means of Numpy Python is not difficult to implement collaborative filtering algorithm, some code examples are as follows:

Sample code portion

The example uses the most basic collaborative filtering algorithm modeling, the resulting model and its results are only preliminary results, in practical applications, it is necessary to analyze the business combination, the model was further modified to accommodate the business demand.


IV Summary

In addition to the recommendation system collaborative filtering algorithm described above, there are other commonly used algorithms, the purpose of the system is that the recommendation based on the user's preference information to predict user behavior, helping users find may be of interest, but not necessarily to find merchandise user recommendation. At the same time, the recommended model is also facing a number of important issues, such as feature extraction problem, how to obtain important feature merchandise from the product label, classification and properties; new user problems, how to solve improve the recommendation quality in less user behavior situation; new of the goods, how to get more of a chance to recommend merchandise display; sparsity problem facing large user and product data, users reviews would seem rather sparse and so on. Faced with these problems, in practical applications, according to different business scenarios, to take advantage of a variety of algorithms, design of hybrid recommendation algorithm to improve the quality of recommendation.

Product is slightly Library http://www.pinlue.com/