Internet in common recommendation algorithms

Original link: a variety of Internet in common recommendation algorithm Jinji

We shop online, read the novel, when to buy movie tickets, will encounter a variety of recommendations, recommended to us some of our favorites ever bought or had the same type of product, or recommend some we've seen in fiction theme the same novel. Recommended that these products are how to achieve it?

We have to talk about these "boring" algorithm today.

In the application of the Internet, a common recommendation algorithm are: collaborative filtering recommendation algorithm (Collaborative Filtering Recommendation), content recommendation algorithms (Content-based Recommendation), similarity recommendation algorithm (Similarity Recommendation), association rule recommendation algorithm (Association Rule Based Recommendaion ). Different algorithms have different application scenarios, a reasonable application of these algorithms, can bring more economic benefits for us.

Collaborative filtering recommendation algorithm (Collaborative Filtering Recommendation)

Collaborative filtering algorithm in e-commerce can be said to be hot, its own platform to do a lot of electronic business platform is recommended to use the product.

What is collaborative filtering it?

In short, is to find the same interest groups, the group will recommend other information of interest to the user.

We use a simple example to illustrate this algorithm:

I like to watch network novel, however, to see what is the most annoying Internet novel? It is the book famine. After reading a book, what this next to see it? One by one in turn to see a few chapters? waste time. See book review, is not spray the child is asked.

This time, collaborative filtering comes in handy.

I am a user A, I like to see "Need retainers", "broken sky", "Zhu Xian," These are my interest. I am interested in how it defined? I have read the book can be more than 100 chapters, may be my favorite book, I may be over-praised book, in short, we must first define a latitude.

With my interest, other users will have the same interest, when the other user's interests and my interests are the same, you can put these users are interested in books recommended to me.

Specific implementation steps are as follows:

We need to build a large table, X-axis is all of our fiction (can be obtained from the database), Y axis is all of our users (can be obtained from the database).

 

Then we each user is interested in the intersection of XY's novel are marked out (in the log can be obtained), it is expressed as follows:

After marked out, we can see that my uncle Ge hobby is "Need retainers", "broken sky" and "Zhu Xian," and Zhang Grandpa and Uncle Lee and I have the same interests. Li and Zhang uncle and uncle also more than I read some novels, but this extra part, reflected in addition to collaboration, it can be used as a recommendation to me.

Next time, when I read the novel APP open when the system can recommend the two novels to me, and I will have a great possibility to open the watch.

Content recommendation algorithms (Content-based Recommendation)

What is the content recommendation algorithms it?

Content recommendation algorithms, in fact, carried out the analysis of historical data users, abstract common content and then make recommendations based on these algorithms in common.

We also give an example to illustrate:

I still like to read novels, I often search for novels by a number of conditions, I often search: finished above this, fantasy, two million words. And my search behavior, it becomes my history logging. According to my commonness of these logging abstract, and then to search based on the novel abstracted in common, and then to recommend these novels to me.

Specific implementation steps are as follows:

I search, select three books and was reading, and three times, respectively, I searched

Fantasy, men and frequency, more than the end of this 200 million words

Fantasy, male frequency, the end of this, 100-200 million words

Fantasy, M frequency, serial, not

We appeal based on the contents of the abstract, you can get a final result, that is fantasy, male frequency, limited, limited results, based on this result, we can go to search, and then follow the latest updates novel heat or sort the result set after recommended to the user, so that the results can meet the needs of users.

Of course, if the number of user queries is very large, resulting in poor accuracy of the results, we can also have some other means to improve the accuracy of, for example, only the last three days of the inquiry record or use only the three most recent query records.

Similarity recommendation algorithm (Similarity Recommendation)

Collaborative filtering and content recommendation algorithms are algorithms require the user to have a certain amount of historical data, and then analyzed based on historical data and then make recommendations.

It should be for new users how to do it? Then the similarity recommendation algorithm will be able to solve this problem.

What is the similarity recommendation algorithm?

Similarity recommendation algorithm, also called Similarity recommendation algorithm is analyzed by the characteristics of the goods, then find similar items were recommended by an algorithm.

Similarity algorithm, the concept of a distance (Distance), the closer the two, the higher the similarity of the two, the greater the distance, the lower the similarity representation. Through this concept, if I click on a novel, the system will be similar to the level of other novels, recommend suitable novel for me based on this novel.

The calculation of this distance, there are a lot of ways, this is similar to the core of the algorithm. There Euclidean distance (Eucledian Distance) (Minkowski distance) cosine similarity (Cosine Similarity), and so are many algorithms from Manhattan (Manhattan Distance) Minkowski distance, I'm not here to talk about specific algorithm (I also say I do not understand, ha ha, we are interested can own Baidu).

DETAILED DESCRIPTION steps:

I still prefer to read the novel, I opened "Zhu Xian" This book, the book are what attributes?

We set up eight attributes for the novel, it is equivalent to eight dimensions, we will, "Zhu Xian," the book all the properties as the origin, the distance between these properties and then calculates the distance algorithm, after the summary, the total distance of the nearest , which is the most recommended.

Assuming: distance of the same attribute 0, otherwise 1.

We can get the distance = f (theme) + f (author) + f (state) + ...... + f (type);

We went through all the novels, and then calculate the distance. Find a "pre-Zhu Xian Biography: wild line" book. The book attributes are six with the "Zhu Xian," the same, two different. Finally, we calculate the distance of 2, is from a recent book, so it was recommended.

Although we here attribute different weights are given the same, but in reality, the heavy weights of different attributes are different. And because this is not a recommendation based on historical data to make the user, so each user recommendation result is the same.

Association rule recommendation algorithm (Association Rule Based Recommendaion)

Association Rules Recommendation electricity supplier is used widely as a recommendation algorithm, one of the most classic case is that the beer next to the diapers, can increase sales of beer.

What is the association rules recommend it?

To understand the association rules recommended, we must first understand the association rules. Association rule is through mining and analysis of data to find out the correlation between the object and the object. The recommendation is a kind of association rules recommended to rely on the correlation between the objects.

We still use it I read the novel by example

When I find one I like to read books when the book is "Zhu Xian," and I will add it to the list in my book, this time, the system for other recommended several books ( "God tomb" "dragon", etc.), I can also tell join together to bookshelf, this recommendation, the general recommendation is the association rules.

Association Rules Recommendation is how to implement it? Let us look for the specific implementation steps:

First of all, we want to find the data for analysis.

We list all the books are found, and the list of books in the book are all listed.

Then, they begin to compute association rules support.

What is the support (Support)? Support is that all of books, the ratio of the combination of a book or books occupied. For example: "Zhu Xian" is present in all book list, then its support is 100%, "God tomb" There are only two books in the list, then its support is 40%.

After we support the individual commodity calculated, then we need to calculate is the multi-commodity portfolio of support. We pairwise combinations of goods to calculate the degree of support, this combination only occurs at the same time in a single book, the only calculation. We have here six books, so the number of combinations is 15 kinds of (5 + 4 + 3 + 2 + 1 = 15).

Here, in fact, we can already make recommendations, and we can book or a combination of a high degree of support recommended to the user, so, this book is more likely to be accepted by users.

Next, we will start to association rules confidence of.

What is the degree of confidence (Confidence)? When a user is added to the list of books "Tomb of God" when, how much probability will go to add "Tomb" mean? This probability is the "Tomb of God" -> "Tomb" of confidence.

"God tomb" of the support (S [God]) was 40%, "Tomb of God" and "Tomb" (S [God -> Pirates]) is 20%, then the "Tomb of God" -> "Tomb "confidence is equal to% 50 (S [God -> Pirates] / S [God])

Finally, we will analyze the lift association rules

According to the calculation of support, we found the collection "dragon" people of the book, 100% will go to the collection "Zhu Xian," then is not to say, when the user "dragon" in the collection, we recommend to "Zhu Xian" It is the best?

No, not the case. Why are we going to recommend "Zhu Xian"? That's because we want to improve, "Zhu Xian," the amount of reading. However, after analyzing the data we found that although the user's favorite "dragon" when to recommend "Zhu Xian", 100% of the users collection, but a separate recommendation, "Zhu Xian," the user is 100% collection. The association rules recommended and can not lead to higher amount of reading as "Zhu Xian," so the user's favorite "Zhu Xian" and the collection "dragon" behavior is not directly related.

How to judge the effect of association rules recommend it? That is a lift.

We book A-> B of confidence in the book, the book do the same support B comparison, is calculated:

When the ratio is greater than 1, represent recommended book B A book in the collection is effective;

When the ratio is equal to 1, representing the recommended book B is meaningless in the collection book A, the two are not related;

When the ratio is less than 1, the representative of the book in the collection A recommendation B is invalid, as a direct recommendation B.

summary

Here are just simply better ideas talked about these algorithms, but the real content there are a lot, and I still continue to study them.

Guess you like

Origin www.cnblogs.com/-wenli/p/12631488.html