How to build a recommendation system when no user data? With three options!

Even if there is no user data, we have been able to build effective recommendation system, showing more quality content to the user, allowing users to participate.How to build a recommendation system when no user data?  With three options!

Do not look too long version:

The first step is to build a content-based recommendation system, the recommendation system will give users recommend other similar goods, but does not rely on other users' data. These characteristics (ie mathematical expression, the expression recommendation algorithm requires the use of a content item to be different aspects of the operation) from itself, not the user behavior contents entry. With written text, we will extract the text features semantic technology can be used.

In the recommended system model as a reference, we can introduce other features, such as metadata is extracted from the text, the system is optimized as far as possible. Although there is no explicit user identity, but can still use the user account proxy personalized recommendations. Assume that each time a user accesses the system are viewed multiple items, you can real-time trend in the session, setting up a local-based recommendation system session.

Text version:

"Under no circumstances the user data of how to build a recommendation system?" We have repeatedly encountered this problem, and today I tried to answer it.

In this paper it will be presented with a series of basic knowledge of how the system works recommended, (the important part) will use some jargon. When it comes to technical issues, we will explain specific technical environment with you.

In general, in the absence of user data case, there are three possible ways to build recommendation systems. I put them in accordance with the complexity of the below listed out, in addition, assume that we can now make use of all available data on hand. These three methods, each of the latter is better to use the unique identification information and user data such as user, but actually, not all the data at hand.

Build content-based recommendation system
based on content recommendation system First of all, we can be certain tags or other content metadata as the characteristics of a standard build. We can use the model to assess TF-IDF algorithm, in this model, which tags represent a pre-calculated for each word (which refers only to some dictionary data structure, all of the text is a good set of words in the dictionary. )

具体来说,假设我们充分利用所有的标签以及其他特征来构建该词典,那么该词典会帮助我们构建所谓的“特征向量”。之后,我们以特征向量为基础,对比不同的内容条目,搭建推荐系统。到了这一步,一个基于内容的推荐系统已初步完成,从我的研究经验来看,该系统的推荐效果相当好。我们现在做的所有工作都是在向用户推荐类似于历史商品的商品。“类似”一词这里指,推荐的商品与历史商品相比,有类似的标签和特征。

如果我们想搭建精度更高的推荐系统,要做的第一件事便是迭代上述初级推荐系统,并在此基础上不断优化。接下来我将介绍其他方法。

优化基于内容的推荐系统
上述步骤利用了包含现有标签和其他特征的单一词典。提高推荐精度的下一步是构建两个及两个以上的词典——对应元数据的不同类别,我们可以基于多个词典,在推荐系统中采用TF-IDF统计方法,计算每一内容条目得分的加权组合。我们可以根据主观评估的结果优化参数(如每一项得分的权重)。这取决于哪一项参数权重能带来最好的推荐效果。

如果某一类元数据不能用TF-IDF进行加权统计,如这组数据不相关,那我推荐大家把这组数据细分成不同的种类。做了这样的细分处理后,我们会获得另一组标签(细分后的每一类数据都有各自对应的标签)。假设这个过程中未大量出现其他特征,那并不会加大整个工作的难度。

接下来可以在系统中引入过滤技术,如加某个特定标签,进一步优化推荐系统。它不是核心算法的一部分,但如果我们想在推荐系统中嵌入某种算法,从而实现用户自定义推荐准则的操作,那么过滤技术便是该算法的附加支撑结构。

搭建采用用户代理的推荐系统
提高系统推荐精度的下一步是观察能作为用户代理的那些数据特征。虽然我们没有用户账号,但可能有IP地址、浏览器信息、用户会话等其他信息。

At this point, we can construct an abstract user. This user account could not be verified, but they have a prototype fingerprint technology. Once we named the "abstract" of users, so that users will be able to generate personalized recommendations, specifically, it is the use of a variety of collaborative filtering technology. In my opinion this is not complicated - we can find a lot of open source projects (such as higher-order Python package). The key is that we can help to build a user account existing proxy information.

In addition, we also need to click the interactive data users. We need to know which items the user has been clicked, otherwise there is no way to follow the user's preference for further optimization. Once you have the user clicks the interactive data abstraction and user accounts will be able to build a personalized recommendation system is a combination of IP address and browser information made. This is not true personalized recommendation, but from the real individual is not far away.

Session-based recommendation system to build
an overview of the method is to build up the final session of the recommendation system. This is similar to the method mentioned earlier, but this time we are concerned about data within a particular session. Even if we can not obtain user information, it is also possible to get the user session data. With the account user session, and will be able to highly localized "user account" peer up.

Recommendation system based on a number of sessions, some of them based on recurrent neural network (RNN) to build a recommendation system accuracy is extremely high, such as research and Karatzoglou two Hidasi done. Recommended effectiveness of these systems are quite satisfactory.

It assumes that the user is ready to stay for some time in the system, session-based recommendation system. If you did so, and clicks enough, then the effect of the system will be better recommendation, the recommended content more attractive to users.

原文标题:What Are the Three Ways to Build a Recommender System When You Don’t Have AudienceData?

The above content from Quora, compiled by the fourth paradigm first recommended release.

Guess you like

Origin blog.51cto.com/13945147/2420729