Portrait of user in recommendation system

Portrait of user in recommendation system

User portrait construction method

  • 1. The first category-basic information is to directly use the original data of the user at the time of registration, such as demographic information or some behavior information that will not change once it occurs, such as the time of the first registration, the first time to view the content Wait, this part of content is also called static information (Static). This basic information is similar to checking the account, in fact there is no technical content, but it is very useful for the user's cold start scenario;
  • 2. The second category-behavior information , here is the continuous accumulation of user historical behavior data, statistics, is also one of the most common user portrait data, can be understood as heap data. If you want to do more detail here, you can also divide the behavior information into the following two categories:
    • Basic behavior, behavior information obtained through a single statistic, such as the number of logins, payment times, etc .;
    • Derivative behavior requires behavioral information based on secondary calculations of basic behavioral statistics, such as login frequency and consumption frequency in the past month;
  • 3. The third category-model labels , is actually a black box. Through machine learning methods or deep learning, we can learn the dense vector embedding that humans cannot control the management community, and it is not valued by non-technical personnel. The role assumed by China is very large. ** Also includes two categories:
    • It can be intuitively understood. In the case of labeled data, machine learning is used to layer or group users or items. This level can be directly understood and used by users;
    • Not intuitively understandable, such as using shallow semantic models to build user reading interest, or using matrix decomposition to obtain hidden factors, or using deep learning models to learn user's Embedding vectors. This type of user profile data is usually inexplicable and cannot be directly understood by others.

The user portrait is not the purpose of the recommendation system, but a by-product of a key link that occurs during the process of building the recommendation system.

Key factors for user portraits

  • Dimension
    • 1. The name of each dimension is understandable
    • 2. The number of dimensions
    • 3. What dimensions
  • Quantification. In the actual production system, the quantification of each dimension of the user portrait should be handed over to the machine, and it should be goal-oriented, and it makes sense to optimize the user portrait in reverse based on the recommendation effect.
  • Effect, do not use user portraits for user portraits, it is only a by-product of the recommendation system, so we should guide the quantification of user portraits based on the use effects (sorting indicators, recall coverage and other indicators) .

--------------------- Gorgeous dividing line ----------------

The essence of the user portrait should be: transfer the information of the item to the user based on the user's basic attributes and user behavior to form
the two aspects written on the user portrait (core, dry goods) , specifically:

  • 1 The basic attributes of users include gender, age, hobbies, etc.
  • 2 The user acts on items by clicking, reading, sharing, etc., and the tags (text information mined in the first step, such as tfidf keywords, textrank keywords, lda keywords) that these items come with can be simply weighted average Transferred to the user, if you think that the items derived from the above method have a lot of tags, I hope that the number of tags used for user grouping can be less, (so that the number of combinations of user groupings can be less), you can pass (Chifang Inspection or information gain) to select a few more important tags to form a user portrait.

Push the corresponding article information according to the user portrait

Simple text search can be done by using elasticsearch. If you convert these into a matrix, you can use faiss to retrieve the matrix.

Published 93 original articles · praised 8 · 10,000+ views

Guess you like

Origin blog.csdn.net/zlb872551601/article/details/103751354