How to verify the correctness of a user portrait?

Foreword

Recently, a confused point in user portrait task, in the end about the accuracy or the user via the user behavior data tailored to the user's labels do? Prior to a preliminary understanding of the user's portrait wrote a blog , which is mainly based on verification of the portrait assessment online and offline model. However, in actual business experience to rely mostly business people or analysts to look at, there are many authentication methods and do not necessarily apply practical business scenarios, so some improvement after the search had some ideas on verification on the Internet, but also I want to do some of these content sharing, also hope to have more exchanges in this direction.

User portrait general process

用户画像前期探索
用户画像数据整理&分析&标签设计
用户画像原型设计
用户画像开发
用户画像上线
用户画像更新

User portrait construction process generally as shown above, wherein more details of the content or the actual service content, the following verification for the user's portrait is mainly focused on the development and user node user portrait portraits update.

User portrait classification

The following are just the more roughly classified
Here Insert Picture Description

User portrait verification

1. Verify portrait of the development process

(1) Model Validation
This method is mostly used for basic information as well as user-portrait based on user behavior, user labels such as gender, age, etc. can have a corresponding label may be used or the real result, general verification indicators AUC, KS, ROC , Confusion Matrix and so on.

(2) Verify sample
under the premise of a large number of users, it can be stratified random sampling or sampling method for verification.

(3) Cross validation
cross-validation and cross-validation complementary external cross-validation of data points between portrait index, for example, third parties and other external data.

2. Verify line after the portrait

(1) real data validation
as business development, a number of portraits user information will slowly accumulate from scratch, no doubt that the real data used to verify the portrait category index is the most accurate.

(2) A / B the Test
A / B the Test is the Internet's most commonly used authentication method, generally based on user-defined policies portrait will be rigorous comparative tests in the on-line to test the accuracy of the portrait.

Thinking small

In the actual business scenario there is still difficult to verify user tags, instead of blindly pursuit label correctness single individual, the focus should be placed after the label for the effective on-line assessment of the effect of the actual business of the business in order to assess the effect of the label personal think there might be more suitable, but also reflect the effect of the label from the label to some extent whether or whether there are different algorithms.

References: https: //www.zhihu.com/question/36422121/answer/207069948

Published 53 original articles · won praise 30 · views 90000 +

Guess you like

Origin blog.csdn.net/Totoro1745/article/details/103947797