Do data analysis, the label must have a capacity of abstraction

EDITORIAL

Internet iterative update very fast, we have ushered in the second half, if the first half is a relatively new Internet era, so most of the second half of the Internet companies have long ago entered our lives, not on the half as rough business, the second half, we have a lot of data, a lot of users, what we need is refined operation .

The second half to lead the development of "big data," "enabling" technologies, such as how to tell the government based on large data traffic management wisdom, to do urban planning, through consumption data, when to tell companies what products best meet their needs through dining data, tells how the site catering and so on. Therefore, the user is the fundamental starting point is the analysis of the data, how to analyze the user, the user is tagged it?

Guidelines User's portrait

How does the user analysis, the user line drawing of it? We can analyze user data, in accordance with the user came from - who users are, what are the characteristics - Where's the user logic presented.
That is, the following three steps:

  • Unification: designed for the user to uniquely identify the user to calibrate how come
  • Tagged: user tagging, analysis of user characteristics
  • Operational: According to the characteristics of the user, which we can bring business value
    Here Insert Picture Description
  1. Unified
    tagging is to give the user a unique identifier, which is the core, similar to the ID number like this, only a person through a lock can. Design unique identification number can be selected from the following items:

User name, registration phone number, email, equipment number, etc.

  1. Tagged
    after that uniquely identifies the user, the user how to analyze the characteristics of it? Eight words: user consumer behavior analysis, we can label on the dimensions of the following four dimensions to analyze characteristics.
  • User Tags : gender, age, region, income, education, occupation, etc. These include basic attributes of users.
  • Consumer Tags : consumer habits, purchase intent, whether promotional sensitive. The statistical analysis of the user's spending habits.
  • Behavior Tags : time, frequency, duration, access path. These are the analysis of user behavior. To get them to use App habits.
  • Content Analysis : usually browse the contents of a user, especially to stay a long time, and more Views of content analysis, an analysis of what users are interested in, for example, finance, entertainment, sports, fashion, science and technology.
  1. Operational
    by the four dimensions of the above, we can analyze the characteristics of users, then the enterprise can bring any value? It can be divided into business value from the three dimensions of the user lifecycle:
  • Acquisition (CPA) : how to pull a new, more accurate access to customer through marketing
  • Stick off : personalized recommendations, search sorting, operations and other scenes
  • Ask a guest : turnover forecasts and analysis of key nodes reduce churn

If the data stream processing stage according to the process of dividing the user modeling illustration, the data can be divided into layers, algorithms and business layers. You will find in the different layers are required marked with a different label.
Here Insert Picture Description

  • Data layer refers to the user consumer behavior in the label. We can be marked with "facts label" as an objective record data.
  • The algorithm level refers to the user calculated through modeling these behaviors. We can marked "label model" as a classification that identifies the user's portrait.
  • ** ** refers to the business layer acquisition (CPA), sticky passengers, means of detain a guest. We can marked "predicted label" as a result of the business association.

So the label of the process, that is, through the "facts label" data layer in the layer calculation algorithm, marked "Model Label" classification results, and finally to guide the business layer, come to "predict the label."

Look at an example

The above has been how users tag of the principles and logic myself clear, then what how to use it? How can I have a final link inside * US group portrait takeaway designed for users of design? * An example of this, I would like to own another topic.

For example, you are a Kebab shop employee, boss asks you to picture the user model, then the restaurant business empowerment, how would you do it?

If you give the boss said: "Boss ah, our family do lamb skewers, make use of data mining doing ah?" Then you do not eat lamb skewers estimated at night, eat fired it. Do not be too narrow, ha ha. Now let tagging, preceded by the guidelines, what is difficult, step by step look at:
First, we should solve three problems: users come from? (Unification), the user is who? (Tagged), the user where to go? (Operational)

  • For the first question: uniquely identifies the user, the lamb skewers store, can be located by the payment information that uniquely identifies the user, usually micro-channel, Alipay, some running out of cash. Through micro letter and Alipay, binding can be found behind the phone number, the company should consider the cost of data exchange, and then select a single user identification ID.
    And come from, but also analyze, dinner? Own supper? random selection? Come here? These and other (this is for the Kebab shop for, and specific conditions)
  • For the second question, tagged, or that we consider four dimensions:
    • User Tags: gender, age, personality characteristics (dinner can almost see a little out), area (distance departures, whether at work, etc. are also considered in the vicinity), income, occupation, etc.
    • Consumer Tags: Food flavors (plus without hot) and average price, spending habits, whether promotional sensitive
      * behavior Tags: like what time period, usually several times a week, how come (dinner, supper, random, etc.), how much to buy lamb skewers used to come after, are generally how long it takes to eat
    • Content Analysis: According to the characteristics of the usual point of food, taste, sensitivity, etc. offer total analysis.
  • For the third question, on selling mutton string What is the value of it? User modeling by the second question, we can know for a particular user A, we know its users tag information, what kind of cuisine, taste, like the level of consumption, whether or not care about promotions and other features, you can for this particular user a recommended dish of. For example, a new listing or mutton dish, other dishes can be compared or what the price point A comparison of categories A, can be recommended to the user A, try this dish? This is not to sell a dish. In addition, we can also cluster, if customer A and customer B's age, tastes, consumption levels, almost like the food, etc., you can put one, so some dishes A point may consider whether to recommend to B, etc. . This is the sticky-off angle. Acquisition (CPA), then you can analyze user data, accurate advertising, for example, our consumer economy level and characteristics of the area and population advertising dishes. Detain a guest's point of view can analyze the information and the reasons for customer churn, you can make improvements and so on.

This is a string above example of lamb.

Label data requires a certain abstraction, this also needs more practice in daily life, accumulate more, and more expand your thinking. This problem and we are also very appropriate, we may wish to also analyze micro letter friends, give them a try modeling.
Here are two ideas to analyze this question:
Here Insert Picture Description
Here Insert Picture Description

to sum up

Do data analysis, the label must have a capacity of abstraction, data from real close to reality, that there are still some difficulties, sometimes, is not the more detailed the better, the division is too small, resulting in data redundancy will also appear, We will deal with trouble, so this one also need more practice, that just the right degree.

The following is a learning and reference:

Published 66 original articles · won praise 67 · views 10000 +

Guess you like

Origin blog.csdn.net/wuzhongqiang/article/details/104105131