[Data analysis] while eating dumplings, data flow analysis while thinking!

Reprinted Source: https://mp.weixin.qq.com/s/6KdKczONYngcFg8qZuV-Kg

 

Foreplay

Festival dumplings Well, suddenly want to eat dumplings, are we supposed to buy dumplings chant! Now the scene change if, assuming you are a Data Analyst diet network, now some dumplings magnate want to take the money to drop your boss (advertising).

Boss: "jewels come here."

Jewels: "Yes Boss"

Boss: "There is a community dumplings Tyrant to hit us."

Jewels: "Who dares to hit your boss as people wait for me?"

Boss: "I am asking is how we charge!"

Jewels: "Go ahead boss, give you an offer next week, reports"

(End BACKGROUND blowing ratio)

As a data analyst you might think of CPC, CPS this type of keywords, but in the end is charging or otherwise use CPC, how billing, to which of the channel line promotion? What may be suitable for users dumplings tycoon? These issues need your company's website for a comprehensive traffic data collected from a deep understanding of the analysis.

 

Contents Overview

Personal and business analysis stage of learning to understand, collect data from the content to the user analysis (performance indicators KPI) This line was launched following. The main contents of the directory are as follows:

 

Traffic Data Analysis

data collection

What is "Buried"?

To put it plainly is to collect data, the first thing you think could possibly be crawling reptiles, but you have to figure out, is now in the company's product line, do not you climb through the reptile "jewels ×× search elements point to open the box" so the behavioral events it? this is obviously unrealistic.

How that event like this user behavior data collected it? The answer is by "Buried" Buried called, refers to the event for the capture user behavior, related technologies and implementation process and send.

For chestnut: If Jingdong internal operations staff want to see the figure below " dumplings love Dragon Boat Festival effect" this activity, researchers can place within the meaning of "Buried" by the red arrow in the following figure, when the user clicks on this field, it background will trigger and report such a user click behavior data.

 

Umaten action?

  1. For traffic monitoring (online situation, PV, UV index analysis, etc.)

  2. Construction path to facilitate the user behavior (Buried acquired user behavior data link)

  3. By the analysis of the selling points and other data to determine product and campaign effectiveness and future direction

  4. Monitoring application running, easy to locate and track problems

  5. Provide data to support marketing decisions

  6. Implementation of AB Testting

Data acquisition and flow fields in the underlying table

When Buried In order to collect data, but not all of the data to be collected up. First of all have to know what business needs, such as BOSS now want to look at the past seven days of DAU trend, this time we must think how to calculate DAU analysts, and product after another to discuss how you can get a point buried "field" with in calculating the DAU (a reporting point can be buried when the user logs identifying startup of APP).

In actual operation, the following aspects of the data may be acquired:

  1. Characteristic of the user attributes system

    1. operating system

    2. Browser

    3. domain name

    4. Access speed

    5. Network status (2G, 3G, 4G, etc.)

    6. other

  2. Characteristics of the user's access

    1. Start access time

    2. The end of the access time,

    3. First visit, last visit

    4. Click the URL

  3. Source user features

    1. Type of content information network

    2. Content Category

    3. Visit URL

  4. product features

    1. Product Code

    2. Product Category

    3. Product color

    4. product price

    5. Quantity etc.

In summary it may collect traffic data underlying the table below (exemplified here, only a simple, more your own Now Reflection):

 

The data processing (ETL)

Objective : According to statistical needs follow-up indicators separated by filtration various topics (on various topics path) basic data (Table create different intermediate representation).

Method : General HQL directly by extraction dimensions and metrics data (extracted from the original may be difficult underlying collection table, the original table data also need to go through the business system via ETL extraction, washing loaded into the data warehouse after conversion). After the foundation may be based on business needs indicators script cure, pushed to the BI platform for reporting internal display.

For chestnuts , calculate nearly 7 UV, user login and IP access and other indicators:

SELECT  dt,  COUNT(DISTINCT deviceid) AS uv ,  COUNT(DISTINCT CASE WHEN length(trim(user_id)) > 0 THEN user_id else NULL end) AS login_users ,COUNT(DISTINCT ip) AS ip_num ,COUNT(session_id)  AS session_num FROM  dwd_caiji_table WHERE  dt between sysdate(-7) and sysdate()GROUP BY  dt

Problem :

Many of my friends would like to be extracted by the previously mentioned data HQL instead of SQL, in fact the purpose of HIVE-SQL design is to let SQL and will not MapReduce programming people can also use Hadoop for data processing (after all, the company's actual the amount of data is TB, PB even larger).

Popular big data associated computational framework can handle large amounts of data and computing, is basically dependent on the distributed computing framework (such as MapReduce), and distributed computing, is a cluster shared computing tasks, the ideal state is calculated for each node should assume a similar amount of data computing tasks, but the reality may be because of a serious imbalance in the distribution of data lead to data skew.

So when the need to consider doing ETL data skew issues, relevant content much your own review.

 

Indicators Statistics and Analysis Users

Description : Due to space limitations, the directory 3,4 strung together.

Product data is of very good, the advantages of :

  1. Visualization: Visualization of user behavior, can be a clear understanding of user behavior

  2. Traceability: product positioning problem

  3. Verify: verification data support and

  4. Predictable: The data changes, predicted late strike

The data is based on the premise needs some indicators to measure, here jewels indicators into traffic metrics and user behavior metrics site, meaning that part of almost universal analysis indicators , based on a part of set of different business requirements scenes .

For the specific meaning of each indicator if you do not understand the need to look yourself. We need to understand that each indicator defined role .

For chestnuts , DAU:

Definitions : Daily Active User (daily active users)

Role : the user can measure the product (such as Jingdong app) of activity, it can be used to understand and reduce subscriber growth trends.

Now the focus of talk about some of the views of users jewels analysis (performance indicators KPI), first users to analyze individual is to be divided into two categories, one is the basis of analysis, one is a model of strategic analysis. He said the simple point is that want to index the basis of analysis, adjusted operating strategy, and according to different business needs, build user model system.

 

1, the basic analysis

Fundamental analysis indicators divided into two, one is for a new user, the old one is for the user, and the new user corresponding to pull new and conversion . And for old users can be divided into active, retained, out of repurchase . For chestnut:

New pull (channel) :

Jingdong tend to have APP, mobile terminal, micro-channel end, PC-side channel, where the emphasis is different according to different business, traffic, along with mobile phones and mobile devices more and more intelligent of the big screen, under normal circumstances, the electricity supplier's Day 618 data show that the end consumer non-PC users the most. Now businesses pay more attention to the marketing of non-PC end (micro-channel, APP, mobile terminal), it is understood that devices and channels used by the user can make operations and maximize profits.

Transformation :

It refers to the ratio of the number of users to access the corresponding objective of the operation to the total number of visits. Appropriate action can be a user login, user registration, users subscribe to, download, users buy a series of user behavior, website conversion rate is a broad concept. In short, when visitors to the site, the visitor converted to resident user site, visitors can also be understood as conversion to the user.

 

 

For old users active, retained, jump, repurchase empathy can be directly related to a keyword search on their own reading.

 

2, analysis model strategy

I believe Give a man a fish than giving the fishing, this is a very important part of the content, it is impossible to make it clear in a tweet. Here these links to learn some reference part corresponding put, more needs to understand his insight:

Event model user behavior :

http://www.woshipm.com/data-analysis/686576.html

Path analysis of user behavior :

http://www.woshipm.com/data-analysis/704261.html

User Experience Analytics :

http://www.woshipm.com/discuss/53005.html。

https://www.jianshu.com/p/f10f706d3ddd?from=groupmessage

User portrait analysis :

https://mp.weixin.qq.com/s/ZBdRn8nBLQk9Qp049c_tqQ

Customer value score and precision marketing :

https://wenku.baidu.com/view/7e156f087275a417866fb84ae45c3b3567ecdd18.html

Funnel Model :

http://www.woshipm.com/data-analysis/697156.html

Traffic Monetization :

https://baike.baidu.com/item/%E6%B5%81%E9%87%8F%E8%B4%A7%E5%B8%81%E5%8C%96/17219976

 

Reference article :

https://www.cnblogs.com/yjd_hycf_space/p/7772722.html。

https://www.cnblogs.com/shujuxiong/p/10218727.html。

https://blog.csdn.net/haoyuexihuai/article/details/53453100。

https://blog.csdn.net/wuxintdrh/article/details/81990385。

https://www.admin5.com/article/20180629/862661.shtml。

Published 44 original articles · won praise 16 · views 10000 +

Guess you like

Origin blog.csdn.net/YYIverson/article/details/105078696