Big data combat project based on Hadoop development - e-commerce log sharing system

Project Introduction

The big data e-commerce log platform project is based on the real business data structure of an e-commerce website, and realizes multi-faceted closed-loop business from data collection to use through front-end applications, back-end programs, data analysis, and platform deployment. Formed a set of e-commerce log analysis projects that conform to the teaching system.

There are two main goals of the bf_dataapi project: the first is that we need to provide a Rest API that provides json data; the other goal is to provide a demo page that displays the results. bf_dataapi uses spring+mybatis+mysql to build a project framework that provides rest api, and uses highcharts to build a demo page. In this project, we have highly aggregated all APIs. In the end, we will only provide two APIs, and then perform different operations in the APIs according to different parameters.

project structure

Project Requirements Analysis

Js sdk execution workflow:

In our js sdk, it is divided into different events according to the collected data, such as pageview events. The execution flow of Js sdk is as follows:


PC event analysis

For our final analysis modules, we need different data, and then analyze the data required by each module from each module. User basic information is the analysis of user browsing behavior information, that is, we only need pageview events; browser information analysis and regional information analysis are actually adding browser and region dimension information on the basis of user basic information analysis, among which Browser information can be analyzed through the browser's window.navigator.userAgent, and geographical information can be analyzed by collecting the user's ip address through the nginx server, which means that the pageview event can also satisfy the analysis of these two modules. External link data analysis and user browsing depth analysis We can add the current url of the accessed page and the url of the previous page to the pageview event for processing and analysis, that is to say, the pageview event can also satisfy the analysis of these two modules. Order information analysis requires the PC to send an event generated by the order, so corresponding to the analysis of this module, we need a new event chargeRequest. For event analysis, we also need a PC to send a new event data, which we can define as event. In addition, we also need to set a launch event to record the new user's visit.

The data url format sent by various events on the PC side is as follows, where the parameters behind the url are the data we collected: http://bjsxt.com/bjsxt.gif?requestdata

Program background event analysis

In this project, only the chargeSuccess event will be triggered in the background of the program. The main function of this event is to send the order success information to the nginx server. The sending format is the same as the sending method on the PC side, and the same url is also accessed for data transmission. The format is:

http://bjsxt.com/bjsxt.jpg?requestdata

chargeSuccess event

This event is triggered when the member finally pays successfully, and the event needs to be called actively by the program.

chargeRefund event

This event is triggered when a member performs a refund operation, and the event needs to be called actively by the program.

Integration

Just import the java sdk directly into the project, or add it to the classpath.

In this project, we analyze from seven major angles, namely: user basic information analysis module, browser information analysis module, regional information analysis module, user browsing depth analysis module, external link data analysis module, order analysis module and event analysis module. The following is the analysis of the final presentation of each module.

Note a few concepts:

User/Visitor: Indicates users represented by the same browser. uniquely identifies the user

Member: Indicates a normal member user of the website.

[i session: Continuous operation within a period of time is all operations in one session.

Pv: the number of pages visited

In this project, all counts are deduplicated. For example: active users/visitors, calculate the number of deduplicated uuids.

Basic user information analysis module

The basic user information analysis module mainly analyzes browsing related information from two main perspectives of users/visitors and members, including but not limited to new users, active users, total users, new members, active members, total members and session analysis, etc. The following is an analysis of different user information angles:

user analysis

This analysis mainly analyzes the relevant information of new users, active users and total users.

New visitors: Old visitors (among active visitors) = 1:7~10

Member analysis

This analysis mainly analyzes the relevant information of new members, active members and total members.

session analysis

This analysis mainly analyzes information related to the number of sessions, session length, and average session length.

Hourly Analysis

This analysis mainly analyzes information about users, number of sessions, and session length per hour per day.

Browser information analysis module

Based on the analysis of basic user information, add a browser dimension information.

Browser User Analysis

Analyze with users.

Browser member analysis

Analyze with members.

Browser Session Analysis

Same session analysis.

Regional Information Analysis Module

Mainly analyze the situation of users and members in different provinces.

Geographic analysis of active visitors

Analyze the number of active visitors across different geographies.

User Access Depth Analysis Module

This module mainly analyzes the depth of user's access records

External link data analysis module

Mainly analyze the user traffic data brought by different external links.

External link preference analysis

Analyze the number of active visitors brought by each external link.

Order data analysis module

Mainly analyze the relevant situation of the order

If you need the source code courseware and other materials supporting the system, you can private message me and share it with everyone~~

If there is a novice who wants to get started with Java and Python, but he doesn't know what to learn and how to learn? You can private message me and share my own 100G Java and Python webpan information~~~

Share and encourage each other~~~

If you have any resources, you can share with each other, and if you have any questions, you can also discuss with each other~~~

Tsk~~~ You and me alone, then we can't learn from each other~~~ (dog head)

Guess you like

Origin blog.csdn.net/lxianshengde/article/details/124795124