Date: 2019.11.13
Blog period: 115
Wednesday
Result file data Description:
Ip: 106.39.41.166, (city)
Date: 10 / Nov / 2016: 00: 01: 02 +0800, (date)
Day: 10, (number of days)
Traffic: 54, (the flow rate)
Type: video, (Type: Video video or article article)
Id: 8701 (id video or article)
Testing requirements:
1, data cleaning: Cleaning in accordance with the data, and import data washing hive database.
Two-stage data cleaning:
(1) First stage: the required information is extracted from the original log
ip: 199.30.25.88
time: 10/Nov/2016:00:01:03 +0800
traffic: 62
Articles: article / 11325
Video: video / 3235
(2) The second stage: to do fine operation based on information extracted from the
ip ---> urban city (IP)
date--> time:2016-11-10 00:01:03
day: 10
traffic:62
type:article/video
id:11325
(3) hive database table structure:
create table data( ip string, time string , day string, traffic bigint,type string, id string )
2 , the data processing:
· Top10 Visits statistics most popular video / article (video / article)
· According to the statistics of the most popular cities Top10 course (ip)
· According to traffic statistics Top10 most popular courses (traffic)
3 , Data Visualization: The statistical results poured MySql database, unfolded through a graphical display mode.
Production:
A, Bean class data base
1 package com.hive.basic; 2 3 import com.hive.format.IPUtil; 4 import com.hive.format.TimeUtil; 5 6 public class Bean { 7 protected String ip; 8 protected String time; 9 protected String day; 10 protected int traffic; 11 protected String type; 12 protected String id; 13 public String getIp() { 14 return ip; 15 } 16 public void setIp(String ip) { 17 this.ip = ip; 18 } 19 public String getTime() { 20 return time; 21 } 22 public String getDay() { 23 return day; 24 } 25 public void setDay(String day) { 26 this.day = day; 27 } 28 public void setTime(String time) { 29 this.time = time; 30 } 31 public int getTraffic() { 32 return traffic; 33 } 34 public void setTraffic(int traffic) { 35 this.traffic = traffic; 36 } 37 public String getType() { 38 return type; 39 } 40 public void setType(String type) { 41 this.type = type; 42 } 43 public String getId() { 44 return id; 45 } 46 public void setId(String id) { 47 this.id = id; 48 } 49 public Bean(String ip, String time, String day , int traffic, String type, String id) { 50 super(); 51 this.ip = ip; 52 this.time = time; 53 this.day = day; 54 this.traffic = traffic; 55 this.type = type; 56 this= .id ID; 57 is } 58 public Bean () { 59 Super (); 60 // the TODO automatically generated constructor stub 61 } 62 / * format conversion * / 63 is public void the format () { 64 the this .ip = IPUtil .getCityInfo ( "106.39.41.166") split ( "\\ |") [3] .replace ( " City", "." ); 65 the this .time = TimeUtil.deal ( the this .time); 66 } 67 public void the display () { 68 System.out.println (IP + "," + Time + "," + + Day ","+traffic+","+type+","+id); 69 } 70 }
B, the date format conversion classes
1 package com.hive.format; 2 3 import java.text.ParseException; 4 import java.text.SimpleDateFormat; 5 import java.util.Date; 6 import java.util.Locale; 7 8 public class TimeUtil { 9 public static String deal(String time){ 10 11 SimpleDateFormat sdf = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss Z", Locale.ENGLISH); 12 Date dd = null; 13 try { 14 dd = sdf.parse (Time); 15 } catch (a ParseException E) { 16 // the TODO automatically generated catch block . 17 e.printStackTrace (); 18 is } // string format to date . 19 20 is String resDate = new new the SimpleDateFormat ( "the mM-dd-YYYY HH: mm: SS" ) .format (dd); 21 is 22 is return resDate; 23 is } 24 public static void main (String [] args) throws a ParseException { 25 26 is String dateString = "10/Nov/2016:00:01:02 +0800"; 27 SimpleDateFormat sdf = new SimpleDateFormat("dd/MMM/yyyy:HH:mm:ss Z", Locale.ENGLISH); 28 Date dd = sdf.parse(dateString); //将字符串改为date的格式 29 String resDate= new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(dd); 30 System.out.println(resDate); 31 } 32 }