hIve—timestamp timestamp problem

Check the table first

timestamp can be converted to standard time (accurate to seconds); https://tool.lu/timestamp/

This time format is useful in many ways:

   Multiple times can be switched using the function.

  When each user generates behavior, use timestamp to distinguish the order relationship, and record which products have been seen when;

  Compare the size, such as the earliest order. .

   select max(`timestampss`) as max_tm,min(`timestampss`) as min_tm from user_data; --`` is the symbol before 1,

  

 

We use the nearest time as the time reference point:

  hive> select ((cast(893286638 as bigint)-cast(`timestampss` as bigint))/(60*60*24)) as days from user_data limit 10;

  

 hive>select id,collect_list(cast (days as bigint)) as days_list from (select id,
((cast(893286638 as bigint)-cast(`timestampss` as bigint))/(60*60*24)) as days from user_data)t group by id limit 10;

 To view the user's behavior time point, you can use this data to make a data cleaning rule
#collect_list, and put the data in the form of an array. The usefulness of the data obtained in this situation: At the same time, if there are multiple comments, there may be suspicion of fraudulent orders. The meaning of excavation.

Time decay requirements:

  A user has multiple behaviors. When doing behavior analysis, the more effective (better) the most recent behavior is. What is returned after sum is a list. So need to aggregate

hive> select user_id,collect_list(cast(days as int)) as day_list    
  from
  (
  select user_id,exp(-(cast(893286638 as bigint)-cast(`timestampss` as bigint))/(60*60*24)/2)*rating as days from user_data) t
  group by user_id
  limit 10;
  

  exp is normally distributed, and rating is equivalent to the score.

  

 

  

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325663971&siteId=291194637