El aprendizaje de Hive no puede reconocer la entrada cerca de 'usuario' '.' 'user_id' en el objetivo de selección El alias de la tabla es incorrecto

Al ejecutar hql hoy, encontré un problema.

Ha estado reportando un error que dice que no se puede reconocer user.user_id.Se estima que el alias de usuario se usa incorrectamente. Posteriormente modifiqué el alias de la tabla a un nombre de tabla que no aparecía y la ejecución fue exitosa.

hive (hive)> select user.user_id,user.date_dt,user.low_carbon
           > from 
           >     user_low_carbon user
           > join    
           > (select user_id,date_dt
           > from
           >    (select user_id,date_dt,
           >                datediff(date_dt,lag2) lag2_diff,
           >                datediff(date_dt,lag1) lag1_diff,
           >                datediff(date_dt,lead1) lead1_diff,
           >                datediff(date_dt,lead2) lead2_diff
           > from
           > (select user_id ,date_dt,
           >        lag(date_dt,2,'1970-01-01') over(partition by user_id order by date_dt) lag2,
           >        lag(date_dt,1,'1970-01-01') over(partition by user_id order by date_dt) lag1,
           >        lead(date_dt,1,'1970-01-01') over(partition by user_id order by date_dt) lead1,
           >        lead(date_dt,2,'1970-01-01') over(partition by user_id order by date_dt) lead2
           >        from
           >        (select user_id,date_format(regexp_replace(date_dt,'/','-'),'yyyy-MM-dd') date_dt,sum(low_carbon) sum_low_carbon
           > from user_low_carbon
           > where
           >   substring(date_dt,1,4)='2017'
           > group by user_id,date_dt
           > having
           >   sum_low_carbon>=100)t1)t2)t3
           > where
           > (lag2_diff=2 and lag1_diff=1)
           > or
           > (lag1_diff=1 and lead1_diff=-1)
           > or
           > (lead1_diff=-1 and lead2_diff=-2))t4    
           > on
           >     t4.user_id=user.user_id and t4.date_dt=date_format(regexp_replace(user.date_dt,'/','-'),'yyyy-MM-dd');

Código correcto:

select a.user_id,a.date_dt,a.low_carbon
from 
    user_low_carbon a
join    
(select user_id,date_dt
from
   (select user_id,date_dt,
               datediff(date_dt,lag2) lag2_diff,
               datediff(date_dt,lag1) lag1_diff,
               datediff(date_dt,lead1) lead1_diff,
               datediff(date_dt,lead2) lead2_diff
from
(select user_id ,date_dt,
       lag(date_dt,2,'1970-01-01') over(partition by user_id order by date_dt) lag2,
       lag(date_dt,1,'1970-01-01') over(partition by user_id order by date_dt) lag1,
       lead(date_dt,1,'1970-01-01') over(partition by user_id order by date_dt) lead1,
       lead(date_dt,2,'1970-01-01') over(partition by user_id order by date_dt) lead2
       from
       (select user_id,date_format(regexp_replace(date_dt,'/','-'),'yyyy-MM-dd') date_dt,sum(low_carbon) sum_low_carbon
from user_low_carbon
where
  substring(date_dt,1,4)='2017'
group by user_id,date_dt
having
  sum_low_carbon>=100)t1)t2)t3
where
(lag2_diff=2 and lag1_diff=1)
or
(lag1_diff=1 and lead1_diff=-1)
or
(lead1_diff=-1 and lead2_diff=-2)) t4    
on
    t4.user_id=a.user_id and t4.date_dt=date_format(regexp_replace(a.date_dt,'/','-'),'yyyy-MM-dd');

Las ventajas de esta declaración hql anidada de múltiples capas son complejas, ¡y se ejecutan un total de tres mapreducers!

Supongo que te gusta

Origin blog.csdn.net/weixin_45813351/article/details/120810967
Recomendado
Clasificación