1. The Hive windowing function takes the last piece of data that is closest to the time based on specific conditions (a single windowing function actually takes two windows)
For medical treatment business, one visit, multiple prescriptions, and the prescription settlement time may be inconsistent. Then there will be multiple AI assistants recommending medications, and there will be multiple recommendation logs. Moreover, the recommended log time and the prescription settlement time are inconsistent, and the logs can only be associated. At the granularity of the visit level, it is necessary to find the recommendation record before the prescription settlement. Therefore, for the windowing function for one visit, only one time window can be opened, but there may be two prescriptions, so it is necessary to find the recommendation before the two prescriptions. Therefore, we need to add conditions according to hive's window function to implement a window and filter out two pieces of data.
select
t1.*
,case when substring(t1.gmt_created,1,19)=substring(t1.gmt_created_max,1,19) then 1 else 0 end as use_flag
from (
select
t1.*
,max(
case
when t1.log_type='2-2' and substring(t1.gmt_created, 1, 19) <= substring(t4.expense_date, 1, 19) then substring(t1.gmt_created, 1, 19)
end
) over(partition by t1.visitCode, t4.expense_date) as gmt_created_max
from wedw_dw.unfold_chdisease_gpt_opt_log_df t1
left join (select
visit_no
,mi_card_no
,expense_date
from wedw_dw.doris_yyf_styy_txynhis_record_settle_bill_detail_df_tmp
group by
visit_no
,mi_card_no
,expense_date
) t2 on t1.visitCode=t2.visit_no and t1.patientIdNo=t2.mi_card_no
) t1