TDH operation analysis of the distinct and groupby

TDH distinct in operation and groupby Analysis
calculated the same order of magnitude faster than distinct Why groupby


select count(1)
from
(
select cust_isn
from 
    database.table
group by 
    cust_isn
)

select count(distinct(cust_isn))
from
    database.table;
    

The operation takes 24s distinct, the operation takes 1s groupby
Here Insert Picture Description
inserted here described image
groupby of the DAG
two stages shuffle
last node takes 1s
Here Insert Picture Description

dinstinct of the DAG
a shuffle stage
the final stage takes 24 seconds
Here Insert Picture Description
reasons Summary:
Although more than a shuffle operation groupby than dinstinct, but because there is a task groupby calculated expected, leading to faster groupby

Published 273 original articles · won praise 1 · views 4684

Guess you like

Origin blog.csdn.net/wj1298250240/article/details/103962715