SQL concat (), collect_set (), collect_list () and concat_ws () usage

1. The splicing of the concat_ws() function and the concat() function is extremely different

1.1 Differences

concat(): When the function concatenates strings, as long as one of them is NULL, it will return NULL

执行代码:
select concat('a','b',null);

执行结果:
NULL

concat_ws() : When the function concatenates stringsconcat_ws(): The function needs to specify the delimiter .

执行代码1:
hive> 
select concat_ws('-','a','b');
执行结果:
a-b

执行代码2:
hive> 
select concat_ws('-','a','b',null);
执行结果:
a-b

执行代码3:
hive> 
select concat_ws('','a','b',null);
执行结果:
ab

2. The difference between collect_set() unordered and collect_list()

Reference link: SQL small knowledge point series-3-collect_list/collect_set (column transfer) - Know almost

2.1 Differences:

They all convert a column in the group into an array and return it.

The difference is that collect_list does not deduplicate and collect_set deduplicates

2.2 The collect_list() function is ordered without deduplication

2.3 collect_set() unordered deduplication

After grouping according to a certain field, use the collect_list() function to merge the data in a group together. The default separator is ',' such as

a b c
1 1 “1”,“2”
1 2 “1”,"2”
1 2 “1”,“2”,“2”

2.4 Examples

Raw temp data

id class
loongshaw 1
loongshaw 2
loongshaw 3
loongshaw 4

expected value

id class
loongshaw 1,2,3,4

Enter code:

select
  t.id,
  concat_ws(',', collect_set(t.class))
from
  temp t
group by
  t.id

As a result, the class is not in order after merging

id class
loongshaw 1,3,2,4

Solution:
Change collect_set unordered collection to collect_list or sort_array for sorting.

concat_ws(',', sort_array(collect_set(t.class), false))

sort_array(e: column, asc: boolean) sorts the elements in the array (natural sorting), the default is asc.

or:

concat_ws(',',collect_list(t.class))

As a result, the classes are merged and ordered

id class
loongshaw 1,2,3,4

Guess you like

Origin blog.csdn.net/weixin_48272780/article/details/128243152