Orcle 12c 新特性 --- APPROX_COUNT_DISTINCT

1 说明

The new and optimized SQL function, APPROX_COUNT_DISTINCT(), provides approximate count distinct aggregation. Processing of large volumes of data is significantly faster than the exact aggregation, especially for data sets with a large number of distinct values, with negligible deviation from the exact result.

对大量数据的处理比精确的聚合要快得多,特别是对于具有大量不同值的数据集,与精确结果的偏差可以忽略不计。

The need to count distinct values is a common operation in today’s data analysis. Optimizing the processing time and resource consumption by orders of magnitude while providing almost exact results speeds up any existing processing and enables new levels of analytical insight.

这个功能对于如今的数据分析用处非常大,又可以提供几乎精确的结果。关键是处理速度是非常之快。

注意:BFILE, BLOB, CLOB, LONG, LONG RAW, or NCLOB数据类型除外。并且会忽略包含空值的列。

2 例如:

–正常查看不同object_name的总数

15:39:31 SQL> select count(distinct(object_name)) from cndba_t;

COUNT(DISTINCT(OBJECT_NAME))
---------------------------
       67455

Elapsed: 00:00:00.64

—使用APPROX_COUNT_DISTINCT()

15:39:36 SQL> select APPROX_COUNT_DISTINCT(object_name) from cndba_t;

APPROX_COUNT_DISTINCT(OBJECT_NAME)
----------------------------------
     67063

Elapsed: 00:00:00.14

可以看到时间差别还是比较大的,数量级的差距。数据量越大,不同值越多,差距越明显。

参考文档:
http://docs.oracle.com/database/121/SQLRF/functions013.htm#SQLRF56900

猜你喜欢

转载自blog.csdn.net/qianglei6077/article/details/92979346