odps function


Common Functions


The system comes with function

COALESCE () : returns the first non-NULL value in the list, if all values in the list is NULL NULL is returned;

eg:

the concat () : string concatenation function;

eg:

Least () : returns the input parameters of a minimum

Greatest () : returns the largest of the input parameters (var1, var2 may bigint, double, datetime string or if all values are NULL NULL is returned.

Return value: the maximum value of the input parameters, the input parameters to return the same type when the implicit conversion is not present. NULL minimum. When the input parameter types are different, the comparison between the converted double double, bigint, string; string, comparing the converted datetime datetime. No other implicit conversion)

decode () : to achieve branch selection functionality

eg:select decode(customer_id,

            1, 'Taobao',

            2, 'Alipay',

            3, Aliyun,

            NULL, 'N/A',

            'Others') as result

    from sale_detail;

Decode the above following function implements the functions of if-then-else statement:

    if customer_id = 1 then

        result := 'Taobao';

    elsif customer_id = 2 then

        result := 'Alipay';

    elsif customer_id = 3 then

        result: = 'Aliyun;

    ...

    else

        result := 'Others';

    end if;

if the function: if (logical condition, coumn1, coumn2) represents an output condition is satisfied, otherwise, the output value of the 2

eg:if(cap_direction not in('0','1'),null, cast(cap_direction as bigint));

substr () : Returns the string str substring from the beginning of length length start_position

eg: substr(""abc"", 2) = ""bc"";substr(""abc"", 2, 1) = ""b"";

TO_CHAR () : the Boolean type, bigint type, decimal type or double type represents the type of string into the corresponding

eg:to_char(123) = '123';to_char(true) = 'TRUE';to_char(1.23) = '1.23';to_char(null) = NULL;

TO_CHAR () : Datetime type, to convert the date value, if the input type string implicitly converted to datetime type involved in computing, other types of exception thrown.

eg:to_char(getdate(),'yyyymmdd')

the concat (coumn1, ',', coumn2) : string concatenation function

Match two precision:

concat(substr(to_char(lng),1,6),',',substr(to_char(lat),1,5)) like '120.08,30.28'; 

REGEXP_EXTRACT (coumn, '', Number): Split String Functions

Such as: Pro Road intersection with Vulcan toro

regexp_extract (inter_name, '(. *?) (Road)', 1) = Pro East

regexp_extract (inter_name, 'and (. *?) (intersection)', 1) = Vulcan toro

REGEXP_REPLACE : string replacement function

regexp_replace (round_name, '-', '', 1) represents a bar - replacing null

split_part string into functions

split_part ( 'North Central - Bridge dense', '-', 2) = density Bridge

InStr : calculating a substring positions in the string in str1 str2

  instr('Tech on the net', 'e') = 2;instr('Tech on the net', 'e', 1, 1) = 2

cast

coors_convert (lng, LAT, 1): Google turn High German coors_convert (120.2334214,30.21829241,1)

WHERE judge_location(split_part(coors_convert(a.lng,a.lat,1),',',1),split_part(coors_convert(a.lng,a.lat,1),',',2))=1

Window function

统计 量: count, sum, avg, max / min, median, stddev, stddev_samp

排名:row_unmber,rank,dense_rank,percent_rank

Other categories: lag, lead, cluster_sample

--------------------

Basic usage; data into groups in accordance with a certain condition called window, each group is called a window

partition by column portion is used to specify the fenestration

The same partitioning column row value is considered in the same window a

How to specify the order by sorting the data in a window

Limitations: it can only appear in the select clause

Do not use the window function window functions and nested aggregate functions

And the same level of aggregation may not function together with

A odps sql statement can be used up to 5 window function

When Partition window, the same window contains up to 100 million rows of data

When the window with rows, x, y must be an integer greater than or equal to the constant 0, limit the scope 0-10000, value of 0 indicates the current row

You must use order by can specify the range of the window with rows ways

Not all window functions can be specified with the windows in rows, to support this use of window functions are avg, count, max, min, stddev and sum

----------------------

For chestnuts

select *,rank() over(partition by monitor_id order by distance) as mindistance_monitor_id from()

Custom Functions


Making the appropriate custom function based on Ali cloud odps

Note: this example because the odps version is too low: it is not used when creating aliyun example maven package step by step, instead of using their own package, due to the adoption of a step by step example to back out of the jar in question (out of the jar no class resources, profiles resources only).

Glossary:

UDF: user-defined scalar-valued function (user defined scalar function), which is one of the input and output relationship, reads one line of data (which may have a plurality of parameters), to write an output value

UDTF: Custom function table value (user defined table valued function), a function call is used to solve the multi-line output data for the scene, only to return a custom function a plurality of fields

UDAF: Custom aggregation function (user defined aggregation function), the input and output of many to one relationship, the plurality of input records aggregated into an output value (and can be coupled with group by statement)


Guess you like

Origin blog.51cto.com/13184837/2402099