Good programmers Big Data learning route hive internal function

Good programmers Big Data learning route hive internal function, as we continue to update the Big Data learning route, in the hope of big data is learning to help small partners.
1, taking the random number function: RAND ()
Syntax: rand (), rand (int seed) Return Value: double Description: Returns a random number in the range 0 to 1. If you specify a seed, it will give a stable random number sequence
SELECT RAND ();
SELECT RAND (10);
2, divided string functions: split (str, splitor)
Syntax: split (string str, string pat ) Return value: Description array: dividing according pat string str, returns an array of strings after the division, the delimiter escaping special attention
SELECT split (5.0, ".") [0];
SELECT split (RAND (10) 100,. " ") [0];
3, string interception functions: substr, substring
syntax: substr (string A, int start ), substring (string A, int start) return value: string Description: returns a string from the start position to the end of A string
syntax: substr (string a, int start , int len), substring (string a, int start, int len) return value: string Description: returns a string from a start position a starts, the length len of the string
select substr (RAND () 100,0,2);
the substring SELECT (RAND () 100,0,2);
. 4, the If function: if
Syntax: if (boolean testCondition, T valueTrue , T valueFalseOrNull) Return Value: T Description: TestCondition when the condition is TRUE, return ValueTRUE; otherwise valueFalseOrNull
SELECT IF (100> 10, "the this IS to true", "the this IS to false");
SELECT IF (2 =. 1, "M", "F");
SELECT IF (. 1 =. 1, "M", (IF (1 = 2, "female", "do not know")));
SELECT IF (. 3 = 1, "M", (if (3 = 2 , " M", "do not know")));
5, with the proviso judging function: the CASE
first format:
syntax: the WHEN a tHEN the CASE B [c tHEN the WHEN D] [E the ELSE] the END return value: T Description: If a is TRUE, b return; if c is TRUE, d is returned ; e otherwise
the second format:
syntax: CASE a WHEN b tHEN c [ WHEN d tHEN e] * [eLSE f] END return value: T: If a is equal to b, then returns C; if a is equal to d, then returns E; otherwise F
SELECT
Case. 6
When the then. 1 "100"
when 2 then "200"
when 3 then "300"
when 4 then "400"
else "others"
end
;
##创建表
create table if not exists cw(
flag int
)
;
load data local inpath '/home/flag' into table cw;
##第一种格式
select
case c.flag
when 1 then "100"
when 2 then "200"
when 3 then "300"
when 4 then "400"
else "others"
end
from cw c
;
##第二种格式
select
case
when 1=c.flag then "100"
when 2=c.flag then "200"
when 3=c.flag then "300"
when 4=c.flag then "400"
else "others". 6, the regular expression replacement function: regexp_replace;from CW C
End

Syntax: regexpreplace (string A, string B , string C) Return Value: string Description: A string of positive comply java B is partially replaced expression is C. Note that in some cases to use the escape character, regexpreplace function similar to the oracle of
the SELECT REGEXP_REPLACE ( "1.jsp", "JSP.", "HTML.");
7, type conversion functions: cast
Syntax: cast (expr as) return value: Expected "=" to follow " type" Description: returns the data type conversion
SELECT. 1;
SELECT Cast (Double AS. 1);
SELECT Cast ( "12 is" AS int);
. 8, string concatenation function: the concat; delimited string concatenation function: CONCAT_WS
syntax: concat (string a, string B ...) return value: string Description: returns a string connected to the input result, support any input string
syntax: concat_ws (string SEP, string a, string B ...) return value: string Description: returns a string connected to the input result, SEP represents a separator between each character string
SELECT "Shiqianfeng" + 1603 + "class";
SELECT the concat ( "Shiqianfeng" , 1603, "class");
the SELECT CONCAT_WS ( "|", "

rownumber (): ranking not parallel rank (): ranking in parallel, but the gap denserank (): ranking in parallel, without gaps
## the data
ID Score class
. 1. 1 90
2 85. 1
. 3. 1 87
. 4 60. 1
. 5 82 2
. 6 2 70
2 67. 7
. 8 2 88
. 9 93 2

1 1 90 1
3 1 87 2
2 1 85 3
9 2 93 1
8 2 88 2
5 2 82 3

create table if not exists uscore(
uid int,
classid int,
score double
)
row format delimited fields terminated by '\t'
;
load data local inpath '/home/uscore' into table uscore;
select
u.uid,
u.classid,
u.score
from uscore u
group by u.classid,u.uid,u.score
limit 3
;
select
u.uid,
u.classid,
u.score,
row_number() over(distribute by u.classid sort by u.score desc) rn
from uscore u
;
取前三名
select
t.uid,
t.classid,
t.score
from
(
select
u.uid,
u.classid,
u.score,
ROW_NUMBER () over (by the distribute u.classid Sort by u.score desc) RN
from uscore U
) T
WHERE t.rn <. 4
;
see three rankings distinction
SELECT
u.uid,
u.classid,
U. Score,
ROW_NUMBER () over (by the distribute u.classid Sort by u.score desc) RN,
Rank () over (by the distribute u.classid Sort by u.score desc) Rank,
DENSE_RANK () over (by the distribute u.classid by u.score desc Sort) DR
from uscore U
;
10. The aggregation function:
min () max () COUNT () COUNT (DISTINCT) SUM () AVG ()
COUNT (. 1): Whether there is the line value, as long as it occurs cumulative 1 count (*): n row value as long as there is an empty give class meter 1 count (col): col listed value is accumulated 1 count (distinct col): col listed values and different only totals 1.
11 null value operation
virtually any number of operations and return nULL nULL
SELECT. 1 + null;
1/0 SELECT;
SELECT 2% null;
12. The equivalent operation
SELECT = null null; #NULL
SELECT <=> null null; #true

Good programmers Big Data learning route hive internal function

Guess you like