Week08_day01 (Hive windowing function row_number () is used (calculated all salary top two sectors))

data preparation:

7369,SMITH,CLERK,7902,1980-12-17,800,null,20
7499,ALLEN,SALESMAN,7698,1981-02-20,1600,300,30
7521,WARD,SALESMAN,7698,1981-02-22,1250,500,30
7566,JONES,MANAGER,7839,1981-04-02,2975,null,20,
7654,MARTIN,SALESMAN,7698,1981-09-28,1250,1400,30
7698,BLAKE,MANAGER,7839,1981-05-01,2850,null,30
7782,CLARK,MANAGER,7839,1981-06-09,2450,null,10
7788,SCOTT,ANALYST,7566,1987-04-19,3000,null,20
7839,KING,PRESIDENT,null,1981-11-17,5000,null,10
7844,TURNER,SALESMAN,7698,1981-09-08,1500,0,30
7876,ADAMS,CLERK,7788,1987-05-23,1100,null,20
7900,JAMES,CLERK,7698,1981-12-03,950,null,30
7902,FORD,ANALYST,7566,1981-12-03,3000,null,20
7934,MILLER,CLERK,7782,1982-01-23,1300,null,10

 

 Create a table in the Hive (of course, this is certainly not the construction of the table statement, this is a field)

 

 

Use local load command to load data load data local inpath 'file absolute path' into table emp2;

 

View

 

 

There is now a demand: Find all the salaries of the top two sectors.

The first step, using the windowing function ROW_NUMBER () for packet number 'used in descending DESC

select deptno,sal,row_number() over(partition by deptno order by sal desc) from emp2;

 

 Provide the following data:

 

 And then subjected to the data packet, the extracted number is less than 3 results obtained:

select w.deptno,w.sal from (select deptno,sal,row_number() over(partition by deptno order by sal desc) as rn from emp2) w where w.rn<3;

Guess you like

Origin www.cnblogs.com/wyh-study/p/12088320.html