pig基础操作

pig基础操作

原始数据

hdj,network,tigle,100
md,database,tigle,99
wqy,pde,yao,94
zx,network,tigle,98
mmd,pde,yao,98
zx,pde,yao,100

一:查询每个学生被几个老师教过

A = load 'score.txt'
using PigStorage(',')
as (student, corse, teacher, score:int);
describe A;
B = foreach A generate student, teacher;
C = distinct B;
D = foreach (group C by student) generate group as student, COUNT(C);
dump D;
###运行结果###
(md,1)
(zx,2)
(hdj,1)
(mmd,1)
(wqy,1)
A = load 'score.txt'
using PigStorage(',')
as (student, corse, teacher, score:int);
describe A;
B = foreach A generate student, teacher;
E = group B by student;
F = foreach E
{
T = B.teacher;
uniq = distinct T;
generate group as student, COUNT(uniq) as cnt;
}
dump F;
###运行结果###
(md,1)
(zx,2)
(hdj,1)
(mmd,1)
(wqy,1)

 二:查询每个科目的前两名学生

A = load 'score.txt'
using PigStorage(',')
as (student, corse, teacher, score:int);
B = foreach A generate student, corse, score;
C = group B by corse;
describe C;
D = foreach C 
{
sorted = order B by score DESC;
top = LIMIT sorted 2;
generate group as course, top as top;
}
dump D;
E = foreach D generate course, flatten (top);
dump E;
####运行结果####
(pde,zx,pde,100)
(pde,mmd,pde,98)
(network,hdj,network,100)
(network,zx,network,98)
(database,md,database,99)

 操作时报错:

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias passwd. Backend error : javadoop/192.168.0.2 to master.hadoop:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more deta  
Details at logfile: /usr/local/pig/pig_1433189043690.log  

 原因是:10020端口的服务没有打开,打开命令是:

mr-jobhistory-daemon.sh start historyserver  

猜你喜欢

转载自linhexiao.iteye.com/blog/2422728
pig