Pig简介和安装

 

Pig和Hive对比

Apache Pig Hive
Apache Pig uses a language called Pig Latin. It was originally created atYahoo. Hive uses a language called HiveQL. It was originally created at Facebook.
Pig Latin is a data flow language. HiveQL is a query processing language.
Pig Latin is a procedural language and it fits in pipeline paradigm. HiveQL is a declarative language.
Apache Pig can handle structured, unstructured, and semi-structured data. Hive is mostly for structured data.

 

Pig执行模式

-----------------------------------

1. local  

所有文件都在本地,for test

2. mapreduce

数据在HDFS上

 

Pig运行模式

1. 交互模式(grunt shell)

输入-执行-输出

2. batch mode 批处理模式

编写pig为扩展名的pig脚本

3. enbed mode 嵌入式

编写udf,在脚本使用

 

 

 

 

安装PIG

1. download pig

wget https://mirrors.tuna.tsinghua.edu.cn/apache/pig/latest/pig-0.16.0.tar.gz

tar -zxvf pig-0.16.0.tar.gz

ln -s pig-0.16.0.tar.gz pig

 

2. config ~/.bashrc

vi ~/.bashrc

export PIG_HOME=/usr/local/pig

export PATH=:$PIG_HOME/bin

source ~/.bashrc

 

3. verify

pig -version 

 

 

参考:

http://pig.apache.org/docs/r0.16.0/

https://www.tutorialspoint.com/apache_pig/index.htm

猜你喜欢

转载自oracle-api.iteye.com/blog/2375755
pig