There is a corresponding component Pig in hadoop using CDH

1. There is a corresponding component Pig in hadoop using CDH, but the version is low, so I gave up using it and directly download
the latest version of Apache Pig0.15 (supports Tez, easier to integrate than Hive)
Download address: http://archive .apache.org/dist/pig/pig-0.15.0/pig-0.15.0.tar.gz
directly download the binary package

2, configure the environment variables of Pig as follows:
#Pig
export PIG_HOME=/ROOT/server/pig
export PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop 
export PATH=/ROOT/server/pig/bin:$PATH
3, directly execute the pig command to start the program, the following exception will be reported
[main]ERROR org.apache.pig.Main -ERROR 2998: Unhandled internal error.Found interface jline.Terminal, but class was expected

The reason is that the jline package is inconsistent with the jline package under hadoop's yarn/lib.
Refer to: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started for the explanation

Solution :

Delete the package with a higher jline version under yarn/lib of hadoop, copy the jline-1.0.jar package under pig/lib to yarn/lib, and then
re- execute the pig command to start normally.





Then execute a MapReduce job written by a pig script, and find that the following exception will be reported, but the MR job runs successfully: the




reason is that the jobhistroy process of Hadoop has not been started.
Solution:
Execute the sbin/mr-jobhistory-daemon.sh start historyserver command to start the log daemon

and then run the pig job again, everything is normal!

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326614496&siteId=291194637
pig
pig