Lzo is currently a very widely used compression format on the Hadoop platform, but it needs to be installed separately. Record it here.
The version used is Centos7.3, CDH6.0.1.
View the compression formats supported by Hadoop
hadoop checknative
Install Lzo online Parcel
下载地址:修改6.x.y为对应版本
CDH6:https://archive.cloudera.com/gplextras6/6.x.y/parcels/
CDH5:https://archive.cloudera.com/gplextras5/parcels/5.x.y/
- In the Parcel configuration of CDH, "Remote Parcel repository URL", click the "+" sign to add the address bar:
CDH6:https://archive.cloudera.com/gplextras6/6.0.1/parcels/
CDH5:http://archive.cloudera.com/gplextras/parcels/latest/
Other offline methods:
Download the parcel and place it in the /opt/cloudera/parcel-repo directory
or
Build httpd, change the parcel URL address, and then press the remote installation
2. Return to the Parcel list, after a few seconds delay, you will see more GPLEXTRAS (CDH6) or HADOOP_LZO (CDH5),
Download-distribute-activate.
3. After installing LZO, open the HDFS configuration, find "compression codec", click the "+" sign,
Add to:
com.hadoop.compression.lzo.LzoCodec
com.hadoop.compression.lzo.LzopCodec
4. YARN configuration, find "MR application Classpath" (mapreduce.application.classpath)
Add:
/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/*
5. Restart and update expired configuration