Use the command line to compile and run a small example of the HDFS program

Write in front

This article uses a small instance of HDFS to determine whether a file exists in a Hadoop 2.7.7 distributed cluster environment to introduce how to use the command line to compile and package HDFS programs in the Hadoop 2.x version.

Add Hadoop classhpath information to CLASSPATH variable

In the Hadoop 2.x version, jars are no longer concentrated in one hadoop-core * .jar, but are divided into multiple jars. For example, using Hadoop 2.7.7 to run WordCount instances requires at least the following three jars:

  • $HADOOP_HOME/share/hadoop/common/hadoop-common-2.7.7.jar
  • $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.7.jar
  • $HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar

In fact, by command hadoop classpathwe can get all classpath information needed to run Hadoop program.

Insert picture description here
We add the Hadoop classhpath information to the CLASSPATH variable, and add the following lines to ~ / .bashrc:

export HADOOP_HOME=/usr/local/hadoop
export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH

Do not forget to execute source ~/.bashrcthe variables to take effect.

Compile, package and execute HDFS programs

Write HDFS program, here is a small chestnut to determine whether the specified file exists or not

vi FileExist.java
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class FileExist {
	public static void main(String[] args){
		try{
			String fileName = "test";
             Configuration conf = new Configuration();
             conf.set("fs.defaultFS", "hdfs://Master:9000");
             conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
             FileSystem fs = FileSystem.get(conf);
             if(fs.exists(new Path(fileName))){
				System.out.println("文件存在");
             }else{
             	 System.out.println("文件不存在");
             }
		}catch (Exception e){
			e.printStackTrace();
		}
	}
}

javac Command to compile FileExist.java

javac FileExist.java 

After compilation, you can see that a .class file is generated

Insert picture description here

Then package the .class file into a jar to run in Hadoop

jar -cvf FileExist.jar ./FileExist*.class

After packaging, you can find that a FileExist.jar package has been generated
Insert picture description here

Next we can run the jar package

hadoop jar FileExist.jar FileExist

FileExist.jar is the jar package we run, FileExist is the class where the main method of the jar package is located

operation result

Insert picture description here
Use the command-line compiler package running HDFS small sample program to write here, of course, you can also use the command line compiler package running MapReduce programs, with the compiler package to run a program similar to HDFS specific reference run using the command line compiler package own MapReduce programs

Published 23 original articles · won 16 · views 9515

Guess you like

Origin blog.csdn.net/atuo200/article/details/105629703