Win7+Eclipse+Hadoop2.6.4 development environment build

Configure the system environment variables JAVA_HOME, ANT_HOME, and HADOOP_HOME, and configure the bin subdirectories of these environment variables into the path variable.

Copy hadoop.dll and winutils.exe under hadoop2.6(x64)V0.2 to the HADOOP_HOME/bin directory.

3. Configure Eclipse

Copy hadoop-eclipse-plugin-2.6.0.jar to the plugins directory of eclilpse.

Start eclipse and set up the workspace. If the plugin is installed successfully, you can see the following after startup:

4. Configure hadoop

Open "window" - "Preferenes" - "Hadoop Mep/Reduce" and configure it to the Hadoop_Home directory.

Open "window" - "show view" - "Mepreduce Tools" - "Mep/Reduce Locations", create a Location, and configure it as follows.

The 1 position is the name of the configuration, which is arbitrary.

The 2 location is the mapreduce.jobhistory.address configuration in the mapred-site.xml file.

3 The location is the fs.default.name configuration in the core-site.xml file.

After configuring the above information, you can see the following content in the Project Explorer, which means the configuration is successful.

The above figure shows that the configured hdfs information has been read. There are a total of 3 folders input, output, and output1, and there are 3 files in the input directory.

Note: The above content is created in my own environment, what you see may not be the same as mine.

Content can be executed by executing on hadoop.master

hadoop fs -mkdir input --create folder

hadoop fs -put $localFilePath input --Upload local files to the input directory of HDFS

三、创建示例程序

1. 新建一个WordCount类

打开eclipse，创建一个Map/Reduce Project，并创建一个org.apache.hadoop.examples.WordCount类。

拷贝hadoop-2.6.4-src.tar.gz中hadoop-mapreduce-project\hadoop-mapreduce-examples\src\main\java\org\apache\hadoop\examples下的WordCount.java文件中的内容到新创建的类中。

2. 配置log4j

在src目录下，创建log4j.properties文件

log4j.rootLogger=debug,stdout,R
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p - %m%n
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=mapreduce_test.log
log4j.appender.R.MaxFileSize=1MB
log4j.appender.R.MaxBackupIndex=1
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%
log4j.logger.com.codefutures=DEBUG

3. 配置运行参数

选择“run”-“run configurations”，在“Arguments”里加入“hdfs://hadoop.master:9000/user/hadoop/input hdfs://hadoop.master:9000/user/hadoop/output1”。

格式为“输入路径输出路径”，如果输出路径必须为空或未创建，否则会报错。

如下图：

注：如果”Java Application”下面没有“WordCount”，可以选择右键，New一个即可。

4. 执行查看结果

配置好之后，执行。查看控制台输出以下内容，表示执行成功：

INFO - Job job_local1914346901_0001 completed successfully

INFO - Counters: 38
    File System Counters
        FILE: Number of bytes read=4109
        FILE: Number of bytes written=1029438
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=134
        HDFS: Number of bytes written=40
        HDFS: Number of read operations=37
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=6
    Map-Reduce Framework
        Map input records=3
        Map output records=7
        Map output bytes=70
        Map output materialized bytes=102
        Input split bytes=354
        Combine input records=7
        Combine output records=7
        Reduce input groups=5
        Reduce shuffle bytes=102
        Reduce input records=7
        Reduce output records=5
        Spilled Records=14
        Shuffled Maps =3
        Failed Shuffles=0
        Merged Map outputs=3
        GC time elapsed (ms)=21
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=1556611072
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=42
    File Output Format Counters
        Bytes Written=40

在“DFS Locations”下，刷新刚创建的“hadoop”看到本次任务的输出目录下是否有输出文件。

四、问题FAQ

1. 问题1：NativeCrc32.nativeComputeChunkedSumsByteArray错误

【问题描述】启动示例程序时，报nativeComputeChunkedSumsByteArray异常。控制台日志显示如下：

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)

【原因分析】hadoop.dll文件版本错误，替换对应的版本文件。由于hadoop.dll 版本问题出现的，这是由于hadoop.dll 版本问题，2.4之前的和自后的需要的不一样，需要选择正确的版本(包括操作系统的版本),并且在 Hadoop/bin和 C：\windows\system32 上将其替换。

【解决措施】下载对应的文件替换即可。http://download.csdn.net/detail/myamor/8393459 (2.6.X_64bit)

Win7+Eclipse+Hadoop2.6.4 development environment build

Guess you like