Operate on hdfs:
Note: The namenode needs to be formatted before opening hdfs and yarn, otherwise the namenode will not be turned on
which is:
hadoop namenode -format #Format first, sometimes you need to agree to something in the middle, that is, y agree start-yarn.sh #start-all.sh is deprecated, but can also be used start-dfs.sh
Use jps to check whether namenode and datanode are all enabled
hadoop fs -mkdir /input #Create input directory on hdfs echo "hello adu hello world"> file #Create a local file hadoop fs -put file /input #Upload to hdfs
You can view the status of hdfs at localhost:50070 (browse the file system in utilities on the upper right side of the page)
The content of pom.xml in idea:
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>hadoop</groupId> <artifactId>com.adu</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.7.2</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.7.2</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>2.7.2</version> </dependency> </dependencies> <repositories> <repository> <id>apache</id> <url>http://maven.apache.org</url> </repository> </repositories> <build> <plugins> <plugin> <artifactId>maven-dependency-plugin</artifactId> <configuration> <excludeTransitive>false</excludeTransitive> <stripVersion>true</stripVersion> <outputDirectory>./lib</outputDirectory> </configuration> </plugin> </plugins> </build> </project>
The map and reduce functions are the same as in the previous chapters
Click on run""edit configurations
Change program arguments to: hdfs://localhost:9000/input/file hdfs://localhost:9000/output
Others are the same as the previous chapter, there is a space between the above two directories, and the output directory does not need to be created by itself, the file file is the uploaded file