[Hadoop] Host access virtual machine HDFS (Java)

This article introduces the method of using the host (Win10+IDEA+Java) to access the virtual machine HDFS (Linux+Hadoop)

0 Preparation

Turn off the firewall: https://blog.csdn.net/Tiezhu_Wang/article/details/113861262
Set a fixed IP: https://blog.csdn.net/Tiezhu_Wang/article/details/113822362

1 Modify the configuration file

1.1 core-site.xml

gedit /usr/local/hadoop-3.2.1/etc/hadoop/core-site.xml

Modify the value of fs.defaultFS to the IP+port of the virtual machine:

<property>
	<name>fs.defaultFS</name>
	<value>hdfs://192.168.2.2:9000</value>
</property>

Insert picture description here
1.2 hdfs-site.xml

gedit /usr/local/hadoop-3.2.1/etc/hadoop/hdfs-site.xml

Add or modify attributes:

<property>
	<name>dfs.permissions.enabled</name>
	<value>false</value>
</property>

2 Dependence

Create a new project on the host and import the following dependencies:

<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>3.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>3.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>3.2.1</version>
</dependency>

3 Prepare documents

Create a new file ~/myfile/words.txt, and write a few words at will:

hadoop
spark
java
spring
hive
hbase
python
C

Create a new directory myfile in HDFS and upload the file to this directory:

start-dfs.sh
hdfs dfs -mkdir myfile
hdfs dfs -put ~/myfile/words.txt myfile

View uploaded files:

hdfs dfs -cat myfile/words.txt

words.txt

4 Java code

  • When configuring the connection, the address is specified as the virtual machine IP, here is 192.168.2.2, which can be viewed through the ip addr command
  • Create a new directory named hello
  • Read the previously uploaded file words.txt and output to the console
  • Upload the pom.xml of the project directly to the hello directory
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.*;

public class MainApp {
    
    
	public static void main(String[] args) throws Exception{
    
    
		// 配置连接
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS","hdfs://192.168.2.2:9000");

		FileSystem fs = FileSystem.get(conf);

		// 1.新建目录
		Path path = new Path("/user/hadoop/hello");
		if (!fs.exists(path)){
    
    
			fs.mkdirs(path);
		}else{
    
    
			System.out.println("路径已存在");
		}

		// 2.读取文件
		InputStream in = fs.open(new Path("/user/hadoop/myfile/words.txt"));
		BufferedReader br = new BufferedReader(new InputStreamReader(in));
		String str;
		while ((str=br.readLine())!=null){
    
    
			System.out.println(str);
		}

		// 3.上传文件
		fs.copyFromLocalFile(new Path("D:\\Program Files (x86)\\WorkspaceIDEA\\HadoopStu\\pom.xml"),
				new Path("/user/hadoop/hello/pom.xml"));

		fs.close();
		System.out.println("Finished.");
	}
}

5 Run the test

The console output is as follows: It
Console output
can be seen that the files in the hello directory in HDFS have been read, and you can enter the virtual machine to check:

hdfs dfs -ls hello

pom.xml

6 Exceptions and solutions

Exception 1:ConnectException

Exception in thread "main" java.net.ConnectException: 
	Call From XXXX to 192.168.2.2:9000 failed on connection exception

Solution: Modify the "localhost" of core-site.xml to the IP of the virtual machine according to 1.1.
modify
Exception 2: AccessControlException or SafeModeException

Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
	Permission denied: user=Dell, access=WRITE, inode="/user/hadoop":hadoop:supergroup:drwxr-xr-x

or

Exception in thread "main" org.apache.hadoop.hdfs.server.namenode.SafeModeException: 
	Cannot create file/user/hadoop/hello/pom.xml. Name node is in safe mode.

Solution: Modify hdfs-site.xml according to 1.2. In the second case, the pom.xml file may already exist in the hello directory, just delete the file in HDFS.

Guess you like

Origin blog.csdn.net/Tiezhu_Wang/article/details/113918261