This article introduces the method of using the host (Win10+IDEA+Java) to access the virtual machine HDFS (Linux+Hadoop)
0 Preparation
Turn off the firewall: https://blog.csdn.net/Tiezhu_Wang/article/details/113861262
Set a fixed IP: https://blog.csdn.net/Tiezhu_Wang/article/details/113822362
1 Modify the configuration file
1.1 core-site.xml
gedit /usr/local/hadoop-3.2.1/etc/hadoop/core-site.xml
Modify the value of fs.defaultFS to the IP+port of the virtual machine:
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.2.2:9000</value>
</property>
1.2 hdfs-site.xml
gedit /usr/local/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
Add or modify attributes:
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
2 Dependence
Create a new project on the host and import the following dependencies:
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.2.1</version>
</dependency>
3 Prepare documents
Create a new file ~/myfile/words.txt, and write a few words at will:
hadoop
spark
java
spring
hive
hbase
python
C
Create a new directory myfile in HDFS and upload the file to this directory:
start-dfs.sh
hdfs dfs -mkdir myfile
hdfs dfs -put ~/myfile/words.txt myfile
View uploaded files:
hdfs dfs -cat myfile/words.txt
4 Java code
- When configuring the connection, the address is specified as the virtual machine IP, here is 192.168.2.2, which can be viewed through the ip addr command
- Create a new directory named hello
- Read the previously uploaded file words.txt and output to the console
- Upload the pom.xml of the project directly to the hello directory
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.*;
public class MainApp {
public static void main(String[] args) throws Exception{
// 配置连接
Configuration conf = new Configuration();
conf.set("fs.defaultFS","hdfs://192.168.2.2:9000");
FileSystem fs = FileSystem.get(conf);
// 1.新建目录
Path path = new Path("/user/hadoop/hello");
if (!fs.exists(path)){
fs.mkdirs(path);
}else{
System.out.println("路径已存在");
}
// 2.读取文件
InputStream in = fs.open(new Path("/user/hadoop/myfile/words.txt"));
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String str;
while ((str=br.readLine())!=null){
System.out.println(str);
}
// 3.上传文件
fs.copyFromLocalFile(new Path("D:\\Program Files (x86)\\WorkspaceIDEA\\HadoopStu\\pom.xml"),
new Path("/user/hadoop/hello/pom.xml"));
fs.close();
System.out.println("Finished.");
}
}
5 Run the test
The console output is as follows: It
can be seen that the files in the hello directory in HDFS have been read, and you can enter the virtual machine to check:
hdfs dfs -ls hello
6 Exceptions and solutions
Exception 1:ConnectException
Exception in thread "main" java.net.ConnectException:
Call From XXXX to 192.168.2.2:9000 failed on connection exception
Solution: Modify the "localhost" of core-site.xml to the IP of the virtual machine according to 1.1.
Exception 2: AccessControlException or SafeModeException
Exception in thread "main" org.apache.hadoop.security.AccessControlException:
Permission denied: user=Dell, access=WRITE, inode="/user/hadoop":hadoop:supergroup:drwxr-xr-x
or
Exception in thread "main" org.apache.hadoop.hdfs.server.namenode.SafeModeException:
Cannot create file/user/hadoop/hello/pom.xml. Name node is in safe mode.
Solution: Modify hdfs-site.xml according to 1.2. In the second case, the pom.xml file may already exist in the hello directory, just delete the file in HDFS.