版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
误用聪明,何若一生守拙
滥交朋友,不如终日读书
相关连接
HDFS相关知识
Hadoop集群连接
WordCount程序示例
HDFS Java API
代码下载
MyHadoop.java下载 提取码z458
具体介绍
注意:在使用Eclipse或者IntelliJ IDEA成功连接Hadoop集群后,方可进行如下操作
- 本测试类类名为MyHadoop,其包含FileSystem类的属性fs和Configuration类的属性conf
- 需要定义HDFSUtil()方法
- 需要在主函数中加入
System.setProperty(“HADOOP_USER_NAME”, ”root”);
,以解决org.apache.hadoop.security.AccessControlException:Permission denied: user=...
错误
package neu.software;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import java.io.IOException;
public class MyHadoop {
private FileSystem fs;
private Configuration conf;
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
}
public void HDFSUtil() throws IOException {
conf = new org.apache.hadoop.conf.Configuration();
fs = FileSystem.get(conf);
}
}
1. 在HDFS中创建目录 /data/test
方法定义
public boolean mkdir(String path) throws IOException {
Path srcPath = new Path(path);
return fs.mkdirs(srcPath);
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
myHadoop.mkdir("/data/test");
}
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -ls /data
Found 1 items
drwxr-xr-x - root supergroup 0 2019-10-15 11:14 /data/test
2. 将本地文件夹mytest通过Java API上传到HDFS的 /data/test 目录中
方法定义
public void put(String src, String dst, boolean delSrc, boolean overwritted) throws IOException {
Path srcPath = new Path(src);
Path dstPath = new Path(dst);
//调用文件系统的文件复制函数,delSrc参数指是否删除原文件,true为删除
fs.copyFromLocalFile(delSrc, overwritted, srcPath, dstPath);
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
}
注意:mytest文件夹中包含以下4个文件
- data1.txt
Hello World
- data2.txt
Hello Hadoop
- data3.txt
Hello Java
- data4.txt
Hello HDFS
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -ls /data/test/mytest
Found 4 items
-rw-r--r-- 3 root supergroup 13 2019-10-15 11:30 /data/test/mytest/data1.txt
-rw-r--r-- 3 root supergroup 14 2019-10-15 11:30 /data/test/mytest/data2.txt
-rw-r--r-- 3 root supergroup 12 2019-10-15 11:30 /data/test/mytest/data3.txt
-rw-r--r-- 3 root supergroup 12 2019-10-15 11:30 /data/test/mytest/data4.txt
3. 查看 /data/test/mytest 目录下的文件列表
方法定义
public List<String> ls (String filePath, String ext) throws IOException {
List<String> listDir = new ArrayList<String>();
Path path = new Path(filePath);
RemoteIterator<LocatedFileStatus> it = fs.listFiles(path, true);
while(it.hasNext()) {
String name = it.next().getPath().toString();
if(name.endsWith(ext)) {
listDir.add(name);
}
}
return listDir;
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
List<String> list = myHadoop.ls("/data/test/mytest","");
for(String file: list){
System.out.println(file);
}
}
结果验证(IDE标准输出窗口)
…
hdfs://master:9000/data/test/mytest/data1.txt
hdfs://master:9000/data/test/mytest/data2.txt
hdfs://master:9000/data/test/mytest/data3.txt
hdfs://master:9000/data/test/mytest/data4.txt
…
4. 统计 /data/test/mytest 目录下文件数和空间占用情况
方法定义
public String count(String filePath) throws IOException {
Path path = new Path(filePath);
ContentSummary contentSummary = fs.getContentSummary(path);
return contentSummary.toString();
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
System.out.println(myHadoop.count("/data/test/mytest"));
}
结果验证(IDE标准输出窗口)
…
DEBUG - Call: getContentSummary took 163ms
none inf none inf 1 4 51
…
5. 递归将 /data/test/mytest 下的文件拥有者修改为admin
方法定义
public void chown(String filePath, String username, String groupname) throws IOException {
Path path = new Path(filePath);
RemoteIterator<LocatedFileStatus> it = fs.listFiles(path, true);
while(it.hasNext()) {
fs.setOwner(it.next().getPath(), username, groupname);
}
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
//System.out.println(myHadoop.count("/data/test/mytest"));
myHadoop.chown("/data/test/mytest/", "admin", "supergroup");
}
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -ls /data/test/mytest
Found 4 items
-rw-r--r-- 3 admin supergroup 13 2019-10-15 11:30 /data/test/mytest/data1.txt
-rw-r--r-- 3 admin supergroup 14 2019-10-15 11:30 /data/test/mytest/data2.txt
-rw-r--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data3.txt
-rw-r--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data4.txt
6. 递归将 /data/test/mytest 下的文件ACL权限修改为只能自己读写执行,组和其他用户只可以读
方法定义
public void chmod(Path src, String mode) throws IOException {
FsPermission fp = new FsPermission(mode);
RemoteIterator<LocatedFileStatus> it = fs.listFiles(src, true);
while(it.hasNext()) {
fs.setPermission(it.next().getPath(), fp);
}
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
//System.out.println(myHadoop.count("/data/test/mytest"));
//myHadoop.chown("/data/test/mytest/", "admin", "supergroup");
Path path = new Path("/data/test/mytest/");
myHadoop.chmod(path, "744");
}
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -ls /data/test/mytest
Found 4 items
-rwxr--r-- 3 admin supergroup 13 2019-10-15 11:30 /data/test/mytest/data1.txt
-rwxr--r-- 3 admin supergroup 14 2019-10-15 11:30 /data/test/mytest/data2.txt
-rwxr--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data3.txt
-rwxr--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data4.txt
7. 在 /data/test/mytest 目录下创建一个空文件 empty.txt
方法定义
public void touchz(String filePath, String fileName) throws IOException {
Path path = new Path(filePath, fileName);
fs.create(path);
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
//System.out.println(myHadoop.count("/data/test/mytest"));
//myHadoop.chown("/data/test/mytest/", "admin", "supergroup");
//Path path = new Path("/data/test/mytest/");
//myHadoop.chmod(path, "744");
myHadoop.touchz("/data/test/mytest/", "empty.txt");
}
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -ls /data/test/mytest
Found 5 items
-rwxr--r-- 3 admin supergroup 13 2019-10-15 11:30 /data/test/mytest/data1.txt
-rwxr--r-- 3 admin supergroup 14 2019-10-15 11:30 /data/test/mytest/data2.txt
-rwxr--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data3.txt
-rwxr--r-- 3 admin supergroup 12 2019-10-15 11:30 /data/test/mytest/data4.txt
-rw-r--r-- 3 root supergroup 0 2019-10-15 12:05 /data/test/mytest/empty.txt
8. 向 /data/test/mytest/empty.txt 中追加其他文件内容
方法定义
public boolean appendToFile (InputStream in, String filePath) throws IOException {
conf.setBoolean("dfs.support.append", true);
if(!check(filePath)) {
fs.createNewFile(new Path(filePath));
}
OutputStream out = fs.append(new Path(filePath));
IOUtils.copyBytes(in, out, 10, true);
in.close();
out.close();
fs.close();
return true;
}
private boolean check(String filePath) throws IOException {
Path path = new Path(filePath);
boolean isExists = fs.exists(path);
return isExists;
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
//System.out.println(myHadoop.count("/data/test/mytest"));
//myHadoop.chown("/data/test/mytest/", "admin", "supergroup");
//Path path = new Path("/data/test/mytest/");
//myHadoop.chmod(path, "744");
//myHadoop.touchz("/data/test/mytest/", "empty.txt");
File file =new File("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest\\data1.txt");
FileInputStream fileInputStream = new FileInputStream(file);
myHadoop.appendToFile(fileInputStream, "/data/test/mytest/empty.txt");
fileInputStream.close();
}
结果验证(XShell命令窗口)
[root@master ~]# hadoop fs -cat /data/test/mytest/empty.txt
Hello World
9. 查看 /data/test/mytest/empty.txt中的内容
方法定义
public void cat(String filePath) throws IOException {
Path path = new Path(filePath);
if(!check(filePath)) {
fs.createNewFile(new Path(filePath));
}
FSDataInputStream fsDataInputStream = fs.open(path);
IOUtils.copyBytes(fsDataInputStream, System.out, 10, false);
}
方法测试
public static void main(String args[]) throws IOException {
System.setProperty("HADOOP_USER_NAME","root");
MyHadoop myHadoop = new MyHadoop();
myHadoop.HDFSUtil();
//myHadoop.mkdir("/data/test"); //不能二次执行
//myHadoop.put("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest", "/data/test/", false, true);
//List<String> list = myHadoop.ls("/data/test/mytest","");
//for(String file: list){
//System.out.println(file);
//}
//System.out.println(myHadoop.count("/data/test/mytest"));
//myHadoop.chown("/data/test/mytest/", "admin", "supergroup");
//Path path = new Path("/data/test/mytest/");
//myHadoop.chmod(path, "744");
//myHadoop.touchz("/data/test/mytest/", "empty.txt");
//File file =new File("C:\\Users\\Lenovo\\Desktop\\localfile\\mytest\\data1.txt");
//FileInputStream fileInputStream = new FileInputStream(file);
//myHadoop.appendToFile(fileInputStream, "/data/test/mytest/empty.txt");
//fileInputStream.close();
myHadoop.cat("/data/test/mytest/empty.txt");
}
结果验证(IDE标准输出窗口)
DEBUG - SASL client skipping handshake in unsecured configuration for addr = /172.16.29.95, datanodeId = DatanodeInfoWithStorage[172.16.29.95:50010,DS-c67f1790-f7ea-4a0c-b564-f91a70d347e4,DISK]
Hello World
有疑问的朋友可以在下方留言或者私信我,我尽快回答
欢迎各路大神萌新指点、交流!
求关注!求点赞!求收藏!