HDFS Distributed File System—Java API Operation

HDFS Distributed File System—Java API Operation

1. Introduction to HDFS Java API

1. haoop is written in java, so you can use the JAVA API to operate the Hadoop file system, and build a client object to perform operations such as adding, deleting, modifying, and checking files on HDFS.
2.Configuration: This class encapsulates client or server configuration
3.FileSystem: This class is a file system object, and some common methods operate on files

method name method description
copyFromLocalFile(Path src,Path dst) Copy files from local disk to HDFS
copyToLocalFile(Path src,Path dst) Copy files from HDFS to disk
mkdirs(Path f) Create a subdirectory
rename(Path src,Path dst) Rename a file or folder
delete(Path f) delete specified file

Hadoop API official documentation: hhtp://hadoop.aphace.org/docs/stable/api/index.html

2. Java API operation

1. Build the project environment

Create a Maven project through IDEA and configure the pom.xml file

    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.4</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.7.4</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.7.4</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>RELEASE</version>
        </dependency>
    </dependencies>

2. Related operations of Java API

package com.itcast.hdfsdemo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.junit.Before;
import org.junit.Test;

import java.io.FileNotFoundException;
import java.io.IOException;

public class HDFS_CRUD {
    
    
    FileSystem fs = null;
    //1.初刷客户端对象
    @Before
    public void init() throws Exception{
    
    
        //构建一个配置参数对象,设置一个参数2:要访问的hdfs的url
       Configuration conf = new Configuration();
       //这里指定使用的是HDFS
        conf.set("fs.defaultFS","hdfs://hadoop01:9000");
        // 通过如下的方式进行客户端身份的设置
        System.setProperty("HADOOP_USER_NAME","root");
        //通过FileSystem的静态方法获取文件系统客户端对象
        fs = FileSystem.get(conf);
    }

    //2.上传文件到HDFS
    @Test
    public void testAddFileToHdfs() throws IOException{
    
    
        //要上传的文件所在本地路径
        Path src = new Path("D:/test.txt");
        //要上传到HDFS的目标路径
        Path dst = new Path("/testFile");
        //上传文件方法
        fs.copyFromLocalFile(src,dst);
        //关闭资源
        fs.close();
    }

    //3.从HDFS下载文件到本地
    @Test
    //从HDFS中复制文件到本地文件系统
    public void testDownloadFiletoLocal() throws IllegalAccessException,IOException {
    
    
        //下载文件
        fs.copyToLocalFile(new Path("/testFile"), new Path("D:/test.txt"));
        fs.close();
    }

    //4.目录操作
    //创建,删除,重命名文件
    @Test
    public void testMkdirAndDeleteAndRename() throws Exception {
    
    
        //创建目录
        fs.mkdirs(new Path("/a/b/c"));
        fs.mkdirs(new Path("/a1/b1/c1"));
        //重命名文件或文件夹
        fs.rename(new Path("/a1"), new Path("a2"));
        //删除文件夹,如果是空文件夹,参数2必须给true
        fs.delete(new Path("/a1"), true);
    }
    //5.查看目录中的文件信息
    //查看目录信息,只显示文件
    @Test
    public void testlistFiles() throws FileNotFoundException,
            IllegalArgumentException,IOException {
    
    
    //获取迭代器对象,这个方法获取文件列表,第一个参数表示获取的路径,第二个参数表示是否递归查询
        RemoteIterator<LocatedFileStatus>listFiles = fs.listFiles(new Path("/"),true);
    while (listFiles.hasNext()){
    
    
        LocatedFileStatus fileStatus = listFiles.next();
        //打印当前文件名
        System.out.println(fileStatus.getPath().getName());
        //打印当前文件块的大小
        System.out.println(fileStatus.getBlockSize());
        //打印当前文件权限
        System.out.println(fileStatus.getPermission());
        //打印当前文件内容的长度
        System.out.println(fileStatus.getLen());
        //获取该文件块信息(包括长度,数据块,datanode信息)
        BlockLocation[] blockLocations = fileStatus.getBlockLocations();
        for (BlockLocation b1: blockLocations) {
    
    
            System.out.println("block-length:"+b1.getLength()+"--"+"block-offset:"+b1.getOffset());
            String[] hosts = b1.getHosts();
            for (String host : hosts) {
    
    
                System.out.println(host);
            }
        }
            System.out.println("分割线----------------");
        }
        
}}

Guess you like

Origin blog.csdn.net/tang5615/article/details/125678042