大数据时代之java操作hdsf

前面已经讲过hadoop伪分布式和分布式环境搭建,参考大数据时代之Hadoop集群搭建

本来想直接搞java代码操作hdfs的,想了想还是先简单的复习下命令行操作hdfs吧。其实hdfs操作起来是操作linux系统的文件个人认为很相似,只不过命令前面要加个hdfs dfs -或者hadoop fs -,如:

#新建目录
hdfs dfs -mkdir <path>
或
hadoop fs -mkdir /test

这里就不得不问hadoop fs和hdfs dfs的区别是什么了。
参考:Hadoop:hadoop fs、hadoop dfs与hdfs dfs命令的区别

言归正传,下面讲解java如何操作hdsf文件系统:
pom.xml,hadoop依赖版本尽量保证和服务器上hadoop版本一致

<properties>
   <hadoop-version>2.6.5</hadoop-version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop-version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>${hadoop-version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>${hadoop-version}</version>
    </dependency>

</dependencies>

HdfsUtil.java

public class HdfsUtil {

    private Configuration configuration;

    private FileSystem fileSystem;

    /**
     * init
     * @param url
     * @param user
     * @return
     * @throws IOException
     * @throws InterruptedException
     */
    public static HdfsUtil getUtil(String url,String user) throws IOException, InterruptedException {
        HdfsUtil util = new HdfsUtil();
        util.configuration = new Configuration();
        util.fileSystem =FileSystem.get(URI.create(url),util.configuration,user);
        return util;
    }

    /**
     * 创建目录
     * @param filePath
     * @return
     */
    public boolean createPath(String filePath) {
        boolean b = false;
        Path path = new Path(filePath);
        try {
            b = this.fileSystem.mkdirs(path);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                this.fileSystem.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return b;
    }
    /**
     <br>功能描述:  判断该路径是否存在,所指路径是文件还是文件夹
     <br>处理逻辑:
     <br>作者: lwl [email protected] 2018/12/25 15:08
     <br>修改记录: {修改人 修改原因 修改时间}
     * @param
     * @throws
     * @return int   0:不存在  1:文件  2:文件夹
     * @see #
     */
    public int checkFile(String filePath) {
        Path path = new Path(filePath);
        int result = 0;
        try {
            if(this.fileSystem.exists(path)){
                if(this.fileSystem.isDirectory(path)){
                    result = 2;
                }else{
                    result = 1;
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                this.fileSystem.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return result;
    }

    /**
     * 上传文件
     * @param sourcePath
     * @param savePath
     */
    public void uploadFile(String sourcePath, String savePath){
        Path source = new Path(sourcePath);
        Path disc = new Path(savePath);
        try {
            this.fileSystem.copyFromLocalFile(source,disc);
        } catch (IOException e) {
            e.printStackTrace();
        }finally {
            try {
                this.fileSystem.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    public void uploadFile(InputStream input, String savePath) throws IOException {
        this.fileSystem.createNewFile(new Path(savePath));
        Path inFile = new Path(savePath);
        FSDataOutputStream output = this.fileSystem.create(inFile);
        IOUtils.copyBytes(input,output,1024*1024*64,false);
        output.close();
    }

    /**
     * 下载文件
     * @param sourcePath
     * @param out
     * @throws IOException
     */
    public void dowonloadFile(String sourcePath, OutputStream out) throws IOException {
        this.fileSystem.createNewFile(new Path(sourcePath));
        Path inFile = new Path(sourcePath);
        FSDataInputStream input = this.fileSystem.open(inFile);
        IOUtils.copyBytes(input,out,1024*1024*64,false);
        input.close();
    }

    public static void main(String[] args) throws IOException, InterruptedException {
        String url = "hdfs://my-cdh-master:9000";//注意端口9000跟core-site.xml的fs.defaultFS配置匹配
        HdfsUtil util = HdfsUtil.getUtil(url,"root");
        util.uploadFile("H:\\VMachines\\my_cdh_slave1\\vmware.log","/test/vmware.log");
    }

hdfs的更多操作根据具体需要参考api
至此java操作hdfs完成~~~~

发布了41 篇原创文章 · 获赞 22 · 访问量 5万+

猜你喜欢

转载自blog.csdn.net/qq1049545450/article/details/103294637