大数据之_Hadoop工作笔记002---SpringBoot连接Hadoop HDFS进行创建文件夹,添加上传文件,删除文件,下载文件操作

版权声明:本文为博主原创文章,未经博主credreamer 允许不得转载 违者追究法律责任。 https://blog.csdn.net/lidew521/article/details/88012964

   技术交流QQ群【JAVA,.NET,BigData,AI】:170933152

首先,把环境搭好,至今我只搭了非集群的。看另一篇博文,

idea开发

1.创建工程,选择:WEB,把web勾上,选择sql,把jdbc,mysql,mybatis勾上

2.配置文件

E:\IdeaWkSpace\hadoopdemo\src\main\resources\application.properties

spring.application.name=hadoopdemo
spring.profiles.active=dev

E:\IdeaWkSpace\hadoopdemo\src\main\resources\application-dev.properties

server.port=8040

spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://172.19.128.38:3306/test_lidw?useUnicode=true&characterEncoding=utf-8
spring.datasource.username=root
spring.datasource.password=123456

hdfs.path=hdfs://192.168.136.110:8020
hdfs.username=root

3.E:\IdeaWkSpace\hadoopdemo\pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.3.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.crehadoop</groupId>
    <artifactId>demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>demo</name>
    <description>Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-jdbc</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.mybatis.spring.boot</groupId>
            <artifactId>mybatis-spring-boot-starter</artifactId>
            <version>2.0.0</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <!--阿里 FastJson依赖-->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.1</version>
        </dependency>

        <!--阿里 FastJson依赖-->
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.44</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

4.E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\util\HadoopUtil.java 

package com.crehadoop.demo.util;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;

import javax.annotation.PostConstruct;
import java.net.URI;

/**
 * 类或方法的功能描述 :Hadoop工具类
 * @date: 2018-11-28 13:59
 */
@Component
public class HadoopUtil {
    @Value("${hdfs.path}")
    private String path;
    @Value("${hdfs.username}")
    private String username;

    private static String hdfsPath;
    private static String hdfsName;

    /**
     * 获取HDFS配置信息
     * @return
     */
    private static Configuration getConfiguration() {

        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", hdfsPath);
        return configuration;
    }

    /**
     * 获取HDFS文件系统对象
     * @return
     * @throws Exception
     */
    public static FileSystem getFileSystem() throws Exception {
        // 客户端去操作hdfs时是有一个用户身份的,默认情况下hdfs客户端api会从jvm中获取一个参数作为
        // 自己的用户身份 DHADOOP_USER_NAME=hadoop
//        FileSystem hdfs = FileSystem.get(getHdfsConfig()); //默认获取
//        也可以在构造客户端fs对象时,通过参数传递进去
        FileSystem fileSystem = FileSystem.get(new URI(hdfsPath), getConfiguration(), hdfsName);
        return fileSystem;
    }

    @PostConstruct
    public void getPath() {
        hdfsPath = this.path;
    }
    @PostConstruct
    public void getName() {
        hdfsName = this.username;
    }

    public static String getHdfsPath() {
        return hdfsPath;
    }

    public String getUsername() {
        return username;
    }
}

5.E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\core\Result.java 

package com.crehadoop.demo.core;

import com.alibaba.fastjson.JSON;

/**
 * 统一API响应结果封装
 */
public class Result {
    private int code;
    private String message;
    private boolean success = true;
    private Object data;

    public Result setCode(ResultCode resultCode) {
        this.code = resultCode.code();
        return this;
    }

    public Result setCode(String resultCode) {
        this.code = Integer.valueOf(resultCode);
        return this;
    }

    public boolean isSuccess() {
        return success;
    }

    public Result setSuccess(boolean success) {
        this.success = success;
        return this;
    }

    public int getCode() {

        return code;
    }

    public String getMessage() {

        return message;
    }

    public Result setMessage(String message) {
        this.message = message;
        return this;
    }

    public Object getData() {
        return data;
    }

    public Result setData(Object data) {
        this.data = data;
        return this;
    }

    @Override
    public String toString() {
        //SerializerFeature.WRITE_MAP_NULL_FEATURES
        return JSON.toJSONString(this);
    }
}

6.E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\core\ResultCode.java 

package com.crehadoop.demo.core;

/**
 * 响应码枚举,参考HTTP状态码的语义
 */
public enum ResultCode {
    SUCCESS(200),//成功
    FAIL(400),//失败
    UNAUTHORIZED(401),//请求超时
    AUTHFAILED(402),//该用户没有权限登录此系统
    NOTHISUSER(403),//用户不存在
    NOT_FOUND(404),//接口不存在
    ERRSYSTEM(405),//错误的系统
    ERRPWD(406),//密码错误
    INTERNAL_SERVER_ERROR(500);//服务器内部错误

    private final int code;

    ResultCode(int code) {
        this.code = code;
    }

    public int code() {
        return code;
    }
}

7.E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\core\ResultGenerator.java 

package com.crehadoop.demo.core;

/**
 * 响应结果生成工具
 */
public class ResultGenerator {
    private static final String DEFAULT_SUCCESS_MESSAGE = "SUCCESS";

    public static Result genSuccessResult() {
        return new Result()
                .setCode(ResultCode.SUCCESS)
                .setMessage(DEFAULT_SUCCESS_MESSAGE);
    }

    public static Result genSuccessResult(Object data) {
        return new Result()
                .setCode(ResultCode.SUCCESS)
                .setMessage(DEFAULT_SUCCESS_MESSAGE)
                .setData(data);
    }

    public static Result genFailResult(String message) {
        return new Result()
                .setCode(ResultCode.FAIL)
                .setMessage(message);
    }
}

8.E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\controller\HadoopController.java

package com.crehadoop.demo.controller;

import com.crehadoop.demo.core.Result;
import com.crehadoop.demo.util.HadoopUtil;
import org.apache.commons.lang3.StringUtils;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

/**
 * 类或方法的功能描述 :TODO
 * @date: 2018-11-28 13:51
 */
@RestController
@RequestMapping("/hadoop")
public class HadoopController {

    /**
     * 创建文件夹
     * @param path
     * @return
     * @throws Exception
     */
    @PostMapping("/mkdir")
    public Result mkdir(@RequestParam("path") String path) throws Exception {
        if (StringUtils.isEmpty(path)) {
            Result result=new Result();
            result.setCode("500");
            result.setMessage("请求参数为空");
            result.setData("请求参数为空");
            result.setSuccess(false);
            return result;
        }
        // 文件对象
        FileSystem fs = HadoopUtil.getFileSystem();
        // 目标路径
        Path newPath = new Path(path);
        // 创建空文件夹
        boolean isOk = fs.mkdirs(newPath);
        fs.close();
        if (isOk) {
            Result result=new Result();
            result.setMessage("create dir success");
            result.setCode("200");
            result.setSuccess(true);
            return result;
        } else {
            Result result=new Result();
            result.setMessage("create dir fail");
            result.setCode("500");
            result.setSuccess(false);
            return result;
        }
    }
}

9.完事启动在idea中就行

创建个文件夹:
postman 访问:localhost:8040/hadoop/mkdir

body写上:path 、/demo

访问

10.然后去浏览器http://192.168.136.110:50070/explorer.html#/

看到新创建了一个文件夹

11.可能会出现的两个问题:

java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
    at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:549) ~[hadoop-common-3.1.1.jar:na]
 

Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
    at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:469) ~[hadoop-common-3.1.1.jar:na]
    at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:440) ~[hadoop-common-3.1.1.jar:na]
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:517) ~[hadoop-common-3.1.1.jar:na]
    ... 67 common frames omitted

org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=DrWho, access=WRITE, inode="hadoop":hadoop:supergroup:rwxr-xr-x
 

上面这三个错误这样解决:

[root@localhost hadoop]# pwd
/usr/hadoop/hadoop-3.1.2/etc/hadoop
[root@localhost hadoop]# 
中的

[root@localhost hadoop]# cat hdfs-site.xml 

添加下面这部分:

  <property>
    <name>dfs.permissions</name>
    <value>false</value>
    <description>
    If "true", enable permission checking in HDFS.
    If "false", permission checking is turned off,
    but all other behavior is unchanged.
    Switching from one parameter value to the other does not change the mode,
    owner or group of files or directories.
    </description>
  </property>

完整代码:

[root@localhost hadoop]# cat hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

 <property>
    	<name>dfs.name.dir</name>
    	<value>/usr/hadoop/hdfs/name</value>
    	<description>namenode上存储hdfs名字空间元数据 </description> 
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/usr/hadoop/hdfs/data</value>
        <description>datanode上数据块的物理存储位置</description>
    </property>


    <!-- 设置hdfs副本数量 -->
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

    <property>
       <name>dfs.http.address</name>
       <value>0.0.0.0:50070</value>
    </property>

   <property>
	<name>dfs.permissions</name>
	<value>false</value>
	<description>
	If "true", enable permission checking in HDFS.
	If "false", permission checking is turned off,
	but all other behavior is unchanged.
	Switching from one parameter value to the other does not change the mode,
	owner or group of files or directories.
	</description>
  </property>

</configuration>
[root@localhost hadoop]# pwd
/usr/hadoop/hadoop-3.1.2/etc/hadoop

然后继续报错:

java.net.ConnectException: Call From A28150121040057/192.168.136.1 to 192.168.136.110:8020 failed on connection exception: java.net.ConnectException: Connection refused: no further information; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

下面这样改:

把下面这个配置文件 修改一下:

[root@localhost hadoop]# cat core-site.xml 

把localhost,修改成对应的ip地址:192.168.136.110:8020

完整代码:

[root@localhost hadoop]# cat core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- 指定HDFS老大(namenode)的通信地址 默认8020端口 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://192.168.136.110:8020</value>
    </property>
    <!-- 指定hadoop运行时产生文件的存储路径 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/hadoop/tmp</value>
    </property>
</configuration>
[root@localhost hadoop]# 

另外防火墙记得关闭

12.继续,刚才是创建文件夹:

下面上传文件 

 /**
     * 创建文件
     * @param path
     * @return
     * @throws Exception
     */
    @PostMapping("/createFile")
    public Result createFile(@RequestParam("path") String path, @RequestParam("file") MultipartFile file) throws Exception {
        if (StringUtils.isEmpty(path) || null == file.getBytes()) {
            Result result=new Result();
            result.setMessage("请求参数为空");
            result.setCode("500");
            result.setSuccess(false);
            return result;
        }
        String fileName = file.getOriginalFilename();
        FileSystem fs = HadoopUtil.getFileSystem();
        // 上传时默认当前目录,后面自动拼接文件的目录
        Path newPath = new Path(path + "/" + fileName);
        // 打开一个输出流
        FSDataOutputStream outputStream = fs.create(newPath);
        outputStream.write(file.getBytes());
        outputStream.close();
        fs.close();
        Result result=new Result();
        result.setMessage("create file success");
        result.setCode("200");
        result.setSuccess(true);
        return result;
    }

然后用postman发送:

localhost:8040/hadoop/createFile

body:

[{"key":"path","value":"/test","description":"","type":"text","enabled":true}] 

[{"key":"file","value":{"0":{}},"description":"","type":"file","enabled":true}]

其实就是传了两个变量一个是path 、/test

一个是file 注意类型是file 选择一个文件就可以了,点击send就可以上传成功

然后:

浏览器http://192.168.136.110:50070/explorer.html#/test

查看上传的文件

13.读取文件

  /**
     * 读取HDFS文件内容
     * @param path
     * @return
     * @throws Exception
     */
    @PostMapping("/readFile")
    public Result readFile(@RequestParam("path") String path) throws Exception {
        FileSystem fs = HadoopUtil.getFileSystem();
        Path newPath = new Path(path);
        InputStream in = null;
        try {
            in = fs.open(newPath);
            IOUtils.copyBytes(in, System.out, 4096);

        } finally {
            IOUtils.closeStream(in);
            fs.close();
        }

        Result result=new Result();
        result.setMessage("读取成功");
        result.setCode("200");
        result.setSuccess(true);
        result.setData(System.out.toString());
        return result;
    }

添加这个方法,然后访问;

localhost:8040/hadoop/readFile

path传入/test/test1.log 

点击send,成功了,但是看不到具体内容,这里可能需要转换一下,到时候自己弄吧,这里默认是输出到控制台的流,printstream

但是控制台好像也没输出

注意到,之前原文,返回数据的时候没有用Result这个类,而是用了下面这两个,从网上找了找:

贴出来,防止有用

  1. import com.sunvalley.hadoop.VO.ApiMsgEnum;

  2. import com.sunvalley.hadoop.VO.BaseReturnVO;

  3. E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\core\BaseReturnVO.java

    package com.crehadoop.demo.core;
    
    import java.io.Serializable;
    
    public class BaseReturnVO implements Serializable {
        protected int resCode;
        protected String resDes;
        protected Object data;
    
        public int getResCode() {
            return this.resCode;
        }
    
        public void setResCode(int resCode) {
            this.resCode = resCode;
        }
    
        public String getResDes() {
            return this.resDes;
        }
    
        public void setResDes(String resDes) {
            this.resDes = resDes;
        }
    
        public Object getData() {
            return this.data;
        }
    
        public void setData(Object data) {
            this.data = data;
        }
    
        public BaseReturnVO() {
        }
    
        public BaseReturnVO(int code, String msg) {
            this.resCode = code;
            this.resDes = msg;
            this.data = "";
        }
    
        public BaseReturnVO(int code, Exception e) {
            this.resCode = code;
            this.resDes = e.getMessage();
            this.data = "";
        }
    
        public BaseReturnVO(int code, String msg, Exception e) {
            this.resCode = code;
            this.resDes = msg;
            this.data = "";
        }
    
        public BaseReturnVO(Object data) {
            this.resCode = ApiMsgEnum.OK.getResCode();
            this.resDes = ApiMsgEnum.OK.getResDes();
            this.data = data;
        }
    
        public BaseReturnVO(Exception exp) {
            this.resCode = 500;
            this.resDes = exp.getMessage();
            this.data = "";
        }
    
        public BaseReturnVO(ApiMsgEnum msgEnum) {
            this.resCode = msgEnum.getResCode();
            this.resDes = msgEnum.getResDes();
            this.data = "";
        }
    }
    

    E:\IdeaWkSpace\hadoopdemo\src\main\java\com\crehadoop\demo\core\ApiMsgEnum.java 

  4. package com.crehadoop.demo.core;
    
    import java.util.HashMap;
    import java.util.LinkedHashMap;
    import java.util.Map;
    
    public enum ApiMsgEnum {
        OK(200, "OK"),
        BAD_REQUEST(400, "Bad Request"),
        NOT_FOUND(404, "Not Found"),
        UNAUTHORIZED(401, "Unauthorized"),
        FORBIDDEN(403, "Forbidden"),
        INTERNAL_SERVER_ERROR(500, "Internal Server Error"),
        BAD_GATEWAY(502, "Bad Gateway"),
        SERVICE_UNAVAILABLE(503, "Service Unavailable"),
        GATEWAY_TIMEOUT(504, "Gateway Timeout"),
        COMMON_SERVER_ERROR(10000, "COMMON_SERVER_ERROR"),
        USER_SERVER_ERROR(11000, "USER_SERVER_ERROR"),
        USER_REDIS_ERROR(11001, "USER_REDIS_ERROR"),
        PRODUCT_SERVER_ERROR(12000, "PRODUCT_SERVER_ERROR"),
        PRODUCT_PARAMETER_ERROR(12001, "PRODUCT_PARAMETER_ERROR"),
        PRODUCT_BOM_ERROR(12002, "PRODUCT_SERVER_ERROR"),
        ORDER_SERVER_ERROR(13000, "ORDER_SERVER_ERROR"),
        CUSTOMER_SERVER_ERROR(13000, "CUSTOMER_SERVER_ERROR");
    
        private int resCode;
        private String resDes;
        public static Map<Integer, String> apiMsgMap = new HashMap();
    
        private ApiMsgEnum(int code, String msg) {
            this.resCode = code;
            this.resDes = msg;
        }
    
        private static Map<Integer, String> getAll() {
            Map<Integer, String> retMap = new LinkedHashMap();
            ApiMsgEnum[] enumArr = values();
            ApiMsgEnum[] var2 = enumArr;
            int var3 = enumArr.length;
    
            for(int var4 = 0; var4 < var3; ++var4) {
                ApiMsgEnum aEnum = var2[var4];
                retMap.put(aEnum.getResCode(), aEnum.getResDes());
            }
    
            return retMap;
        }
    
        public int getResCode() {
            return this.resCode;
        }
    
        public void setResCode(int resCode) {
            this.resCode = resCode;
        }
    
        public String getResDes() {
            return this.resDes;
        }
    
        public void setResDes(String resDes) {
            this.resDes = resDes;
        }
    
        static {
            apiMsgMap = getAll();
        }
    }
    

    14.

    读取目录信息

    新建一个readPathInfo方法,读取/demo下面的信息

  5.   /**
         * 读取HDFS目录信息
         * @param path
         * @return
         * @throws Exception
         */
        @PostMapping("/readPathInfo")
        public BaseReturnVO readPathInfo(@RequestParam("path") String path) throws Exception {
            FileSystem fs = HadoopUtil.getFileSystem();
            Path newPath = new Path(path);
            FileStatus[] statusList = fs.listStatus(newPath);
            List<Map<String, Object>> list = new ArrayList<>();
            if (null != statusList && statusList.length > 0) {
                for (FileStatus fileStatus : statusList) {
                    Map<String, Object> map = new HashMap<>();
                    map.put("filePath", fileStatus.getPath());
                    map.put("fileStatus", fileStatus.toString());
                    list.add(map);
                }
                return new BaseReturnVO(list);
            } else {
                return new BaseReturnVO("目录内容为空");
            }
        }

    postman测试:

  6. localhost:8040/hadoop/readPathInfo

  7. path /test

  8. 结果:可以了

  9. {
        "resCode": 200,
        "resDes": "OK",
        "data": [
            {
                "fileStatus": "HdfsNamedFileStatus{path=hdfs://192.168.136.110:8020/test/test1.log; isDirectory=false; length=1583; replication=3; blocksize=134217728; modification_time=1551349849105; access_time=1551353462724; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}",
                "filePath": {
                    "name": "test1.log",
                    "parent": {
                        "name": "test",
                        "parent": {
                            "name": "",
                            "parent": null,
                            "absolute": true,
                            "root": true,
                            "absoluteAndSchemeAuthorityNull": false,
                            "uriPathAbsolute": true
                        },
                        "absolute": true,
                        "root": false,
                        "absoluteAndSchemeAuthorityNull": false,
                        "uriPathAbsolute": true
                    },
                    "absolute": true,
                    "root": false,
                    "absoluteAndSchemeAuthorityNull": false,
                    "uriPathAbsolute": true
                }
            },
            {
                "fileStatus": "HdfsNamedFileStatus{path=hdfs://192.168.136.110:8020/test/test2.log; isDirectory=false; length=1583; replication=3; blocksize=134217728; modification_time=1551349922655; access_time=1551349922230; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}",
                "filePath": {
                    "name": "test2.log",
                    "parent": {
                        "name": "test",
                        "parent": {
                            "name": "",
                            "parent": null,
                            "absolute": true,
                            "root": true,
                            "absoluteAndSchemeAuthorityNull": false,
                            "uriPathAbsolute": true
                        },
                        "absolute": true,
                        "root": false,
                        "absoluteAndSchemeAuthorityNull": false,
                        "uriPathAbsolute": true
                    },
                    "absolute": true,
                    "root": false,
                    "absoluteAndSchemeAuthorityNull": false,
                    "uriPathAbsolute": true
                }
            },
            {
                "fileStatus": "HdfsNamedFileStatus{path=hdfs://192.168.136.110:8020/test/test3.log; isDirectory=false; length=1583; replication=3; blocksize=134217728; modification_time=1551349933463; access_time=1551349933035; owner=root; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}",
                "filePath": {
                    "name": "test3.log",
                    "parent": {
                        "name": "test",
                        "parent": {
                            "name": "",
                            "parent": null,
                            "absolute": true,
                            "root": true,
                            "absoluteAndSchemeAuthorityNull": false,
                            "uriPathAbsolute": true
                        },
                        "absolute": true,
                        "root": false,
                        "absoluteAndSchemeAuthorityNull": false,
                        "uriPathAbsolute": true
                    },
                    "absolute": true,
                    "root": false,
                    "absoluteAndSchemeAuthorityNull": false,
                    "uriPathAbsolute": true
                }
            }
        ]
    }

    15.

  10. 读取文件列表

    新建一个方法listFile读取/demo下的所有文件

  11.  /**
         * 读取文件列表
         * @param path
         * @return
         * @throws Exception
         */
        @PostMapping("/listFile")
        public BaseReturnVO listFile(@RequestParam("path") String path) throws Exception {
            if (StringUtils.isEmpty(path)) {
                return new BaseReturnVO("请求参数为空");
            }
            FileSystem fs = HadoopUtil.getFileSystem();
            Path newPath = new Path(path);
            // 递归找到所有文件
            RemoteIterator<LocatedFileStatus> filesList = fs.listFiles(newPath, true);
            List<Map<String, String>> returnList = new ArrayList<>();
            while (filesList.hasNext()) {
                LocatedFileStatus next = filesList.next();
                String fileName = next.getPath().getName();
                Path filePath = next.getPath();
                Map<String, String> map = new HashMap<>();
                map.put("fileName", fileName);
                map.put("filePath", filePath.toString());
                returnList.add(map);
            }
            fs.close();
            return new BaseReturnVO(returnList);
        }

    测试:

  12. postman

  13. localhost:8040/hadoop/listFile

  14. path /test

  15. 可以了:

  16. {
        "resCode": 200,
        "resDes": "OK",
        "data": [
            {
                "fileName": "test1.log",
                "filePath": "hdfs://192.168.136.110:8020/test/test1.log"
            },
            {
                "fileName": "test2.log",
                "filePath": "hdfs://192.168.136.110:8020/test/test2.log"
            },
            {
                "fileName": "test3.log",
                "filePath": "hdfs://192.168.136.110:8020/test/test3.log"
            }
        ]
    }

    16.

    重命名文件 

    新建一个renameFile方法

  17.   /**
         * 重命名文件
         * @param oldName
         * @param newName
         * @return
         * @throws Exception
         */
        @PostMapping("/renameFile")
        public BaseReturnVO renameFile(@RequestParam("oldName") String oldName, @RequestParam("newName") String newName) throws Exception {
            if (StringUtils.isEmpty(oldName) || StringUtils.isEmpty(newName)) {
                return new BaseReturnVO("请求参数为空");
            }
            FileSystem fs = HadoopUtil.getFileSystem();
            Path oldPath = new Path(oldName);
            Path newPath = new Path(newName);
            boolean isOk = fs.rename(oldPath, newPath);
            fs.close();
            if (isOk) {
                return new BaseReturnVO("rename file success");
            } else {
                return new BaseReturnVO("rename file fail");
            }
        }

     postman测试

  18. localhost:8040/hadoop/renameFile

    oldName /test/test1.log

    newName /test/test01.log

    执行可以了,去:

    http://192.168.136.110:50070/explorer.html#/test 看结果

    17.

    删除文件 

     新建一个deleteFile 方法

  19.  /**
         * 删除文件
         *
         * @param path
         * @return
         * @throws Exception
         */
        @PostMapping("/deleteFile")
        public BaseReturnVO deleteFile(@RequestParam("path") String path) throws Exception {
            FileSystem fs = HadoopUtil.getFileSystem();
            Path newPath = new Path(path);
            boolean isOk = fs.deleteOnExit(newPath);
            fs.close();
            if (isOk) {
                return new BaseReturnVO("delete file success");
            } else {
                return new BaseReturnVO("delete file fail");
            }
        }
     

     postman测试:

  20. localhost:8040/hadoop/deleteFile

    path /test/test01.log

    可以了,去http://192.168.136.110:50070/explorer.html#/test

    确认

    18.上传本地文件到hdfs

    新建一个uploadFile方法,把我本地D盘的hello.txt文件上传上去

  21.     /**
         * 上传文件
         *
         * @param path
         * @param uploadPath
         * @return
         * @throws Exception
         */
        @PostMapping("/uploadFile")
        public BaseReturnVO uploadFile(@RequestParam("path") String path, @RequestParam("uploadPath") String uploadPath) throws Exception {
            FileSystem fs = HadoopUtil.getFileSystem();
            // 上传路径
            Path clientPath = new Path(path);
            // 目标路径
            Path serverPath = new Path(uploadPath);
    
            // 调用文件系统的文件复制方法,第一个参数是否删除原文件true为删除,默认为false
            fs.copyFromLocalFile(false, clientPath, serverPath);
            fs.close();
            return new BaseReturnVO("upload file success");
        }

    postman测试:

  22. localhost:8040/hadoop/uploadFile

  23. path E:/test1.log

  24. uploadPath /test

  25. 可以了,去http://192.168.136.110:50070/explorer.html#/test查看上传的文件

19.下载hdfs文件到本地

新建一个download方法下载hdfs上的文件到本地的D盘的hdfs文件夹中

/**
     * 下载文件
     * @param path
     * @param downloadPath
     * @return
     * @throws Exception
     */
    @PostMapping("/downloadFile")
    public BaseReturnVO downloadFile(@RequestParam("path") String path, @RequestParam("downloadPath") String downloadPath) throws Exception {
        FileSystem fs = HadoopUtil.getFileSystem();
        // 上传路径
        Path clientPath = new Path(path);
        // 目标路径
        Path serverPath = new Path(downloadPath);
 
        // 调用文件系统的文件复制方法,第一个参数是否删除原文件true为删除,默认为false
        fs.copyToLocalFile(false, clientPath, serverPath);
        fs.close();
        return new BaseReturnVO("download file success");
    }

测试:

localhost:8040/hadoop/downloadFile

path  /test/test1.log

downloadPath  D:\\

报错了:

java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
    at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:469) ~[hadoop-common-3.1.1.jar:na]
    at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:440) ~[hadoop-common-3.1.1.jar:na]
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:517) ~[hadoop-common-3.1.1.jar:na]
 

下面是找了个解决方法:

当我运行这个java文件的时候,报错“HADOOP_HOME and hadoop.home.dir are unset”。然后我就懵逼了。

解决步骤

题外话:我一直认为,我本地的eclipse就是个调用远程hadoop的作用,所以本地的windows操作系统中不需要安装hadoop了。所以我看到这个HADOOP_HOME的时候一直不明白,难道我还需要在我本地安装一个hadoop吗????

其实答案是:是的。我倒霉就倒霉在没有在本地放一个hadoop。这个hadoop不需要安装,不需要像网上说的得先安装一个什么Cygwin(说个题外话,我始终认为,这个东西就是在windows上模拟一个类似linux的环境出来。然并卵。)。只需要找一个hadoop2.8的二进制包(官网上这个东西大概400多兆),解压放到你windows下的某处即可。

然后重点在这里,仅仅解压缩了这个hadoop的包还不够,还需要这俩东西:hadoop.dll和winutils.exe及其附属(hadoop.exp、hadoop.lib、hadoop.pdb、libwinutils.lib、winutils.pdb)。缺少会报错,我在报错前已经把这俩及其相关的东西都放上了,所以不知会报啥错。

有点扯远了,还说在本地放hadoop程序包的问题。只要解压了程序包,同时在你的环境变量中的系统变量中配置了HADOOP_HOME并且指向hadoop程序包的本目录即可(如我本机就是F:\work\software\linux\hadoop-2.8.0),并且在系统变量的path中附加了%HADOOP_HOME%/bin(这里一定要指向到bin这一级才行)就能在你的windows上运行wordcount程序了。

%HADOOP_HOME%\sbin就能在你的windows上运行hadoop程序了。如果还不行则重启电脑试试。

有的人仅仅配置了上一步还不够,还需要重启一下电脑才行。我就属于这部分的。。。

然后后面可能还会出一些其他jar包找不到的错误,从网上找一下放上就全都ok了。

感想

本地的hadoop其实真的真的没有参与干活儿。因为我连启动都没启动过本地的hadoop。但是在本地的hadoop所属盘符的根目录下生成了一系列的目录。虽然不知是啥原因,但我瞎想可能主要起到一个临时缓存的目的。

生成这个缓存路径的原因应该是hadoop本身系统的相关设置。或者准确来说,是mapreduce本身的设置决定的。要有一个缓存路径。所以在windows上设置hadoop_home的目的就是为了给这个缓存用的。

而且如果是在linux系统上执行wordcount的jar包的时候,应该会在服务器上生成这么一个缓存路径。

上面这个配置完重启之后再试试吧。

20.hdfs之间文件复制

新建一个copyFile方法把/java/test.txt 文件复制到/demo/test.txt下面

/**
     * HDFS文件复制
     * @param sourcePath
     * @param targetPath
     * @return
     * @throws Exception
     */
    @PostMapping("/copyFile")
    public BaseReturnVO copyFile(@RequestParam("sourcePath") String sourcePath, @RequestParam("targetPath") String targetPath) throws Exception {
        FileSystem fs = HadoopUtil.getFileSystem();
        // 原始文件路径
        Path oldPath = new Path(sourcePath);
        // 目标路径
        Path newPath = new Path(targetPath);

        FSDataInputStream inputStream = null;
        FSDataOutputStream outputStream = null;
        try {
            inputStream = fs.open(oldPath);
            outputStream = fs.create(newPath);

            IOUtils.copyBytes(inputStream, outputStream, 1024*1024*64,false);
            return new BaseReturnVO("copy file success");
        } finally {
            inputStream.close();
            outputStream.close();
            fs.close();
        }
    }

 测试:

localhost:8040/hadoop/copyFile

sourcePath /test/test1.log

targetPath /demo/test1.log

去:

http://192.168.136.110:50070/explorer.html#/demo查看,可以了


 

猜你喜欢

转载自blog.csdn.net/lidew521/article/details/88012964