大数据运维HDFS

HDFS 题:

  1. 在 HDFS 文件系统的根目录下创建递归目录“1daoyun/file”,将附件中的BigDataSkills.txt 文件,上传到 1daoyun/file 目录中,使用相关命令查看文件系统中 1daoyun/file 目录的文件列表信息。

[root@master ~]# su hdfs
[hdfs@master ~]$ hadoop fs -mkdir -p /1daoyun/file
[hdfs@master ~]$ hadoop fs -put /opt/BigDataSkills.txt /1daoyun/file
[hdfs@master ~]$ hadoop fs -ls /1daoyun/file/

Found 1 items
-rw-r–r-- 3 hdfs hdfs 144 2019-05-03 13:44 /1daoyun/file/BigDataSkills.txt

  1. 在 HDFS 文件系统的根目录下创建递归目录“1daoyun/file”,将附件中的BigDataSkills.txt 文件,上传到 1daoyun/file 目录中,并使用 HDFS 文件系统检查工具检查文件是否受损。
    [root@master ~]# su hdfs
    [hdfs@master ~]$ hadoop fs -mkdir -p /1daoyun/file
    [hdfs@master ~]$ hadoop fs -put /opt/BigDataSkills.txt /1daoyun/file
    [hdfs@master ~]$ hadoop fsck /1daoyun/file/BigDataSkills.txt
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

Connecting to namenode via http:// master.hadoop:50070/fsck?ugi=hdfs&path=%2F1daoyun%2Ffile%2FBigDataSkills.txt
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.103 for path /1daoyun/file/BigDataSkills.txt at Fri May 03 14:29:36 UTC 2019
.
/1daoyun/file/BigDataSkills.txt: Under replicated BP-1109077204-10.0.6.135-1524734809311:blk_1073742063_1239. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
Status: HEALTHY
Total size: 144 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 144 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 1 (33.333332 %)
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Fri May 03 14:29:36 UTC 2019 in 0 milliseconds

The filesystem under path ‘/1daoyun/file/BigDataSkills.txt’ is HEALTHY

  1. 在 HDFS 文件系统的根目录下创建递归目录“1daoyun/file”,将附件中的BigDataSkills.txt 文件,上传到 1daoyun/file 目录中,上传过程指定BigDataSkills.txt 文件在 HDFS 文件系统中的复制因子为 2,并使用 fsck 工具检查存储块的副本数。
    [root@master ~]# su hdfs
    [hdfs@master ~]$ hadoop fs -mkdir -p /1daoyun/file
    [hdfs@master ~]$ hadoop fs -D dfs.replication=2 -put /opt/BigDataSkills.txt /1daoyun/file
    [hdfs@master ~]$ hadoop fsck /1daoyun/file/BigDataSkills.txt
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

Connecting to namenode via http:// master.hadoop:50070/fsck?ugi=hdfs&path=%2F1daoyun%2Ffile%2FBigDataSkills.txt
FSCK started by hdfs (auth:SIMPLE) from /10.0.0.103 for path /1daoyun/file/BigDataSkills.txt at Sat May 04 10:04:11 UTC 2019
.Status: HEALTHY
Total size: 144 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 144 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Sat May 04 10:04:11 UTC 2019 in 1 milliseconds

The filesystem under path ‘/1daoyun/file/BigDataSkills.txt’ is HEALTHY

  1. HDFS 文件系统的根目录下存在一个/apps 的文件目录,要求开启该目录的可创建快照功能,并为该目录文件创建快照,快照名称为 apps_1daoyun,使用相关命令查看该快照文件的列表信息。
    [hdfs@master ~]$ hadoop dfsadmin -allowSnapshot /apps
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

Allowing snaphot on /apps succeeded
[hdfs@master ~]$ hadoop fs -createSnapshot /apps apps_1daoyun
Created snapshot /apps/.snapshot/apps_1daoyun
[hdfs@master ~]$ hadoop fs -ls /apps/.snapshot
Found 1 items
drwxrwxrwx - hdfs hdfs 0 2019-05-04 10:16 /apps/.snapshot/apps_1daoyun

  1. 当 Hadoop 集群启动的时候,会首先进入到安全模式的状态,该模式默认30 秒后退出。当系统处于安全模式时,只能对 HDFS 文件系统进行读取,无法进行写入修改删除等的操作。现假设需要对 Hadoop 集群进行维护,需要使集群进入安全模式的状态,并检查其状态。
    [hdfs@master ~]$ hadoop dfsadmin -safemode enter
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

Safe mode is ON
[hdfs@master ~]$ hadoop dfsadmin -safemode get
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Safe mode is ON

  1. 为了防止操作人员误删文件,HDFS 文件系统提供了回收站的功能,但过多的垃圾文件会占用大量的存储空间。要求在先电大数据平台的 WEB 界面将 HDFS 文件系统回收站中的文件彻底删除的时间间隔为 7 天。

    高级 core-site
    fs.trash.interval
    10080

  2. 为了防止操作人员误删文件,HDFS 文件系统提供了回收站的功能,但过多的垃圾文件会占用大量的存储空间。要求在 Linux Shell 中使用“vi”命令修改相应的配置文件以及参数信息,关闭回收站功能。完成后,重启相应的服务。
    [root@master ~]# vi /etc/hadoop/2.6.1.0-129/0/core-site.xml
    <property>
    <name>fs.trash.interval</name>
    <value>0</value>
    </property>

[root@master ~]# su hdfs
[hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh stop namenode
stopping namenode
[hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-master.out
[hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh stop datanode
stopping datanode
[hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-master.out

  1. Hadoop 集群中的主机在某些情况下会出现宕机或者系统损坏的问题,一旦遇到这些问题,HDFS 文件系统中的数据文件难免会产生损坏或者丢失,为了保证 HDFS 文件系统的可靠性,现需要在先电大数据平台的 WEB 界面将集群的冗余复制因子修改为 5。

    General
    Block replication 5

  2. Hadoop 集群中的主机在某些情况下会出现宕机或者系统损坏的问题,一旦遇到这些问题,HDFS 文件系统中的数据文件难免会产生损坏或者丢失,为了保证 HDFS 文件系统的可靠性,需要将集群的冗余复制因子修改为 5,在 Linux Shell 中使用“vi”命令修改相应的配置文件以及参数信息,完成后,重启相应的服务。
    [root@master ~]# vi /etc/hadoop/2.6.1.0-129/0/hdfs-site.xml
    <property>
    <name>dfs.replication</name>
    <value>5</value>
    </property>
    [hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh stop namenode
    stopping namenode
    [hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode
    starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-master.out
    [hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh stop datanode
    stopping datanode
    [hdfs@master ~]$ /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode
    starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-master.out

10.使用命令查看 hdfs 文件系统中/tmp 目录下的目录个数,文件个数和文件总大小。
[hdfs@master ~]$ hadoop fs -count /tmp
13 1 2073 /tmp

猜你喜欢

转载自blog.csdn.net/mn525520/article/details/93773021