hadoop self-study diary-3.hadoop hdfs operation

hadoop self-study diary-3.hadoop hdfs operation

Build environment

I use a simple windows 7 notebook and use VirtualBox to create a Centos virtual machine to install Hadoop

VirtualBox:6.0.8 r130520 (Qt5.6.2)
CentOS:CentOS Linux release 7.6.1810 (Core)
jdk:1.8.0_202
hadoop:2.6.5

Cluster environment

  • A master node, serving as NameNode in hdfs and ResourceManager in yarn.
  • Three data nodes, data1, data2, and data3, act as DataNode in hdfs and NodeManager in yarn.
name ip hdfs yarn
master 192.168.37.200 NameNode ResourceManager
data1 192.168.37.201 DataNode NodeManager
data2 192.168.37.202 DataNode NodeManager
data3 192.168.37.203 DataNode NodeManager

Basic commands

Hdfs operation has two syntax, one is: hadoop fsThis command is not only operable hdfs, other file systems can also be used; the other is: hdfs dfsspecifically for hdfs distributed file system.

Here I use the wider range as hadoop fsan example

1. View the catalog

Use the -lscommand to view:

[root@master ~]# hadoop fs -ls /

2. Create a directory

Use the '' '-mkdir' '' command to create a directory "

[root@master ~]# hadoop fs -mkdir /rawdata

View the files in the directory:

[root@master ~]# hadoop fs -ls /
Found 1 items
drwxr-xr-x   - root supergroup          0 2019-07-25 10:43 /rawdata

Create a multi-level directory:

[root@master ~]# hadoop fs -mkdir -p /a/b/c

View all subdirectories:

[root@master ~]# hadoop fs -ls -R /
drwxr-xr-x   - root supergroup          0 2019-07-25 11:02 /a
drwxr-xr-x   - root supergroup          0 2019-07-25 11:02 /a/b
drwxr-xr-x   - root supergroup          0 2019-07-25 11:02 /a/b/c
drwxr-xr-x   - root supergroup          0 2019-07-25 10:43 /rawdata

3. Upload local files

There are two kinds of syntax: -putand -copyFromLocal
difference: use -put, if the file already exists, the system will not display the file already exists, but will directly overwrite; and use -copyFromLocal, if the file already exists, the system will prompt that the file already exists, the upload fails. And -putcan accept standard input. Take the
following as -putan example:

  • Upload local files to the specified directory of hdfs:
[root@master ~]# hadoop fs -put /software/hadoop-2.6.5.tar.gz /a/b/c/
[root@master ~]# hadoop fs -ls /a/b/c/
Found 1 items
-rw-r--r--   3 root supergroup  199635269 2019-07-25 11:35 /a/b/c/hadoop-2.6.5.tar.gz
  • Upload multiple files at once:
[root@master ~]# hadoop fs -put /home/a /home/b /home/c /home/d /a/b/c
[root@master ~]# hadoop fs -ls /a/b/c
Found 5 items
-rw-r--r--   3 root supergroup          0 2019-07-25 11:38 /a/b/c/a
-rw-r--r--   3 root supergroup          0 2019-07-25 11:38 /a/b/c/b
-rw-r--r--   3 root supergroup          0 2019-07-25 11:38 /a/b/c/c
-rw-r--r--   3 root supergroup          0 2019-07-25 11:38 /a/b/c/d
-rw-r--r--   3 root supergroup  199635269 2019-07-25 11:35 /a/b/c/hadoop-2.6.5.tar.gz
  • Standard input:

The results of the ls command can be directly transferred to the hdfs file

[root@master ~]# ls /software/hadoop-2.6.5/ |hadoop fs -put - /a/b/c/h.txt

Use to -catview file content:

[root@master ~]# hadoop fs -cat /a/b/c/h.txt
bin
etc
hadoop_data
hdfs
include
lib
libexec
LICENSE.txt
logs
NOTICE.txt
README.txt
sbin
share

3. Download hdfs file

There are 2 kinds of syntax: -getand-copyToLocal

-getTake the following as an example:

Download the files on hdfs to the local:

[root@master ~]# hadoop fs -get /a/b/c/a ./
[root@master ~]# ll
total 4
-rw-r--r--. 1 root root    0 Jul 25 14:39 a
-rw-------. 1 root root 1204 Jul 17 14:49 anaconda-ks.cfg

4. Copy and delete hdfs files

Use the -cpcommand to copy the hdfs file, use the -rmcommand to delete the hdfs file.

-cpThe command can copy the contents of an entire folder:

[root@master ~]# hadoop fs -cp /a/b/c /rawdata
[root@master ~]# hadoop fs -ls -R /rawdata
drwxr-xr-x   - root supergroup          0 2019-07-25 14:45 /rawdata/c
-rw-r--r--   3 root supergroup          0 2019-07-25 14:44 /rawdata/c/a
-rw-r--r--   3 root supergroup          0 2019-07-25 14:44 /rawdata/c/b
-rw-r--r--   3 root supergroup          0 2019-07-25 14:44 /rawdata/c/c
-rw-r--r--   3 root supergroup          0 2019-07-25 14:44 /rawdata/c/d
-rw-r--r--   3 root supergroup         95 2019-07-25 14:44 /rawdata/c/h.txt
-rw-r--r--   3 root supergroup  199635269 2019-07-25 14:45 /rawdata/c/hadoop-2.6.5.tar.gz

-rmThe command can delete the hdfs file. If you need to delete the folder, you must add -R:

[root@master ~]# hadoop fs -rm -R /rawdata/*
19/07/25 14:49:08 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /rawdata/c
[root@master ~]# hadoop fs -ls -R /rawdata

Use graphical interface to operate hdfs

Need to use the web to open hdfs management URL:
http://10.11.91.122:50070/

After entering the homepage, select browse the file system:
Insert picture description hereenter the directory path in the browser directory to see the files under the path:
Insert picture description here
click the file name connection to see the detailed status of the file and provide download:

Insert picture description here

Published 136 original articles · Like 58 · Visits 360,000+

Guess you like

Origin blog.csdn.net/sunbocong/article/details/97150585