Beginner Hadoop-HDFS command line client use

1. Introduction to HDFS and its basic concepts

HDFS (Hadoop Distributed File System) is an important part of the hadoop ecosystem. It is a storage component in hadoop. It has an extraordinary position in the entire Hadoop and is the most basic part, because it involves data storage, MapReduce and other calculations. All models depend on data stored in HDFS. HDFS is a distributed file system that stores very large files in a streaming data access mode, and stores the data in blocks on different machines in a commercial hardware cluster.

2. Experimental environment

Operating system: Ubuntu64 bit
Hadoop version: Hadoop 2.7.1
Jdk version: jdk-8u241-linux-x64

3. Practice content

Note: At the
beginning of the Shell command mode, there are three shell command modes:

  1. hadoop fs # hadoop fs is applicable to any different file system, such as local file system and HDFS file system
  2. hadoop dfs # hadoop dfs can only be applied to HDFS file system
  3. hdfs dfs #hdfs dfs has the same function as hadoop dfs, and it can only be applied to HDFS file system

1. Before using HDFS command line operation, you need to start Hadoop first

command:

cd /usr/local/hadoop
./sbin/start-dfs.sh #启动hadoop

practice:
Insert picture description here

2. Hadoop supports many Shell commands, among which fs is the most commonly used command for HDFS. With fs, you can view the directory structure of the HDFS file system, upload and download data, and create files.

command:

hadoop fs #查看fs总共支持了哪些命令

practice:
Insert picture description here

3. View the role of a specific command

command:

hadoop fs -help put#查看put命令的作用

practice:

Insert picture description here

4. Use Shell commands to interact with HDFS

(1) Directory operation

① When using HDFS for the first time, you need to create a user directory in HDFS first.

command:

cd /usr/local/hadoop
hdfs dfs –mkdir –p /user/Hadoop  #在HDFS中创建一个“/user/hadoop”目录,“–mkdir”是创建目录的操作,“-p”表示如果是多级目录,则父目录和子目录一起创建,这里“/user/hadoop”就是一个多级目录,因此必须使用参数“-p”,否则会出错。

practice:

Insert picture description here

②Display the contents of the user directory corresponding to the current user hadoop in HDFS

command:

hdfs dfs –ls. #“-ls”表示列出HDFS某个目录下的所有内容,“.”表示HDFS中的当前用户目录,也就是“/user/hadoop”目录

practice:
Insert picture description here

Equivalent command:

hdfs dfs –ls /user/Hadoop

practice:

Insert picture description here

③List all directories on HDFS

command:

hdfs dfs –ls 

practice:

Insert picture description here

④ Create an input directory

command:

hdfs dfs –mkdir input #创建此input目录时,采用了相对路径形式,实际上,这个input目录创建成功以后,它在HDFS中的完整路径是“/user/hadoop/input”

practice:

Insert picture description here

hdfs dfs –mkdir /input #在HDFS的根目录下创建一个名称为input的目录

practice:

Insert picture description here

⑤ Use the rm command to delete a directory

command:

hdfs dfs –rm –r /input #删除刚才在HDFS中创建的“/input”目录(不是“/user/hadoop/input”目录);“-r”参数表示删除“/input”目录及其子目录下的所有内容,如果要删除的一个目录包含了子目录,则必须使用“-r”参数,否则会执行失败。

practice:

Insert picture description here

(2) File operation

①Create file

command:

cd /usr/local/hadoop
touch mylocalfile.txt

practice:

Insert picture description here
Insert picture description here

②Edit the file

command:

vim mylocalfile.txt

practice:

Insert picture description here
Insert picture description here

③Upload the files of the local file system to the current user directory in HDFS

command:

hdfs dfs -put /usr/local/hadoop/myLocalFile.txt  input #把本地文件系统的“/usr/local/hadoop/myLocalFile.txt”上传到HDFS中的当前用户目录的input目录下,也就是上传到HDFS的“/user/hadoop/input/”目录下。

practice:

Insert picture description here
Insert picture description here

④View the contents of files in HDFS

Code:

hdfs dfs –cat input/myLocalFile.txt

practice:

Insert picture description here

⑤Download files in HDFS to local

Code:

hdfs dfs -get input/myLocalFile.txt/home/hadoop/下载 #把HDFS中的myLocalFile.txt文件下载到本地文件系统中的“/home/hadoop/下载/”这个目录下

practice:

Insert picture description here
Insert picture description here

Go to the local file system to view the downloaded file mylocalfile.txt

command:

cd ~
cd 下载
ls
cat myLocalFile.txt

practice:

Insert picture description here

⑥ Copy files from one directory in HDFS to another directory in HDFS

command:

hdfs dfs -cp input/myLocalFile.txt/input #把HDFS的“/user/hadoop/input/myLocalFile.txt”文件,拷贝到HDFS的另外一个目录“/input”中(注意,这个input目录位于HDFS根目录下)

(3) Other commonly used shell commands

① Output this command parameter manual

command:

Hdfs dfs -help 命令名

Examples:

Hdfs dfs -help cat

Insert picture description here

②Upload files

command:

hdfs dfs -put file(本地文件路径) /hdfsfile(hdfs文件路径)

Examples:

hdfs dfs -put /usr/local/hadoop/myLocalFile.txt input

Insert picture description here
Insert picture description here

③Cut file

command:

hdfs dfs -moveFromLocal file(文件路径) /hdfsfile(hdfs文件路径)

Examples:

hdfs dfs -moveFromLocal /usr/local/hadoop/myLocalFile.txt input

Insert picture description here

④Download the file to the local

command:

hdfs dfs -get /hdfsfile(hdfs文件路径) file(本地路径)

Examples:

hdfs dfs -get input/mylocalfile.txt /usr/local/hadoop

Insert picture description here

⑤Merge download

command:

hdfs dfs -getmerge /hdfsdir    file

Examples:

hdfs dfs -get input /usr/local/hadoop

Insert picture description here

⑥Create folder

command:

hdfs dfs -mkdir /dirname

Examples:

Insert picture description here
Insert picture description here
Multi-level mandatory creation commands:

hdfs dfs -mkdir -p /dirname/dirname

Examples:

Insert picture description here

⑦Move folder

command:

hdfs dfs -mv /hdfsfile /hdfsdir/hdfsfile
# -mv a.txt b.txt    //重命名
# -mv a.txt /    //移动

⑧Copy

command:

hdfs dfs -cp /hdfsfile /hdfsdir

⑨Delete

command:

hdfs dfs -rm /hdfsfile

Delete folder command:

hdfs dfs -rm -r /hdfsdir

⑩View

command:

hdfs dfs -cat /hdfsfile
hdfs dfs -tail -f /hdfsfile

See how many files are in the folder

command:

hdfs dfs -count /hdfsdir

View the total space of hdfs

command:

hdfs dfs -df /
hdfs dfs -df -h /    可读性更高

Guess you like

Origin blog.csdn.net/qq_45154565/article/details/109180824