Hadoop HDFS operations related commands (Python)

Hadoop HDFS operations related commands (Python)

CentOS 7 herein is based on the system environment, Hadoop cluster environment set up and tested on the primary node

  • CentOS 7
  • python 3.6.8
  • hadoop-2.7.1

A, Hadoop related commands

(1) View HDFS file structure

hadoop fs -lsr /     

(2) New Folder

hadoop fs -mkdir /test_xz/input

(3) Upload a file to HDFS

hadoop fs -put /home/bailang/test.txt /test_xz/input

(4) Download HDFS files to a local directory

hadoop fs -get /test_xz/input/test.txt /home/bailang/ 

(5) are listed in a directory of HDFS

hadoop fs -ls /test_xz  

Files on (6) View HDFS

hadoop fs -cat /test_xz/input/test.txt  

(7) To delete files on HDFS

hadoop fs -rm /test_xz/input/test.txt  

(8) delete the directory on HDFS

hadoop fs -rmr /test_xz/input/

(9) View HDFS state

hadoop dfsadmin -report 

(10) into safe mode

hadoop dfsadmin -safemode enter

(11) leave safe mode

hadoop dfsadmin -safemode leave

Two, python operation HDFS

(1) related to the installation package hdfs

pip install hdfs

(2) reads the contents of the file hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
file_path = "/test_xz/input/test.txt"
lines = []
with client.read(file_path, encoding='utf-8', delimiter='\n') as reader:
	for line in reader:
    lines.append(line.strip())
print(lines)

(3) create a directory on hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
client.makedirs(hdfs_path)

(4) returns the file in the specified directory hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
print(client.list(hdfs_path, status=False))

(5) moved or modified file

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
client.rename(source_path, dst_path)

(6) to upload files to hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
local_path = "/home/bailang/test.txt"
client.upload(hdfs_path, local_path, cleanup=True)

(7) Download a file to the local hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
local_path = "/home/bailang"
client.download(hdfs_path, local_path, overwrite=False)

(8) append mode, data is written to the file hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.write(hdfs_path, data, overwrite=False, append=True)

(9) in overwrite mode, the file data is written hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.write(hdfs_path, data, overwrite=True, append=False)

(10) to delete the file hdfs

from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.delete(hdfs_path)

Published 68 original articles · 98 won praise · views 1.01 million +

Guess you like

Origin blog.csdn.net/qq_32599479/article/details/101509612