Hadoop HDFS operations related commands (Python)
CentOS 7 herein is based on the system environment, Hadoop cluster environment set up and tested on the primary node
- CentOS 7
- python 3.6.8
- hadoop-2.7.1
A, Hadoop related commands
(1) View HDFS file structure
hadoop fs -lsr /
(2) New Folder
hadoop fs -mkdir /test_xz/input
(3) Upload a file to HDFS
hadoop fs -put /home/bailang/test.txt /test_xz/input
(4) Download HDFS files to a local directory
hadoop fs -get /test_xz/input/test.txt /home/bailang/
(5) are listed in a directory of HDFS
hadoop fs -ls /test_xz
Files on (6) View HDFS
hadoop fs -cat /test_xz/input/test.txt
(7) To delete files on HDFS
hadoop fs -rm /test_xz/input/test.txt
(8) delete the directory on HDFS
hadoop fs -rmr /test_xz/input/
(9) View HDFS state
hadoop dfsadmin -report
(10) into safe mode
hadoop dfsadmin -safemode enter
(11) leave safe mode
hadoop dfsadmin -safemode leave
Two, python operation HDFS
(1) related to the installation package hdfs
pip install hdfs
(2) reads the contents of the file hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
file_path = "/test_xz/input/test.txt"
lines = []
with client.read(file_path, encoding='utf-8', delimiter='\n') as reader:
for line in reader:
lines.append(line.strip())
print(lines)
(3) create a directory on hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
client.makedirs(hdfs_path)
(4) returns the file in the specified directory hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
print(client.list(hdfs_path, status=False))
(5) moved or modified file
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
client.rename(source_path, dst_path)
(6) to upload files to hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input"
local_path = "/home/bailang/test.txt"
client.upload(hdfs_path, local_path, cleanup=True)
(7) Download a file to the local hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
local_path = "/home/bailang"
client.download(hdfs_path, local_path, overwrite=False)
(8) append mode, data is written to the file hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.write(hdfs_path, data, overwrite=False, append=True)
(9) in overwrite mode, the file data is written hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.write(hdfs_path, data, overwrite=True, append=False)
(10) to delete the file hdfs
from hdfs.client import Client
client = Client("http://172.30.11.101:50070")
hdfs_path = "/test_xz/input/test.txt"
client.delete(hdfs_path)