hadoop series of three operations --HDFS

hdfs common command line client operation command

0, hdfs catalog View in
hadoop fs -ls / hdfs path

1, upload the file to the hdfs
hadoop fs -put / local file / AAA
Hadoop FS -copyFromLocal / local file / path hdfs put equivalent to ## copyFromLocal

hadoop fs -moveFromLocal / local file / hdfs path ## with copyFromLocal difference is: to move from the local to the hdfs

2. Download the file to the client's local disk
path hadoop fs -get / hdfs in / local disk directory
path hadoop fs -copyToLocal / hdfs in / local disk path ## with the equivalent GET
hadoop FS -moveToLocal / HDFS path / local from the path ## to local mobile hdfs

3. Create a folder in hdfs
hadoop fs -mkdir -p / aaa / xxx

4, the mobile hdfs file (renamed)
Another route path hadoop fs -mv / hdfs of / the hdfs

5, hdfs delete a file or folder
hadoop fs -rm -r / aaa

6, modify the file permissions
hadoop FS -chown the User: Group / aaa
hadoop FS -chmod 700 / aaa

7, additional content to existing files
hadoop fs -appendToFile / local file / hdfs files

8, the display contents of a text document
file hadoop fs -cat / hdfs the
file hadoop fs -tail / hdfs in

Supplementary: hdfs command-line client list of all commands
Usage: hadoop FS [the Generic Options]
[-appendToFile ...]
[-cat [-ignoreCrc] ...]
[-checksum ...]
[-chgrp [-R] the GROUP the PATH ...]
[- the chmod [-R & lt] <the MODE [, the MODE] ... | OCTALMODE> the PATH ...]
[-chown [-R & lt] [OWNER] [: [the GROUP]] the PATH ...]
[-copyFromLocal [-f] [-p] [- L] [-d] ...]
[-copyToLocal [-f] [-p] [-ignoreCrc] [data_CRC] ...]
[-count [-q] [-H] [-v] [-t []] [-u] [-x] …]
[-cp [-f] [-p | -p[topax]] [-d] … ]
[-createSnapshot []]
[-deleteSnapshot ]
[-df [-h] [ ...]]
[-you [-s] [-h] [-x] …]
[-expunge]
[-find … …]
[-get [-f] [-p] [-ignoreCrc] [-crc] … ]
[-getfacl [-R] ]
[-getfattr [-R] {-n name | -d} [-e en] ]
[-getmerge [-nl] [-skip-empty-file] ]
[-help [cmd …]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [ …]]
[-mkdir [-p] …]
[-moveFromLocal … ]
[-moveToLocal ]
[-mv … ]
[-put [-f] [-p] [-l] [-d] … ]
[-renameSnapshot ]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] …]
[-rmdir [–ignore-fail-on-non-empty]

…]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} ]|[–set <acl_spec> ]]
[-setfattr {-n name [-v value] | -x name} ]
[-Setrep [-R] [-w] ...]
[-state [format] …]
[-tail [-f] ]
[-test -[defsz] ]
[-text [-ignoreCrc] …]
[-touchz …]
[-truncate [-w] …]
[-usage [cmd …]]

(X) using Hadoop HDFS API way of learning

Mechanism Configuration parameter object: When configured, the default configuration is loaded xx-default.xml jar package reload

User configuration xx-site.xml, to override default parameters after the completion of construction, can also conf.set ( "p", "v"), parameter values ​​override the user profile again

When you create a Configuration object, its constructor will default configuration file to load hadoop in two, namely hdfs-site.xml and core-site.xml, these two documents will have access to the required parameter values ​​hdfs, mainly it is fs.default.name, hdfs specified address, with this address hdfs client can access through this address. It can be understood as the configuration is the configuration information in the hadoop. According by this class, this class can set the parameters, and changes to the parameter setting hdfs.

//会从项目的classpath中加载core-default.xml hdfs-default.xml,根据这个类
 Configuration conf = new Configuration();  
    
// 指定本客户端上传文件到hdfs时需要保存的副本数为:2
conf.set("dfs.replication", "2");
	
// 指定本客户端上传文件到hdfs时切块的规格大小:64M
conf.set("dfs.blocksize", "64m");


	// 构造一个访问指定HDFS系统的客户端对象:
		fs = FileSystem.get(new URI("hdfs://hdp-01:9000/"), conf, "root");
// 参数1:——HDFS系统的URI,参数2:——客户端要特别指定的参数,参数3:客户端的身份(用户名)

//------------------------------------------常用api-----------------------------------------------

// 上传一个文件到HDFS中
	fs.copyFromLocalFile(new Path("D:/install-pkgs/hbase-1.2.1-bin.tar.gz"), new Path("/aaa/"));

// 从HDFS中下载文件到客户端本地磁盘
fs.copyToLocalFile(new Path("/hdp20-05.txt"), new Path("f:/"));

//在hdfs内部移动文件\修改名称
fs.rename(new Path("/install.log"), new Path("/aaa/in.log"));

//在hdfs中创建文件夹
fs.mkdirs(new Path("/xx/yy/zz"));

//在hdfs中删除文件或文件夹
fs.delete(new Path("/aaa"), true);


//查询hdfs指定目录下的文件信息
	RemoteIterator<LocatedFileStatus> iter = fs.listFiles(new Path("/"), true);
// 只查询文件的信息,不返回文件夹的信息
	while (iter.hasNext()) {
		LocatedFileStatus status = iter.next();
		System.out.println("文件全路径:" + status.getPath());
		System.out.println("块大小:" + status.getBlockSize());
		System.out.println("文件长度:" + status.getLen());
		System.out.println("副本数量:" + status.getReplication());
		System.out.println("块信息:" + Arrays.toString(status.getBlockLocations()));

		System.out.println("--------------------------------");
	}
	fs.close();


 //查询hdfs指定目录下的文件和文件夹信息
 public void testLs2() throws Exception {
		FileStatus[] listStatus = fs.listStatus(new Path("/"));

		for (FileStatus status : listStatus) {
			System.out.println("文件全路径:" + status.getPath());
			System.out.println(status.isDirectory() ? "这是文件夹" : "这是文件");
			System.out.println("块大小:" + status.getBlockSize());
			System.out.println("文件长度:" + status.getLen());
			System.out.println("副本数量:" + status.getReplication());

			System.out.println("--------------------------------");
		}
		fs.close();


//读取hdfs中的文件的内容
	FSDataInputStream in = fs.open(new Path("/test.txt"));
	BufferedReader br = new BufferedReader(new InputStreamReader(in, "utf-8"));
	String line = null;
	while ((line = br.readLine()) != null) {
		System.out.println(line);
	}

	br.close();
	in.close();
	fs.close();



//往hdfs中的文件写内容

		FSDataOutputStream out = fs.create(new Path("/zz.jpg"), false);

	// D:\images\006l0mbogy1fhehjb6ikoj30ku0ku76b.jpg

	FileInputStream in = new FileInputStream("D:/images/006l0mbogy1fhehjb6ikoj30ku0ku76b.jpg");

	byte[] buf = new byte[1024];
	int read = 0;
	while ((read = in.read(buf)) != -1) {
		out.write(buf,0,read);
	}
	
	in.close();
	out.close();
	fs.close();
Published 44 original articles · won praise 0 · Views 870

Guess you like

Origin blog.csdn.net/heartless_killer/article/details/100717646