1、检测集群启动进程
-
1.jps #查看启动进程
2.利用图形界面
hdfs: http://hadoop01:50070 #hadoop01表示主节点的的主机映射名
yarn: http://hadoop02:8088 #hadoop02表示配置yarn的ResourceManager的从节点主机映射名
3.运行任务测试
2、启动/关闭集群
-
1.逐个进程启动
hadoop-daemon.sh start(stop) namenode/datanode/secondarynamenode
yarn-daemon.sh start(stop) ResourceManager/NodeManager
2.整个集群启动(涉及到通信,ssh免密登录)
start-dfs.sh(stop-dfs.sh) #在主节点启动
start-yarn.sh(stop-yarn.sh) #必须在rm节点启动(所以这里在hadoop02里启动)
3.全部启动
start-all.sh(stop-all.sh)
3、shell操作
-
hadoop fs #运行fs的命令
-
hdfs dfs #运行fs的命令 等价于hadoop fs
-
[hadoop@hadoop02 ~]$ hadoop Usage: hadoop [--config confdir] [COMMAND | CLASSNAME] CLASSNAME run the class named CLASSNAME or where COMMAND is one of: fs run a generic filesystem user client version print the version jar <jar> run a jar file note: please use "yarn jar" to launch YARN applications, not this command. checknative [-a|-h] check native hadoop and compression libraries availability distcp <srcurl> <desturl> copy file or directories recursively archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive classpath prints the class path needed to get the credential interact with credential providers Hadoop jar and the required libraries daemonlog get/set the log level for each daemon trace view and modify Hadoop tracing settings Most commands print help when invoked w/o parameters.
注:当你不清楚命令有哪些用途时直接输入第一个命令如hdfs回车,按照提示说明一步一步往下。如:找到dfs:在Hadoop支持的文件系统上运行文件系统命令;然后再执行hdfs dfs 再查看下一步操作及所对应的意思。
-
[hadoop@hadoop02 ~]$ hdfs Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND where COMMAND is one of: dfs run a filesystem command on the file systems supported in Hadoop. classpath prints the classpath namenode -format format the DFS filesystem secondarynamenode run the DFS secondary namenode namenode run the DFS namenode journalnode run the DFS journalnode zkfc run the ZK Failover Controller daemon datanode run a DFS datanode dfsadmin run a DFS admin client haadmin run a DFS HA admin client fsck run a DFS filesystem checking utility balancer run a cluster balancing utility jmxget get JMX exported values from NameNode or DataNode. mover run a utility to move block replicas across storage types oiv apply the offline fsimage viewer to an fsimage oiv_legacy apply the offline fsimage viewer to an legacy fsimage oev apply the offline edits viewer to an edits file fetchdt fetch a delegation token from the NameNode getconf get config values from configuration groups get the groups which users belong to snapshotDiff diff two snapshots of a directory or diff the current directory contents with a snapshot lsSnapshottableDir list all snapshottable dirs owned by the current user Use -help to see options portmap run a portmap service nfs3 run an NFS version 3 gateway cacheadmin configure the HDFS cache crypto configure HDFS encryption zones storagepolicies list/get/set block storage policies version print the version Most commands print help when invoked w/o parameters.
[hadoop@hadoop02 ~]$ hdfs dfs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] <path> ...]
[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
常用命令介绍:
-help 功能:输出这个命令参数手册 [hadoop@hadoop02 ~]$ hadoop [hadoop@hadoop02 ~]$ hadoop -help [hadoop@hadoop02 ~]$ hadoop fs -help [hadoop@hadoop02 ~]$ hadoop fs -help ls |
hdfs dfsadmin -report #报告整个集群的状态 hdfs dfs -setrep 2 /output/part-r-00000 #设置part-r-00000文件的副本数量为2个 hadoop fs -count /output/ #统计一个指定目录下的文件节点数量 |
hdfs getconf -namenodes #获得namenode所在列表位置 hdfs getconf -confkey [name] #获取配置文件中的name的value ---- hdfs getconf -confkey fs.defaultFS #获取 hdfs集群的入口地址(namenode:客户端的请求和响应) ---- hdfs getconf -confkey hadoop.tmp.dir #获取 临时文件的存储目录 ---- hdfs getconf -confkey dfs.replication #获取 副本数 ---- hdfs getconf -confkey dfs.blocksize #每个块的大小 |
hdfs dfs -ls / 或者 hdfs dfs -ls hdfs://hadoop01:9000/ #查看hdfs根目录下的所有文件信息 hdfs dfs -ls -R / 或者 hdfs dfs -ls -R hdfs://hadoop01:9000/ #递归查看所有文件信息 |
-mkdir #功能:在hdfs上创建目录(-p表示可以创建多级目录) hdfs dfs -mkdir -p /aa/bb/cc/dd #在hdfs上创建目录(-p表示可以创建多级目录) -cp # 功能:从 hdfs 的一个路径拷贝 hdfs 的另一个路径 示例: hdfs dfs -cp /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2 -mv #功能:在 hdfs 目录中移动文件 示例: hdfs dfs -mv /aaa/jdk.tar.gz / -rm #功能:删除文件或文件夹 示例: hdfs dfs -rm -r /aaa/bbb/ hdfs dfs -rmdir /aaa/bbb/ccc |
-put 和 -copyFromLocal #进行文件上传 hdfs dfs -put /aaa/jdk.tar.gz /home/hadoop/bbb/jdk.tar.gz -get 和 -copyToLocal #就是从 hdfs 下载文件到本地 hdfs dfs -get /aaa/jdk.tar.gz /home/hadoop/aaa/jdk.tar.gz |
-getmerge #合并下载多个文件 , 如 hdfs 的目录 /aaa/下有多个文件:log.1, log.2,log.3,... hdfs dfs -getmerge /aaa/log.* ./log.sum -appendToFile <localsrc> ... <dst> #功能:追加一个文件到已经存在的文件末尾 hdfs dfs -appendToFile ./hello.txt /hello.txt |
-moveFromLocal #功能:从本地剪切到 hdfs 示例: hdfs dfs - moveFromLocal /home/hadoop/a.txt /aa/bb/cc/dd hdfs dfs - moveToLocal /aa/bb/cc/dd /home/hadoop/a.txt |
-copyFromLocal #功能:从本地文件系统中拷贝文件到 hdfs 文件系统去 示例: hdfs dfs -copyFromLocal ./jdk.tar.gz /aaa/ hdfs dfs -copyToLocal /aaa/jdk.tar.gz |
-text #功能:以字符形式打印一个文件的内容,类似于-cat 、-tail的用法 hdfs dfs -text /output/part-r-00000 |