需要软件:jdk ssh
Hadoop 2.8.3
安装 jdk并配置环境变量
安装ssh和rshync,主要设置免密登录
sudo apt-get install ssh
sudo apt-get install rshync
sh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh
安装hadoop
export JAVA_HOME=/usr/local/jdk1.8.0_151
配置yarn-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_151
3)配置core-site.xml
添加如下配置:
<configuration>
<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> <description>HDFS的URI,文件系统://namenode标识:端口号</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/tmp</value> <description>namenode上本地的hadoop临时文件夹</description> </property> </configuration>
4),配置hdfs-site.xml
添加如下配置
<configuration>
<!—hdfs-site.xml-->
<property> <name>dfs.name.dir</name> <value>/usr/hadoop/hdfs/name</value> <description>namenode上存储hdfs名字空间元��据 </description> </property> <property> <name>dfs.data.dir</name> <value>/usr/hadoop/hdfs/data</value> <description>datanode上数据块的物理存储位置</description> </property> <property> <name>dfs.replication</name> <value>1</value> <description>副本个数,配置默认是3,应小于datanode机器数量</description> </property> </configuration>
5),配置mapred-site.xml
添加如下配置:
<configuration>
<property>
<name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
6),配置yarn-site.xml
添加如下配置:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>192.168.241.128:8099</value> </property> </configuration>
4,Hadoop启动
1)格式化namenode
$ bin/hdfs namenode –format
2)启动NameNode 和 DataNode 守护进程
$ sbin/start-dfs.sh
3)启动ResourceManager 和 NodeManager 守护进程
$ sbin/start-yarn.sh
- $ cd ~/.ssh/ # 若没有该目录,请先执行一次ssh localhost
- $ ssh-keygen -t rsa # 会有提示,都按回车就可以
- $ cat id_rsa.pub >> authorized_keys # 加入授权
5,启动验证
1)执行jps命令,有如下进程,说明Hadoop正常启动
# jps
6097 NodeManager
11044 Jps
7497 -- process information unavailable
8256 Worker
5999 ResourceManager
5122 SecondaryNameNode
8106 Master
4836 NameNode
4957 DataNode