linux(contOS7)全分布式配置hadoop全部流程

contOS配置hadoop

1.vi /etc/sysconfig/network-scripts/ifcfg-ens33进入配置网络编辑分配IP

TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
NAME=ens33
UUID=af99d9a2-66ad-4f73-a71a-063a1badb02d
DEVICE=ens33
ONBOOT=yes
IPADDR=192.168.47.10
GATEWAY=192.168.47.2
NETMASK=255.255.255.0
DNS1=192.168.47.2
  • 解压文件的命令tar -zxvf /root/mzr/hadoop-2.8.4.tar.gz -C /usr/local

2.修改虚拟机名称和ip映射

  • 修改机器名称vi /etc/hostname,修改完重启虚拟机使用reboot
  • 修改ip映射:vi /etc/hosts
192.168.245.111 6110master
192.168.245.112 6110slave0
192.168.245.113 6110slave1

3.配置 jdk

  • 配置jdk解压到/usr/local目录,进入配置文件vi /etc/profile配置完成之后重启环境source /etc/profile
#jdk的配置文件 java env
export JAVA_HOME=/usr/local/jdk1.8.0_151
export JRE_HOME=/usr/local/jdk1.8.0_151/jre
export PATH=$PATH:/usr/local/jdk1.8.0_151/bin
export CLASSPATH=./:/usr/local/jdk1.8.0_151/lib:/usr/local/jdk1.8.0_151/jre/lib
  • 检验是否配置成功的命令java,java -versionjavac,
# 配置成功的标识
java version "1.8.0_281"
Java(TM) SE Runtime Environment (build 1.8.0_281-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.281-b09, mixed mode)

4.配置hadoop

  • 把文件解压至/usr/local目录下
  • 配置环境,使用命令进入vi /etc/profile配置
# 配置hadoop环境  Hadoop env
export HADOOP_HOME=/usr/local/hadoop-2.8.4/
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
  • 重启环境使用命令source /etc/profile
  • 检验Hadoop是否安装成功使用命令which hadoop,hadoop version
# 成功标识
Hadoop 3.3.1
Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2
Compiled by ubuntu on 2021-06-15T05:13Z
Compiled with protoc 3.7.1
From source with checksum 88a4ddb2299aca054416d6b7f81ca55
This command was run using /usr/local/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar

5.Hadoop 分布式配置

1.需要配置的文件名称,进入目录cd /usr/local/hadoop-3.3.1/etc/hadoop/
# 目录内的文件
[root@6274master hadoop]# ls
capacity-scheduler.xml      hadoop-policy.xml                 kms-acls.xml          mapred-queues.xml.template     workers
configuration.xsl           hadoop-user-functions.sh.example  kms-env.sh            mapred-site.xml                yarn-env.cmd
container-executor.cfg      hdfs-rbf-site.xml                 kms-log4j.properties  shellprofile.d                 yarn-env.sh
core-site.xml               hdfs-site.xml                     kms-site.xml          slaves                         yarnservice-log4j.properties
hadoop-env.cmd              httpfs-env.sh                     log4j.properties      ssl-client.xml.example         yarn-site.xml
hadoop-env.sh               httpfs-log4j.properties           mapred-env.cmd        ssl-server.xml.example
hadoop-metrics2.properties  httpfs-site.xml                   mapred-env.sh         user_ec_policies.xml.template
工作过程:
1、通过安装、克隆方式逐步配置好3台Centos7 64位操作系统的虚拟机;
2、安装好jdk;
3、安装配置SSH;
4、配置hadoop-env.sh;
5、配置hadoop-core-site.xm;
6、配置hadoop-hdfs-site.xm;
7、配置hadoop-mapred-site.xml;
8、配置hadoop-yarn-site.xml;
9、配置slave;
10、分发到所有虚拟机上;
11、测试并运行程序。
2.配置 vi hadoop-env.sh
export JAVA_HOME=/usr/local/ jdk-15.0.2
export HADOOP_CONF_DIR=/usr/local/hadoop-3.2.2/etc/hadoop/
3、配置hadoop-core-site.xml
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://6274master:9000</value>
  </property>
  <property>
    <name>io.file.buffer.size</name>
    <value>4096</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/bigdata/tmp</value>
  </property>
</configuration>
4、配置 hadoop-hdfs-site.xml
<configuration>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<property>
		<name>dfs.block.size</name>
		<value>134217728</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>file:///home/hadoopdata/dfs/name</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/home/hadoopdata/dfs/data</value>
	</property>
	<property>
		<name>fs.checkpoint.dir</name>
		<value>/home/hadoopdata/checkpoint/dfs/6274name</value>
	</property>
	<property>
		<name>dfs.http.address</name>
		<value>6274master:50070</value>
	</property>
	<property>
		<name>dfs.secondary.http.address</name>
		<value>6274master:50090</value>
	</property>
	<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
	</property>
</configuration>
5、配置hadoop-mapred-site.xml
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    <final>true</final>
  </property>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>6274master:10020</value>
  </property>
  <property>
    <name>mpareduce.jobhistory.webapp.address</name>
    <value>6274master:19888</value>
  </property>
</configuration>
6、配置 hadoop-yarn-site.xml
<configuration>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>6274master</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>6274master:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>6274master:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>6274master:8031</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>6274master:8033</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>6274master:8088</value>
  </property>
<!-- Site specific YARN configuration properties -->
</configuration>
7、配置vi slave
6274master
6274slave0
6274slave1
8.完成分发任务
vi /etc/hosts
192.168.245.111 6110master
192.168.245.112 6110slave0
192.168.245.113 6110slave1
  • 在两个slave里删除Hadoop目录
6110slave0: rm -rf /usr/local/hadoop-2.8.4/
6110slave1: rm -rf /usr/local/hadoop-2.8.4/
  • 完成分发工作,首先ping 虚拟机名称是否ping的通,不通则不能分发任务:
    6110master: scp -r /usr/local/hadoop-2.8.4/ 6110slave0:/usr/local/
    6110master: scp -r /usr/local/hadoop-2.8.4/ 6110slave1:/usr/local/
9.启动之前要先在namenode服务器上格式化,只需一次。
hadoop namenode –format
启动三种方式:
  • 全启动:
start-all.sh
  • 模式启动:
start-dfs.sh
start-yarn.sh
  • 单个进程启动:
hadoop-daemon.sh start namenode
hadoop-daemons.sh start datanode
yarn-daemon.sh start namenode
yarn-daemons.sh start datanode
mr-jobhistory-daemon.sh start historyserver
  • 1、 查看进程是否启动了:jps 查看对应模块的web
http://192.168.47.10:50070
http://192.168.47.10:8088
  • 2、 上传下载文件
Hdfs dfs –ls /
Hdfs dfs –put ./***  /
  • 3、 跑一个程序
yarn jar $HADOOP_HOME/share/hadoop/mapreduce/Hadoop-mapreduce-examples-2.7.1.jar wordcount  /*** /out/01
hdfs dfs –ls /out/01
hdfs dfs –cat /out/01/****

Guess you like

Origin blog.csdn.net/m0_50641264/article/details/122144756