hadoop 3.0.0 installation configuration

Environment described
according to the needs, deploy hadoop-3.0.0 basic functional architecture, a three-node installation environment, 7 x64 operating system CentOS;
OpenStack create three virtual machines, began to deploy;
IP address of the host name
10.10.204.31 Master
10.10.204.32 node1
10.10.204.33 node2

Function Node Planning
Master node1 node2
the NameNode
DataNode DataNode DataNode
HQuorumPeer the NodeManager the NodeManager
the ResourceManager SecondaryNameNode
HMaster

Three-node initialization is performed;
1. Update the system environment;
yum yum makecache FAST Clean All && && && yum yum Update the install -Y -Y-NET wget Vim FTP Tools Git ZIP the unzip
2. Modify the host name according to the plan;
hostnamectl SET-hostname master
hostnamectl SET-hostname node1
hostnamectl SET-hostname node2
3. Add hosts resolved;
Vim / etc / hosts
10.10.204.31 master
10.10.204.32 node1
10.10.204.33 node2
4.ping test between three mutually normal host host name resolution;
of ping Master
the ping node1
the ping node2
5. Download and install JDK environment;
#hadoop version 3.0 requires JDK 8.0 support;
cd / opt /
after # under normal circumstances, you need to log oracle official website, register an account, agree to their agreement, to download, in accordance with this link wget direct download;
--no-Cookies --no-wget the Check-Certificate --header "Cookie: gpw_e24 = HTTP% 3A% 2F% 2F% 2Fwww.oracle.com; oraclelicense the Accept-securebackup-the cookie =" " HTTPS: // download. oracle.com/otn-pub/java/jdk/8u202-b08/1961070e4c9b4e26a04e7f5a083f551e/jdk-8u202-linux-x64.tar.gz "
# hadoop and create JDK installation path
mkdir / opt / modules
cp / opt / the JDK-8u202- x64.tar.gz-Linux / opt / modules
CD / opt / modules
the tar-8u202 zxvf JDK-Linux-x64.tar.gz
# configure the environment variables
Export the JAVA_HOME = "/ opt / modules / jdk1.8.0_202"
Export the PATH = JAVA_HOME $ / bin /: $ the PATH
Source / etc / Profile
# permanent configuration
vim / etc / bashrc
#add Lines
Export JAVA_HOME = "/ opt / modules / jdk1.8.0_202"
Export the PATH = $ JAVA_HOME / bin /: $ the PATH
6. The download extract hadoop-3.0.0 installation package;
cd / opt /
wget http://archive.apache.org/dist/hadoop/core/hadoop-3.0.0/hadoop-3.0.0.tar.gz
cp /opt/hadoop-3.0.0.tar.gz / modules /
CD / opt / modules
the tar-3.0.0.tar.gz zxvf Hadoop
7. The closed selinux / firewalld firewall;
systemctl disable firewalld
Vim / etc / sysconfig / SELinux
the SELINUX = Disabled
8. The restart the server;
reboot

master node operation;
Note:
the test environment, all using the root account to install run hadoop;
1. Add ssh password-free login;
cd
ssh-keygen
## cubic press Enter
# key to copy the file to node1 / node2
ssh-copy- Master the above mentioned id
SSH-Copy-the above mentioned id node1
SSH-Copy-the above mentioned id node2
2. test password-free normal landing;
SSH Master
SSH node1
SSH node2
3. modify hadoop configuration file;
for hadoop configuration, you need to modify the configuration file:
hadoop-env.sh
the Yarn -env.sh
Core-the site.xml
HDFS-the site.xml
mapred-the site.xml
the Yarn-the site.xml
Workers

cd /opt/modules/hadoop-3.0.0/etc/hadoop
vim hadoop-env.sh
export JAVA_HOME=/opt/modules/jdk1.8.0_202
vim yarn-env.sh
export JAVA_HOME=/opt/modules/jdk1.8.0_202

Configuration file parsing:
https://blog.csdn.net/m290345792/article/details/79141336

vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/data/tmp</value>
</property>
<property>
   <name>hadoop.proxyuser.hadoop.hosts</name>
<value></value>
</property>
<property>
   <name>hadoop.proxyuser.hadoop.groups</name>
  <value>
</ value> # io.file.buffer.size queue file read / write buffer size</ the Configuration>
</ Property>

the site.xml HDFS-Vim
<Configuration>
<Property>
<name> dfs.namenode.secondary.http-address </ name>
<value> slave2: 50090 </ value>
</ Property>
<Property>
<name> DFS. Replication </ name>
<value> 3 </ value>
<Description> copy number, 3 is the default configuration, the machine should be smaller than the number datanode </ Description>
</ Property>
<Property>
hadoop.tmp.dir <name> < / name>
<value> / Data / tmp </ value>
</ Property>
</ configuration>
### NameNode configuration
on the local file system name # dfs.namenode.name.dir NameNode persistent storage space and transaction log path, If this is a comma-separated list of directories, then copy all table names in the directory to be redundant.
# dfs.hosts / dfs.hosts.exclude contain inventory data storage node / abandoned,

rpc # dfs.namenode.handler.count NameNode plurality of server thread from a large number of data processing nodes
### datanode configuration
on the local file system # dfs.datanode.data.dir DataNode storing comma separated list of the block, if this is a comma-separated list of directories, then the data will be in all the named directory and is usually stored on different devices.

vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/opt/modules/hadoop-3.0.0/etc/hadoop,
/opt/modules/hadoop-3.0.0/share/hadoop/common/,
/opt/modules/hadoop-3.0.0/share/hadoop/common/lib/
,
/opt/modules/hadoop-3.0.0/share/hadoop/hdfs/,
/opt/modules/hadoop-3.0.0/share/hadoop/hdfs/lib/
,
/opt/modules/hadoop-3.0.0/share/hadoop/mapreduce/,
/opt/modules/hadoop-3.0.0/share/hadoop/mapreduce/lib/
,
/opt/modules/hadoop-3.0.0/share/hadoop/yarn/,
/opt/modules/hadoop-3.0.0/share/hadoop/yarn/lib/

</value>
</property>
</configuration>

vim yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandle</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
</configuration>

### resourcemanager and nodemanager configuration
# yarn.acl.enable allow ACLs, the default is false
# yarn.admin.acl adminis set on the cluster. ACLs are of for comma-separated- usersspacecomma-separated-groups. The default value is designated to represent any person. In particular spaces are no express permission.
# yarn.log-aggregation-enable Configuration to enable or disable log aggregation configure whether to allow log aggregation.
### resourcemanager configuration
# yarn.resourcemanager.address value: ResourceManager host: port for client tasks Submission Instructions: If host: port, will cover yarn.resourcemanager.hostname.host:port host name.
# yarn.resourcemanager.scheduler.address value: ResourceManager host: port manager for application access to resources to the scheduler. Description: If host: port, the cover yarn.resourcemanager.hostname hostname
# yarn.resourcemanager.resource-tracker.address value: ResourceManager host: port for NodeManagers Note: If host: port, the cover yarn.resourcemanager .hostname hostname settings.
# yarn.resourcemanager.admin.address value: ResourceManager host: port for management commands. Note: If you set host: port, will override the settings yarn.resourcemanager.hostname hostname
# yarn.resourcemanager.webapp.address value: ResourceManager web-ui host: port Note: If you set host: port, will cover yarn.resourcemanager set .hostname hostname
# yarn.resourcemanager.hostname value:. ResourceManager host Description: can be set to replace all yarn.resourcemanager
address resources of a single host host name. As a result, the default port for the ResourceManager components.
# yarn.resourcemanager.scheduler.class value: ResourceManager scheduling classes described:.. Capacity Scheduling (recommended), Fair scheduling (also recommended), or Fifo scheduling using fully qualified class name, such as org.apache.hadoop.yarn.server. resourcemanager.scheduler.fair.FairScheduler.
# yarn.scheduler.minimum-allocation-MB Found: Resource Manager in the container a minimum allocation for each memory request.
# yarn.scheduler.maximum-allocation-MB values: in the Resource Manager on each request for the container for dispensing the maximum memory
# Yarn.resourcemanager.nodes.include-path / yarn.resourcemanager.nodes.exclude-path Found: Allow / abandoned nodeManagers list: If necessary, these files can be used to control the list of allowed NodeManagers

Workers vim
Master
slave1
slave2
4. modify the startup file
# hadoop because the test environment to start the service as root, so the need to add permissions for the startup files;
cd /opt/modules/hadoop-3.0.0/sbin
vim start-dfs.sh
#add Lines
HDFS_DATANODE_USER = the root
HDFS_DATANODE_SECURE_USER = the root
HDFS_NAMENODE_USER = the root
HDFS_SECONDARYNAMENODE_USER = the root
HDFS_ZKFC_USER = the root
HDFS_JOURNALNODE_USER = the root

vim stop-dfs.sh
#add lines
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root

vim start-yarn.sh
#add lines
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

vim stop-yarn.sh
#add lines
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

5. Push hadoop profile;
CD /opt/modules/hadoop-3.0.0/etc/hadoop
SCP ./ the root @ node1: /opt/modules/hadoop-3.0.0/etc/hadoop/
SCP ./
the root @ node2 : /opt/modules/hadoop-3.0.0/etc/hadoop/
6. the format hdfs;
# specified in the profile storage paths hdfs / Data / tmp /
/opt/modules/hadoop-3.0.0/bin/hdfs -format NameNode
7. The start hadoop services;
#namenode three nodes
CD /opt/modules/zookeeper-3.4.13
./bin/zkServer.sh start
CD /opt/modules/kafka_2.12-2.1.1
./bin/kafka -server-start.sh ./config/server.properties &

/opt/modules/hadoop-3.0.0/bin/hdfs journalnode &

#master节点
/opt/modules/hadoop-3.0.0/bin/hdfs namenode -format
/opt/modules/hadoop-3.0.0/bin/hdfs zkfc -formatZK
/opt/modules/hadoop-3.0.0/bin/hdfs namenode &

#slave1节点
/opt/modules/hadoop-3.0.0/bin/hdfs namenode -bootstrapStandby
/opt/modules/hadoop-3.0.0/bin/hdfs namenode &
/opt/modules/hadoop-3.0.0/bin/yarn resourcemanager &
/opt/modules/hadoop-3.0.0/bin/yarn nodemanager &

#slave2节点
/opt/modules/hadoop-3.0.0/bin/hdfs namenode -bootstrapStandby
/opt/modules/hadoop-3.0.0/bin/hdfs namenode &
/opt/modules/hadoop-3.0.0/bin/yarn resourcemanager &
/opt/modules/hadoop-3.0.0/bin/yarn nodemanager &

#namenode three-node
/opt/modules/hadoop-3.0.0/bin/hdfs zkfc &

#master node
cd /opt/modules/hadoop-3.0.0/
./sbin/start-all.sh
cd /opt/modules/hadoop-3.0.0/hbase-2.0.4
./bin/start-hbase.sh

8. Review each node hadoop services start properly;
JPS

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

9. Run the test;
cd /opt/modules/hadoop-3.0.0
create a test path on #hdfs
./bin/hdfs the DFS -mkdir / TestDir1
# create a test file
cd / opt
Touch wc.input
vim wc.input
hadoop Hive MapReduce
the Spark Storm HBase
Sqoop hadoop Hive
the Spark hadoop
# wc.input will be uploaded to HDFS
bin / HDFS the DFS -put /opt/wc.input /testdir1/wc.input
# run hadoop Demo comes with MapReduce
./bin/yarn JAR / opt wordcount /testdir1/wc.input /modules/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar / the output
# view the output file
bin / hdfs dfs -ls / output

10. Status screenshot

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

After all the services start properly Screenshot:
ZooKeeper Kafka used to live + + + journalnode the NameNode + HBase

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

hadoop 3.0.0 installation configuration

路过点一赞,技术升一线,加油↖(^ω^)↗!

Guess you like

Origin blog.51cto.com/driver2ice/2432011