Hadoop environment configuration under Linux


First, the system environment

Some four virtual machines configured as follows

Modify the hostname

hostnamectl set-hostname node01

Modify the IP address

terminal to write too much trouble, you can directly use the intersection when the desktop configuration

I want to use Terminal, https://jingyan.baidu.com/article/e4d08ffdd417660fd3f60d70.html

Second, prepare for operation

1, and installation Xshell familiar with several xshell command

https://jingyan.baidu.com/article/e75057f210c6dcebc91a89dd.html

Log host

ssh host user @ hostname

Examples of  SSH root @ xu

Enter the host password

2, mounting xFTP --- centos system to facilitate file transfer

Third, the specific steps

1, installation support jdk ---------- hadoop need jdk on centos, proper installation is critical therefore jdk

(1)jdk

By xshell in xftp the jdk spread centos above (just pass over it ok, what is the way free)

(2) extracting JDK --- decompressed packet according to the type of

Decompression

Jdk installation path of

 

( 3 ) configuration environment variable ( analog window above configuration environment variable here is the terminal work is essentially unchanged under )

vi /etc/profile

(4) Test jdk correctly installed, a similar lower surface in FIG successful installation jdk

2, install hadoop

(1)

(2) Configuration Environment Variables

Environment Variables

(3)配置Hadoop文件

            

  首先进入Hadoop配置文件夹

关于java路径的配置

vi hadoop-env.sh

 

vi mapred-env.sh

vi yarn-env.sh

配置core-site.xml

namenode

vi core-site.xml

 

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://node01:9000</value>

    </property>

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/var/sxt/hadoop/local</value>

    </property>

配置hdfs-site.xml

Secondarynode

 

 

 <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

    <property>

        <name>dfs.namenode.secondary.http-address</name>

        <value>node01:50090</value>

    </property>

配置slaves文件

datanode

vi slaves node01

(4)格式化+启动

格式化hdfs

hdfs namenode -format  (只能格式化一次,再次启动集群不要执行)

 

启动集群

start-dfs.sh

 

 

如果出现

说明-------本质原因:namenode的id 与datanode id 不一致,导致datanode自杀(这里的id用于标识namenode和datanode是否一致)

表象原因:core-site.xml中的,hadoop.tem.dir下的目录不干净,将其删除并重启格式化启动就可以了具体见: https://blog.csdn.net/love666666shen/article/details/74350358

   

四、优化处理--------免密钥登录

1、在四台centos和主机上的hosts文件中互相配置各子的别名(用别名替代主机名,便于访问各个主机)

在主机上进行配置

在centos上配置

 

centos本机

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 

传播

scp ~/.ssh/id_dsa.pub root@node02:`pwd`/node01.pub           这里的'pwd'是一个参数

到目标主机将发送过去的公钥发送到该主机的

cat node01.pub >> ~/.ssh/authorized_keys

 

方便的方法

for i in `seq 4`; do

cat k | sed "s/\(node0\)1\(,192\.168\.9\.1\)1\(.*\)/\1$i\2$i\3/gi"

>> known_hosts;

done

 

 

 

 

发布了26 篇原创文章 · 获赞 4 · 访问量 1万+

Guess you like

Origin blog.csdn.net/weixin_41926640/article/details/101396104