Big data tool use-install Hadoop (multiple servers) and Hive, Hbase

1. Configure the environment version

Upload the data to Baidu Cloud and pick it up by yourself: Link: https://pan.baidu.com/s/1evVp5Zk0_X7VdjKlHGkYCw Extraction code: ypti
copy this content and open the Baidu Netdisk  mobile phone App, the operation is more convenient

(The apache version of Hadoop 2.6.4 was installed before, and an error was reported when hive was started, but it was changed to CDH again)

2. Configuration work before installation

2.1 Install jdk

(1) Download jdk

(2) Unzip, and configure environment variables in the /etc/profile file

export JAVA_HOME=/home/jdk1.8.0_131
export PATH=${JAVA_HOME}/bin:${PATH}

2.2 SSH password-free login

ssh-keygen

Change according to the path of the file:

cp /root/.ssh/id_rsa.pub /root/.ssh/authoried_keys

Test with the command:

ssh localhost

2.3 mysql installation (needed in hive environment)

You can refer to the rookie tutorial: https://www.runoob.com/linux/mysql-install-setup.html

My database is remote, need to configure mysql remote connection

2.4 Configure IP

To set /etc/hosts, both servers need to be changed. Mine are two, one master and one data. Nothing is written in the brackets.

IP地址 hostname (master)
IP地址 hostname (data)

3. Install Hadoop

(1) Download files

(2) Decompress to the server separately and set environment variables

Environment variable configuration:

export HADOOP_HOME=/home/hadoop-2.6.0-cdh5.15.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-DJava.library.path=$HADOOP_HOME/lib"
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
#export YARN_CONF_IR=/home/hadoop-2.6.4/etc/hadoop
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

Remember to use source /etc/profile to take effect! ! !

(3) Configuration file

  • Configure the master server

Enter the hadoop file directory /etc/hadoop/core-site.xml:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

Enter the hadoop file directory /etc/hadoop/hdfs-site.xml:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!--<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop-2.6.0-cdh5.15.1/hadoop_data/hdfs/namenode</value>
</property> -->

</configuration>

Enter the hadoop file directory /etc/hadoop/mapred-site.xml:

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
</property>
</configuration>

Enter the hadoop file directory /etc/hadoop/yarn-site.xml:

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>

</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>master:8025</value>

</property>
<property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>master:8030</value>

</property>
<property>
  <name>yarn.resourcemanager.address</name>
  <value>master:8050</value>

</property>

</configuration>

Create a new masters file in the hadoop file directory /etc/hadoop/, and type master

Create a slave file in the hadoop file directory /etc/hadoop/ and type data (if there are multiple data servers, write them separately, such as data1, data2, data3)

  • Configure data server

Enter the hadoop file directory /etc/hadoop/core-site.xml:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

Enter the hadoop file directory /etc/hadoop/hdfs-site.xml:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop-2.6.0-cdh5.15.1/hadoop_data/hdfs/datanode</value>
</property>
</configuration>

Enter the hadoop file directory /etc/hadoop/mapred-site.xml:

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
</property>
</configuration>

Enter the hadoop file directory /etc/hadoop/yarn-site.xml:

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>

</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>master:8025</value>

</property>
<property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>master:8030</value>

</property>
<property>
  <name>yarn.resourcemanager.address</name>
  <value>master:8050</value>

</property>

</configuration>

(4) Start

Enter the hadoop file directory/sbin, start start-all.sh, or start-dfs.sh and start-yarn.sh respectively

(5) View

  • The master server starts the NameNode node:

  • data server, start the DataNode node:

4. Install Hbase

(1) Download Hbase to decompress

(2) Configure environment variables

export HBASE_HOME=/home/hbase-1.2.0-cdh5.15.1
export PATH=$PATH:$HBASE_HOME/bin

(3) Configuration file

Enter the Hbase installation directory/conf/hbase-env.sh, change

Enter the Hbase installation directory/conf/hbase-site.xml, change

<configuration>
 <property>
        <name>hbase.rootdir</name>
        <value>file:/home/hbase-1.2.0-cdh5.15.1/hbase_data</value>
    </property>

</configuration>

(4) Start

Type hbase shell

5. Install Hive

(1) Download Hive to decompress

(2) Configure environment variables

export HIVE_HOME=/home/hive-1.1.0-cdh5.15.1
export PATH=:$JAVA_HOME/bin:$MAVEN_HOME/bin:$FINDBUGS_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SQOOP_HOME/bin:$HIVE_HOME/bin:$PATH 

(3) Configuration file

Enter the Hive installation directory/conf/hive-env.sh, change

export HADOOP_HOME=/home/hadoop-2.6.0-cdh5.15.1/
export HBASE_HOME=/home/hbase-1.2.0-cdh5.15.1

Enter the Hive installation directory/conf/hive-site.sh, change

<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://IP地址:3306/hive?createDatabaseIfNotExsit=true;characterEncoding=utf8&amp;useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
</configuration>

The remote connection to the mysql database is set up. The hive database is the default and cannot be changed. It needs to be created in mysql in advance.

(4) Start

Enter hive to start

If the terminal Jline package error is reported, the jline package under the hadoop file directory /share/hadoop/yarn/lib/ needs to be consistent with the jline package version of the Hive installation directory /lib/! ! !

The installation has come to an end and the rest will continue!

 

Guess you like

Origin blog.csdn.net/qq_28409193/article/details/112987718