CentOS 7 builds Impala 4.1.2 + Kudu 1.15.0 test environment

install dependencies

This part is not too detailed, if there is a ready-made environment, it can be used directly.

Java

To download the java installation package, you need to log in to oracle, please download it yourself.

cd /mnt
tar zxvf jdk-8u202-linux-x64.tar.gz

Configure environment variables to /etc/bashrcand execute source /etc/bashrc. Start environment variables including Hadoop and Hive.

export JAVA_HOME=/mnt/jdk1.8.0_202
export PATH=$JAVA_HOME/bin:HIVE_HOME/bin:HADOOP-HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export HADOOP_HOME=/mnt/hadoop-3.3.2
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_NAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

Hadoop

Download Hadoop 3.3.2

wget https://archive.apache.org/dist/hadoop/core/hadoop-3.3.2/hadoop-3.3.2.tar.gz
tar zxvf hadoop-3.3.2.tar.gz

Configure the local password-free

ssh-keygen -t rsa
cd ~/.ssh/
cat id_rsa.pub >> authorized_keys

Modify the configuration file

core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.proxyuser.work.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.work.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/mnt/hadoop-3.3.2/tmp</value>
    </property>
</configuration>

hdfs-site.xml

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/mnt/hadoop-3.3.2/hdfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/mnt/hadoop-3.3.2/hdfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

format

/mnt/hadoop-3.3.2/bin/hdfs namenode -format

start up

/mnt/hadoop-3.3.2/sbin/start-dfs.sh

MySQL

Install MySQL 8 via yum

yum install -y ca-certificates
wget https://dev.mysql.com/get/mysql80-community-release-el7-2.noarch.rpm
yum -y install mysql80-community-release-el7-2.noarch.rpm
yum -y install mysql-community-server --nogpgcheck
# 启动 mysql
systemctl start mysqld

change Password

# 查看初始 mysql 密码
grep "password" /var/log/mysqld.log
# 登录 mysql 后修改 root 密码
ALTER USER 'root'@'localhost' IDENTIFIED BY 'AAAaaa111~';
# 修改 mysql 密码策略和长度限制
set global validate_password.policy=0;
set global validate_password.length=4;

Create the database required by Hive

CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%';
DELETE FROM mysql.user WHERE user='';
flush privileges;
CREATE DATABASE hive charset=utf8;

Hive

Download Hive 3.1.2

wget https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
tar zxvf apache-hive-3.1.2-bin.tar.gz
mv apache-hive-3.1.2-bin hive-3.1.2

Modify the configuration file

hive-site.xml

<configuration>
    <property>
        <name>hive.metastore.dml.events</name>
        <value>true</value>
    </property>  
    <property>
      <name>hive.exec.scratchdir</name>
      <value>/mnt/hive-3.1.2/scratchdir</value>
    </property>
    <property>
      <name>hive.metastore.warehouse.dir</name>
      <value>/mnt/hive-3.1.2/warehouse</value>
    </property>
    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://localhost:9083</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false&amp;allowPublicKeyRetrieval=true&amp;serverTimezone=UTC</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>root</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>AAAaaa111~</value>
    </property>
    <property>
      <name>hive.metastore.event.db.notification.api.auth</name>
      <value>false</value>
    </property> 
    <property>
      <name>hive.server2.active.passive.ha.enable</name>
      <value>true</value>
    </property>
</configuration>

Initialize the schema

/mnt/hive-3.1.2/bin/schematool -dbType mysql -initSchema

start up

/mnt/hive-3.1.2/bin/hive --service metastore

Install core components

Must

Download Kudu 1.15.0

wget https://github.com/MartinWeindel/kudu-rpm/releases/download/v1.15.0-1/kudu-1.15.0-1.x86_64.rpm

Install ntp service

yum install ntpd
systemctl start ntpd
systemctl enable ntpd

Modify the configuration file (or not)

master.gflagfile

--log_dir=/mnt/kudu
--fs_wal_dir=/mnt/kudu/master
--fs_data_dirs=mnt/kudu/master

tserver.gflagfile

--tserver_master_addrs=127.0.0.1:7051

--log_dir=/mnt/kudu
--fs_wal_dir=/mnt/kudu/tserver
--fs_data_dirs=/mnt/kudu/tserver

start up

kudu-master --flagfile /etc/kudu/conf/master.gflagfile &
kudu-tserver --flagfile /etc/kudu/conf/tserver.gflagfile &

Impala

Impala 4.1.2 is compiled from the source code, and you need to pay attention to adding it when compiling export USE_APACHE_HIVE=true, so that it can be compatible with Hive 3.1.2 after compiling. Otherwise, an error will be reported when creating the library:

ERROR: ImpalaRuntimeException: Error making 'createDatabase' RPC to Hive Metastore:
CAUSED BY: TApplicationException: Invalid method name: 'get_database_req'

After the compilation is complete, install the RPM package by yourself. You can refer to impala-rpm to modify it yourself.

Modify the configuration file

Create soft links of and hive-site.xmlunder the path.core-site.xml/etc/impala/conf/

ln -s /mnt/hive-3.1.2/conf/hive-site.xml hive-site.xml
ln -s /mnt/hadoop-3.3.2/etc/hadoop/core-site.xml core-site.xml

impala-conf.xml

<configuration>
  <property>
    <name>catalog_service_enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>catalog_topic_mode</name>
    <value>minimal</value>
  </property>
  <property>
    <name>kudu_master_hosts</name>
    <value>localhost:7051</value>
  </property>
  <property>
    <name>default_storage_engine</name>
    <value>kudu</value>
  </property>
</configuration>

start up

impalad &
catalogd &
statestored &

verify

[root@bogon ~] impala-shell
Starting Impala Shell with no authentication using Python 2.7.5
Opened TCP connection to localhost.localdomain:21050
Connected to localhost.localdomain:21050
Server version: impalad version 4.1.2-RELEASE RELEASE (build 1d7b63102ebc8974e8133c964917ea8052148088)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v4.1.2-RELEASE (1d7b631) built on Thu Jul  6 05:44:12 UTC 2023)

To see live updates on a query's progress, run 'set LIVE_SUMMARY=1;'.
***********************************************************************************
[localhost.localdomain:21050] default> CREATE TABLE test
                                     > (
                                     >   id BIGINT,
                                     >   name STRING,
                                     >   PRIMARY KEY(id)
                                     > )
                                     > PARTITION BY HASH PARTITIONS 16
                                     > STORED AS KUDU
                                     > TBLPROPERTIES (
                                     >   'kudu.master_addresses' = 'localhost:7051',
                                     >   'kudu.num_tablet_replicas' = '1'
                                     > );
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 9.89s
[localhost.localdomain:21050] default> insert into test values (1, 'xiedeyantu');
Query: insert into test values (1, 'xiedeyantu')
Query submitted at: 2023-07-07 03:50:41 (Coordinator: http://bogon:25000)
Query progress can be monitored at: http://bogon:25000/query_plan?query_id=b94595ef56094a6e:05654dec00000000
Modified 1 row(s), 0 row error(s) in 0.22s
[localhost.localdomain:21050] default> select * from test;
Query: select * from test
Query submitted at: 2023-07-07 03:50:44 (Coordinator: http://bogon:25000)
Query progress can be monitored at: http://bogon:25000/query_plan?query_id=a74db79af051b646:81c486ed00000000
+----+------------+
| id | name       |
+----+------------+
| 1  | xiedeyantu |
+----+------------+
Fetched 1 row(s) in 0.15s

Take a look at Kudu through the Web page, the address is: http://127.0.0.1:8051. For convenience, you can also use w3m to access: w3m http://127.0.0.1:8051.

insert image description here

Take a look at Impala through the web page, the ports are:

component name web port
statestored 25010
catalogd 25020
impalad 25000

Open: http://127.0.0.1:25020/catalog

insert image description here

Open: http://127.0.0.1:25000/backends

insert image description here

Open: http://127.0.0.1:25010/metrics

insert image description here

At this point, all installation verification is completed.

おすすめ

転載: blog.csdn.net/weixin_39992480/article/details/131599591