ambari 服务组件报错解决

ambari 服务组件报错解决

目录

ambari 服务组件报错解决

hive组件问题

 Hbase组件问题

oozie组件问题 

atlas组件问题

druid组件问题

hive组件问题

问题1 hiveserver2服务报错:

Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://ambari-hadoop2:10000/;transportMode=binary: java.net.ConnectException: 拒绝连接 (Connection refused) (state=08S01,code=0)

先尝试重新启动一下,启动成功

 Hbase组件问题

master启动失败,尝试再次重新启动,启动成功

oozie组件问题 

oozie server启动失败,报错信息如下

Error: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 5. Exception = Error while authenticating with endpoint: http://ambari-hadoop3:11000/oozie/versions

尝试手动点击重启

 看似启动成功,但是OOZIE web没法访问

但是我等待一会儿后,刷新页面 又可以正常访问但是出现"Oozie web console is disabled."提示

 

Oozie UI 启动过程中需要依赖Ext JS,而由于License的原因,HDP2.6以后的版本中Ext JS将不再被包含其中。所以需要自己手动安装才能使用Oozie UI。尝试下载 Ext JS

wget http://public-repo-1.hortonworks.com/HDP-UTILS-GPL-1.1.0.22/repos/centos7-ppc/extjs/extjs-2.2-1.noarch.rpm

 但是出现403Forbidden问题,然后再尝试添加user-agent

wget -U '你自己的user-agent' -o 'extjs-2.2-1.noarch.rpm' http://public-repo-1.hortonworks.com/HDP-UTILS-GPL-1.1.0.22/repos/centos7-ppc/extjs/extjs-2.2-1.noarch.rpm

虽然有文件了,但是大小不正确,没有下载完整所以安装失败

再尝试以下操作:

Oozie web console is disabled. To enable Oozie web console install the Ext JS library 解决方法-CSDN博客

 OK,成功解决

atlas组件问题

Atlas Metadata Server Start启动报错

stderr: 
2023-11-16 15:12:06,479 - Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.
2023-11-16 15:02:13,533 - Ranger KMS is not ssl enabled, skipping ssl-client for hdfs audits.
2023-11-16 15:02:13,535 - call['ambari-python-wrap /usr/bin/hdp-select status atlas-server'] {'timeout': 20}
2023-11-16 15:02:13,632 - call returned (0, 'atlas-server - 3.1.5.0-152')
2023-11-16 15:02:13,635 - RangeradminV2: Skip ranger admin if it's down !
2023-11-16 15:02:13,684 - Will retry 74 time(s), caught exception: Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.. Sleeping for 8 sec(s)
2023-11-16 15:02:21,689 - Will retry 73 time(s), caught exception: Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.. Sleeping for 8 sec(s)
....................
2023-11-16 15:11:58,469 - Will retry 1 time(s), caught exception: Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.. Sleeping for 8 sec(s)
2023-11-16 15:12:06,479 - Connection failed to Ranger Admin. Reason - [Errno 111] Connection refused.

这是由于ranger admin服务没有启动或启动失败,重启解决一下:

 哎,结果又遇到新问题报错:

ERROR Java::OrgApacheHadoopHbaseIpc::CallTimeoutException: Call id=15, waitTime=90015, rpcTimeout=90000
stderr: 
Python script has been killed due to timeout after waiting 1200 secs
 stdout:
2023-11-16 15:53:07,114 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=3.1.5.0-152 -> 3.1.5.0-152
2023-11-16 15:53:07,146 - Using hadoop conf dir: /usr/hdp/3.1.5.0-152/hadoop/conf
2023-11-16 15:53:07,495 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=3.1.5.0-152 -> 3.1.5.0-152
2023-11-16 15:53:07,502 - Using hadoop conf dir: /usr/hdp/3.1.5.0-152/hadoop/conf
2023-11-16 15:53:07,505 - Group['kms'] {}
2023-11-16 15:53:07,507 - Group['livy'] {}
2023-11-16 15:53:07,507 - Group['spark'] {}
2023-11-16 15:53:07,507 - Group['ranger'] {}
2023-11-16 15:53:07,508 - Group['hdfs'] {}
2023-11-16 15:53:07,508 - Group['hadoop'] {}
2023-11-16 15:53:07,509 - Group['users'] {}
.......
2023-11-16 15:53:40,331 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=3.1.5.0-152 -> 3.1.5.0-152
2023-11-16 15:53:40,336 - Execute['cat /var/lib/ambari-agent/tmp/atlas_hbase_setup.rb | hbase shell -n'] {'tries': 5, 'user': 'hbase', 'try_sleep': 10}
2023-11-16 16:06:36,052 - Retrying after 10 seconds. Reason: Execution of 'cat /var/lib/ambari-agent/tmp/atlas_hbase_setup.rb | hbase shell -n' returned 1. SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/phoenix/phoenix-5.0.0.3.1.5.0-152-server.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
atlas_janus
ATLAS_ENTITY_AUDIT_EVENTS
atlas
TABLE
Took 730.7334 secondsjava exception
ERROR Java::OrgApacheHadoopHbaseIpc::CallTimeoutException: Call id=15, waitTime=90015, rpcTimeout=90000

Command failed after 1 tries

 看一看hbase状态,regionserver全挂了,重新手动启动一下:

 regionserver启动完后再次启动atlas服务,atlas服务成功启动,web页面正常访问:

后续其他组件问题持续更新.......

druid组件问题

 查看警告信息为:

Command aborted. Reason: 'Server considered task failed and automatically aborted it

 发现关键错误信息

2023-11-15 18:27:17,273 - Retrying after 10 seconds. Reason: Execution of '/usr/java/jdk1.8.0_371/bin/java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/hdp/current/druid-broker/extensions/mysql-metadata-storage/mysql-connector-java.jar org.apache.ambari.server.DBConnectionVerification 'jdbc:mysql://ambari-hadoop1:3306/druid?createDatabaseIfNotExist=true' druid [PROTECTED] com.mysql.jdbc.Driver' returned 1. Wed Nov 15 18:27:16 CST 2023 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
ERROR: Unable to connect to the DB. Please check DB connection properties.
java.sql.SQLException: Access denied for user 'druid'@'ambari-hadoop1' (using password: YES)
Command aborted. Reason: 'Server considered task failed and automatically aborted it'
        SSL服务验证在之前已经关闭了,那么看看是否是用户权限问题,发现没有名为druid的用户和密码尝试创建完后再次尝试
mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select user from user;
+---------------+
| user          |
+---------------+
| ambari        |
| oozie         |
| rangeradmin   |
| rangerkms     |
| root          |
| rangeradmin   |
| rangerkms     |
| ambari        |
| mysql.session |
| mysql.sys     |
| rangeradmin   |
| rangerkms     |
+---------------+
12 rows in set (0.01 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| ambari             |
| mysql              |
| oozie              |
| performance_schema |
| ranger             |
| rangerkms          |
| sys                |
+--------------------+
8 rows in set (0.00 sec)

mysql> CREATE DATABASE druid  character set utf8 collate utf8_general_ci;
Query OK, 1 row affected (0.05 sec)

mysql> CREATE USER 'druid'@'%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.02 sec)

mysql> GRANT ALL PRIVILEGES ON druid.* TO 'druid'@'%';
Query OK, 0 rows affected (0.00 sec)

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.03 sec)

mysql> 

猜你喜欢

转载自blog.csdn.net/qq_44540985/article/details/134437544