Cloudera Manager5 installation summary encountered problems and solutions

During the installation process, due to the network terminal, the following problems are caused:

Issue 1: Installation stopped at getting install lock
/tmp/scm_prepare_node.tYlmPfrT 
using SSH_CLIENT to get the SCM hostname: 172.16.77.20 33950 22 
opening logging file descriptor 

Starting install script...Acquiring install lock... BEGIN flock 4 

this After about half an hour, close selinux! disabled

Question 2: Cannot select the host

Installation failed, and cannot select the host again

 


Figure 1
Solution, need to clear the installation failure file
Uninstall Cloudera Manager 5.1.x. and related software [Official website translation: high availability]






Question 3: DNS reverse resolution PTR localhost :

Description:

DNS reverse resolution error, unable to resolve Cloudera Manager Server hostname correctly
Log :
Detecting Cloudera Manager Server...
Detecting Cloudera Manager Server...
BEGIN host -t PTR 192.168.1.198
198.1.168.192.in-addr.arpa domain name pointer localhost.
END (0)
using localhost as scm server hostname
BEGIN which python
/usr/bin/python
END (0)
BEGIN python -c 'import socket; import sys; s = socket.socket(socket.AF_INET); s.settimeout(5.0); s.connect((sys.argv[1], int(sys.argv[2]))); s.close();' localhost 7182
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "<string>", line 1, in connect
socket.error:[Errno 111] Connection refused
END (1)
could not contact scm server at localhost:7182, giving up
waiting for rollback request

solution:
Delete the /usr/bin/host file of the unconnected machine and execute the following command:
  1. sudo mv /usr/bin/host /usr/bin/host.bak
copy code


illustrate:
I don't understand the original intention of cloudera. I have already obtained the ip of Cloudera Manager Server, but I have to parse the ip into a hostname to connect.
Because the DNS reverse resolution is not configured properly, the host name is resolved according to the ip of the Cloudera Manager Server, but the localhost is obtained, resulting in subsequent connection errors.
The solution here is to delete /usr/bin/host directly, so that Cloudera Manager will directly use ip to connect, there is nothing wrong
refer to:






Question 4 NTP:


Problem Description: 
Bad Health --Clock Offset
The host's NTP service did not respond to a request for the clock offset.
solve:
Configure NTP service
Step reference:

CentOS configure NTP Server:

http://www.hailiangchen.com/centos-ntp/

Domestic commonly used NTP server address and IP

http://www.douban.com/note/171309770/

Modify the configuration file:
[root@work03 ~]# vim /etc/ntp.conf

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool ( http://www.pool.ntp.org/join.html).
server s1a.time.edu.cn prefer
server s1b.time.edu.cn
server s1c.time.edu.cn

restrict 172.16.1.0 mask 255.255.255.0 nomodify   <===放行局域网来源

启动ntp
#service ntpd restart    <===启动ntp服务
客户端同步时间(work02,work03):
ntpdate work01
说明:NTP服务启动需要大约五分钟时间,服务启动之前,若客户端同步时间,则会出现错误“no server suitable for synchronization found”
定时同步时间:
在work02和 work03上配置crontab定时同步时间

crontab -e
00 12 * * * root /usr/sbin/ntpdate 192.168.56.121 >> /root/ntpdate.log 2>&1 
问题 2.2
描述:
     Clock Offset

  • Ensure that the host's hostname is configured properly.
  • Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
  • Ensure that ports 9000 and 9001 are free on the host being added.
  • Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).
问题定位:

在对应host(work02、work03)上运行 'ntpdc -c loopinfo'
[root@work03 work]# ntpdc -c loopinfo
ntpdc: read: Connection refused
解决:

开启ntp服务:
三台机器都开机启动 ntp服务
chkconfig ntpd on






问题 5 heartbeat:

错误信息:
Installation failed. Failed to receive heartbeat from agent.
解决:关闭防火墙






问题 6 Unknow Health:

Unknow Health
重启后:Request to theHost Monitor failed.
service --status-all| grep clo
机器上查看scm-agent状态:cloudera-scm-agent dead but pid file exists
解决:重启服务
service cloudera-scm-agent restart
service cloudera-scm-server restart






问题 7 canonial name hostname consistent:

Bad Health
The hostname and canonical name for this host are not consistent when checked from a Java process.
canonical name:
4092 Monitor-HostMonitor throttling_logger WARNING  (29 skipped) hostname work02 differs from the canonical name work02.xinzhitang.com
解决:修改hosts 使FQDN和 hostname相同
ps:虽然解决了但是不明白为什么主机名和主机别名要一样
/etc/hosts
192.168.1.185 work01 work01
192.168.1.141 work02 work02
192.168.1.198 work03 work03





问题 8 Concerning Health:

Concerning Health Issue
--  Network Interface Speed --
描述:The host has 2 network interface(s) that appear to be operating at less than full speed. Warning threshold: any.
详细:
This is a host health test that checks for network interfaces that appear to be operating at less than full speed.
A failure of this health test may indicate that network interface(s) may be configured incorrectly and may be causing performance problems. Use the ethtool command to check and configure the host's network interfaces to use the fastest available link speed and duplex mode.
解决:
本次测试修改了 Cloudera Manager 的配置,应该不算是真正的解决

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324547946&siteId=291194637