Clouderera SCM Server fails to start the pam_unix (sshd: session) session closed for user root Positioning Analysis

Yesterday, in a customer environment CDH Hadoop installation, the installation fairly smoothly, but the start time and Cloudera SCM Server Agent services are starting to fail.

[root@YXnode01 ~]# service cloudera-scm-server restart
Restarting cloudera-scm-server (via systemctl):  Job for cloudera-scm-server.service failed because the control process exited with error code. See "systemctl status cloudera-scm-server.service" and "journalctl -xe" for details.
                                                           [FAILED]

According to the message, we perform "systemctl status cloudera-scm-server.service" to view the detailed error information is as follows,

[root@YXnode01 ~]# systemctl status cloudera-scm-server.service
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-11-05 09:25:49 CST; 3min 32s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 15982 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=1/FAILURE)

Nov 05 09:25:44 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 09:25:44 YXnode01.esgyn.cn su[16015]: pam_unix(su:auth): auth could not identify password for [cloudera-scm]
Nov 05 09:25:44 YXnode01.esgyn.cn su[16015]: pam_succeed_if(su:auth): requirement "uid >= 1000" not met by user "cloudera-scm"
Nov 05 09:25:46 YXnode01.esgyn.cn su[16015]: FAILED SU (to cloudera-scm) root on none
Nov 05 09:25:49 YXnode01.esgyn.cn cloudera-scm-server[15982]: Starting cloudera-scm-server: [FAILED]
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service: control process exited, code=exited status=1
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: Failed to start LSB: Cloudera SCM Server.
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: Unit cloudera-scm-server.service entered failed state.
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service failed.

Incidentally View Cloudera SCM Server logs, as follows,

[root@YXnode01 ~]# tail -10f /var/log/cloudera-scm-server/cloudera-scm-server.out 
Password: su: Error in service module

Selinux check Hadoop nodes, firewall, ssh, etc. These are normal, according to the above specific error "pam_succeed_if (su: auth): requirement" uid> = 1000 "not met by user" cloudera-scm "", we suspect that it was linux system What are the special security policy, some online search to find Ali in an article https://help.aliyun.com/knowledge_detail/41491.html?spm=a2c6h.13066369.0.0.2edd1479fTjQLg
according to the content of the article, from the directory / etc / pam.d following search 'uid> = 1000' content, find the configuration file.

[root@YXnode01 pam.d]# grep 'uid >= 1000' *
password-auth:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
password-auth-ac:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
system-auth:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
system-auth-ac:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
[root@YXnode01 pam.d]# pwd
/etc/pam.d

So we commented above relevant content and try again try to start the SCM Server service, found still failed to start, but the error message is slightly different, before the error pam_succeed_if (su: auth): requirement "uid> = 1000" not met by user "cloudera-scm" does not exist, the error message into a FAILED SU (to cloudera-scm) root on none.

[root@YXnode01 ~]# systemctl status cloudera-scm-server.service
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-11-05 09:59:37 CST; 17s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 17469 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=1/FAILURE)

Nov 05 09:59:32 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 09:59:32 YXnode01.esgyn.cn su[17502]: pam_unix(su:auth): auth could not identify password for [cloudera-scm]
Nov 05 09:59:34 YXnode01.esgyn.cn su[17502]: FAILED SU (to cloudera-scm) root on none
Nov 05 09:59:37 YXnode01.esgyn.cn cloudera-scm-server[17469]: Starting cloudera-scm-server: [FAILED]
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service: control process exited, code=exited status=1
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: Failed to start LSB: Cloudera SCM Server.
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: Unit cloudera-scm-server.service entered failed state.
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service failed.

It turned out, the root user directly performs service cloudera-scm-server start, switching to the inside will first be started cloudera-scm user, perform su cloudera-scm command starts.
So we check the switch from the root to cloudea-scm users, and do the same test in the other normal environment. We need to find you will be prompted to enter the password when this execution environment inside root su cloudera-scm, but do not need a normal environment.

[root@YXnode01 ~]# su cloudera-scm
Password: 

Based on this information, we need to check further /etc/pam.d/su file search, so we compare this normal environment and in the environment file /etc/pam.d/su except as shown below
Here Insert Picture Description
in this environment in the document one extra line, we commented out in accordance with the normal environment configuration above this line, and then restart the SCM Server service, are now able to start normally.

[root@YXnode01 ~]# service cloudera-scm-server status
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: active (exited) since Tue 2019-11-05 11:29:54 CST; 15s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 19790 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=0/SUCCESS)

Nov 05 11:29:49 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 11:29:49 YXnode01.esgyn.cn su[19823]: (to cloudera-scm) root on none
Nov 05 11:29:54 YXnode01.esgyn.cn cloudera-scm-server[19790]: Starting cloudera-scm-server: [  OK  ]
Nov 05 11:29:54 YXnode01.esgyn.cn systemd[1]: Started LSB: Cloudera SCM Server.

Look again,
auth required pam_wheel.so Group = wheel, indicating the prohibition of non-wheel user group to switch to root.
In Linux, in order to further strengthen the security of the system, it is necessary to establish the administrator of a group, this group only allows the user to execute "su -" command to log in as root, and let other groups of users even execution. " su - ", enter the correct root password, you can not log in as root. Under UNIX and Linux, the name of this group is usually "wheel". And this is /etc/pam.d/su configured inside the configuration file. Thus, this configuration is added to a file inside su, it has led cloudera-scm user can not su to root, unless the user is added to the cloudera-scm wheel group.

Published 352 original articles · won praise 400 · views 730 000 +

Guess you like

Origin blog.csdn.net/Post_Yuan/article/details/102914847