一次TNS-12537, TNS-12560, TNS-00507的监听问题处理记录

版权声明:本文为博主原创文章,欢迎转载! https://blog.csdn.net/qq_40687433/article/details/83027830

早上接到应用说数据库连接不上,登陆服务器发现2节点登陆不上,1节点可用登陆
[oracle@xsdbd32 ~]$ sqlplus testconn/[email protected]:1521/ngjkdb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 11:07:19 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12537: TNS:connection closed

[oracle@xsdbd31 ~]$ sqlplus system/[email protected]:1521/ngjkdb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 09:27:26 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied

检查2遍的监听状态,看上去都正常
[oracle@xsdbd31 ~]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 12-OCT-2018 09:26:12

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                12-OCT-2018 09:25:49
Uptime                    0 days 0 hr. 0 min. 22 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /grid/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /grid/app/grid/diag/tnslsnr/xsdbd31/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.158)(PORT=1521)))
Services Summary...
Service "lzldb1" has 1 instance(s).
  Instance "lzldb11", status READY, has 1 handler(s) for this service...
Service "lzldb1XDB" has 1 instance(s).
  Instance "lzldb11", status READY, has 1 handler(s) for this service...
The command completed successfully

测试连接性
1节点正常连接
[oracle@xsdbd31 ~]$ sqlplus system/[email protected]:1521/lzldb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 09:27:26 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied

2节点异常
SQL> create user testconn identified by oracle;

User created.

SQL> grant connect to testconn;

Grant succeeded.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
[oracle@xsdbd32 ~]$  sqlplus testconn/[email protected]:1521/lzldb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 09:58:57 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12537: TNS:connection closed


Enter user-name: testconn
Enter password: 

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> 
数据库连上去了,但是不知道是哪个节点


SQL> grant dba to testconn
[oracle@xsdbd32 ~]$ sqlplus testconn/[email protected]:1521/lzldb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 11:07:19 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12537: TNS:connection closed


Enter user-name: testconn
Enter password: 

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> show parameter name

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cell_offloadgroup_name               string
db_file_name_convert                 string
db_name                              string      lzldb1
db_unique_name                       string      lzldb1
global_names                         boolean     FALSE
instance_name                        string      lzldb12
lock_name_space                      string
log_file_name_convert                string
processor_group_name                 string
service_names                        string      lzldb1

确实连上了2节点

这里可用分析出,我的sqlplus连接请求应该是被数据库接受到了,不然不可能登陆的上数据库

在最初是请求中,收到了tns的报错,然后再输入用户密码又进去了


检查告警日志无异常,cpu和内存,swap都很空闲

2节点监听日志:
12-OCT-2018 09:35:20 * (CONNECT_DATA=(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user))(SERVICE_NAME=lzldb1)(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user)8
TNS-12518: TNS:listener could not hand off client connection
 TNS-12547: TNS:lost contact
  TNS-12560: TNS:protocol adapter error
   TNS-00517: Lost contact
    Linux Error: 32: Broken pipe
12-OCT-2018 09:35:20 * (CONNECT_DATA=(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user))(SERVICE_NAME=lzldb1)(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user)8
TNS-12518: TNS:listener could not hand off client connection
 TNS-12547: TNS:lost contact
  TNS-12560: TNS:protocol adapter error
   TNS-00517: Lost contact
    Linux Error: 32: Broken pipe
12-OCT-2018 09:35:24 * (CONNECT_DATA=(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user))(SERVICE_NAME=lzldb1)(CID=(PROGRAM=JDBC Thin Client)(HOST=__jdbc__)(USER=user)8
TNS-12518: TNS:listener could not hand off client connection
 TNS-12547: TNS:lost contact
  TNS-12560: TNS:protocol adapter error
   TNS-00517: Lost contact
    Linux Error: 32: Broken pipe
    

重启监听,无用,重启集群,监听日志仍然报错,走监听无法连接2节点

检查tcp协议

[root@xsdbd31 ~]# netstat -anop|grep tnslsnr
tcp        0      0 10.10.10.160:1521      0.0.0.0:*               LISTEN      34242/tnslsnr        off (0.00/0/0)
tcp        0      0 10.10.10.158:1521      0.0.0.0:*               LISTEN      33996/tnslsnr        off (0.00/0/0)
tcp        0      0 10.10.10.10:1521       0.0.0.0:*               LISTEN      33996/tnslsnr        off (0.00/0/0)
tcp        0      0 10.10.10.160:1521      10.10.10.11:15896      ESTABLISHED 34242/tnslsnr        keepalive (2746.94/0/0)
tcp        0      0 10.10.10.158:1521      10.10.10.10:25658      ESTABLISHED 33996/tnslsnr        keepalive (1337.91/0/0)
tcp        0      0 10.10.10.158:1521      10.10.10.10:25676      ESTABLISHED 33996/tnslsnr        keepalive (1370.68/0/0)
tcp        0      0 10.10.10.160:1521      10.10.10.10:19531      ESTABLISHED 34242/tnslsnr        keepalive (1370.68/0/0)
tcp6       0      0 ::1:52981               ::1:6100                ESTABLISHED 34242/tnslsnr        keepalive (1370.68/0/0)
tcp6       0      0 ::1:52967               ::1:6100                ESTABLISHED 33996/tnslsnr        keepalive (1337.91/0/0)
unix  2      [ ACC ]     STREAM     LISTENING     3357737141 33996/tnslsnr        /var/tmp/.oracle/s#33996.1
unix  2      [ ACC ]     STREAM     LISTENING     3357737142 33996/tnslsnr        /var/tmp/.oracle/s#33996.2
unix  2      [ ACC ]     STREAM     LISTENING     3357737140 33996/tnslsnr        /var/tmp/.oracle/sLISTENER
unix  2      [ ACC ]     STREAM     LISTENING     3357740672 34242/tnslsnr        /var/tmp/.oracle/sLISTENER_SCAN1
unix  2      [ ACC ]     STREAM     LISTENING     3357740673 34242/tnslsnr        /var/tmp/.oracle/s#34242.1
unix  2      [ ACC ]     STREAM     LISTENING     3357740674 34242/tnslsnr        /var/tmp/.oracle/s#34242.2
unix  3      [ ]         STREAM     CONNECTED     3357741362 34242/tnslsnr        /var/tmp/.oracle/sLISTENER_SCAN1
unix  3      [ ]         STREAM     CONNECTED     3357730674 33996/tnslsnr        /var/tmp/.oracle/sLISTENER


用strace更正lsnrctl进程,期间用sqlplus登陆数据库
 strace -o/tmp/tnslsnr1.log  -p 33996

查看生成的trace文件,也没有看出什么

检查监听进程权限和监听执行文件权限,没有问题
[root@xsdbd31 ~]# ps -ef|grep tnslsnr
grid      33996      1  0 09:25 ?        00:00:01 /grid/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid      34242      1  0 09:26 ?        00:00:00 /grid/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
[grid@xsdbd32 ~]$ ls -lrt $ORACLE_HOME/bin/tnslsnr
-rwxr-x--x. 1 grid oinstall 974016 Jan 26  2018 /grid/app/11.2.0/grid/bin/tnslsnr
[root@xsdbd32 ~]#  su - oracle
Last login: Fri Oct 12 12:13:38 CST 2018 on pts/0
[oracle@xsdbd32 ~]$  ls -lrt $ORACLE_HOME/bin/tnslsnr
-rwxr-x--x. 1 oracle oinstall 974016 Jan 25  2018 /oracle/app/oracle/product/11.2.0/db_1/bin/tnslsnr

检查oracle执行文件权限

[oracle@xsdbd32 trace]$ cd $ORACLE_HOME/bin
[oracle@xsdbd32 bin]$ ls -l oracle
-rwxr-s--x. 1 oracle asmadmin 239889136 Oct 11 23:27 oracle

这里少了一个s,正确的权限应该是   :6751  -rwsr-s--x    oracle  asmadmin 

chmod 6751   oracle

重启下2节点上的数据库实例和监听,2实例终于可以正常登陆了

[oracle@xsdbd32 ~]$ sqlplus testconn/[email protected]/lzldb1

SQL*Plus: Release 11.2.0.4.0 Production on Fri Oct 12 12:24:48 2018

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> 

但是
1、2节点监听日志仍然在报错:
12-OCT-2018 12:17:34 * <unknown connect data> * 12537
TNS-12537: TNS:connection closed
 TNS-12560: TNS:protocol adapter error
  TNS-00507: Connection closed
   Linux Error: 115: Operation now in progress
Fri Oct 12 12:17:39 2018
12-OCT-2018 12:17:39 * <unknown connect data> * 12537
TNS-12537: TNS:connection closed
 TNS-12560: TNS:protocol adapter error
  TNS-00507: Connection closed
   Linux Error: 115: Operation now in progress
12-OCT-2018 12:17:41 * service_update * lzldb11 * 0
12-OCT-2018 12:17:44 * <unknown connect data> * 12537
TNS-12537: TNS:connection closed
 TNS-12560: TNS:protocol adapter error
  TNS-00507: Connection closed
  
  现在应用是正常了,数据库可以通过scan ip,vip正常连接

unknown connect data表示连接数据库的客户端在访问监听时没有给出正确的data,也就是说客户端访问了监听的端口,但是没有合法的连接信息。

开启监听trace跟踪该问题:(监听trace可以参考我的文章https://blog.csdn.net/qq_40687433/article/details/83089218

LSNRCTL> set trc_level 16 

LSNRCTL> show trc_file 

直接读取trace文件,不格式化。在trace中找到相应ip

2018-10-15 17:56:13.800070 : nttvlser:valid node check on incoming node 10.xxx.xxx.xx4

总结:

1.在打完补丁或者对数据库完成任何大动作后,要手动测试下数据库的连接性,监听看上去正常也不一定能够连接数据库。

2. $ORACLE_HOME/bin/oracle文件权限很容易在打完补丁后被更改,需要手动更改回来。

其实在打完gi后,数据库拉不起,我改了$ORACLE_HOME/bin/oracle权限后还是不行,最后重启了crs再拉才拉起来。

$ORACLE_HOME/bin/oracle还会影响监听,就跟这次的问题一样,所有资源都正常,但是就是连不上。

3.dba是个细致的活,本来在$ORACLE_HOME/bin/oracle上已经栽过很多跟头了,也很注意的去检查了权限,但是还是看飘了,少了个s没有注意到。

这种问题去查又很难查到问题的根本原因,花了大力气,其实就是改下权限而已。

猜你喜欢

转载自blog.csdn.net/qq_40687433/article/details/83027830