GI ocssd.log rotation fails with error LFI-00142 and logfile grows to huge size (文档 ID 1900986.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.2 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.

SYMPTOMS

GI logfile "ocssd.log"  grows beyond the default file size.

The logfile rotation fails with error LFI-00142: Unable to delete an existing file [ocssd][110] not owned by Oracle.

This problem can cause file system space issues!


Example ->   ocssd.log located in $GRID_HOME/log/<hostname>/cssd/  below has grown to 5Gb in size

-rw-r--r-- 1 grid oinstall 5517323953 Jun  6 07:57 ocssd.log                ---->  Here!
-rw-rw-r-- 1 grid oinstall     483269 Jun  5 15:37 cssdOUT.log
-rw------- 1 grid oinstall   74092544 Jan 31 10:36 core.9352
-rw-r--r-- 1 grid oinstall  158399110 Jan 29 14:35 ocssd.l01
-rw-r--r-- 1 grid oinstall  158423139 Jan 24 17:00 ocssd.l02
-rw-r--r-- 1 grid oinstall  158422898 Jan 18 06:26 ocssd.l03
-rw-r--r-- 1 grid oinstall  158402809 Jan 11 20:54 ocssd.l04
-rw-r--r-- 1 grid oinstall  158413241 Jan  5 13:07 ocssd.l05
-rw-r--r-- 1 grid oinstall  158413772 Dec 30 17:44 ocssd.l06
-rw-r--r-- 1 grid oinstall  158404163 Dec 24 15:57 ocssd.l07
-rw-r--r-- 1 grid oinstall  158391594 Dec 18 16:22 ocssd.l08
-rw-r--r-- 1 grid oinstall  158406347 Dec 12 22:09 ocssd.l09
-rw-r--r-- 1 grid oinstall  158420893 Dec  6 12:09 ocssd.l10

In the above example the expected file size should not exceed 150mb. Query CSS as follows to confirm logfile size limit -->

% crsctl get css logfilesize
CRS-4676: Successful get logfilesize 157286400 for Cluster Synchronization Services.

  

CHANGES

 None

扫描二维码关注公众号,回复: 4444832 查看本文章

CAUSE

It is caused by unpublished Bug 18700935 - CLOUD:ACLDX0085 OCSSD LOG IS NOT ROTATED

At some point in time, the Clusterware alert log reports an attempted logfile rotation failure.

As a result, the last logfile 'ocssd.110' is never deleted. This may be due to the logfile being open during logfile delete or a permissions issue on the file itself.  

The ocssd.bin thread that performs log file rotation 'clsd_logThread' encounters the delete failure and this causes the logfile never to be deleted/rotated, resulting in ocssd.log continually growing in size.


Extract of the error in $GRID_HOME/log/<hostname>/alert<hostname>.log

[cssd(29355)]CRS-1713:CSSD daemon is started in clustered mode
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0009:log file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log" reopened
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0019:file rotation terminated. log file: "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log"
[cssd(29355)]CRS-0014:An error occurred while attempting to delete file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.l10" during log file rotation. Additional diagnostics: LFI-00142: Unable to delete an existing file [ocssd][l10] not owned by Oracle.

SOLUTION

The CSSD thread that encountered the LFI-00142 error needs to be restarted to ensure log rotation works again.

Manually deleting the logfile will not resolve the log rotation problem.


1).  Shutdown CRS on the node reporting the problem.

# crsctl stop crs

2).  Once CRS is down,  proceed to manually delete the 'ocssd.l10' file, or copy the logfile to another location if you need to keep a backup.

# rm  $GRID_HOME/log/<hostname>/cssd/ocssd.l10

3).  Startup Clusterware again

# crsctl start crs

If you are NOT able to schedule downtime and file size growth in the GRID Home is causing a space issue then copy the logs to another location and do the following

% echo 0 > ocssd.l10

  
Please note this does not resolve the log rotation problem but only allows you to free up some space.


4). Bug 18700935 has been fixed in 11.2.0.4.5 PSU for Unix/Linux platform and 11.2.0.4.12 Bundle for Windows platform. Please apply the patch if required.

猜你喜欢

转载自blog.csdn.net/j_ychen/article/details/82883020