Cannot Allocate New Log

Error reporting

Thread 1 cannot allocate new log, sequence 2594

Checkpoint not complete

fault phenomenon

The redo log is switched frequently, and the performance of the database DML is degraded. Additional log archiving affects IO performance.

Fault causes and solutions

In most cases, this failure is due to the frequent redo log switching, which results in the failure of log archiving in time. In addition, if the checkpoint operation is not completed before the log switch, the same error will be reported. You can optimize the checkpoint operation by adjusting the fast_start_mttr_target parameter.

The switching frequency of redo log can be detected by the following script

 SELECT
        TO_CHAR(first_time, 'YYYY-MON-DD') DAY
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '00', 1, 0)), '99') "00"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '01', 1, 0)), '99') "01"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '02', 1, 0)), '99') "02"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '03', 1, 0)), '99') "03"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '04', 1, 0)), '99') "04"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '05', 1, 0)), '99') "05"
      ,TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '06', 1, 0)), '99') "06"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '07', 1, 0)), '99') "07"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '08', 1, 0)), '99') "0"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '09', 1, 0)), '99') "09"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '10', 1, 0)), '99') "10"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '11', 1, 0)), '99') "11"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '12', 1, 0)), '99') "12"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '13', 1, 0)), '99') "13"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '14', 1, 0)), '99') "14"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '15', 1, 0)), '99') "15"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '16', 1, 0)), '99') "16"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '17', 1, 0)), '99') "17"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '18', 1, 0)), '99') "18"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '19', 1, 0)), '99') "19"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '20', 1, 0)), '99') "20"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '21', 1, 0)), '99') "21"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '22', 1, 0)), '99') "22"
      , TO_CHAR(SUM(DECODE(TO_CHAR(first_time, 'HH24'), '23', 1, 0)), '99') "23"
   FROM v$log_history
GROUP BY TO_CHAR(first_time, 'YYYY-MON-DD') ;

 

According to empirical values, the switching of redo log during the peak period should not exceed 10 times per hour, the redo log file group is 5 groups, and the logs of each log file group are located on different file systems or storage devices. The specific redo log size and the number of file groups need to be determined according to the actual situation.

Query the existing redo log size , log status, log file group and the number of members of each log file group.

SELECT

        group#

      , thread#

      , bytes / 1024 / 1024 mb

      , members

      , status

   FROM v$log;

 

You can see that there are a total of 9 log file groups, each with 1 log file, and the fourth group is the log file group in use. If the status is active , the log file needs to be used for instance recovery. It should be archived before it can be deleted.

redo log adjustment

Calculate redo log size

30( Maximum number of switches per hour ) * 5000MB / 10 = 15000MB

According to the above calculation value, theoretically, the log file group size should be expanded to 15000MB . Considering the uneven distribution of the ODS database DML operation time period, an excessively large redo log setting will cause the database to lose the current log file group for database recovery. more data is lost. Therefore, comprehensively consider expanding the log file to 10GB , which also ensures that the system can archive and switch redo logs when the system is not busy. The specific operation script is as follows. Please adjust the log file path and log file size according to the actual situation.

 -- Note that before deleting all archived logs, please keep 3 log file groups to ensure system stability, and delete the log file groups after modifying them.

sqlplus / as sysdba
host mkdir -p $ORACLE_BASE/standby/redo
host mkdir -p /mnt/EMC1/redo/

alter database drop logfile group 1;
alter database drop logfile group 2;
alter database drop logfile group 3;
alter database drop logfile group 7;
alter database drop logfile group 8;
alter database drop logfile group 9;
alter database add logfile group 1 ('/mnt/EMC1/redo/group1redo1.log', '/opt/app/oracle/standby/redo/group1redo2 .log') size 10G;
alter database add logfile group 2 ('/mnt/EMC1/redo/group2redo1.log', '/opt/app/oracle/standby/redo/group2redo2.log') size 10G;
alter database add logfile group 3 ('/mnt/EMC1/redo/group3redo1.log', '/opt/app/oracle/standby/redo/group3redo2.log') size 10G;

-- 切换当前日志文件组
alter system archive log current;
alter system archive log current;
alter system checkpoint;
alter database drop logfile group 4;
alter database drop logfile group 5;
alter database drop logfile group 6;
alter database add logfile group 4 ('/mnt/EMC1/redo/group4redo1.log', '/opt/app/oracle/standby/redo/group4redo2.log') size 10G;
alter database add logfile group 5 ('/mnt/EMC1/redo/group5redo1.log', '/opt/app/oracle/standby/redo/group5redo2.log') size 10G;

-- 确认日志文件组当前状态
select * from v$logfile order by member;
select * from v$log;

-- Delete useless log file groups on the operating system
rm -f /mnt/EMC1/redo1.log
rm -f /mnt/EMC1/redo10.log
rm -f /mnt/EMC1/redo11.log
rm -f /mnt/ EMC1/redo12.log
rm -f /mnt/EMC1/redo2.log
rm -f /mnt/EMC1/redo3.log
rm -f /mnt/EMC1/redo4.log
rm -f /mnt/EMC1/redo5.log
rm -f /mnt/EMC1/redo6.log
rm -f /mnt/EMC1/redo7.log
rm -f /mnt/EMC1/redo8.log
rm -f /mnt/EMC1/redo9.log

 

-- Note that before deleting all archived logs, please keep 3 log file groups to ensure system stability, and delete the log file groups after modifying them.

Adjust FAST_START_MTTR_TARGET

The parameter FAST_START_MTTR_TARGET parameter is a parameter to speed up instance recovery. We can define a reasonable and acceptable value according to the service sector. The value is in seconds. For example, if it is set to 60S, assuming that the value is in a reasonable situation, once the instance crashes, the instance should be able to be recovered within 60S. Reasonable means that the value cannot be too large or too small. If it is too large, it will take a long time for the instance to recover; if it is too small, a large amount of data will be written in time, which will increase the I/O of the system.

The main factor that affects instance recovery time is the distance from the closest checkpoint location to the tail of the online redo log. The longer the distance, the longer the time required for cache recovery, undo, and redo. So how to effectively shorten the distance between the closest checkpoint position and the tail of the online redo log is the purpose of FAST_START_MTTR_TARGET.

The value of FAST_START_MTTR_TARGET actually serves its purpose by triggering a checkpoint. When the recovery time (estimated_mttr) required by the dirty buffer generated in memory reaches the time specified by FAST_START_MTTR_TARGET, the checkpoint process is triggered. Once the checkpoint process is triggered, the DBWn process will write the dirty data to the data files in the order of the checkpoint queue, thereby shortening the distance between the last checkpoint position and the online redo log, and reducing the time required for instance recovery.

alter system set fast_start_mttr_target=30;

Related documents can be found at: Can Not Allocate Log ( document ID 1265962.1)

 

This article is original, reprint please indicate the source, author

If there is any error, please correct me

Email: [email protected]

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327088645&siteId=291194637