ORACLE Redo Log Buffer Design of redo log buffer mechanism

Recently, I have been communicating with friends, including some developers of domestic databases, and many programmers believe that Oracle is outdated, and that open source databases or domestic databases developed by them represent the future of database development. Even in many exchange meetings, when I compared a certain function point of my own product with Oracle, I felt that I was already far ahead.

In fact, the development of database systems is a dynamic process that usually needs to adapt to the actual needs of users and application scenarios. This is why most successful general-purpose database systems have gone through multiple versions and evolution stages over a long period of time, constantly improving and improving to meet the needs of changing users and application scenarios.

Why many friends and customers think that the internal design of the Oracle database is very complicated? That’s becauseIf you have never encountered many application scenarios, you cannot imagine them at all. It is even more impossible to rely on a few R&D personnel to design it in advance, but it has been continuously polished by Oracle through various user needs and application scenarios during more than 40 years of practice.

Although China has made revolutionary breakthroughs in some key scientific and technological fields, the United States still maintains overall and critical advantages.

In factThe reason why large-scale general-purpose databases have become one of the core technologies that are stuck is because databases play a key data storage and role in China’s information technology infrastructure and various industries. The core role of query modification, the processing performance, availability and scalability of the database system have a very important impact on the healthy and normal operation of the entire country and society, mainly reflected in:

Insert image description here

  1. Data volumes continue to grow: Over time, the volume of data in governments, businesses, and organizations is growing at an accelerating rate. This includes core processing data of government affairs systems, power, civil aviation, high-speed rail, ships, and vehicle transportation control system data, banking, insurance, securities, and financial transaction data, national security, police, and judicial system data, finance, civil affairs, Taxation, safety supervision system data, radio and television, satellite, communications, health, disease control, hospital department core system data, enterprise automated production, warehousing, logistics system data, etc.

    Databases must be able to efficiently manage and store these large data sets. If the database cannot handle the addition, deletion, and modification of so many types of large-scale data in a timely and effective manner, it will lead to traffic, production accidents and personal casualties at the least, and endanger the operation of social infrastructure and national security at the worst.

  2. Complex query requirements: Many core business applications need to perform complex database queries, such as joining multiple tables, aggregating data, and performing complex filtering and sorting. If the database query engine is not efficient enough or the database design is unreasonable, query performance will be affected and become a bottleneck.

  3. High concurrent access: Core applications of governments, enterprises, and organizations usually need to support a large number of concurrent users. For example, the 12306 ticket booking website, social security website, tax website, e-commerce website, and social media platform all face the challenge of tens or even hundreds of millions of users accessing the database at the same time during peak periods. The database must be able to handle concurrent access requests efficiently, otherwise performance lags or application crashes will result.

  4. Transaction processing requirements: The database is the core of transactional applications. Applications in financial transactions, traffic control, inventory management, order processing and other fields require databases to ensure the integrity and consistency of transactions. If the database cannot handle transactions efficiently, data errors and inconsistencies can result.

  5. Data Security and Compliance: Data security and compliance require appropriate protection and auditing of the database. Encryption, access control, auditing capabilities, etc. all need to be supported in the database. If your database fails to provide adequate security and compliance, you may face data breaches and legal issues.

  6. Backup and Recovery: Database backup and recovery are critical data management tasks. Without an effective backup and recovery strategy, database failure may result in data loss or errors, resulting in long-term application downtime.

  7. Database design and index optimization: Database design and index optimization are critical to performance. Poor database design, lack of indexes, or incorrect index selection can cause performance problems.

  8. Application of new technologies: As new technologies continue to emerge, adapted database systems need to continuously adapt to and integrate these new technologies while providing better performance and functionality. Meet stability and reliability at the same time.

  9. Changes in business needs: The business needs and application scenarios of governments, enterprises, and organizations are constantly changing. The database system must be able to flexibly adapt to these new needs and changes and make adjustments flexibly, otherwise the application may crash or be unable to meet new functions.

To sum up, there are many reasons why the database becomes a stuck technology. It is usually due to the combined impact of complex requirements, large-scale data, high concurrent access, performance limitations, and insufficient database management and optimization strategies. Solving these problems requires comprehensive consideration of database design, hardware and software configuration, performance optimization, fault tolerance and disaster recovery to ensure that the database can meet the evolving needs of the entire country and society. This is a very difficult matter. It cannot be solved by relying on the strength of many people.

In the current environment, learning more about the mature Oracle is a shortcut for the development of domestic databases. The experience gained by Oracle’s large-scale applications in many countries and enterprises around the world for more than 40 years is well worth learning at this stage.

Today we will take a look at the design of the ORACLE Redo Log Buffer redo log buffer mechanism.

Although the Redo Log Buffer is the smallest memory structure in the Oracle database SGA (System Global Area), it is a very critical component, and its structure and purpose are very important.

Its main function isto record the changes made to the data in the database memory block buffer by the SQL statements executed by the user process. These Changes are called redo log entries.

In the event that database recovery is required, these entries contain sets of information needed to reconstruct the changes made by operations such as INSERT, UPDATE, DELETE, CREATE, DROP, or ALTER.

The redo log buffer is a circular memory buffer. When it is full, new redo log entries will be written from the beginning of the buffer, overwriting the old data.

In the Oracle database, redo log entries (Redo Log Entries) are generated by user processes.

User processes include processes that execute SQL statements, such as sessions that execute data operation statements such as INSERT, UPDATE, and DELETE.

When user processes execute these SQL statements, they generate corresponding redo log entries, which record the performed data operations, transaction information, etc.

Redo log entries generated by user processes are used to ensure data consistency and durability.

When the user process generates redo log entries, they are first stored in the redo log buffer, and then the LGWR (Log Writer) process is responsible for regularly writing the data in the redo log buffer to the online redo log on the disk. Group (Online Redo Log), this process ensures the persistence and consistency of data during operational transactions. Even if the database fails, the data can be recovered through redo log files.

LGWR writes data blocks to disk sequentially, while DBWR writes data blocks to disk in a scattered manner. Scattered writes tend to be much slower than sequential writes. Because LGWR enables users to avoid waiting for DBWR to complete its slow writes, the database provides more efficient processing performance through this design.

In order to more clearly understand the series of operations related to the Redo Log performed by the database when the user modifies a row of data, let’s look at the next step of process analysis:

Redo Log Buffer

  1. The user issues an updated SQL statement, which is usually part of a transaction, and Oracle assigns a unique transaction number to the transaction.

  2. The server process is responsible for executing this SQL statement. Before execution, the server process needs to read the required data, index and restore data into memory and lock the rows to be updated.

  3. Before performing an update operation, the server process attempts to obtain a redo copy latch (Redo Copy Latch). The purpose of this latch is to ensure serial access to the redo log buffer to avoid multiple server processes changing data at the same time, which may cause data contention and lead to reduced performance. If no free latch (Latch) is available, other server processes will not be able to access the redo log buffer until the writing of the Redo Log Buffer is completed and the latch is released.

Insert image description here

You can view the current status of the system's Redo Copy Latch as follows.

SQL> col name for a13;
SQL> select name, gets, misses, immediate_gets, wait_time
  2  from v$latch_children
  3  where name='redo copy';

Query redo replication latch

  1. Once the redo copy latch (Redo Copy Latch) is obtained, the server process will try again to obtain a redo allocation latch (Redo Allocation Latch). This latch is used to obtain the reserved space in the redo log buffer so that Write redo log entries. Once it acquires the redo allocation latch and successfully allocates space in the redo log buffer, it releases the latch.

The "Redo Allocation Latch" (redo allocation latch) is an internal locking mechanism in the Oracle database that is used to coordinate and manage the allocation of redo log buffers. The main function of this lock is to ensure that multiple server processes (user processes) can mutually allocate and use space in the buffer when writing new redo log entries to the redo log buffer to avoid data competition and confusion. .

Specifically, when a server process needs to allocate a certain amount of space in the redo log buffer to store new redo log entries, it will try to obtain a "Redo Allocation Latch" to gain control of the allocation. Once this lock is successfully acquired, the server process can safely write new redo log entries to the buffer without conflicting with the operations of other processes.

Importantly, once the allocation operation is completed, the server process usually releases the "Redo Allocation Latch" immediately so that other processes can also acquire and use the space in the redo log buffer. This mechanism allows multiple concurrent processes to perform write operations in the database while maintaining data consistency and integrity.

The Redo Allocation Latch can be viewed via:

SQL> select count(*) from v$latch_children where name='redo allocation';

Query Redo Allocation Latches

  1. Next, the server process uses the obtained redo replication latch to write redo entries to the redo log buffer. Redo items include the original value of the updated data, operation type, transaction number and other information. After completing the write, the server process releases the redo log replication latch.

  2. At the same time, the server process also writes restore information to the restore segment related to the transaction. This restore segment will be used when the user uses the ROLLBACK instruction to perform a rollback operation.

  3. Finally, the server process completes the update of the data and writes the required original values ​​and modifications to the data to the database cache. This data is marked as dirty because the data in memory is inconsistent with the data on disk.

After understanding the working principle of the redo log buffer and the above process, we can further analyze when the LGWR (Log Writer) process writes the redo data in the redo log buffer to the redo log file.

A deep understanding of these operations is important to optimize redo log buffer performance.

The LGWR process will flush (Flush) buffer data to disk when any of the following situations occur:

  • 1. Once every 3 seconds;
  • 2. When a commit (COMMIT) or rollback (ROLLBACK) request occurs;
  • 3. When LGWR is required to switch the log file (Redo Log File); (alter system switch logfile;)
  • 4. When the redo buffer (Redo_Log_Buffer) is 1/3 full, or the cached redo log data reaches 1MB;

Due to the above reasons, if the size of the redo log buffer exceeds tens of MB, it is meaningless for most systems.

Unless it is a large system with a large number of concurrent transactions, perhaps a larger redo log buffer will be beneficial to it, because LGWR, the process responsible for flushing the redo log buffer to disk, does not output the log from the buffer. to disk, other sessions may also need to fill the buffer with new data at the same time.

Generally speaking, if you have a long-running transaction that generates a large amount of redo logs, it is beneficial to use a larger than normal log buffer in this case, because LGWR flushes the data in the redo log buffer to disk. At the same time, this long transaction will continue to write data to the buffer. The larger and longer the transaction, the more significant the benefit of a large log buffer becomes.

The default size of the redo log buffer is controlled by the LOG_BUFFER parameter and varies significantly with different operating systems, database versions, and other parameter settings.

You can query the current database through the following statement. The size of the LOG_BUFFER configuration is about 7MB. (8GB for SGA).

SQL> show parameter log_buffer

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
log_buffer			     big integer 7312K

Or query SGA

SQL> SHOW SGA

Total System Global Area 8589930576 bytes
Fixed Size		    8945744 bytes
Variable Size		 1476395008 bytes
Database Buffers	 7096762368 bytes
Redo Buffers		    7827456 bytes  		//  LOG_BUFFER当前配置

When LGWR (Log Writer) writes the redo entries in the redo log buffer to the redo log file or disk, the user process can continue to copy new redo in memory on the entries written to disk entry. This is because in Oracle's redo log mechanism, the redo log buffer is a circular structure, and new redo entries may overwrite older entries that have been safely written to disk.

This mechanism ensures that the in-memory redo log buffer remains up-to-date with changes. Even while LGWR is writing entries to disk, it is possible for new redo entries generated by user processes to overwrite old entries that have been written to disk.

Description of "Redo Log Buffer"

This design ensures that when the database needs to be restored, the redo log files can be used to re-apply the recent changes of the user process.

One of the main tasks of LGWR (Log Writer) is to ensure that there is always enough space in the redo log buffer to accommodate newly generated redo entries, even if this buffer is accessed frequently. If there is insufficient space in the redo log buffer, LGWR writes redo log entries to disk when needed to free up space for new entries to be written. In this case, LGWR will continuously flush redo records to disk to ensure that the buffer maintains sufficient space.

The database initialization parameter Log_Buffer defines the size of the redo log buffer. The default value is four times of DB_BLOCK_SIZE. Generally, larger Log_Buffer values ​​reduce redo log file I/O, especially if the transaction is long or has a high number of transactions.

In production databases with high transaction loadsAn important optimization strategy is to increase the size of the redo log buffer (the value of Log_Buffer set large enough). If the redo log buffer is large enough, it is more likely to accommodate new redo records without frequent flushing to disk. This not only improves performance, but also reduces frequent redo log file I/O operations, since writing to disk is typically more time-consuming than writing to memory.

The redo log group is used cyclically. When the current redo log file is full and the previous file is overwritten, if the database is in archive mode, the archiving process (ARCH) will automatically copy the contents of the overwritten redo log file to the archive. in the log file.

To optimize database performance, there are several factors to consider:

1. Set the size of the Log_Buffer parameter appropriately to reduce redo log file I/O operations, especially under high transaction load.
2. Ensure that the size and number of redo log files are sufficient to accommodate database activity and avoid frequent redo log switching.
3. Monitor the performance indicators of the database, such as wait events, especially performance issues related to redo, such as "log buffer space" wait events, and solve these problems through appropriate configuration .

Another optimization method is to enable the "Private Redo Parallelism" or "Zero Copy Redo" function.

Control the database engine to enable or disable the zero-copy redo optimization function by enabling the Oracle database internal parameter "_LOG_PRIVATE_PARALLELSIM". When this parameter is set to TRUE, the database engine attempts to use zero-copy technology to improve the performance of redo log operations.
"Zero-copy redo"

The zero-copy redo optimization function can split the Shared Pool into multiple parts, create an independent dedicated space for each server process, and create a redo change vector (Redo Change Vector) data structure in each dedicated space for Stores changes to the redo log of each server process.

The LGWR process can then write this **redo change vector (Redo Change Vector) directly to the redo log file on disk without copying it to the redo log buffer. **This optimization can improve performance and reduce unnecessary resource overhead.

However,The actual feasibility of the "Zero Copy Redo" feature depends on the specific operating environment and actual production needs of each database. When considering applying these strategies, you need to carefully evaluate the database workload, hardware resources, and database administrator experience, and conduct detailed testing and evaluation in non-production environments to ensure that they have a positive impact on the performance of the database without Introduce unnecessary risks.

The redo log buffer plays a key role in the Oracle database for recording and protecting data changes. It also requires careful configuration and monitoring to ensure the high performance and reliability of the database. Understanding its structure and purpose is critical to database management and performance optimization.

Insert image description here
Finally, I would like to say that in the field of basic core software research and development, you must not isolate yourself and build wheels behind closed doors. This is often just self-pleasure. In addition to moving yourself, it is more likely that you are going in the wrong direction. If you go in the wrong direction, the harder you work, you will only leave. The goal is further away. If you do not actively communicate and cooperate with developers from other countries, your vision will usually be limited and your thinking will become rigid.

In the database field, humbly asking for advice and learning mature software from developed countries is a very critical part. In particular, we must learn from the advanced experience of European and American countries in the software field. These countries have rich experience and resources in software research and development, and their methods, technologies, and products are often among the best in the industry. Therefore, staying in touch with them, asking them for advice, and learning their best practices can help us continue to make progress in core software research and development.

Maintaining cooperative relationships with experts and teams in European and American countries and participating in international open source projects or the formulation of international standards is also an effective way to increase knowledge and improve software development levels. By working with them, you can gain an in-depth understanding of the latest trends and development directions in the international software field, so as to better meet the needs of domestic users and the market.

In short, basic core software development requires open thinking and extensive cooperation. Learning and discussing problems with others, especially drawing on the advanced experience of European and American countries, will help improve our software development level and ultimately solve the "stuck" problem.

Guess you like

Origin blog.csdn.net/GYN_enyaer/article/details/133466415