Oracle DUL working principle and technical realization

DUL tool is the last resort for Oracle database to save data. When you use DUL, in most cases, the database can no longer be started, and some data files have even been damaged. So how does DUL export the data under these extreme conditions? Let's analyze its working principle step by step. If you want to develop a similar tool yourself, this article will also tell you what to do and how to do it.

The Oracle database is actually a collection of a bunch of data. The data is stored in a table. The data is managed by some software. Reading data is only a small part of these functions. The most important of these data is user data, which are usually stored in data files and stored in a certain format. How to interpret these data as we see it requires the help of metadata. Usually we call metadata a data dictionary. Let's first take a look at what the data dictionary looks like.

Data Dictionary

Oracle's data dictionary is also composed of some tables. The most important ones are obj$, tab$, and col$. The obj$ table specifies the name of the object, the object ID, the data ID of the object, etc., and of course the owner ID of the object is also specified. The tab$ table specifies some attributes of the table. The most important thing is that it specifies where the table starts, in which data file, and from which block. The col$ table specifies the column attributes of the table, including the name of the column, column ID, ID listed in the segment, column type, length, etc., with the information in col$, Oracle can explain the storage in the data block The format of the table is up.

The location of the table in the data file

As we said above, there are two fields in the tab$ table that specify the starting position of the table, one is called FILE#, which indicates which data file the table is in, and the other is called BLOCK#, which indicates which block the table starts from. This starting block is called the segment header block, which contains a range of extent addresses, called an extent map, and an extent is composed of consecutive data blocks. With this extent map, you can read data from these blocks, which are the data blocks of the table. What if a table is very large and the segment header block cannot contain all extents? Oracle will specify the block address of the next extent map in this block until all extent maps are listed.

With the above knowledge, we can read the table data from the data file. Before we start, there seems to be a little problem. How to read the data dictionary table from the data file? The data dictionary is also a table, and the segment header position of the table is obtained from tab$. At this time, we have not read the data of tab$, and it seems to be in an endless loop. Don't worry, Oracle will encounter the same problem as ours when it starts up. How can it solve it? It turns out that when Oracle starts, it first creates a table called bootstrap$ in memory. This table stores some table building statements, including the obj$, tab$ and col$ mentioned above. The interesting thing is that every After a table building statement, it also indicates the position of the segment header block of the table, so this is convenient, go directly to this position to find the extent map, traverse all the extent maps to find the data block belonging to this table, and parse the data block The content of the data dictionary can be obtained.
Seeing this, you are still a bit confused, so it means that you are thinking deeply, yes, where is the starting position of the bootstrap$ table? It is stored in the No. 1 block of the No. 1 data file. This block contains the file header information. There is a field called root dba, and the address contained is the section header block address of the bootstrap$ table.

data block

The data block contains the data in the table. It also has a certain structure. It starts with block header information, transaction information, and below is ITL. The size of ITL is fixed, called transaction slot. The block contains several transaction slots. In transaction information Specified in. Then there is the data header information, followed by the table directory information, followed by the row directory, which specifies the location of each row of data. Next is the row data. The row data is stored from the bottom of the block upwards, so there may be some free space between the row directory and the actual data. The structure of the data block is more complicated. Fortunately, Oracle has a tool called bbed, which can open a data block. It defines these data structures in detail, including the fields of the data structure, so that you can easily see the details of data storage.

long data type

The data of LONG type is generally relatively long, and it is easy to cause row joins. Of course, if a table is created with too many fields, it will also cause row joins, that is, a row of data is distributed between two or more data blocks. Do it? Oracle has a field called fb at the beginning of each row of data, indicating whether the data is connected to the next block. If the next block is reached, a field called nrid will appear to indicate where the subsequent data is connected. This It is an address, which represents the offset in which block. If the next block has not fully accommodated this row of data, then there will be the next nrid, which will be connected until the end of the data row.

lob data type

LOB is a large object data type and is introduced to replace the LONG type. When the amount of data is relatively small, it is stored in a block of the table. If the data is relatively large, it is stored in a segment outside the table. This segment is called the LOB segment. The location of the LOB data in the LOB segment is specified by a field called the locator. The English name is called Lob Locator. This locator is stored in the data block of the table, so that when the LOB field is read, the LOB can be found through the locator data.

lob index

In fact, LOB storage is quite complicated. By default, in order to facilitate storage, the LOB is listed in the data block of the table. Not only the locator is stored, but also the address of some LOB data blocks. The LOB data is stored through these addresses. read out. However, the number of stored addresses is limited, which depends on the length of the LOB information in the table data block. By default, the maximum is 12. If it exceeds, the locator will be used, and the locator cannot find the LOB directly. The block position of the segment is actually a key value of the LOB index. Through this key, a series of LOB block addresses are found in the LOB index, and the LOB data is read out through these addresses.

SecureFile

The LOB storage format mentioned above is called BasicFile LOB. Starting from 11g, Oracle has introduced a new LOB storage format called SecureFile LOB. It almost cancels the LOB index, but puts the block address of the LOB directly in the head block of the LOB segment, and the LOB data can be directly read through the address in the head block. Of course, if the amount of LOB data is very large, and the header block cannot fit so many addresses, what should we do? Oracle sets four addresses in the header block, called dba0, dba1, dba2, and dba3. This is a four-level internal tree structure. dba0 is equivalent to a leaf node and manages a lot of LOB data block addresses. When dba0 is full, dba1 will appear, which is the superior node of dba0, and it manages many leaves similar to dba0 Each leaf node block contains the address of a lot of LOB data blocks. When dba1 is full, dba2 nodes will appear. By analogy, when dba3 is reached, the amount of data that can be managed has far exceeded the maximum limit of LOB data. All LOB data can be traversed and read through this structure.

recyclebin

If you delete a table, starting from 11g, it is not really deleted by default, but the table name is changed. The original table name is stored in a table called the recycle bin. If you change your mind, you can also use the command Recovering it is good news for accidentally deleting the table. Since it is no different from a normal table, we can recover it through the above knowledge.

truncate table

If a table is truncated, you may really not be able to access the original data. If you regret it now, you can only hit the wall. Can the data be retrieved using the method we introduced earlier? Find the segment header block of the table, dump it and take a look, you will find that the extent map in the segment header block has been cleared, so it is impossible to traverse the data through the extent map. There is always a way, isn’t all the data stored in data files? Then we scan all the blocks in the data file, find out the blocks that are consistent with the ID of this table, and then analyze the data from these blocks. Isn't it all right? It just takes more time, and to ensure that no data files are missed, practice has proved that the data can still be read out.

drop table

With the experience of truncating the table above, the deleted table can be handled easily. The change of the section header block is almost the same as the truncation table. The difference with the truncated table is that you need to restore the deleted records in the data dictionary first. The records of this table in the obj$, tab$, and col$ tables are all deleted, so how to restore it? Remember that we mentioned earlier that there is a field called fb in front of each row of data. In fact, Oracle did not clear this data, but made a mark on the fb field. Remove this mark, and these records will be restored. . Next, scan the data file again and find the block belonging to this table, then the data can be restored.

Data dictionary corruption

The most serious situation is that part of the data file has been damaged, so there is no guarantee that the data will be completely restored. Then first try to read the data dictionary. Oracle has fixed the position of the segment header block stored in the basic data dictionary table. Find a database of the same version and find the segment header position of the data dictionary from the bootstrap$ table, or Find the section header position from tab$, and then try to export the data dictionary from these places. If the data dictionary can be exported, the rest of the work is the same as before.
The worst case is that the data files in the system tablespace are lost or severely damaged, and the data dictionary can no longer be exported. What should I do at this time? Then only use the data to rebuild the data dictionary, or scan all the data files, record the position of the segment header block, each segment header block will correspond to a table or a partition, so that the segment header position of the table is found, and then What needs to be done is to reconstruct the fields in col$, mainly the data type, length, etc. The length of some data types is fixed, such as date type and timestamp type, which is easy to guess. The number type also has its own characteristics, it is easy to determine, the rest is the character type, most of the unguessed can be treated as a character type first. Then export part of the data according to the reconstructed data dictionary. At this time, it is necessary to manually compare and determine the field type clearly, and then the data can be exported.

Guess you like

Origin blog.51cto.com/13641771/2586642