Database kernel explanation-(2) Let's build a database system architecture together!

 To build a building, you must first need a detailed and beautiful design drawing. The establishment of a system also requires a blueprint. If you cover your head and do what you think of, then the system will collapse later. In fact This is also one of the meanings of the existence of architects.

 One day, while moving bricks, you suddenly said to yourself: "No, I can't do additions, deletions, corrections, and check all day long. I want to be a program lady, so let me build a database for others to add, delete, modify, and check. !"

 Just do it. You go to a treasure and buy a large hard drive, connect it to the computer, and put it in your room, saying, "This is the warehouse where I realize the database!"

 Then, if you want to make a database, you must first draw a blueprint (architecture diagram) of the database!
Let's have a sense of ritual first, and give us a name for the architecture blueprint file we want to draw, let's call it dggz (big brother attention)!

 That has its own hard disk, and then the first step is to draw a data warehouse on dggz.
Insert picture description here
 With a data warehouse, you will then consider a problem. The data warehouse needs a disk manager to access files, so you have three solutions.

 1. Use the file system provided by the current operating system as the disk manager of your own data warehouse.
 2. Write a disk manager yourself?
 3. Transform the existing file system to make this disk manager suitable for your own data warehouse.

 What a database often needs to do is to use the data tuples in it to do things. Obviously, the existing file system does not seem to be that suitable in terms of file units. Can you consider writing a disk manager yourself? Obviously, if we can write a disk manager by ourselves is the most suitable solution for our data warehouse, but the cost is too high, time cost and labor cost are too high, then there is only a third compromise solution, in the existing The file system was modified to adapt it to our own data warehouse.

 Do whatever it takes. If we want to adapt the existing file system to our own warehouse, then our own warehouse will have some problems to solve.

 1. How to represent the data item, that is, the name and age. These data are expressed in bytes in varchar and interger in the warehouse, but how to represent this issue requires our own consideration.
  2. How the record is represented is a collection of data items. For example, if the name and age are the model of the user table, then a record is (Xiaoming, 18). How to represent this record in the file.
  3. Every time data is read into memory, it is unlikely to read one piece at a time. It would be more appropriate to read one block each time, so how to organize the records in the block is more appropriate?
  4. How to modify the record? For example, after the modification, the record exceeds the scope of the block, etc.?
  5. How should the block be represented in the file?

 Finally, after a lot of hard work, we solved the above problems and completed the transformation of the existing file system. You can add a storage manager to dggz.
Insert picture description here
 Next, we have to consider the efficiency of reading. If the data is read directly from the disk every time, then the efficiency of our database will not be particularly high, so we must set it in memory. Set a buffer to exchange data and save some data to reduce the number of exchanges to disk, so we can upload a buffer on dggz.
Insert picture description here
 With this buffer, the management of the buffer must be involved. Consider the following questions:
 1. How to design the structure of the buffer?
 2. How to design the buffer replacement algorithm?
 3. How to manage the buffer, that is, how to replace the data from the disk to the buffer?

After solving the above problems, we can draw a buffer manager on dggz.
Insert picture description here
 What's next? We want to read data from the disk, but every time we read data, we can't scan the entire table. We definitely need an index and a file manager to record and manage the index. Use index, file, record and other information to read files in the warehouse through the buffer.
Insert picture description here
 After we have a complete set of tools for our database operation, we must consider opening to the outside world. SQL statements are required. Then, to define our SQL statements, we need a query compiler.
Insert picture description here
 At this point, after parsing these statements, a tool to execute these statements is definitely needed, and the execution engine appears.
Insert picture description here
 In this way, a simple database of ours is completed, but such a database has many imperfections in software design and cannot be commercialized. For example, if the data is lost, is there any way to restore our database? At this time you need a log and recovery manager.
Insert picture description here
 With the log, we have to consider the next thing. If we operate on the database, such as bank transfer, A makes money from B’s account, and A transfers 1 million to B’s account. At this time, the banking system crashes and then recovers. At this time, B’s account has not been transferred to 1 million, so there is a lawsuit. What should we do? We will definitely not allow this to happen, so let’s consider combining A account minus 1 million and B account plus 1 million into a group Operations, either all or nothing, we call this thing a transaction, so as to ensure the correctness and security of our database, so we add a transaction manager to the database. Of course, transactions need to be recorded in the log to be safe.
Insert picture description here
 At this time, the security and correctness of the database have been satisfied, but the database is used by many people at the same time. Now what we have done is single-person use. If multiple people use it together, it will still happen, such as two simultaneous access to one data. One read and one write will cause incorrect things to happen, so concurrency control must be added. Of course, the current mainstream concurrency control still uses locks, so we need a lock table.
Insert picture description here
 Well, all those who do system know that they need an administrator, so let's come to a database administrator.

Insert picture description here
 Then a complete database system is freshly released!
 You looked at your bricks with satisfaction and moved harder.

 After digging so many holes, the next chapter is finally about to start filling the holes! Let’s talk about memory in the next chapter!
 Finally, the same sentence, big brothers and sisters, the code is not easy, come and like it, I don't mind if there is a reward.

Guess you like

Origin blog.csdn.net/qq_34364255/article/details/108680139