"Operating System Restore truth," read the notes - Chapter 0

0. How does the software access to hardware

IO interface

Because the operating system can not update the various hardware driving method, giving rise to a variety of various hardware adapter devices, this is the IO interface . Interface is the standard, hardware accordance with this standard work on the realization of universal.
Hardware generally divided into serial and parallel input and output, in response interface is a serial interface and parallel interface. Serial hardware communicate with each other, or a serial interface for data transfer with the CPU. Parallel access device similar, except through the parallel interface.

Two ways to access external hardware

  1. The memory-mapped peripheral to a range of address space, the memory will fall peripheral region when the CPU accesses the address bus, this mapping allows the CPU to access memory, like a peripheral board access to physical memory same. Some devices do so, such as video card, a display adapter card, and the CPU is not directly monitor the interaction, communication and video card only. There the memory card called a memory chip, it is mapped to the lower end of the 1MB host physical memory 0xB8000 ~ 0xBFFFF. CPU to access this memory is to visit the graphics card, to write this piece of memory byte is content to print on the screen.
  2. Peripherals are IO interface to communicate with the CPU, the CPU access the peripherals, is to visit the IO interface, the IO interface to pass information to the other end of the peripherals, that is to say, CPU never know of the existence of these devices, only We know their operation is IO interface. How to access the IO interface it? There are a number of ports on the IO interface. Specifically, the processor via a port (Prot) and peripherals to deal. Essentially, the port number of registers that is, similar to the internal processor registers. The only difference is that these registers called ports located IO interface circuit.

1. What is the application that the operating system and how to fit together

Applications and operating systems are software.
The operating system is people want to come out, a set of management practices to make their own management computer convenience created.
Applications to use some kind of language, and language is compiler to provide. In fact, there are not any language, there are only compiler. The compiler decides how to explain certain keywords and some grammar. Language compilers and just everyone's agreement, the only way to write code, the compiler translate it to put some kind of machine instructions, depending on what kind of compiler translated into behavior, and language-independent, such as printf function of the C language, it the function is not saying that we should print characters to the screen, depending on the compiler to handle this keyword.
Application plus operating system functionality is considered a complete program. Thanks to the support of the operating system, some ready-made things have been put in that, but these are part of the operating system, not the application, so we usually write applications only semi-finished products, need to call the operating system functions to provide good completely get things done, and this function is a system call.

2. Why say "into a" kernel

Applications in privilege level 3, in the operating system kernel privilege level 0. When an application want access to system resources (whether hardware or kernel data structures), it requires a system call. Such CPU will enter the kernel state, also known as pipe state.

User mode and kernel mode

User mode and kernel mode is in terms of CPU, run CPU refers to the user mode (privilege level 3) or kernel mode (privilege level 0).
User process into kernel mode means: due to internal or external interrupt occurs, the current process is to suspend the implementation of its kernel interrupt context is saved after the program starts executing some kernel code. Kernel code, not the user program in the kernel code, user code how that may exist in the kernel, so "user mode and kernel mode" is for the CPU.
When the matter after application into the kernel, after the occurrence of the application completely unaware, its context has been saved to your privilege level 0 stack, and then the program running on the CPU is already a core program. So to be clear, not become the core kernel code metaplasia of the application, operating system independent part, because user processes never enter kernel mode and transformed into the operating system.

3. The memory access the segment, and the segment code

Segment of memory access

First program segment for relocation. At 8086, a 16-bit register, which is a segment can access up to 64KB, there was then a small memory 1MB, changing the base address, a segment from the changed segment to another, so that the program is to segment the large memory is divided into small pieces that can be accessed, will be able to access all the memory through such alternative methods. But then the 16-bit register and can not access 20 (1M is 2 to the power 20) address space. After the CPU address processing unit designers moving the hands and feet, the address means to address "segment within the segment base address + offset," automatically 段基地址左移4位 + 段内偏移, making 20 physical address.

Code segment

CPU chip is a high degree of automation, as long as the specified address given in the first CPU, CPU when it executes this instruction, he will get the address of the next automatically. Since the program instructions are the next, no gaps between one another. But in order to align the compiler program in a number of plug 0, so that the program data gap appeared between the voids do not exist between the instructions, the next instruction address is arranged down in accordance with the size of the previous instruction, which is the Intel processor program counter cs: eip principle could automatically get the next instruction, i.e. the current address plus eip in urine is the machine code of the current instruction memory starting address of the next instruction. Even if a void or other non-data instructions between the instruction, this is only on a physical disconnect in the still jmp instruction can be skipped in order to maintain the original uncommanded logically successive instructions. In order to make the continued execution of program instructions connected, all instruction should be grouped together to form a continuous instruction area, which is a code segment. Instruction opcode and operands, instructions and data operands, the continuously formed side by side in the stored passage, called a data segment.

The code and data separate benefits:

  • You can not confer the properties for them.
  • In order to improve improve internal CPU cache hit rate.
  • save space.

Segment protection attributes:

  • The compiler is responsible for the selected data attributes have to classify the segments according to the program attributes. The compiler did not allow segment has an attribute for a code segment, the compiler does is code segments grouped together only.
  • Operating system provided by the global descriptor table GDT constructed descriptor, given segment position, size, and attributes (including s field and type fields) in the segment descriptor.
  • Segment register in the CPU in advance by the operating system gives the corresponding selectors, so as to determine the point section. In the execution of the instruction, the instruction will be judged based on the behavior of the segment attributes, if no returns an exception is returned.
    In short, compilers, operating system, CPU three fit together to a program to protect, detect violations of the directive.

To summarize, the program segment is only a logical division, for collating different data, but can be used in the CPU segment registers point to them directly, and then the memory segmentation mechanism to access them. Memory segmentation mechanism refers to a mechanism for memory accesses using the processor, the program segment is a logical division of the software artificial memory area.

4. The difference between the physical address, logical address, effective address, a linear address of the virtual address

  • Physical address: the physical address of the real memory , house number corresponds to the memory of each memory cell, unique. No matter what mode, whether it is virtual or linear address address, cpu will eventually be accessible to physical memory addresses, physical addresses is the only terminal memory access.
  • In real mode, "the segment base address + offset" is the physical address of the segment processed member, the direct output.
  • In protected mode, "the segment base address + offset" is referred to a linear address, which is a segment selector things, is essentially an index, the index through the index to find the appropriate segment descriptor in the GDT the descriptor segment base address is recorded, size and other information. If the address is not on the paging function, the linear address is the physical address, if turn pagination, this linear address and a name: virtual address, the virtual address of the page to go through the conversion part, said specific physical address.
  • Logical address: both in the real mode or protected mode, the segment offset address is also called an effective address, also called logical addresses.
    Here Insert Picture Description

5. Why applications under Linux system can not run in Windows system

Different format, executable files under Linux is the elf, that is, "Executable and Linking Format". The executable file is in the Windows format PE (portable executable, portable executable file).
API different, Linux is called the system API calls, this is by int 0x80 software interrupt implementation. The Windows API is placed in the dynamic link library. That Linux executable program acquisition method and system resources Windows is different, so clearly it can not run Windows.
In addition to the above reasons, as well as the compiler and standard libraries.

6. Why should local variables and function parameters on the stack

Local variables, referred to the local name suggests its scope, not as a global variable as global. Global variables, means that you can access at any time, so put it in the data segment. Local variables only with themselves, in order not to waste space, it will be placed in its own stack.
Heap is running in dynamic memory allocation of memory space, is planned for each operating system user process belongs to the software category.
Stack is necessary processor running memory space, hardware is required, but is software (operating system) provides. The stack is a stack, heap and it does not matter.

Function parameters on the stack area of ​​reasons:

  • Also caused localized, this function will only use this parameter.
  • Function during program execution is dynamic invocation, the compiler can not predict when to call and how many times is called.

7. Why assembler faster than C

High-level language such as C language for versatility, etc., need to balance things more, often adding some extra code, assembly code compiled more, many parts are some peripheral functions, not directly play a role, as write assembly language functions directly related to the partial effect to be more direct, the C language is compiled into machine instructions, machine instructions generated course includes these extra parts, equivalent to more than the implementation of a number of "seemingly" command, and therefore will be more than slow directly with assembly instructions.

8. The difference between the compiled program and interpreted program

Interpreted language, also known as scripting languages, such as JavaScript, Python, perl, PHP, Shell scripts. They themselves are text files that are input to an application, the application is a script interpreter. These scripts are no different from the code string in the script interpreter opinion. Script code never really go execute CPU, CPU's cs: ip register to point never had them, CPU's eyes only see the interpreter, the interpreter is a process, in essence, is the script interpreter in script analysis, do the corresponding dynamic behavior based on keywords and syntax.
So the script if an error occurred, the previously correct section will be executed normally and compiled this program is very different.
Compiled program itself is a process running. It is called directly by the operating system. After loading the operating system into memory, the CPU executes.

9. endian and big segments little-endian

Small pieces with big endian byte order is the reverse order of the two arrangements.

  • Low little-endian byte value is placed at the low address memory, the high value of the high byte address in memory.
  • Big endian high address low byte value in memory, the value of the high byte in the low address memory.

Advantage:

  • Small end: no need to adjust the position of the low byte as byte data type when cast.
  • Big end: signed number, most significant bit bytes which represents a value not only itself, but also acts as a symbol. The sign bit of the first byte is fixed, i.e. highest occupied lowermost position, the symbol can be directly taken out easily determined positive or negative.

Common endian CPU:

  • Little-endian: x86, DEC
  • Big-endian: IBM, Sun, PowerPC

The size endian CPU ARM-take-all system, which select a specific byte sequence used by the hardware.
Endian CPU to access memory not only the concept, but also in network file storage and transmission. bmp format images belong to little-endian, while jpeg format images was big-endian, using what the sequence is absolutely required when developers design products.
Network byte order is big endian, the program on the x86 architecture of the network when transmitting data, need byte order conversion.

10.BIOS interrupts, DOS interrupts, Linux interrupt the difference

Interrupt

In a computer system, whether it is in real mode or in protected mode, in any case there will be external or internal events from occurring. If the event from the internal CPU is called abnormal, that is Exception. For example, CPU in the calculation algorithm and found that the denominator is 0, it throws an exception other than 0. If the event from the outside, that is, the event is an external device by the law and notify the CPU, this device is called an interrupt.

BIOS interrupt

BIOS and DOS are present in the program in real mode, created by their interrupt calls are established in the interrupt vector (Interrupt Vector Table, IVT), they are int instruction interrupt number to be called by software interrupt.
The main BIOS interrupt function call is to provide a method to access the hardware, which makes the hardware operation becomes easy. The operating system to read and write by the int / out port peripherals instructions, BIOS, interrupt processing is used to operate the hardware, so the handler must everywhere in / out instructions.

Adding BIOS interrupt handling routine reasons:

  • For their own use, because the BIOS is a program, in order to avoid duplication of code to write a piece of code repetitive execution of the interrupt function, called directly.
  • Later in the program to use, such as a loader or boot loader. When they call their own hardware resources do not need to rewrite code.

How to set BIOS interrupt handling routines:

  • BIOS routines have to call someone else's function. First of all hardware manufacturers to make their products to use, pre-written call interface, direct parameter passed to interface functions, you will be able to return to a hardware output.
  • Each peripheral, including the video card, a keyboard, various controllers, and the like, has its own memory, but this memory is a read only memory ROM. Hardware their routines and initialization function call code is stored in this ROM. According to the specification, the contents of the first memory cell is 0x55, the second storage unit is 0xAA, a third storage unit 512 to the code length in units of bytes. From a fourth storage unit is the actual code.

Peripheral access in two ways:

  • Memory mapping: through the address bus Peripheral own memory mapped to a memory area.
  • Port operations: peripherals has its own controller, the controller has a register, the register also known ports. To read and write through the port in / out instructions to access the hardware memory.

From the physical address of the memory 0xA0000 ~ 0xFFFFF this memory, a part is designed to make the map, if the hardware is present, their hardware ROM is mapped to this memory somewhere.
During operation of the BIOS will scan 0xC0000 ~ 0xE0000 direct memory, if the first two bytes found that a region is 0x55 and 0xAA, means that there is a region corresponding to a ROM code is present, then do a checksum of the region, If the result is equal to the third byte, the correct code is described, it enters from the fourth byte. Then started a routine to initialize the hardware that comes with the hardware itself, finally, BIOS fill in the relevant interrupt vector table entry so that they point hardware that comes with the routine.
The first 0H ~ 1FH interrupt vector table entry is BIOS interrupts.
CPU interrupt vector table is natively supported, not who is responsible for creating. Software can achieve what function, which depends on the hardware provided support. As long as the software execution int interrupt vector number, CPU will be the vector number as a subscript, to interrupt the interrupt handler positioning and executing the vector table.

DOS interrupt

DOS runs in real mode, so the establishment of interrupt calls also established in the interrupt vector table, but it does not interrupt vectors and the BIOS conflicts.
0x20 ~ 0x27 is a DOS interrupt. Because DOS runs in real mode, so that you can call BIOS interrupts.
DOS interrupt 0x21 only takes the interrupt number.
First go through the DOS interrupt is written in the sub-function ah register number, then execute int 0x21, this time in the interrupt vector table entries of 0x21, 0x21 * 4 i.e. the physical address of the interrupt handler to call starts according to the register value ah the corresponding sub-functions.

Linux interrupt

The Linux kernel is re-entering the protected mode interrupt routine was established, but in protected mode interrupt vector label no longer exists, replaced by the interrupt descriptor table (Interrupt Descriptor Table, IDT).
Linux system call and the like DOS interrupt calls, but after a Linux interrupt routine into the instruction int 0x80, then the sub-function to call a different function depending on the value of the eax register.

11. The instruction set architecture, microarchitecture

Complex instruction set CISC and RISC reduced instruction set, not the specific instruction set, but two different command system, the equivalent of martial art instruction set, instruction set design ideas.
The instruction set is a set of instructions specific coding, micro-architecture is the physical implementation of the instruction set.
This belongs to the x86 instruction set CISC system, but because of inefficient and eventually take on its internal RISC core achieved, i.e., when a CISC instruction decoding is decomposed into a plurality of RISC instructions, so that the efficiency will be comparable with RISC.
Currently on the market there are five common instruction set, outside the CISC instruction architecture, ARM, MIRS, Power, instructions are RISC instruction set C6000 system except x86 Yes.
CPU instruction set corresponds to a CPU can identify only one instruction set, with its support are so many CPU instruction set to call.

12. The library functions as a bridge user process and the kernel

The operating system provides a system call interface, user processes directly call these interfaces. Interface module is a function of the inlet, through the interface provides an input module, it returns an output.
C programming under Linux, programs are usually written in a user-level program, in order to output text, usually in the beginning of the file include <stdio.h>. And this library file stdio.h, contains a library function, the code calls the system call is to the credit of library functions.
Although the system call encapsulated in a library function, but the user program can call "system calls" direct, but with the library functions would be more efficient.

13.MBR, EBR, DBR 和 OBR

Because the BIOS is located on a small gradual process, limited space, less code, functional limitation, control relay mode must be taken, eventually the right to use the processors to the operating system.

MBR

BIOS completes only simple detection or initialization, the right to use the processor then transferred to MBR, MBR must wait in a fixed position, thus MBR located in the beginning sector of the hard disk, i.e. 0 0 1 sector, the sector MBR boot sector area is called , this is a good agreement, it will MBR boot sector program 0 0 1 may be loaded into the physical address 0x7C00 , then hop performed in the past, such use of the processor is put BIOS to the MBR. Generally the sector size is 512 bytes, but not fixed.

In the MBR boot sector content:

  • 446 bytes of program guide;
  • 64-byte partition table
  • And end flag byte 2 0x55 0xAA.

MBR boot sector of the boot program in addition, also 4 partition table entry, representing 4 * 16 bytes, which is partition information, which is four partitions "second boot sector", that is, the operating system loader, this step to do is to give control to the operating system loader, the bootstrap completed by the operating system loader, ultimately delivered to the control of the operating system kernel. In order to know which MBR partition to install the operating system, the partition will be marked as "active partition" (0x80), the active partition mark is the first byte of the partition table entry whose value is 0 or 0x80. In order to facilitate MBR finds the kernel loader, the kernel entry address loader position is fixed, the sector located at the beginning of each partition, which is a good agreement. The start sector is stored in an operating system bootstrap - kernel loader, the operating system boot sector is called a sector, wherein the boot program (kernel loader) known as the operating system boot record OBR (OS Boot Record ), so this sector, also known as the boot sector OBR . In the first three bytes of storage sectors OBR jump instruction, it is agreed. After summary, the MBR to find the active partition, skip to the beginning of the active OBR boot sector of the active partition, the pick at the start transfer command processor immediately into the operating system boot program, from the work to complete the transfer.

FIG 和 DBR

OBR is left over from the DBR down, so from the DBR begin to understand. DBR is a DOS Boot Record, which is what's DOS operating system boot record, DBR is probably:

  • Jump instruction that jumps to the MBR boot code;
  • Vendor information, DOS version information;
  • BIOS parameter block BPB, i.e. BIOS Parameter Block.
  • Operating system boot program
  • End tag 0x55 and 0xAA.

Only four partitions, an extended partition does not exist in the DOS era, which is equivalent to four partitions are primary partitions, so each of the main sectors of the beginning of the partition boot sector DBR called. Later, with the extended partition, whether primary or logical partition for compatibility, the beginning of the partition sector as DOS boot sector. Other operating systems inherited this custom, the sector will be the beginning of the partition as its boot sector, which kept its own operating system boot sector, because too many operating system type, and DOS exit the stage of history, so DBR also known as OBR.

APR

EBR is the concept of the extended partition for compatibility MBR was raised, mainly in the MBR partition table compatible. Partition is the partition table described. A partition is extended in a logical partition, so the partition is also extended partition table. A sector partition table is stored in the extended partition called EBR, i.e. Expand Boot Record. Content compatible partition table, so it MBR same structure, but different positions, each sub-sector is located in an EBR in the beginning of the extended partition (mainly, the beginning of each sector and the primary partition each logical partition is the operating system boot sector), in theory, only a MBR, EBR there are numerous.

to sum up

A DBR OBR fact, referring to the operating system boot program is located in the beginning of the sector of each partition (primary and logical partitions). OBR related to the number of partitions equals the number of plus primary partition number and logical partitions. A sub extended partition contains only one logical partition.
MBR and EBR is a partitioning tool to create and maintain, and does not belong to the range of the OS, not free to modify its address data, OBR is a district of the (primary and logical partitions) sector beginning, and therefore part of the operating system management.
DBR, OBR, MBR, EBR contains a boot program, and therefore they are called boot sector, as long as the executable program exists in the sector, the sector is a boot sector. If the sector is located in the start sector of the entire hard drive, and is 0x55 and 0xaa end, the sector is considered MBR boot sector. If the sector is located the beginning of the sector of each partition, and to 0x55 and 0xaa end, the sector is considered the MBR boot sector OBR.
DBR, OBR, MBR, EBR structure in both boot code and an end tag 0x55 and 0xAA. The biggest difference is the presence of only MBR partition table and the EBR.
Here Insert Picture Description

Published an original article · won praise 0 · Views 188

Guess you like

Origin blog.csdn.net/qq_42457363/article/details/104074813