[Notes] Nantah operating system jyy

Article preview:

Video link of station b:
Operating system overview (why learn operating system) Nanjing University 2022 operating system-Jiang Yanyan

Explanation: It has not been updated yet. I feel that this course pays more attention to actual combat. It is not just about ppt, but will lead you to write code. If you want to see the specific concepts of the operating system, you can go to the postgraduate course

Three Multiprocessor Programming: From Getting Started to Giving Up (Threading Libraries; Modern Processors and Relaxed Memory Models)

1. Look at the CPU execution from the point of view of the state machine,

共享: 全局的变量, 系统调用等库函数,

独占: 线程栈帧, thread_local 变量

2. Multithreading changes program execution flow and some default assumptions

指令不再具有 原子性, 顺序性, 内存一致性

3. Modern processors

指令执行

(顺序性也可能会在编译优化下消失)

单条汇编指令 也非原子的 

它将指令解释为 *μops*

内存模型

4. There are some impossible results, which may be because the C language keeps it serial when it is compiled into assembly. At that time, when the assembly is compiled into a smaller unit, the serial part is "optimized" into parallel

5. State machine=create+join

join

Thread C-thread joint join

yield

The yield() method of multithreading

syscall

Linux system call principle - syscall

4. Understanding concurrent program execution (Peterson algorithm, model checking and software automation tools)

Operating System: Design and Implementation:

1️⃣Peterson Algorithm✔️

2️⃣ Correctness proof and stress test ✔️

3️⃣ Automatic Proof: Model Checking ✔️

1. Peterson Algorithm ✔️

✨Mutual exclusion: Guarantee that two threads cannot execute a piece of code at the same time.

Insert "mystery code" so that sum.c (or any other code) works

Assume that a memory read/write can be guaranteed sequential and atomic

00:00
Peterson Algorithm

2. Correctness proof and stress test ✔️

✨The case of entering the critical area

If only one person raises the flag, he can enter directly

If two people raise the flag at the same time, the label on the toilet door determines who enters

Hand fast circle (covered by another person's label), hand slow black

some specific details

A sees that B does not raise the flag.

B must not be in the critical section

Or B wants to get in but hasn't had time to put "A is in use" on the door

memory ordering

A sees B raising the flag

A must have raised the flag
insert image description here

27:13
Proof of correctness and stress testing

3. Automatic proof: model checking ✔️

Design Dilemmas of Concurrent Algorithms

Dare Not Not To Draw: Who Knows What Strange Things Will Happen

Don’t dare to doodle: if you make a mistake, it’s all over

solve the dilemma

Can the computer draw for us?

With the formal semantics (mathematical definition) of the program, an interpreter can be written to simulate execution

54:59
Automatic Proofs: Model Checking

Five concurrency control: mutual exclusion (spin lock, mutex and futex)

1. About volatile

insert image description here

volatile

Detailed explanation of volatile (the kind that anyone can understand)

Linux C++] Thread Safety - Atomicity, Visibility, Orderliness

[Linux C++] Thread Safety - Atomicity, Visibility, Orderliness

Six concurrency control: synchronization (condition variable, semaphore, producer-consumer and philosopher eating problem)

1. assert assertion function

Introduction to assert assertion function

2. Semaphores and conditional quantities

The difference between the use of linux condition variables and semaphores

3. Summary

insert image description here

7 Concurrent Programming in the Real World (Concurrent Programming in High Performance Computing/Data Center/Human-Computer Interaction)

1. The benefits of threads: multiprocessors can be used

insert image description here
For example, there are many coroutines in a thread. When coroutine A requests data, other coroutines cannot move, which will waste CPU resources. The
cost of coroutines is small, but there will be blocking problems.

Threads and coroutines are not perfect. The solution is as follows:
insert image description here
each CPU has a thread, and each thread has multiple coroutines, which reduces the switching time of CPUs.
As long as the go program is running, the coroutine will always run.
If the coroutine wants to perform time-consuming operations, it will immediately switch to another coroutine to run

2. Synchronous and asynchronous

Generally, our understanding of synchronization is to do many things at the same time, but the synchronization in the program is to execute the tasks in the order of the tasks. If the previous task is not completed, the next task will not be executed, and we have to wait for the execution of the previous task to complete.

Asynchronous: It is possible to do multiple things at the same time, which is often accompanied by multithreading

The main advantages and disadvantages of synchronous and asynchronous:
1. Synchronous execution efficiency will be relatively low and time-consuming, but it will help us control the process and avoid many uncontrollable accidents; 2. Asynchronous
execution efficiency is high and saves time, but It will take up more resources, and it is not conducive to our control of the process

3. What is the difference between concurrent, parallel, serial, synchronous, and asynchronous?

1. Concurrent programming is also called multi-threaded programming.
    In the program, there are often many time-consuming tasks, such as uploading files, downloading files, and chatting with customers, which require a long time to establish a connection. At this time, one thread cannot serve multiple users, and there will be waiting problems due to resource monopoly. The essence of concurrency is the multiplexing of one physical CPU (or multiple physical CPUs) among several programs. Concurrency is to enforce multi-user sharing of limited physical resources to improve efficiency (the problem of buying tickets concurrently).
    Concurrency When there are multiple threads in operation, if the system has only one CPU, it is impossible to run more than one thread at the same time. It can only divide the CPU running time into several time periods, and then allocate the time periods to each Thread execution, while the thread code is running for a period of time, other threads are suspended. . This way we call it concurrent (Concurrent).

2. "Parallel" refers to two or more events or activities occurring at the same time. In a multiprogramming environment, parallelism enables multiple programs to be executed simultaneously on different CPUs at the same time. (Hadoop clusters are calculated in parallel)
    When the system has more than one CPU, the operations of the threads may not be concurrent. When one CPU executes one thread, another CPU can execute another thread. The two threads do not seize CPU resources and can be executed at the same time. This method is called parallel (Parallel).

Concurrency and parallelism
    Concurrency and parallelism are two concepts that are similar but different. Parallelism means that two or more events occur at the same time; while concurrency means that two or more events occur within the same time interval. In a multi-programming environment, concurrency means that multiple programs are running at the same time macroscopically for a period of time, but in a single-processor system, only one program can be executed at a time, so these programs are only It can be executed alternately in a time-sharing manner. If there are multiple processors in the computer system, these programs that can be executed concurrently can be distributed to multiple processors to realize parallel execution, that is, each processor is used to process a program that can be executed concurrently. programs can be executed simultaneously.

3. Serial, parallel:
    Parallel and serial refer to the way tasks are executed. Serial means that when there are multiple tasks, each task is executed in sequence, and the next one can only be performed after one is completed. Parallelism means that multiple tasks can be executed at the same time, and asynchrony is a prerequisite for parallelism of multiple tasks.

4. Synchronous and asynchronous:
    refers to whether a new thread can be opened. Synchronization cannot start a new thread, but asynchrony can.
    Asynchronous: Asynchronous and synchronous are relative. Synchronous is to execute sequentially. After one is executed, the next one needs to be waited and coordinated. Asynchrony means being independent from each other, continuing to do your own thing while waiting for an event, without waiting for the event to complete before working. Threads are one way to achieve asynchrony. Asynchrony means that the main thread that calls the method does not need to wait for the completion of another thread synchronously, so that the main thread can do other things.
    Asynchrony and multithreading are not an equal relationship, asynchrony is the ultimate goal, and multithreading is just a means for us to achieve asynchrony. Asynchrony is when a call request is sent to the callee, and the caller can do other things without waiting for the result to be returned. To achieve asynchrony, multi-threading technology can be used or it can be handed over to another process for processing.

5. Multi-threading
    Multi-threading is a logical layer concept of programming, which is a piece of code that runs concurrently in a process. Multi-threading can realize switching execution between threads

Eight concurrency bugs and solutions (deadlock/data race/atomicity violation; defensive programming and dynamic analysis)

1. The canary mentioned in CSAPP is also a defensive programming

2. Concurrency bugs: deadlock and data competition

Application of ten state machine models (cellular automata; gdb/rr/perf; code verification tools)

1. Prifiler tool

2. Use the 150-line model checker to check the concurrency of the program, and it can also be checked if it is not a concurrent program

insert image description here
It is also possible to combine states
insert image description here
insert image description here

3. The state machine model is a very important tool that can be used to view the past and the future

Processes on eleven operating systems (minimum Linux; fork, execve and exit)

1. Create a process by fork, fork is a system call

The program is a state machine.
After the operating system completes the startup, it actually creates a state machine representing init.
insert image description here
After the operating system code executes the fork, there will be two copies of the same process in the system.

Each byte of the memory of these copies is the same, and the register is the same. The return value of fork is different. The return
value eax register is different from eax.
insert image description here
insert image description here
After the state machine is created, it will become a concurrent program.
The operating system is a manager. Command what to do next.
insert image description here
Virtualization means that the operating system can manage multiple state machines

2. Understand fork

insert image description here
One outputs 6 Hellos, and the other outputs 8 Hellos

line buffer: when you see \n, all the contents of the buffer will be written out in system units.
full buffer: the output will only be output after 4096 bytes are filled.
insert image description here
Therefore, 6 and 8 are actually executed on the spot and printed immediately, and the other is Pipeline execution n also hides and does not hit
insert image description here

insert image description here
It is equivalent to the buffer of these several processes is the same, printf just writes into the buffer but does not print?

The following example will generate 4 forks. First, the parent process will create child process 1 and child process 2, and child process 1 will execute the second fork statement, so child process 1 will create another child process.
insert image description here

The following example will not only generate 5 sub-processes, but sub-process 1 will execute a loop from i=1, 2, 3, 4, and the sub-process generated by sub-process 1 will generate sub-processes
insert image description here

3. execute: reset a state machine, reset to the initial state of a certain program

insert image description here

Summary: fork is equivalent to creating a state machine, and excue is equivalent to replacing the state machine

4、exit

insert image description here
insert image description here

Linux exit and _exit

insert image description here

In the standard library functions of linux, there is a set of functions called advanced I/O. The well-known printf, fopen, fread, and fwrite are all listed here, and they are also called buffered I/O. Its characteristic is that corresponding to each open file, there is a buffer, and there is a buffer in the memory. Each time the file is read, several more records will be read, so that the next time the file is read, it can be directly read from the memory cache. Take it out, every time you write a file, it is only written to the buffer of the memory, wait for certain conditions to be met (reach a certain number, or encounter a specific character, such as newline and end-of-file character EOF), and then write the contents of the buffer Write the file at one time, which greatly increases the speed of file reading and writing, but it also brings a little trouble to our programming. If there is some data, we think it has been written to the file, but in fact it does not meet the specific conditions. , they are only stored in the buffer, at this time we use the _exit function to directly close the program, and the data in the buffer will be lost. On the contrary, if you want to ensure the integrity of the data, you must use the exit function.

In the first example, the ok statement is output because \n is included, it will be output directly, and will not be saved in the buffer, while the good statement is saved in the buffer, so it will not be output when _exit is executed, and it will be directly quit

insert image description here

The address space of twelve processes (pmap; vdso; mmap; game modifier/plug-in)

1. Linux command line tool: fish

2, pmap can view the address space of the process

Use man 5 proc to view the manual

3. Various uses of the cat command in Linux

Various uses of the cat command in Linux

4. vim command for Linux learning

vim command for Linux learning

5. If you want to make a system call but don't want to enter the kernel, you can use RTFM

insert image description here

6、vdso

From the official manual of vdso, we can know that this is a virtual shared library, which is dynamically loaded by the kernel when the application is running, and its symbol analysis is completed by the C language dynamic library. The official manual gives the replacement functions of the system calls provided by the shared library (this solves one of the doubts: except for the system calls provided by vdso, strace should be able to capture all other system calls), and explains that these are replaced The function of is characterized by high performance, and the main motivation is to improve the operating efficiency of the application.

Realization Mechanism of Virtual Dynamic Shared Library VDSO

Can it be realized without entering the operating system to call the kernel?

7. Dynamic link and static link

Why Dynamic Linking?
Static linking enables different program developers to develop and test their own program modules relatively independently, but problems such as difficulty in updating modules and waste of memory and disk space are also highlighted.

The basic idea of ​​​​dynamic linking: the object files that make up the program are not linked until the program is running, that is, the process of linking is postponed until runtime, which is the basic idea of ​​​​dynamic linking.
At present, mainstream operating systems support dynamic linking. In the Linux system, ELF dynamic linking files are called dynamic shared files (DSO, dynamic shared objects), referred to as shared objects, and generally end with .so.

The whole process: perform virtual memory mapping on the main program, enter the entry linker of the linker to perform relocation and other operations, and enter the entry of the main program to execute the program.

There is a point here that needs to be recognized: during static linking, the entire program ends up with only one executable file, which is an indivisible whole, but under dynamic linking, a program is divided into several files. The main part of , that is, executable files and shared objects on which programs depend. Many times we also refer to these parts as modules, that is, executable files and shared objects under dynamic linking can be regarded as a module of the program.

insert image description here

8、mmap

Can help to find a new piece of memory in the memory space
insert image description here

insert image description here

9 Process Isolation

insert image description here

Guess you like

Origin blog.csdn.net/qq_43050258/article/details/129504157