Interview summary (7): Summary of common interviews with operating systems (3): Other issues with operating systems

Preface

  The first two articles we introduced the operating system in the interview about the threads and processes , and some of the problems in the system of common interview questions . In this article, we continue to introduce common problems to you. This article will introduce you to system-related issues in the operating system.

Interview questions and reference answers

1. Please talk about the structure alignment and byte alignment in the operating system

  • 1. Reasons:
    1) Platform reasons (transplant reasons): Not all hardware platforms can access any data at any address; some hardware platforms can only fetch certain types of data at certain addresses, otherwise it will throw A hardware exception occurred.
    2) Performance reasons: Data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; while aligned memory access only requires one access.
  • 2. Rule
    1) Data member alignment rules: For the data members of the structure (struct) (or union (union)), the first data member is placed where the offset is 0, and the alignment of each data member is
    specified by #pragma pack. The smaller the value of the data member and the length of the data member itself.
    2) The overall alignment rules of the structure (or union): After the data members are aligned, the structure (or union) itself must also be aligned. The alignment will be based
    on the value specified by #pragma pack and the maximum data member length of the structure (or union) In the smaller one.
    3) Structure as a member: If there are some structure members in a structure, the structure members should be stored from an address that is an integer multiple of the maximum element size inside.
  • 3. Define the alignment of the structure.
      You can change this coefficient through the pre-compiled command #pragma pack(n), n=1,2,4,8,16, where n is the designated "alignment coefficient".

2. Please tell me about the mechanism of mutex and the difference between mutex and read-write lock

  • 1. The difference between a mutex lock and a read-write lock:   Mutex lock: mutex is used to ensure that at any time, only one thread can access the object. When the lock acquisition operation fails, the thread will go to sleep and be awakened while waiting for the lock to be released.
      Read and write lock: rwlock, divided into read lock and write lock. During a read operation, multiple threads can be allowed to obtain read operations at the same time. But only one thread can acquire a write lock at a time. Other threads that fail to acquire the write lock will go to sleep and will be awakened when the write lock is released.
    Note: Write locks will block other read-write locks. When a thread acquires the write lock while writing, the read lock cannot be acquired by other threads; the writer takes precedence over the reader (once there is a writer, subsequent readers must wait and give priority to the writer when waking up). It is suitable for occasions where the frequency of reading data is much greater than the frequency of writing data.
      The difference between a mutex and a read-write lock: 1) Read-write locks distinguish between readers and writers, while mutex locks do not distinguish
    2) Mutex locks allow only one thread to access the object at the same time, regardless of read and write; read-write locks are the same Only one writer is allowed at a time, but multiple readers are allowed to read the object at the same time.
  • 2. The four lock mechanisms of Linux:

  Mutex: mutex, used to ensure that at any time, only one thread can access the object. When the lock acquisition operation fails, the thread will go to sleep and be awakened while waiting for the lock to be released.
  Read-write lock: rwlock, divided into read lock and write lock. During a read operation, multiple threads can be allowed to obtain read operations at the same time. But only one thread can acquire a write lock at a time. Other threads that fail to acquire the write lock will go to sleep and will be awakened when the write lock is released.
Note: Write locks will block other read-write locks. When a thread acquires the write lock while writing, the read lock cannot be acquired by other threads; the writer takes precedence over the reader (once there is a writer, subsequent readers must wait and give priority to the writer when waking up). It is suitable for occasions where the frequency of reading data is much greater than the frequency of writing data.
  Spin lock: Spinlock, at any time, only one thread can access the object. But when the lock acquisition operation fails, it will not go to sleep, but will spin in place until the lock is released. This saves the thread's consumption from sleep to wake up, and greatly improves efficiency in an environment where the lock time is short. However, if the lock time is too long, it will waste CPU resources.
  RCU: Read-copy-update. When modifying data, you first need to read the data, and then generate a copy to modify the copy. After the modification is completed, update the old data to new data. When using RCU, readers hardly need synchronization overhead, neither need to acquire locks nor use atomic instructions, and will not cause lock contention, so there is no need to consider deadlock issues. The synchronization overhead for writers is relatively large, it needs to copy the modified data, and a lock mechanism must be used to synchronize and parallel the modification operations of other writers. The efficiency is very high when there are a large number of read operations and a small number of write operations.

3. Please answer the difference between soft link and hard link

   In order to solve the file sharing problem, Linux introduced soft links and hard links. In addition to solving file sharing for Linux, it also brings benefits such as hiding file paths, increasing permission security and saving storage. If one inode number corresponds to multiple file names, it is a hard link, that is, a hard link is the same file that uses different aliases and is created with ln. If the content stored in the user data block of the file is pointed to by the path name of another file, the file is a soft link. A soft link is an ordinary file with its own independent inode, but its data block content is quite special.

4. What is big-endian little-endian and how to judge big-endian little-endian

   Big endian means that the low byte is stored at the high address; little endian means that the low byte is stored at the low address. We can judge whether the system is big-endian or little-endian based on the consortium. Because the union variable is always stored from the lower address.

5. Please tell us about the process from source code to executable file

  • 1) Pre-compilation    mainly deals with pre-compilation instructions beginning with "#" in source code files. The processing rules are as follows: 1. Delete all #defines and expand all macro definitions.
    2. Process all conditional precompilation instructions, such as "#if", "#endif", "#ifdef", "#elif" and "#else".
    3. Process the "#include" precompiled instruction and replace the file content to its location. This process is recursively carried out, and the file contains other files.
    4. Delete all comments, "//" and "/**/". 5. Keep all #pragma compiler directives, the compiler needs to use them, for example: #pragma once
    is to prevent files from being repeatedly quoted. 6. Add line numbers and file identifiers to facilitate the compiler to generate line number information for debugging during compilation, and to display line numbers when compilation errors or warnings are generated during compilation.
  • 2) Compile    the xxx.i or xxx.ii file generated after pre-compilation and perform a series of lexical analysis, syntax analysis, semantic analysis and optimization to generate the corresponding assembly code file.
    1. Lexical analysis: Using an algorithm similar to a "finite state machine", input the source code program into the scanner, and divide the character sequence into a series of marks.
    2. Syntax analysis: The syntax analyzer performs syntax analysis on the tokens generated by the scanner to generate a syntax tree. The syntax tree output by the syntax analyzer is a tree with expressions as nodes.
    3. Semantic analysis: The grammatical analyzer only completes the analysis of the grammatical level of the expression, while the semantic analyzer judges whether the expression is meaningful. The semantics of its analysis is static semantics-semantics that can be divided during compile time. The corresponding dynamic semantics is the semantics that can be determined at runtime.
    4. Optimization: An optimization process at the source code level. 5. Target code generation: The code generator converts the intermediate code into target machine code, and generates a series of code sequences-assembly language representation.
    6. Target code optimization: The target code optimizer optimizes the above-mentioned target machine code: find a suitable addressing mode, use displacement to replace multiplication, delete redundant instructions, etc.
  • 3) Assembly    converts assembly code into instructions that can be executed by the machine (machine code file). The assembly process of the assembler is simpler than that of the compiler. There is no complicated syntax, no semantics, and no instruction optimization. It is just translated one by one according to the comparison table of assembly instructions and machine instructions. The assembly process has an assembler. as complete. After assembly, the target file (almost the same format as the executable file) xxx.o (under Windows) and xxx.obj (under Linux) are generated.
  • 4) Linking Links    the target files generated by different source files to form an executable program. Linking is divided into static linking and dynamic linking:
    1. Static linking:   functions and data are compiled into a binary file. In the case of using a static library, when compiling and linking an executable file, the linker copies these functions and data from the library and combines them with other modules of the application to create the final executable file.
      Space waste: because each executable program has a copy of all required target files, if multiple programs have dependencies on the same target file, there will be multiple copies of the same target file in memory;
      Update difficulty: Whenever the code of the library function is modified, at this time, it is necessary to recompile and link to form an executable program.
      Fast running speed: But the advantage of static linking is that everything needed to execute the program is already in the executable program, and it runs fast when it is executed.
    2. Dynamic linking:   The basic idea of ​​dynamic linking is to split the program into relatively independent parts according to modules, and link them together to form a complete program when the program is running, instead of combining all program modules like static linking. Link into a single executable file.
      Shared library: Even if every program needs to rely on the same library, the library will not have multiple copies and copies in memory like static linking, but these multiple programs share the same copy when they are executed;
      easy to update: update It is only necessary to replace the original target file, without relinking all the programs again. When the program runs next time, the new version of the target file will be automatically loaded into the memory and linked, and the program has completed the upgrade target.
      Performance loss: Because the link is postponed until the program is running, it needs to be linked every time the program is executed, so there will be a certain loss in performance.

6. Have you used GDB for debugging? What is a conditional breakpoint?

1. GDB debugging
  GDB is one of the software tools of the Free Software Foundation (Free Software Foundation). Its role is to assist programmers in finding errors in the code. Without the help of GDB, the only way for programmers to track the execution flow of the code is to add a large number of statements to generate specific output. However, this method itself may introduce new errors, which makes it impossible to analyze the error codes that cause the program to crash.
  The emergence of GDB reduces the burden on developers. They can track their own code step by step while the program is running, or temporarily suspend the execution of the program through a breakpoint. In addition, they can observe the current state of variables and memory at any time, and monitor how key data structures affect code operation.
2. Conditional breakpoint
  Conditional breakpoint is to interrupt the program running when the condition is met, the command: break line-or-function if expr.

7. Please talk about the event loop of asynchronous programming

   The event loop is a non-stop loop waiting for the occurrence of time, and then all the processors of this event and the time sequence of their subscription to this event are executed in turn. When all the processors of this event have been executed, the event loop will continue to wait for the trigger of the next event, reciprocating. When processing multiple requests concurrently at the same time, the above concept is also correct. It can be understood as follows: In a single thread, the event handlers are executed one by one in sequence. That is, if an event is bound to two processors, the second processor will not start execution until the first processor has finished executing. Before all the handlers of this event are executed, the event loop will not check whether a new event is triggered. In a single thread, everything is executed sequentially one by one!

8. Please answer why there is a page cache, and how does the operating system design the page cache?

   Speed ​​up the rate of reading files from disk. There is a part of the disk file cache in the page cache. Because it is relatively slow to read files from the disk, the file is read first to find it in the page cache. If it hits, there is no need to read it from the disk, which greatly speeds up the reading speed. In the Linux kernel, each data block of a file can only correspond to one Page Cache item at most. It manages these Cache items through two data structures, one is a radix tree and the other is a doubly linked list. Radix tree is a search tree. The Linux kernel uses this data structure to quickly locate Cache items through offsets within files.

9. What is page storage?

   The main memory is divided into equal-sized pieces, called main memory blocks, also known as real pages. When a user program is loaded into memory, it is allocated in units of pages. The page size is 2n, usually 1KB, 2KB, 2n KB, etc.

10. Python lock

   Various locks in Python:

  • 1. Global interpreter lock (GIL)
    1. What is a global interpreter lock.
      Each CPU can only execute one thread at the same time, so other threads must wait for the thread's global interpreter and use the global interpreter after the usage rights disappear The interpreter, even if multiple threads do not directly affect each other in the same process, only one thread uses the cpu. This mechanism is called the global interpreter lock (GIL). The design of GIL simplifies the implementation of CPython, so that the object model includes key built-in types, such as dictionaries, etc., which are implicit and can be accessed concurrently. Locking the global interpreter makes it easier to implement multithreading Support, but also loses the parallel computing power of the multi-processor host.
    2. The benefits of the global interpreter lock
    1) The benefits of avoiding a large number of locks and unlocks
    2) Make data more secure, solve the data integrity and state synchronization between multiple threads
    3. The disadvantages of the global interpreter,
      multi-core processor degradation As a single-core processor, it can only be concurrent but not parallel.
    4. The role of GIL:
      There must be resource competition in the case of multithreading. The GIL is to ensure that threads at the interpreter level only use shared resources (cpu).
  • 2. Synchronization lock
    1. What is a synchronization lock?
      A thread under a process at the same time can only use one cpu. To ensure that the program under this thread is executed by the cpu within a period of time, a synchronization lock is needed.
    2. Why use synchronization lock?
      Because it is possible that when a thread is using the cpu, the program under that thread may encounter io operations, then the cpu will switch to other threads, which may affect the integrity of the program results.
    3. How to use synchronization lock?
      It is only necessary to add lock and release lock operations before and after operations on public data.
    4. The use of synchronization locks:
      In order to ensure that the self-written programs under the interpreter level only use shared resources to generate synchronization locks.
  • 3. Deadlock
    1. What is a deadlock?
      Refers to a phenomenon in which two or more threads or processes wait for each other due to contention for resources or improper sequence of program advancement during the execution of the program.
    2. What are the necessary conditions for deadlock?
      Mutually exclusive conditions, request and hold conditions, non-deprivation conditions, loop waiting conditions
    3. Basic methods for handling deadlocks?
      Prevent deadlock, avoid deadlock (banker's algorithm), detect deadlock (resource allocation), remove deadlock: deprive resources, cancel process
  • 4. What is a recursive lock?
      In order to support multiple requests for the same resource in the same thread in Python, Python provides reentrant locks. This RLock internally maintains a Lock and a counter variable. The counter records the number of acquires so that the resource can be required multiple times. Until all the acquires of a thread are released, other threads can obtain resources. Recursive locks are divided into recursive locks and non-recursive locks.
  • 5. What is optimistic lock?
      Assuming that no concurrency conflicts will occur, only check for data integrity violations when submitting the operation.
  • 6. What is a pessimistic lock?
      Assuming concurrency conflicts will occur, shield all operations that may violate data integrity.
  • Seven, python commonly used locking method?
      Mutex, reentrant lock, iteration deadlock, mutual call deadlock, spin lock.

to sum up

  Due to the content of operating system interviews, the previous two articles and this article briefly summarized the common operating system problems in the interview, from the threads, processes, and system related issues in the operating system, and this article It's about other related issues. The reason why I spend three articles on operating system related issues is to facilitate my review of future interviews, and also to provide you with review directions and ideas when interviewing related positions again. Here we need to have a deeper understanding of the operating system. Therefore, when we are preparing, we should first lay a solid foundation. Only in this way can we stand out from the crowd. In addition, as a person working in the computer industry, it is necessary to master some basic operating system knowledge, which is also our basic quality. Finally, I hope that everyone will continue to improve and get a more satisfactory offer as soon as possible! ! ! ! Keep going, the future can be expected! ! ! !

Guess you like

Origin blog.csdn.net/Oliverfly1/article/details/108568677