golang interview: golang concurrency and multithreading (3)


title: golang concurrency and multithreading (3)
auther: Russshare
toc: true
date: 2021-07-13 18:57:01
tags: [golang, interview, multithreading and concurrency]
categories: golang interview

  • 3. Concurrency and multithreading

    • 01 The concurrency mechanism of go language and the CSP concurrency model it uses. Communicating Sequential Process

      • The CSP model was proposed in the 1970s. Unlike the traditional multi-thread communication through shared memory, CSP pays attention to "sharing memory by means of communication". A concurrency model used to describe two independent concurrent entities communicating through a shared communication channel (pipeline). The channel in CSP is the first type of object. It does not pay attention to the entity sending the message, but to the channel used when sending the message.

      • The channel in Golang is created separately and can be passed between processes. Its communication mode is similar to the boss-worker mode. An entity sends a message to the channel, and then listens to the entity processing of the channel. The relationship between the two entities The room is anonymous, which realizes the decoupling between entities. The channel is a synchronous message that is sent to the channel and must be consumed by another entity in the end. In terms of implementation principle, it is actually similar to a blocked message queue. .

      • Goroutine is the entity that Golang actually executes concurrently. Its bottom layer uses coroutines (goroutine) to achieve concurrency. Goroutine is a user thread running in user mode, similar to greenthread. The reason why Go bottom layer chooses to use goroutine is because it has the following Features:

        • User space avoids the cost caused by switching between kernel mode and user mode.
        • Can be dispatched by language and framework layers.
        • Smaller stack size allows creating a large number of instances.
      • Features of Goroutine in Golang:

        • There are three objects in Golang: P object (processor) represents the context (or can be considered as cpu), M (work thread) represents the worker thread, and G object (goroutine).

        Under normal circumstances, a cpu object starts a worker thread object, and the thread checks and executes the goroutine object. When the goroutine object is blocked, a new worker thread will be started to make full use of cpu resources. So sometimes there are many more thread objects than processor objects.

        • G (Goroutine): What we call a coroutine is a user-level lightweight thread, and the sched in each Goroutine object saves its context information.
        • M (Machine): Encapsulation of kernel-level threads, the number corresponds to the real number of CPUs (objects that really work).
        • P (Processor): It is the scheduling object of G and M, which is used to schedule the relationship between G and M. The number can be set by GOMAXPROCS(), and the default is the number of cores.
          • GPM scheduling model
          • Golang is a language born for concurrency, and Go language is one of the few languages ​​that achieves concurrency at the language level; it is also the concurrency feature of Go language that has attracted countless developers around the world.

        Golang's CSP concurrency model is implemented through Goroutine and Channel.

        Goroutine is the concurrent execution unit in Go language. A bit abstract, in fact, it is similar to the traditional concept of "thread", which can be understood as "thread". Channel is the communication mechanism before each concurrent structure (Goroutine) in Go language. Usually Channel is a "pipe" for communication between Goroutines, which is somewhat similar to the pipeline in Linux.

        The communication mechanism channel is also very convenient. Use channel <- data to transmit data, and <- channel to fetch data.

        In the communication process, data transmission channel <- data and data fetching <- channel will inevitably appear in pairs, because the communication between the two goroutines will be realized when the data is transmitted here and fetched there.

        And no matter whether it is passed or fetched, it must be blocked until another goroutine is passed or fetched.

    • 02 What is a channel and why can it be thread-safe?

      • Channel is a core type in Go. It can be regarded as a pipeline through which concurrent core units can send or receive data for communication. Channel can also be understood as a first-in-first-out queue that communicates through pipelines. .

      Golang's Channel, sending a data to the Channel and receiving a data from the Channel are atomic. And the design idea of ​​Go is: don't communicate through shared memory, but share memory through communication. The former is traditional locking, and the latter is Channel. In other words, the main purpose of designing Channel is to transfer data between multiple tasks, which is of course safe.

    • 03 What is the difference between an unbuffered channel and a buffered channel?

      • For an unbuffered channel, the sender will block the channel until the receiver receives data from the channel, and the receiver will also block the channel until the sender sends data into the channel.
      • For buffered channels, the sender blocks if there are no empty slots (the buffer is used up), while the receiver blocks if the channel is empty.
    • 04 What is a Goroutine Leak?

      • Coroutine leak means that after the coroutine is created, it cannot be released for a long time, and new coroutines are constantly being created, which eventually leads to memory exhaustion and program crashes. Common scenarios that lead to coroutine leaks are as follows:
        • Missing receiver, causing send to block

        • Missing transmitter, causing receive to block

        • deadlock

        • During the execution of two or more coroutines, they are blocked due to competition for resources or communication with each other. In this case, the coroutines will also be blocked and cannot exit.

        • Infinite loops

        • In this example, in order to avoid network problems, an infinite retry method is used to send HTTP requests until the data is obtained. Then if the HTTP service is down, it will never be reachable, causing the coroutine to fail to exit, and a leak will occur.

    • 05 Can Go limit the number of operating system threads at runtime?

      • You can use the environment variable GOMAXPROCS or runtime.GOMAXPROCS(num int) to set, for example:
        runtime.GOMAXPROCS(1) // Limit the number of operating system threads that execute Go code at the same time to 1

      • GOMAXPROCS limits the number of operating system threads executing userland Go code concurrently, but there is no limit to the number of threads blocked by system calls.
        The default value of GOMAXPROCS is equal to the number of logical cores of the CPU. At the same time, one core can only bind one thread, and then run the scheduled coroutine.

        Therefore, for CPU-intensive tasks, if the value is too large, for example, set to twice the number of CPU logical cores, it will increase the overhead of thread switching and reduce performance.

        For I/O-intensive applications, appropriately increasing this value can improve I/O throughput.

    • 06 GPM scheduling model

      • see (garbage collection)
    • 07 Concurrency models commonly used in Golang?

      • 1. Realize concurrency control through channel notification

        • An unbuffered channel means that the size of the channel is 0. That is to say, this type of channel has no ability to save any value before receiving. It requires the sending goroutine and receiving goroutine to be ready at the same time to complete the sending and receiving operations.
        • From the above unbuffered channel definition, the sending goroutine and the receiving goroutine must be synchronized. After being prepared at the same time, if they are not ready at the same time, the operation executed first will block and wait until another corresponding operation is ready. . This unbuffered channel is also called a synchronous channel.
        • When the main goroutine runs until <-ch accepts the value of the channel, if there is no data in the channel, it will block and wait until there is a value. In this way, concurrency control can be easily implemented
      • 2. Realize concurrency control through WaitGroup in sync package

        • Goroutine is executed asynchronously. Sometimes, in order to prevent Goroutine from ending at the end of main function, it needs to wait synchronously. At this time, WaitGroup is needed. In the sync package, WaitGroup is provided, and it will wait for all it collects. The goroutine tasks are all done. There are three main methods in WaitGroup:

          • Add, can add or reduce the number of goroutines.
          • Done, equivalent to Add(-1).
          • Wait, after execution, the main thread will be blocked until the value in WaitGroup is reduced to 0
        • In the main goroutine Add(delta int) asks for the number of waiting goroutines. After each goroutine is completed, Done() indicates that this goroutine has been completed. When all goroutines are completed, WaitGroup returns in the main goroutine.

        • The introduction to WaitGroup on the Golang official website is A WaitGroup must not be copied after first use, after the first use of WaitGroup, it cannot be copied

      • 3. The powerful Context context introduced after Go 1.7 realizes concurrency control

        • Usually, it is enough to use channel and WaitGroup in some simple scenarios, but in the face of some complex and changeable network concurrency scenarios, channel and WaitGroup seem a bit powerless. For example, a network request Request, each Request needs to open a goroutine to do something, and these goroutines may open other goroutines, such as database and RPC services. So we need a solution that can track goroutines to achieve the purpose of controlling them. This is the Context provided by the Go language. It is very appropriate to call it the context. It is the context of goroutines. It includes a program's operating environment, scene and snapshot. When each program is running, it needs to know the running status of the current program. Usually, Go encapsulates these in a Context, and then passes it to the goroutine to be executed.

        • The context package is mainly used to handle data sharing between multiple goroutines and the management of multiple goroutines.

          If the main coroutine needs to send a message at a certain point to notify the sub-coroutine to interrupt the task to exit, then the sub-coroutine can be made to monitor the done channel. Once the main coroutine closes the done channel, the sub-coroutine can be launched, so that The requirement of the main coroutine to notify the sub-coroutine is realized. That's fine, but it's also limited.

          If we can add additional information to the simple notification to control the cancellation: why it is canceled, or there is a deadline for it to be completed, or there are multiple cancellation options, we need to judge which one to choose based on the additional information Cancel option.

          Consider the following situation: If there are multiple tasks 1, 2, ...m in the main coroutine, the main coroutine has timeout control on these tasks; and task 1 has multiple subtasks 1, 2, ...n, task 1 These subtasks also have their own timeout control, so these subtasks need to perceive both the cancellation signal of the main coroutine and the cancellation signal of task 1.

          If we still use the done channel, we need to define two done channels, and the subtasks need to monitor the two done channels at the same time. Hmm, that actually seems to be okay. But if the hierarchy is deeper, if these subtasks have subtasks, then the way of using the done channel will become very cumbersome and confusing.

          We need an elegant solution to implement such a mechanism:

          After the upper-level task is canceled, all lower-level tasks will be cancelled;
          when a middle-level task is canceled, only the lower-level tasks of the current task will be canceled without affecting the upper-level tasks and tasks of the same level.
          At this time context comes in handy. Let's first look at the structural design and implementation principles of context.

          • The core of the context package is struct Context, and the interface declaration is as follows:

          • Done() returns a channel type that can only accept data. When the context is closed or the timeout expires, the channel will have a cancel signal

          • Err() After Done(), returns the reason for context cancellation.

          • Deadline() sets the time point of the context cancel

          • The Value() method allows the Context object to carry request-scoped data, which must be thread-safe.

          • The Context object is thread-safe. You can pass a Context object to any number of goroutines. When canceling it, all goroutines will receive the cancellation signal.

          • A Context cannot have a Cancel method, and we can only receive data in the Done channel. The reason for this is the same: the function that receives the cancellation signal and the function that sends the signal are usually not the same. A typical scenario is: a parent operation starts a goroutine for a child operation, and the child operation cannot cancel the parent operation.

          • Each request of the Go server has its own goroutine, and some requests often start additional goroutines to process requests in order to improve performance. When the request is canceled or timed out, all goroutines on the request should exit to prevent resource leakage. Then the context comes, it constrains all goroutines on the request, and then cancels the signal, times out and so on. The invocation of Context exists in a chain, and a new Context derived through the WithXxx method is associated with the current parent Context. When the parent Context is cancelled, all its derived Contexts will be cancelled.

          • In servers written in go language, it is often necessary to process requests in gorutine, such as an http request, usually a new gorutine is generated for processing, and there may be many new gorutines in this gorutine, such as access to rpc services, database access, etc. When the request is canceled or timed out, the gorutine on this chain should be canceled or exit in time to release system resources. Or maybe these gorutines need to share authorization tokens and so on. With context, cancellation, timeout, and shared link information, all this becomes very simple.

          • Responsibility and boundary
            The context is only responsible for notification and delivery of information. As for how the gorutines that receive the notification do it, that is their own business.

    • 08 The difference between coroutines, threads, and processes.

      • Process
        A process is a program with certain independent functions about a running activity on a certain data set, and a process is an independent unit for the system to allocate and schedule resources. Each process has its own independent memory space, and different processes communicate through inter-process communication. Because processes are relatively heavy and occupy independent memory, the switching overhead (stacks, registers, virtual memory, file handles, etc.) between context processes is relatively large, but it is relatively stable and safe.
      • Thread
        A thread is an entity of a process and the basic unit of CPU scheduling and dispatching. It is a basic unit that is smaller than a process and can run independently. The thread itself basically does not own system resources, but only has a little bit that is essential in operation. resources (such as the program counter, a set of registers, and the stack), but it can share all resources owned by the process with other threads belonging to the same process. Inter-thread communication is mainly through shared memory, context switching is fast, and resource overhead is less, but it is less stable than the process and easy to lose data.
      • Coroutine
        A coroutine is a lightweight thread in user mode, and the scheduling of the coroutine is completely controlled by the user. A coroutine has its own register context and stack. When the coroutine schedule is switched, the register context and stack are saved to other places. When switching back, the previously saved register context and stack are restored. There is basically no kernel switching overhead for direct operation of the stack, and global variables can be accessed without locking. , so context switching is very fast.

When the newosproc function in runtime/os1_linux.go in the source code of go1.5.4 calls the clone system API of linux, the configured flag includes _CLONE_THREAD. Obviously an ordinary thread. Goroutine is actually a task in user code, which will be executed in M ​​(take linux as an example OS Thread, that is, a common thread user thread) through the scheduling mechanism. The tasks (goroutine) in the user program are scheduled above to realize the M:N scheduling model.

Guess you like

Origin blog.csdn.net/weixin_45264425/article/details/132200028