Understanding high-performance and high-concurrency from the root (6): Easy to understand, how high-performance servers are implemented

The original title of this article "How to achieve high concurrency and high performance server", please contact the author for reprinting.

1. Introduction to the series

1.1 Purpose of the article

As a developer of instant messaging technology, the technical concepts related to high performance and high concurrency have long been clear. What thread pool, zero-copy, multiplexing, event-driven, epoll, etc. are all at your fingertips, maybe you Familiar with technical frameworks with these technical characteristics such as: Java's Netty , Php's workman , Go's gnet, etc. But when it comes to face-to-face or technical practice, when you encounter unresolved doubts, you know that what you have is just the skin.

Return to the basics and return to the essence, what are the underlying principles behind these technical features? How to understand the principles behind these technologies in an easy-to-understand and effortless way is exactly what the series of articles "Understanding High Performance and High Concurrency from the Root" will share.

1.2 Origin of the article

I have compiled a lot of resources and articles related to IM, message push and other instant messaging technologies, from the initial open source IM framework MobileIMSDK to the online version of the classic network programming masterpiece " TCP/IP Detailed Explanation ", and then to the IM development programmatic. The article "One entry is enough for beginners: Develop mobile IM from scratch ", and " Introduction to Network Programming for Lazy People ", " Introduction to Brain Disabled Network Programming ", " High-performance Network Programming ", " Not for Known Network Programming " series of articles.

The more you go to the depths of knowledge, the more you feel that you know too little about instant messaging technology. So later, in order to allow developers to better understand the characteristics of networks (especially mobile networks) from the perspective of basic telecommunications technology, I collected and compiled a series of high-level articles on the " Introduction to Zero-Basic Communication Technology for IM Developers " across disciplines . This series of articles is already the knowledge boundary of network communication technology for ordinary instant messaging developers. With these network programming materials before, it is basically enough to solve the knowledge blind spots in network communication.

For the development of instant messaging systems such as IM, knowledge of network communication is indeed very important, but it returns to the essence of technology to realize these technical characteristics of network communication itself: including the thread pool, zero copy, multiplexing, and multiplexing mentioned above. Event-driven, etc., what is their nature? What is the underlying principle? This is the purpose of organizing this series of articles, I hope it will be useful to you.

1.3 Article directory

" Understanding high performance and high concurrency from the root (1): Going deep into the bottom of the computer, understanding threads and thread pools "

" Understanding high performance and high concurrency from the root (2): In-depth operating system, understanding I/O and zero copy technology "

" Understanding high performance and high concurrency from the root (3): In-depth operating system, thorough understanding of I/O multiplexing "

" Understanding high performance and high concurrency from the root (4): In-depth operating system, thorough understanding of synchronization and asynchrony "

" Understanding high performance and high concurrency from the root (5): Going deep into the operating system and understanding the coroutines in high concurrency "

" Understanding high performance and high concurrency from the root (6): easy to understand, how high-performance servers are implemented " (* this article)

1.4 Overview of this article

Continuing from the previous article " Understanding High Performance and High Concurrency from the Root (5): In-depth Operating System and Understanding Coroutines in High Concurrency ", this article is the sixth article (and the end) of the high performance and high concurrency series.

This article is the end of this series of articles. You will be able to understand how a typical server side uses the individual technologies explained in the first 5 articles to achieve high performance and high concurrency.

This article has been simultaneously published on the "Instant Messaging Technology Circle" official account , welcome to follow. The link on the official account is: click here to enter .

2. The author of this article

At the request of the author, no real names or personal photos are provided.

The main technical direction of the author of this article is Internet back-end, high-concurrency and high-performance server, search engine technology, and the screen name is "Code Farmer's Survival on Deserted Island". Thank the author for his selfless sharing.

3. Introduction

When you are reading this article, have you ever thought about how the server sent this article to you?

It's simple to say: is n't it just a user request? The server retrieves this article from the database according to the request, and then sends it back through the network.

It's actually a bit complicated: how does the server handle thousands of user requests in parallel? What technologies are involved in this?

This article is here to answer this question for you.

4. Multi-process

The earliest and simplest method in history to process multiple requests in parallel is to use multiple processes .

For example, in the Linux world, we can use system calls such as fork and exec to create multiple processes. We can receive user connection requests in the parent process, and then create child processes to handle user requests.

like this:

The advantages of this method are:

  • 1) The programming is simple and easy to understand;
  • 2) Since the address space of each process is isolated from each other, a process crash will not affect other processes;
  • 3) Make full use of multi-core resources.

The advantages of multi-process parallel processing are obvious, but the disadvantages are also obvious:

  • 1) The address space of each process is isolated from each other. This advantage will also become a disadvantage, that is, it will become more difficult to communicate between processes. You need to use the interprocess communications (IPC, interprocess communications) mechanism to think about you. Now know which inter-process communication mechanism, and then let you implement it in code? Obviously, inter-process communication programming is relatively complicated, and performance is also a big problem;
  • 2) We know that the cost of creating a process is greater than that of threads, and frequent creation and destruction of processes will undoubtedly increase the burden on the system.

Fortunately, in addition to processes, we also have threads.

5. Multithreading

Isn't it expensive to create a process? Isn't it difficult to communicate between processes? None of this is a problem for threads.

what? If you don't know threads yet, please take a look at this article " Deep into the bottom of the computer, understand threads and thread pools ", here is a detailed explanation of how the concept of threads comes from.

Since threads share the process address space, communication between threads does not naturally require any communication mechanism, just read the memory directly.

The cost of thread creation and destruction is also reduced. You must know that threads are like hermit crabs. The house (address space) is a process, and it is only a tenant. Therefore, it is very lightweight and the cost of creation and destruction is also very small.

We can create a thread for each request, even if a thread is blocked due to performing I/O operations-such as reading the database, etc.-it will not affect other threads.

like this:

But are threads perfect and cure all diseases? Obviously, the computer world has never been that simple.

Because the threads share the process address space, this brings convenience to the communication between threads and at the same time brings endless troubles.

It is precisely because the address space is shared between threads, a thread crash will cause the entire process to crash and exit. At the same time, the communication between threads is simply too simple, as simple as the communication between threads only needs to read the memory directly, and it is also simple enough to cause problems. It is extremely easy, such as deadlocks, synchronization and mutual exclusion between threads, etc. These are extremely prone to bugs, and a considerable part of the precious time of countless programmers is used to solve the endless problems caused by multithreading.

Although threads also have shortcomings, they are more advantageous than multi-processes. However, it is impractical to simply use multi-threading to solve high concurrency problems.

Because although the thread creation overhead is smaller than the process, it still has overhead. For a highly concurrent server with tens of thousands of hundreds of thousands of links, creating tens of thousands of threads will have performance problems, including memory usage and inter-threads. Handover, that is, the overhead of scheduling.

Therefore, we need to think further.

6. Event-driven: Event Loop

So far, when we mention the word "parallel", we think of processes and threads.

But: Can parallel programming only rely on these two technologies? That's not the case!

There is another parallel technology widely used in GUI programming and server programming. This is the very popular event-driven programming in recent years: event-based concurrency.

PS: Programmers who are engaged in IM server-side development are certainly no strangers.  What does the EvenLoop interface in the famous Java NIO high-performance network programming framework Netty mean? (For the high-performance principle of the Netty framework, you can read this " Beginner's Guide : So far the most Thorough Netty high-performance principles and framework analysis ").

Don't think this is a difficult technology to understand. In fact, the principle of event-driven programming is very simple.

This technique requires two raw materials:

  • 1)event;
  • 2) A function that handles events, this function is usually called event handler;

The rest is simple: you just need to wait quietly for the event to arrive. When the event arrives, check the type of the event, and find the corresponding event processing function according to the type, which is the event handler, and then call the event directly The handler is just fine.

That's it !

The above is the whole content of event-driven programming, is it very simple!

From the above discussion, we can see that: we need to continuously receive events and then process events, so we need a loop (either while or for loops can be used), this loop is called Event loop.

This is how pseudocode is used:

while(true) {

    event = getEvent();

    handler(event);

}

What to do in the Event loop is actually very simple, just wait for the event to be brought, and then call the corresponding event processing function.

Note: This code only needs to run in one thread or process, and only this one event loop can handle multiple user requests at the same time.

Some students may still not understand: Why can such an event loop handle multiple requests at the same time?

The reason is simple: for a network communication server, most of the time when processing a user request is actually spent on I/O operations, such as database read and write, file read and write, network read and write, etc. When a request comes, it may need to query the database and other I/O operations after simple processing. We know that I/O is very slow. After initiating I/O, we can continue processing without waiting for the completion of the I/O operation. The next user request.

Now you should understand: Although the previous user request has not been processed, we can actually process the next user request. This is also parallel, and this parallel can be handled by event-driven programming.

This is like a waiter in a restaurant: a waiter cannot wait for the next customer to place an order, serve food, eat, and pay for the next customer. What does the waiter do? When a customer finishes placing an order, he will directly deal with the next customer, and when the customer has finished eating, he will come back to pay for the bill by himself.

See it: The same waiter can handle multiple customers at the same time. This waiter is equivalent to the Event loop here. Even if the event loop is only running in one thread (process), it can handle multiple user requests at the same time.

I believe you have a clear understanding of event-driven programming, so the next question is, how to get this event, which is the event?

7. Event source: IO multiplexing

In the article "In- depth Operating System, Thorough Understanding of I/O Multiplexing ", we know that everything in the Linux/Unix world is a file, and our programs perform I/O operations through file descriptors Of course, sockets in network programming are no exception.

So how do we handle multiple file descriptors at the same time?

IO multiplexing technology is used to solve this problem: through IO multiplexing technology, we can monitor multiple file descriptions at a time, when a certain "file" (actually may be a socket in im network communication) can be We will be notified when it is read or writeable.

In this way, the IO multiplexing technology has become the raw material supplier of the event loop, continuously providing us with various events, so that the problem of the source of the event is solved.

Of course: For a detailed explanation of IO multiplexing technology, please refer to "In- depth Operating System, Thorough Understanding of I/O Multiplexing ", this article is a programmatic article, so I won't repeat it.

So far: Have all the problems related to the use of event-driven implementation of concurrent programming been solved? The problem of the source of the event is solved. When the event is obtained, the corresponding handler is called, and it looks like you're done.

Think if there are any other questions?

8. Problem: Blocking IO

Now: We can use one thread (process) to carry out parallel programming based on event-driven, and there are no more problems such as various locks, synchronization mutual exclusion, deadlocks and so on that make people annoying in multithreading.

But: There has never been a technology that can solve all problems in computer science, there is none now, and there will not be in the foreseeable future.

Is there any problem with the above method?

Don't forget, our event loop is running in one thread (process). Although this solves the multithreading problem, what happens if an IO operation is required when processing an event?

In the article "In- depth Operating System, Understanding I/O and Zero Copy Technology ", we explained how the most commonly used file reading is implemented at the bottom level. This IO method most commonly used by programmers is called blocking IO.

In other words: when we perform IO operations, such as reading a file, if the file is not read, then our program (thread) will be blocked and suspended. This is not a problem in multithreading, because the operating system can still Schedule other threads.

But: there is a problem in a single-threaded event loop, because when we perform blocking IO operations in the event loop, the entire thread (event loop) will be suspended, and the operating system will have no other threads to schedule. , Because there is only one event loop in the system processing user requests, so when the event loop thread is blocked and suspended, all user requests cannot be processed. Can you imagine that your request is suspended when the server is processing other user requests to read the database?

Therefore: There is a precaution when event-driven programming is to not allow blocking IO to be initiated.

Some students may ask, if blocking IO cannot be initiated, then how to perform IO operations?

PS: There is blocking IO, there is non-blocking IO. We continue to discuss.

9. Solution: non-blocking IO

In order to overcome the problems caused by blocking IO, modern operating systems have begun to provide a new method of initiating IO requests. This method is asynchronous IO. Correspondingly, blocking IO is synchronous IO. For the two concepts of synchronization and asynchrony, please refer to " Understanding High Performance and High Concurrency from the Root (4): In-depth Operating System, Thorough Understanding of Synchronization and Asynchrony ".

When asynchronous IO, suppose the aio_read function is called (for specific asynchronous IO API, please refer to the specific operating system platform), that is, asynchronous read. When we call this function, we can return immediately and continue other things, although the file may be It has not been read, so that it does not block the calling thread. In addition, the operating system also provides other methods for the calling thread to detect whether the IO operation is complete.

In this way, the IO blocking call problem is also solved with the help of the operating system.

10. Difficulties of event-driven parallel programming

Although there is asynchronous IO to solve the problem that the event loop may be blocked, event-based programming is still difficult.

First of all: we mentioned that the event loop runs in a thread. Obviously, a thread cannot make full use of multi-core resources. Some students may say that it is not enough to create multiple event loop instances. There are multiple event loop threads, but then the multithreading problem will appear again.

Another point lies in programming. In the article " Understanding High Performance and High Concurrency from the Root (4): In-depth Operating System, Thorough Understanding of Synchronization and Asynchrony ", we mentioned that asynchronous programming needs to be combined with callback functions (this programming method The processing logic needs to be divided into two parts: one part is handled by the caller itself, and the other part is handled in the callback function). This change in programming method has increased the burden on programmers to understand. Event-based programming will be difficult to expand later in the project. And maintenance.

So is there a better way?

To find a better way, we need to solve the essence of the problem, so what is the essence of the problem?

11. A better way

Why do we use asynchronous programming in such an incomprehensible way?

The reason is: although blocking programming is easy to understand, it will cause the thread to be blocked and suspended.

So smart, you must ask: Is there a way to combine a simple understanding of synchronous IO without causing threads to be blocked due to synchronous calls?

The answer is yes: this is the user level thread, that is, the famous coroutine (for the coroutine, please read the first part of this series " Understanding high performance and high concurrency from the root (5): in-depth operating system, "Understanding the coroutines in high concurrency ", this article will not repeat them).

Although event-based programming has various shortcomings, event-based programming is still very popular on today's high-performance and high-concurrency servers, but it is no longer purely event-driven based on a single thread, but event loop + multi thread + user level thread.

Regarding this combination, it is also worthwhile to come up with an article to explain, we will discuss in detail in a follow-up article.

12. Summary of this article

High-concurrency technology has evolved from the initial multi-process to the current event-driven. Computer technology is also evolving and evolving just like biology, but in any case, understanding the history can give a deeper understanding of the present. I hope this article can help you understand high-concurrency servers.

Appendix: More high-performance, high-concurrency articles

" High-performance network programming (1): How many concurrent TCP connections can a single server have "

" High-performance network programming (2): The famous C10K concurrent connection problem in the last 10 years "

" High-Performance Network Programming (3): In the next 10 years, it's time to consider C10M concurrency "

" High-performance network programming (4): Theoretical exploration of high-performance network applications from C10K to C10M "

" High-Performance Network Programming (5): Reading the I/O Model in High-Performance Network Programming in One Article "

" High-Performance Network Programming (6): Understanding the Thread Model in High-Performance Network Programming in One Article "

" High-performance network programming (7): What is high concurrency? Understand in one sentence!

" Take the network access layer design of the online game server as an example to understand the technical challenges of real-time communication "

" Knowing the technology sharing: knowing the practice of high-performance long-connection gateway technology with tens of millions of concurrency "

" Taobao Technology Sharing: The Technological Evolution Road of the Mobile Access Layer Gateway of the Hand Taobao Billion Level "

" A set of mobile IM architecture design practice sharing for massive online users (including detailed graphics and text) "

"An Original Distributed Instant Messaging (IM) System Theoretical Architecture Plan "

" WeChat background based on the time series of massive data cold and hot hierarchical architecture design practice "

" WeChat Technical Director Talks about Architecture: The Way of WeChat-Dao Zhi Jian (Full Speech) "

" How to Interpret "WeChat Technical Director Talking about Architecture: The Way of WeChat-The Road to the Simple" "

" Rapid Fission: Witness the evolution of WeChat's powerful back-end architecture from 0 to 1 (1) "

" 17 Years of Practice: Technical Methodology of Tencent's Massive Products "

" Summary of Tencent's Senior Architect Dry Goods: An article to understand all aspects of large-scale distributed system design "

" Take Weibo application scenarios as an example to summarize the architectural design steps of massive social systems "

" Getting Started: A Zero-Basic Understanding of the Evolution History, Technical Principles, and Best Practices of Large-scale Distributed Architectures "

" From novice to architect, one piece is enough: the evolution of architecture from 100 to 10 million high concurrency "

This article has been simultaneously published on the official account of "Instant Messaging Technology Circle".

▲ The link of this article on the official account is: click here to enter . The synchronous publishing link is: http://www.52im.net/thread-3315-1-1.html

Guess you like

Origin blog.csdn.net/hellojackjiang2011/article/details/113120090