node.js----asynchronous I/O and event-based programming

1. Process and thread, concurrency and parallelism, thread safety

    Process and thread:

        When it comes to "thread", it is inevitable to compare it with "process", and I think the difference between "thread" and "process" is not confused in Java concurrent programming, but "Task". A process is the basic unit that represents resource allocation. A thread is the smallest unit that performs operations in a process, that is, the basic unit that performs processor scheduling. The difference between "thread" and "process" is familiar to everyone. It's just a sentence: usually a program has one process, and a process can have multiple threads.

    Concurrency and Parallelism:

        When it comes to concurrency, it has to be compared with parallelism. Concurrency refers to doing multiple things at the same time over a period of time , such as washing dishes, washing clothes, etc. at 1:00-2:00. Parallelism refers to doing multiple things at the same time . For example, at 1 o'clock, I draw a circle with my left hand and a square with my right hand. Two very important distinctions are "a period of time" and "the same moment". In the operating system it is:

1) Concurrency is the simultaneous processing of multiple tasks in single-core processing. (The simultaneous refers to the logical simultaneous)

2) Parallelism is the simultaneous processing of multiple tasks in a multi-core processor. (The simultaneous refers to the physical simultaneous)

        Beginner programming is basically single-threaded structured programming, or you can't touch the concept of thread at all. Anyway, the program follows its own logic, and the program is implemented step by step according to our logic and gets the desired output result. However, with the improvement of programming capabilities and the complexity and change of application scenarios, we have to face multi-threaded concurrent programming. When beginners learn multi-threaded concurrent programming, there are often unexpected results, which are related to the "thread safety" problem.

    Thread safety:
        This is a problem that needs to be paid enough attention in multi-threaded concurrent programming. If your threads are not "safe" enough, the program may have unpredictable and difficult-to-reproduce results. "Java Concurrent Programming Practice" mentioned that it is not good to define thread safety. My simple understanding is: thread safety means that the program executes according to your code logic and always outputs the predetermined result. The definition given in the book: When a class is accessed by multiple threads, the class can always show the correct behavior, then the class is said to be thread-safe. Specific issues related to thread safety, such as atomicity, visibility, etc., will not be elaborated here, and will be introduced in detail when appropriate. To put it simply, if you want this thread safety, you must lock it when you access it. Do not allow other threads to access, of course, this statement is not rigorous, but it can be understood in this way for the time being.

2. Blocking and multi-threading, non-blocking and single-threading

What is blocking ( block )? If a thread encounters disk read/write or network communication (collectively referred to as  I/O  operations) during execution, it usually takes a long time. At this time, the operating system will deprive the thread of  CPU  control, make it suspend execution, and at the same time Resources are given to other worker threads, and this thread scheduling method is called blocking . When the  I/O  operation is completed, the operating system unblocks the thread, restores its control over the CPU , and allows it to continue executing. This  I/O  mode is the usual synchronous I/O ( Synchronous I/O ) or blocking  I/O  ( Blocking I/O ).

Correspondingly, asynchronous  I/O ( Asynchronous I/O ) or non -blocking  I/O ( Non-blocking I/O ) does not adopt a blocking strategy for all  I/O operations. When a thread encounters an  I/O  operation, it will not wait for the completion of the  I/O  operation or the return of data in a blocking manner , but just send the  I/O request to the operating system and continue to execute the next statement. When the operating system completes the  I/O operation, it notifies the thread performing the  I/O operation in the form of an event , and the thread will process the event at a specific time. To handle asynchronous  I/O , the thread must have an event loop that constantly checks for unhandled events and processes them in turn.

In blocking mode, a thread can only process one task, and multi-threading is necessary to improve throughput. In non-blocking mode, a thread is always performing computing operations, the  CPU  core utilization used by this thread is always  100% , and  I/O  is notified in the form of events. In blocking mode, multithreading can often improve system throughput, because when one thread is blocked, there are other threads working, and multithreading can prevent  CPU  resources from being wasted by blocking threads. In non-blocking mode, the thread is not blocked by  I/O  and is always using the  CPU . The benefit of multi-threading is just to utilize more cores in the case of multi-core  CPU  , and Node.js single thread can bring the same benefit. This is why  Node.js uses a single-threaded, non-blocking event programming model.

The following figures are examples of multi-threaded synchronous  I/O  and single-threaded asynchronous  I/O  . Suppose we have a job that can be divided into two computational parts and an  I/O  part, and the I/O  part takes much more time than the computation (which is usually the case). If we use blocking  I/O , then we must start multiple threads to achieve high concurrency. When using asynchronous  I/O  , a single thread can do the job.


Figure  multi-threaded synchronous  I/O

图 单线程异步式 I/O


单线程事件驱动的异步式 I/O比传统的多线程阻塞式 I/O究竟好在哪里呢?简而言之,异步式 I/O 就是少了多线程的开销。对操作系统来说,创建一个线程的代价是十分昂贵的,需要给它分配内存、列入调度,同时在线程切换的时候还要执行内存换页, CPU 的缓存被清空,切换回来的时候还要重新从内存中读取信息,破坏了数据的局部性。

当然,异步式编程的缺点在于不符合人们一般的程序设计思维,容易让控制流变得晦涩难懂,给编码和调试都带来不小的困难。习惯传统编程模式的开发者在刚刚接触到大规模的异步式应用时往往会无所适从,但慢慢习惯以后会好很多。尽管如此,异步式编程还是较为困难,不过可喜的是现在已经有了不少专门解决异步式编程问题的库(如async


 同步式 I/O 和异步式 I/O 的特点


同步式 I/O(阻塞式) 异步式 I/O(非阻塞式)
利用多线程提供吞吐量 单线程即可实现高吞吐量
通过事件片分割和线程调度利用多核 CPU 通过功能划分利用多核 CPU
需要由操作系统调度多线程使用多核 CPU 可以将单进程绑定到单核 CPU
难以充分利用 CPU 资源 可以充分利用 CPU 资源
内存轨迹大,数据局部性弱 内存轨迹小,数据局部性强
符合线性的编程思维 不符合传统编程思维

3.回调函数

        Node.js 异步编程的直接体现就是回调。异步编程依托于回调来实现,但不能说使用了回调后程序就异步化了。回调函数在完成任务后就会被调用,Node 使用了大量的回调函数,Node 所有 API 都支持回调函数。

        例如,我们可以一边读取文件,一边执行其他命令,在文件读取完成后,我们将文件内容作为回调函数的参数返回。这样在执行代码时就没有阻塞或等待文件 I/O 操作。这就大大提高了 Node.js 的性能,可以处理大量的并发请求。


4.事件循环

    Node.js 是单进程单线程应用程序,但是通过事件和回调支持并发,所以性能非常高。

    Node.js 的每一个 API 都是异步的,并作为一个独立线程运行,使用异步函数调用,并处理并发。

    Node.js 基本上所有的事件机制都是用设计模式中观察者模式实现。

    Node.js 单线程类似进入一个while(true)的事件循环,直到没有事件观察者退出,每个异步事件都生成一个事件观察者,如果有事件发生就调用该回调函数.


    事件驱动程序

            Node.js 使用事件驱动模型,当web server接收到请求,就把它关闭然后进行处理,然后去服务下一个web请求。当这个请求完成,它被放回处理队列,当到达队列开头,这个结果被返回给用户。

            这个模型非常高效可扩展性非常强,因为webserver一直接受请求而不等待任何读写操作。(这也被称之为非阻塞式IO或者事件驱动IO)

在事件驱动模型中,会生成一个主循环来监听事件,当检测到事件时触发回调函数。



整个事件驱动的流程就是这么实现的,非常简洁。有点类似于观察者模式,事件相当于一个主题(Subject),而所有注册到这个事件上的处理函数相当于观察者(Observer)。

Node.js 有多个内置的事件,我们可以通过引入 events 模块,并通过实例化 EventEmitter 类来绑定和监听事件,如下实例:

// 引入 events 模块
var events = require('events');
// 创建 eventEmitter 对象
var eventEmitter = new events.EventEmitter();

以下程序绑定事件处理程序:

// 绑定事件及事件的处理程序
eventEmitter.on('eventName', eventHandler);

我们可以通过程序触发事件:

// 触发事件
eventEmitter.emit('eventName');

实例

创建 main.js 文件,代码如下所示:
// 引入 events 模块
var events = require('events');
// 创建 eventEmitter 对象
var eventEmitter = new events.EventEmitter();

// 创建事件处理程序
var connectHandler = function connected() {
   console.log('连接成功。');
  
   // 触发 data_received 事件 
   eventEmitter.emit('data_received');
}

// 绑定 connection 事件处理程序
eventEmitter.on('connection', connectHandler);
 
// 使用匿名函数绑定 data_received 事件
eventEmitter.on('data_received', function(){
   console.log('数据接收成功。');
});

// 触发 connection 事件 
eventEmitter.emit('connection');

console.log("程序执行完毕。");

接下来让我们执行以上代码:

$ node main.js
连接成功。
数据接收成功。
程序执行完毕。

Node 应用程序是如何工作的?

在 Node 应用程序中,执行异步操作的函数将回调函数作为最后一个参数, 回调函数接收错误对象作为第一个参数。

接下来让我们来重新看下前面的实例,创建一个 input.txt ,文件内容如下:

源宝网地址:www.ybao.org

创建 main.js 文件,代码如下:

var fs = require("fs");

fs.readFile('input.txt', function (err, data) {
   if (err){
      console.log(err.stack);
      return;
   }
   console.log(data.toString());
});
console.log("程序执行完毕");

以上程序中 fs.readFile() 是异步函数用于读取文件。 如果在读取文件过程中发生错误,错误 err 对象就会输出错误信息。

如果没发生错误,readFile 跳过 err 对象的输出,文件内容就通过回调函数输出。

执行以上代码,执行结果如下:

程序执行完毕
源宝网地址:www.ybao.org

接下来我们删除 input.txt 文件,执行结果如下所示:

程序执行完毕
Error: ENOENT, open 'input.txt'

因为文件 input.txt 不存在,所以输出了错误信息。






        

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325647511&siteId=291194637