Child_process module learning in nodejs

foreword

Before understanding child_process , let's first understand the basic concepts in several computer operating systems and the relationship between them.

  1. cup: The computer includes five basic hardware arithmetic units, controllers, memory, input, and output devices. The arithmetic unit and the controller are integrated into a central processing unit, the CPU (Central Processing Unit), whose main function is to execute a series of instruction operations and then write back the results.
  2. Process: A process is the basic unit for resource allocation and scheduling by the system. Only one process can run on a single CPU at a time, and it will occupy independent memory. But we may think that there must be more than one process running when the computer is running. In a modern operating system, all processes will take turns using the cpu, but because the cpu is extremely efficient, it can quickly switch between multiple tasks, giving us the feeling that multiple tasks are executing concurrently.
  3. Thread: A thread is an entity in a process. A process can have multiple threads, and a thread must have a parent process. Threads do not own system resources, and threads share all resources owned by the process, but when one of the threads uses a piece of shared memory, other threads must wait for it to end before using this piece of memory.

We know that node runs on a single thread. When we use node app.js to start the node service, a node process will run on the server, and our js code will only run in one of the threads. In the design of node, it is to delegate time-consuming operations to the operating system or other threads. These operations are I/Ocommon asynchronous operations such as disk I/O and network, and to separate these time-consuming operations from the main thread. Although node does not support creating threads from the language level, we can create a new process through the child_process module to complete time-consuming and resource-consuming operations, such as executing a shell script to upload or download large files, and then send the execution results back to main thread.

This article mainly introduces the relevant content of the child_process module in Node.js. Before introducing the child_process module, let's look at an example.

const http = require('http');
const longComputation = () => {
    
    
    let sum = 0;
    for (let i = 0; i < 1e10; i++) {
    
    
        sum += i;
    };
    return sum;
};
const server = http.createServer();
server.on('request', (req, res) => {
    
    
    if (req.url === '/compute') {
    
    
        const sum = longComputation();
        return res.end(`Sum is ${
      
      sum}`);
    } else {
    
    
        res.end('Ok')
    }
});

server.listen(3000);

You can try to use the above code to start the Node.js service, and then open two browser tabs to visit /compute and / respectively. You can find that when the node service receives a /compute request, it will perform a large number of numerical calculations, resulting in failure to respond to other ask(/).
In the Java language, the above problems can be solved by multi-threading, but Node.js is single-threaded when the code is executed, so how should Node.js solve the above problems? In fact, Node.js can create a child process to perform intensive cpu computing tasks (such as longComputation in the above example) to solve the problem, and the child_process module is used to create child processes.

child_process provides several ways to create child processes

  • Asynchronous mode: spawn, exec, execFile,fork
  • Synchronization method: spawnSync, execSync,execFileSync

exec

grammar:child_process.exec(command[, options][, callback])

The first parameter command here is the command executed in the shell; options can set parameters related to the execution command, such as: cwd (current working directory), shell (the shell that executes the command), uid, gid, encoding, etc.; callback After the command is executed and called, the command output can be obtained through the stdout of the callback function. options and callback are optional parameters. For example, if you want to execute "ls -l" in the /Usrs/ben directory, then the code is as follows:

const {
    
     exec } = require('child_process');

exec('ls -l',{
    
    cwd: '/Users/liu/Desktop'}, (error, stdout, stderr) => {
    
    
  if (error) return;
  console.log('stdout:', stdout);
})

// stdout: total 4792
// -rw-r--r--@ 1 liuchongyang  staff   53248 May 24  2022 10起诉状(一审).doc
// -rw-r--r--@ 1 liuchongyang  staff   34816 May 24  2022 11答辩状(一审).doc
// -rw-r--r--@ 1 liuchongyang  staff   33280 May 24  2022 12质证意见(一审).doc
// -rw-r--r--@ 1 liuchongyang  staff   65024 May 24  2022 13代理词(一审营).doc

execFile

grammar:child_process.execFile(file[, args][, options][, callback])

execFile, as the name implies, executes executable files. Typically on Unix-type operating systems, execFile is more efficient than exec because it does not spawn a shell. In the Windows system, since .batand .cmd files cannot be executed independently without a terminal, execFile cannot be used to execute them, but exec or the spawn described below should be used to execute them.

The file parameter in the execFile method is mandatory, specifying the file to be executed; args is optional, and is a list of parameters passed to the execution file; options and callback are similar to those in exec, so I won’t go into details. Next, we want to take a look at the node version, the code can be as follows:

const {
    
     execFile } = require('child_process');

execFile('/usr/local/bin/node', ['-v'], (error, stdout, stderr) => {
    
    
  if (error) return;
  console.log('stdout:', stdout);
})

// stdout: v12.17.0

spawn

grammar:child_process.spawn(command[, args][, options])

spawnSimilar to execthe feature, it executes a command, but spawn does not receive stdout in the form of a callback function, but obtains standard output data by listening to the data event on stdout on the child process object. In this way, stdout is transmitted in the form of a stream, which is much more efficient than exec's way of calling the callback after the output ends.

We use the example given in exec to execute "ls -l" in the /Usrs/ben directory, then the code is as follows:

const {
    
     spawn } = require('child_process');

const subprocess = spawn('ls', ['-l'], {
    
    cwd: '/Users/ben'});

subprocess.stdout.on('data', (data) => {
    
    
  console.log(data.toString());
});

// total 8
// drwx------@  7 ben  staff   224  7 28 09:36 Applications
// drwx------@ 25 ben  staff   800  7 31 22:19 Desktop
// drwx------@ 24 ben  staff   768  6 21 17:18 Documents
// drwx------@ 79 ben  staff  2528  7 31 20:25 Downloads
// ...

fork

grammar:child_process.fork(modulePath[, args][, options])

fork is actually a special case of spawn, because the first parameter of fork() is a node module path. The args and options parameters are the same as spawn. But fork executes a node module, so fork provides a feature, that is, an IPC channel is established between the parent and child processes, so that the parent and child processes can send information to each other through the send() method. For example:

child.js

//child.js
process.on('message',function(msg){
    
    
   console.log('child receive msg:', msg) // child receive msg: hello world
   process.send(msg)
})

parent.js

// parent.js
let cp=require('child_process');
let child=cp.fork('./child');
child.on('message',function(msg){
    
    
  console.log('parent get messg:',msg); // parent get messg: hello world
});
child.send('hello world');

Because fork will establish a communication channel between the parent and child processes, if there is a synchronous fork, then this IPC channel will not exist, so there is no corresponding synchronization method for fork.

Summarize

Among the four methods of creating child processes, spawn and fork are relatively commonly used. Spawn processes operating system commands; fork processes node modules (and an IPC channel will be established between the parent and child processes for communication); exec directly uses the shell to execute commands, so it can be convenient Use features such as pipelines in the shell, but the output result will be output once in the callback, so it is not suitable for the case where the output data is particularly large; execFile is not suitable for Windows systems.

Guess you like

Origin blog.csdn.net/woyebuzhidao321/article/details/129567942