nodejs article stream module

foreword

insert image description here

A "stream" is a way of dealing with the system's cache. The operating system reads data in chunks, and stores the data in the cache each time it is received. The Node application has two cache processing methods. The first one is to wait until all the data is received and read from the cache at one time. This is the traditional way of reading files; the second is to use the "data stream" method. When a block of data is received, one block is read, that is, when the data has not been received, it is started to be processed.

The first way is to read all the data into the memory first, and then process it. The advantage is that it is intuitive and the process is very natural. The disadvantage is that if you encounter a large file, it will take a long time to enter the data processing step. The second method only reads a small piece of data at a time, like "flowing water", whenever the system reads a small piece of data, it will trigger an event and send out a "new data block" signal. As long as the application listens to this event, it can grasp the progress of data reading and make corresponding processing, which improves the performance of the program.

In nodejs, many IO processing objects are deployed with Steam interfaces, such as:

  1. file read and write
  2. Reading and writing HTTP requests
  3. TCP connection
  4. standard input and output

The stream interface can be divided into three categories, readable data stream, writable data stream, and bidirectional data stream.

countable stream

Readable

var Readable = require('stream').Readable;
var rs = new Readable();
rs.push('beep ');
rs.push('boop\n');
rs.push(null);

rs.pipe(process.stdout); // beep boop

The readable data stream has two states, streaming state and paused state.
The flow state means that the data starts to be transmitted, and the pause state means that the data transmission is suspended.

Four ways to turn a pause into a flow

1. Add data event listener function

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
var data = '';

readableStream.setEncoding('utf8'); // 还处于暂停态

readableStream.on('data', function(chunk) {
    
    
  data+=chunk;
}); // 从暂停态进入流动态,data开始被赋值

readableStream.on('end', function() {
    
    
  console.log(data);
});

2. Call the resume method

var fs = require('fs');
var readable = fs.createReadStream('url.js'); // 暂停态
readable.resume(); // 强制从暂停态进入流动态
readable.on('end', function(chunk) {
    
    
  console.log('数据流到达尾部,未读取任务数据');
});

3. Call the pipe method to send data to a writable data stream

var fs = require("fs");
fs.createReadStream("url.js").pipe(process.stdout);

4. Explicitly call stream.read()

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
var data = '';
var chunk;

readableStream.setEncoding('utf8');

// readable事件表示系统缓冲之中有可读的数据,使用read方法去读出数据。如果没有数据可读,read方法会返回null。
readableStream.on('readable', function() {
    
    
  while ((chunk=readableStream.read()) !== null) {
    
    
    data += chunk;
  }
});

readableStream.on('end', function() {
    
    
  console.log(data)
});

There are two ways to change the flow state to a paused state

1. When there is no destination of the pipe method, call the pause method

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
readableStream.on('data', function(chunk) {
    
    
    console.log('读取%d字节的数据', chunk.length);
    readableStream.pause(); // 暂停了
    console.log('接下来的1秒内不读取数据');
    setTimeout(function() {
    
    
      console.log('数据恢复读取');
      readableStream.resume(); // 之后重新从暂停状态进入流动态
    }, 1000);
  });

2. When the destination of the pipe method exists, remove all listener functions of the data event, and call the unpipe method to remove the destination of all the pipe methods

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
readableStream.pipe(process.stdout);
setTimeout(function() {
    
    
  console.log('停止写入url.js');
  readableStream.unpipe(process.stdout); // 移除管道后,无法继续传送数据流,进入暂停态
}, 0);

Common methods and properties for readable data streams

readable property
The readable property of a readable stream returns a Boolean value. Returns true if the data stream is a readable data stream, false otherwise.

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
console.log(readableStream.readable) // true

read()
The read method reads and returns data from the system cache. Returns null if no data can be read.

var fs = require("fs");
var readableStream = fs.createReadStream("url.js");
readableStream.on("readable", function () {
    
    
  var chunk;
  while (null !== (chunk = readableStream.read())) {
    
    
    console.log("got %d bytes of data", chunk.length);
  }
});

_read()
The _read method of the readable data stream can put data into the readable data stream, but this process will continue, and if it is not stopped, it will be an infinite loop.

var Readable = require("stream").Readable;
var rs = Readable();
var count = 0;

rs._read = function () {
    
    
  if (count < 10) {
    
    
    rs.push(count + "dx ");
    count++;
  } else {
    
    
// null时,及默认为结束
    rs.push(null);
  }
};

rs.pipe(process.stdout); // 0dx 1dx 2dx 3dx 4dx 5dx 6dx 7dx 8dx 9dx

Calling this method with setEncoding()
will cause the data stream to return a string with the specified encoding instead of a binary object in the cache. For example, call setEncoding('utf8'), the data stream will return a UTF-8 string, call setEncoding('hex'), and the data stream will return a hexadecimal string.
The parameter of setEncoding is the encoding method of the string, such as utf8, ascii, base64, buffer, hex, etc.

var Readable = require("stream").Readable;
var rs = Readable();
var count = 0;

rs._read = function () {
    
    
  if (count < 2) {
    
    
    rs.push(count + "dx ");
    count++;
  } else {
    
    
    // null时,及默认为结束
    rs.push(null);
  }
};
// rs.setEncoding('hex')
// rs.pipe(process.stdout); // 3064782031647820

rs.setEncoding('utf8')
rs.pipe(process.stdout); // 0dx 1dx

resume() and pause()
The resume method will make the "readable data stream" continue to release data events, that is, turn into a stream dynamic.
The pause method makes the dynamic data stream stop releasing data events and enter the pause state. Any data that can already be read at this time will stay in the system cache.

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
readableStream.on('data', function(chunk) {
    
    
    console.log('读取%d字节的数据', chunk.length);
    readableStream.pause(); // 暂停了
    console.log('接下来的1秒内不读取数据');
    setTimeout(function() {
    
    
      console.log('数据恢复读取');
      readableStream.resume(); // 之后重新从暂停状态进入流动态
    }, 1000);
  });

isPaused()
This method returns a Boolean value indicating that the "readable data stream" was manually paused by the client (that is, the pause method was called).

var readable = require("stream").Readable()
console.log(readable.isPaused())  // === false
readable.pause()
console.log(readable.isPaused()) // === true
readable.resume()
console.log(readable.isPaused()) // === false

pipe() and unpipe()
The pipe method is a mechanism for automatically transferring data, just like a pipe. It reads all data from the "readable data stream" and writes it out to the specified destination. The whole process is automatic.

The unpipe method removes the stream destination specified by the pipe method. If no arguments, removes all pipe method destinations. If there is an argument, remove the destination specified by the argument. If no destination matches the argument, it will have no effect.

The url.js file can be copied in the following way

var fs = require('fs');
var readableStream = fs.createReadStream('url.js');
var writableStream = fs.createWriteStream('url1.js');
// pipe方法必须在可读数据流上调用,它的参数必须是可写数据流。
readableStream.pipe(writableStream);

var fs = require('fs');
var zlib = require('zlib');

fs.createReadStream('url.js')
  .pipe(zlib.createGunzip())
  .pipe(fs.createWriteStream('url1.js'));

The above code adopts the chain writing method, first reads the file, then compresses it, and finally outputs it.

var fs = require("fs");
var readable = fs.createReadStream("url.js");
var writable = fs.createWriteStream("url1.js");
readable.pipe(writable);
setTimeout(function () {
    
    
  console.log("停止写入url1.js");
  readable.unpipe(writable);
  console.log("手动关闭url1.js的写入数据流");
  writable.end();
}, 1);

Event listening for readable data streams

Mainly the five events of readable data end close error

  1. The readable event is triggered when the data stream can provide data to the outside.
  2. For those data streams that are not explicitly paused, adding a data event listening function will switch the data stream to stream dynamics and provide data as soon as possible.
  3. When no more data can be read, the end event is triggered.
  4. The close event is fired when the data source is closed. Not all streams support this event.
  5. When an error occurs while reading data, the error event is triggered.
    In the previous example, the readable data end has actually been used, and then verify the close and error

close

var fs = require("fs");
var readable = fs.createReadStream("url.js");
readable.resume(); // 开始读取数据
readable.on("close", function () {
    
    
  console.log("数据源关闭"); // 当数据源被读取完毕,数据源关闭时触发
});

error

var fs = require("fs");
var readable = fs.createReadStream("url2.js"); // url2不存在,会报错
readable.resume(); // 开始读取数据
readable.on("error", function (error) {
    
    
  console.log(error); // no such file or directory, open 'D:\Codes\Test\url2.js'....
  console.log("url2不存在"); // url2不存在
});

writable data stream

  1. client http requests
http.request(options, (res) => {
    
    }
  1. The server's http responses
var server = http.createServer(function (req, res) {
    
    }
  1. fs write streams
var writableStream = fs.createWriteStream('file2.txt');
  1. zlib streams
zlib.createGzip()
  1. crypto streams
const crypto = require('crypto');
const cipher = crypto.createCipher('aes192', 'a password');

let encrypted = '';
cipher.on('readable', () => {
    
    
  const data = cipher.read();
  if (data)
    encrypted += data.toString('hex');
});
cipher.on('end', () => {
    
    
  console.log(encrypted);
  // Prints: ca981be48e90867604588e75d04feabb63cc007a8f8ad89b10616ed84d815504
});

cipher.write('some clear text data');
cipher.end();
  1. tcp sockets
const net = require('net');
const server = net.createServer((c) => {
    
    
  // 'connection' listener
  console.log('client connected');
  c.on('end', () => {
    
    
    console.log('client disconnected');
  });
  c.write('hello\r\n');
  c.pipe(c);
});
  1. child process stdin
  2. process.stdout, process.stderr
fs.createReadStream("url.js").pipe(process.stdout)

Methods and properties of writable data streams

writable property
The writable property returns a Boolean value. Returns true if the data stream is still open and writable, otherwise returns false.

var fs = require("fs");
var writableStream = fs.createWriteStream('url1.js');
console.log(writableStream.writable) // true

write()
It accepts two parameters, one is the content to be written, which can be a string, or a stream object (such as a readable data stream) or a buffer object (representing binary data) and the other is after the writing is
completed Callback function, it is optional, of course, the second parameter can also pass in the string of the encoding method.

var fs = require("fs");
var writableStream = fs.createWriteStream("url1.txt");
// writableStream.write("dx ", function () {
    
    
//   console.log("已经写入了dx");
// });
writableStream.write("dx1 ", "ascii");
writableStream.end("yx\n");

cork(), uncork()
The cork method can force the data waiting to be written into the cache.
When the uncork method or end method is called, the cached data will be spit out.

var fs = require("fs");
var stream = fs.createWriteStream("url1.txt");;
stream.cork();
stream.write('some ');
stream.write('data ');
process.nextTick(() => stream.uncork());

setDefaultEncoding()
The setDefaultEncoding method is used to encode the written data into a new format.

var fs = require("fs");
var writableStream = fs.createWriteStream("url1.js");
writableStream.write("dx1 ");
writableStream.setDefaultEncoding("base64url");
writableStream.write("yx"); // yx将会以base64url的方式写入

end()
The end method is used to terminate the "writable data stream". This method accepts three parameters, all of which are optional.
The first parameter is the data to be written in the end, which can be a string, a stream object or a buffer object; the second
parameter is the write code;
the third parameter is a callback function, when the finish event occurs, it will Trigger this callback function.

var fs = require("fs");
var writableStream = fs.createWriteStream("url1.js");
writableStream.write("dx1 ");
writableStream.end("yx1",'ascii',function() {
    
    
    console.log('写入结束了')
});

Events for writable streams

After the drain event
writable.write(chunk) returns false, when all cached data is written and can continue to be written, the drain event will be triggered, indicating that the cache is empty.

var fs = require("fs");
var writableStream = fs.createWriteStream("url1.js");

function writeOneMillionTimes(writer, data, encoding, callback) {
    
    
    var i = 1000000;
    write();
    function write() {
    
    
      var ok = true;
      do {
    
    
        i -= 1;
        if (i === 0) {
    
    
          writer.write(data, encoding, callback);
        } else {
    
    
          ok = writer.write(data, encoding);
        }
      } while (i > 0 && ok);
      if (i > 0) {
    
    
        writer.once('drain',function(){
    
    
            console.log(i)
        });
      }
    }
  }
  writeOneMillionTimes(writableStream,'write1','utf-8',function(){
    
    
    // console.log()
  })

finish event
When the end method is called, all cached data is released and the finish event is triggered.

var fs = require("fs");
var writableStream = fs.createWriteStream("url1.js");
writableStream.write("dx1 ");
writableStream.end("yx1",'ascii',function() {
    
    
    console.log('写入结束了')
});

writableStream.on('finish',function(){
    
    
    console.log('end调用了,完成了')
})

Pipe event and unpipe event
This is well understood. When there is a pipe() call, the pipe event listener is triggered, and when there is an unpipe() call, the unpipe event listener is triggered.

var fs = require("fs");
var readable = fs.createReadStream("url.js");
var writable = fs.createWriteStream("url1.js");

writable.on("pipe", function () {
    
    
  console.log("pipe触发了");
});

writable.on("unpipe", function () {
    
    
  console.log("unpipe触发了");
});

readable.pipe(writable);
setTimeout(function () {
    
    
  console.log("停止写入url1.js");
  readable.unpipe(writable);
  console.log("手动关闭url1.js的写入数据流");
  writable.end();
}, 1);

error event
The event is triggered when there is an error writing data or using the pipeline. When the event occurs, the callback function will only receive an Error parameter.

error handling

Need to call an additional package on-finished
command line installation

npm install on-finished

Doing the following can not only solve the error of the stream itself,
but also prevent the data stream written by the client from interrupting the download from receiving the close event and keep waiting, causing memory leaks.

var onFinished = require("on-finished");
var http = require("http");

http.createServer(function (req, res) {
    
    
  // set the content headers
  var stream = fs.createReadStream("filename.txt");
  stream
    .on("error", onerror)
    .pipe(zlib.createGzip())
    .on("error", onerror)
    .pipe(res);

  function onerror(err) {
    
    
    console.error(err.stack);
    // 彻底关闭这次的stream
    stream.destroy();
  }

  onFinished(res, function () {
    
    
    // make sure the stream is always destroyed
    stream.destroy();
  });
});

Other content related to nodejs

nodejs commonjs introduces
nodejs fs module introduces
nodejs path module introduces
nodejs events module introduces
nodejs http module introduces
nodejs net module introduces
nodejs url module introduces
nodejs process module introduces
nodejs buffer module

Guess you like

Origin blog.csdn.net/glorydx/article/details/129165518