Node.js streams by examples
Stream is a powerful concept but it’s not widely understood by developers.
I have found myself googling and learning the concept again every time I needed to work with streams in Node.js. This article is very much to remind my future self.
This article focuses on applying streams with text content. Code in this article is available in this GitHub repo.
Why streams?
A few cases where streams really shine:
- Processing large files that do not fit into computer’s memory. Using streams allows you to read a file and process each chunk as the data arrive into your program.
- All the input is not available just yet. When you write a command line program for users to interact with, input only becomes available over time.
- Reduce latency and improve user experience. Thanks for streams, your Netflix movies play (almost) immediately. The Netflix app on your PC/TV/browser shows you the content whilst it’s still downloading the rest of the movie.
- Reduce bandwidth. You can show the user the interim download result and terminate it if it’s not the right one. It would be a waste of bandwidth to download a whole large file only to find out it’s the wrong one.
Four types of streams
Readable
: from which data can be read, e.g.fs.createReadStream()
Writable
: to which data can be written, e.g.fs.createWriteStream()
Duplex
: That are both Readable and Writeable, e.g.net.Socket
Transform
: That can modify data as they are written or read, e.g. read and write compressed data from/to a file.
Two reading modes
Readable streams operate in one of two modes: flowing
and paused
:
- In
flowing mode
, data are automatically read from the underlying system and provided to an application as quickly as possible using events via EventEmitter interface (i.e. each chunk is provided via adata
event).
- The stream implementor decides how often a data event is emitted, e.g. HTTP request may emit a
data
event once a few KBs of data are read. - When there is more data to read (i.e. the end of stream), the stream emits an
end
event. - If there is an error, the stream will emit an
error
event.
2. In paused mode
, the stream.read()
method must be explicitly called each time to return a chunk of data from the stream.
- A
readable
event is emitted every time a chunk of data is ready. - A
end
event is emitted when the end of the stream is reached. stream.read()
returnsnull
when the end of the stream is reached.
Notes:
- All
Readable
streams begin inpaused mode
but can be switched onflowing mode
in one of the following ways:
- Adding a
data
event handler - Calling the
stream.resume()
method - Calling the
stream.pipe()
method to send data to aWritable
stream
2. The Readable
stream can switch back to paused
with one of the following:
- There is no pipe destination, by calling
stream.pause()
method. Removing all pipe destinations withstream.unpipe()
method. - Adding a
readable
event handler, which has higher priority thandata
event. - A
Readable
will not generate data until a mechanism for either consuming or ignoring data is provided. If that mechanism is taken away or disabled, theReadable
will attempt to stop generating data.
Node.js components
The examples will show you how to work with:
process.stdin
as aReadable
streamprocess.stdout
as aWritable
streamfs.createReadStream()
as aReadable
streamfs.createWriteStream()
as aWritable
streamreadline
module, which provides an interface to work with data from a textualReadable
streamhttp.response
as aReadable
stream
Further reading
- This wonderful introduction into streams in Node.js
- Readline module’s official documentation
- Async iterator in Node.js streams
- Node.js streams cheat sheet
Examples
- Get an input from the user
2. A simple CLI to interact with the user
3. Read a csv file line by line from postcodes.csv
, convert each line to an JSON object and write it down to postcodes.txt
4. Read a file using flowing mode
, i.e. by attaching a data
event handler. Notice in this program, the events (data
and end
) are emitted after last line, i.e. console.log('The fun has just begun')
has been executed.
5. Read a stream in paused mode
using readble
event and stream.read()
method
6. Utility to download a file. http.response
is a Readable stream
and can be read using 3 ways:
- Using
flowing mode
withresponse.pipe()
method () - Using
flowing mode
withdata
event - Using
paused mode
withreadable
event andresponse.read()
method
Using either flowing mode
or paused mode
, you can see the size of each data chunk (except the last one) to be 16384 Bytes
= 16KB
.
7. Utility to copy a file
Notice the size of each chunk is 65536 Bytes
= 64KB
.
8. Readable.from()
method can create a readable
stream from a string or a iterator (both synchronous and asynchronous).
data
event.9. The Readable
object constructor can be used to create a new readable
stream. Notice the readable.push(null)
is to signal the end of the stream.