Skip to content
Eugene Lazutkin edited this page Jun 17, 2018 · 21 revisions

This is the workhorse of the package. It is a Transform stream, which consumes text, and produces a stream of data items corresponding to high-level tokens. It is always the first in a pipe chain being directly fed with a text from a file, a socket, the standard input, or any other text stream.

Its Writable part operates in a buffer/text mode, while its Readable part operates in an objectMode.

Introduction

The simple example (streaming from a file):

const Parser = require("stream-json/Parser");
const parser = new Parser();

const fs = require("fs");

let objectCounter = 0;
parser.on("startObject", () => ++objectCounter);
parser.on("end", console.log("Found ", objectCounter, " objects."));

fs.createReadStream("sample.json").pipe(parser);

API

Being a stream Parser doesn't have any special interface. The only thing required is to configure it during construction.

Parser produces a rigid stream of objects, which order is strictly defined. It is impossible to get an item out of sequence. All data items (strings, numbers, even object keys) are streamed in chunks and potentially they can be of any size: gigabytes, terabytes, and so on.

In many real cases, while files are huge, individual data items can fit into memory. It is better to work with them as a whole, so they can be inspected. In that case, Parser can optionally pack items efficiently.

The details of the stream of objects are described later.

constructor(options)

options is an optional object described in details in node.js' Stream documentation. Additionally, the following custom flags are recognized, which can be truthy or falsy:

  • jsonStreaming controls the parsing algorithm. If truthy, a stream of JSON objects is parsed as described in JSON Streaming as "Concatenated JSON". Technically it will recognize "Line delimited JSON" as well. Otherwise, it will follow the JSON standard assuming a singular value. The default: false.
  • Packing options control packing values. They have no default values.
    • packValues serves as the initial value for other three options described above.
    • packKeys specifies, if we need to pack keys and send them as a value.
    • packStrings specifies, if we need to pack strings and send them as a value.
    • packNumbers specifies, if we need to pack numbers and send them as a value.
    • More details in the section below.
  • Streaming options control sending unpacked values. They have no default values.
    • streamValues serves as the initial value for other three options described above.
    • streamKeys specifies, if we need to send items related to unpacked keys.
    • streamStrings specifies, if we need to send items related to unpacked strings.
    • streamNumbers specifies, if we need to send items related to unpacked numbers.
    • More details in the section below.

Stream of objects

This is the list of data objects produced by Parser in the correct order:

// a sequence can have 0 or more items
// a value is one of: object, array, string, number, null, true, false

// a parser produces a sequence of values

// object
{name: "startObject"};
// sequence of object properties: key, then value
{name: "endObject"};

// array
{name: "startArray"};
// sequence of values
{name: "endArray"};

// key
{name: "startKey"};
// sequence of string chunks:
{name: "stringChunk", value: "string value chunk"};
{name: "endKey"};
// when packing:
{name: "keyValue", value: "key value"};

// string
{name: "startString"};
// sequence of string chunks:
{name: "stringChunk", value: "string value chunk"};
{name: "endString"};
// when packing:
{name: "stringValue", value: "string value"};

// number
{name: "startNumber"};
// sequence of number chunks (as strings):
{name: "numberChunk", value: "string value chunk"};
{name: "endNumber"};
// when packing:
{name: "numberValue", value: "string value"};

// null, true, false
{name: "nullValue", value: null};
{name: "trueValue", value: true};
{name: "falseValue", value: false};

All value chunks (stringChunk and numberChunk) should be concatenated in order to produce a final value. Empty string values may have no chunks. String chunks may have empty values.

Important: values of number chunks and numberValue are strings, not numbers. It is up to a downstream code to convert it to a number using parseInt(x), parseFloat(x) or simply x => +x.

All items follow in the correct order. If something is going wrong, a parser will produce an error event. For example:

  • All startXXX are balanced with endXXX.
  • Between startKey and endKey can be zero or more stringChunk items. No other items can be seen.
  • After startObject optional key-value pairs emitted in a strict pattern: a key-related item, then a value, and this cycle can be continued until all key-value pairs are streamed.
    • It is not possible for a key to be missing a value.
  • All endObject are balanced with the corresponding startObject.
  • endObject cannot close startArray.
  • Between startString and endString can go 0 or more stringChunk, but no other items.
  • endKey can be optionally followed by keyValue, then a new value will be started, but no endObject.

In short, the item sequence is always correctly formed. No need to do unnecessary checks.

Packing options

Parser packs keys, strings, and numbers

Streaming options

Clone this wiki locally