-
-
Notifications
You must be signed in to change notification settings - Fork 51
jsonl Parser
(Since 1.6.0) This is a convenience component for parsing large JSONL files. It is a Transform stream, which consumes text, and produces a stream of JavaScript objects. It is always the first in a pipe chain being directly fed with a text from a file, a socket, the standard input, or any other text stream.
Its Writable part operates in a buffer/text mode, while its Readable part operates in an objectMode.
Functionally, json/Parser replaces a combination of Parser with jsonStreaming set to true, which immediately follows by StreamValues. The only reason for its existence is improved performance.
The simple example (streaming from a file):
const {parser: jsonlParser} = require('stream-json/jsonl/Parser');
const fs = require('fs');
const pipeline = fs.createReadStream('sample.json').pipe(jsonlParser());
let objectCounter = 0;
pipeline.on('data', data => data.name === 'startObject' && ++objectCounter);
pipeline.on('end', () => console.log(`Found ${objectCounter} objects.`));The alternative example:
const JsonlParser = require('stream-json/jsonl/Parser');
const fs = require('fs');
const jsonlParser = new JsonlParser();
const pipeline = fs.createReadStream('sample.json').pipe(jsonlParser);
let objectCounter = 0;
pipeline.on('data', data => data.name === 'startObject' && ++objectCounter);
pipeline.on('end', () => console.log(`Found ${objectCounter} objects.`));Both of them are functionally equivalent to:
const {parser} = require('stream-json/Parser');
const {streamValues} = require('stream-json/streamers/StreamValues');
const fs = require('fs');
const pipeline = fs.createReadStream('sample.json')
.pipe(parser(jsonStreaming: true})
.pipe(streamValues());
let objectCounter = 0;
pipeline.on('data', data => data.name === 'startObject' && ++objectCounter);
pipeline.on('end', () => console.log(`Found ${objectCounter} objects.`));The module returns the constructor of jsonl/Parser. Being a stream jsonl/Parser doesn't have any special interfaces. The only thing required is to configure it during construction.
In many real cases, while files are huge, individual data items can fit into memory. It is better to work with them as a whole, so they can be inspected. jsonl/Parser leverages JSONL format and returns a stream of JavaScript objects exactly like StreamValues.
options is an optional object described in details in node.js' Stream documentation. Additionally, the following custom flags are recognized:
-
reviveris an optional function, which takes two arguments and returns a value.- See JSON.parse() for more details.
make() and parser() are two aliases of the factory function. It takes options described above, and return a new instance of jsonl/Parser. parser() helps to reduce a boilerplate when creating data processing pipelines:
const {chain} = require('stream-chain');
const {parser} = require('stream-json/jsonl/Parser');
const fs = require('fs');
const pipeline = chain([
fs.createReadStream('sample.json'),
parser()
]);
let objectCounter = 0;
pipeline.on('data', data => data.name === 'startObject' && ++objectCounter);
pipeline.on('end', () => console.log(`Found ${objectCounter} objects.`));Constructor property of make() (and parser()) is set to jsonl/Parser. It can be used for indirect creating of parsers or metaprogramming if needed.