Skip to content

feat: add jsonl output format (#1159)#3415

Open
ChrisJr404 wants to merge 1 commit into
anchore:mainfrom
ChrisJr404:add-jsonl-output
Open

feat: add jsonl output format (#1159)#3415
ChrisJr404 wants to merge 1 commit into
anchore:mainfrom
ChrisJr404:add-jsonl-output

Conversation

@ChrisJr404

Copy link
Copy Markdown

Closes #1159 — adds a JSON Lines (newline-delimited JSON) output formatter selectable via -o jsonl.

Each line is a single match record, per @kzantow's clarification on the issue ("each line would be each match record"). This shape is what makes the requested pipeline ergonomic:

grype <input> -o jsonl | jq -r .vulnerability.id | xargs -I {} ./cve-search.py {}

ndjson is also accepted as an alias since some communities prefer that spelling.

Why this shape

@kzantow asked on the issue:

If we understand the ask properly, to output JSONL, each line would be each match record — is this what you are expecting?

Reporter @ocervell confirmed yes. So this PR ships exactly that: one Match per line, no surrounding envelope.

Document-level metadata (descriptor, source, distro, ignoredMatches, alertsByPackage) is intentionally omitted — JSON Lines is a flat record stream by design. Consumers that need that metadata should continue to use -o json. The package doc comment calls this out explicitly so it's not read as an oversight.

Empty match sets produce zero bytes of output, which is the standard jsonl convention (an empty file is a valid empty stream).

What's added

File Purpose
grype/presenter/jsonl/presenter.go New presenter — uses json.Encoder.Encode per match (encoder appends \n automatically, so output is naturally newline-delimited)
grype/presenter/jsonl/presenter_test.go Image + directory golden-snapshot tests, empty-document test, JSON-per-line validation, line-count assertion
grype/presenter/jsonl/testdata/snapshot/*.golden Snapshot fixtures (regenerable with -update)
internal/format/format.go JSONLinesFormat constant, Parse(\"jsonl\") and Parse(\"ndjson\") recognition, added to AvailableFormats
internal/format/format_test.go Coverage for jsonl / JSONL / ndjson parsing
internal/format/presenter.go Wire JSONLinesFormat to jsonl.NewPresenter

Streaming question (raised by reporter)

Any reason why we can't stream result individually in a more "live" manner?

Streaming would require restructuring how the matcher pipeline produces results — currently grype waits for all matchers to finish before invoking the presenter. That is a much larger change and is not in scope for this PR. JSON Lines as a file format is independently useful (post-pipe consumption with jq/xargs/etc.) even without streaming, which matches @kzantow's comment that "We probably wouldn't stream each result individually, but only output a JSONL file at the end."

Test plan

  • go test ./grype/presenter/... passes (existing presenters + new jsonl)
  • go test ./internal/format/... passes
  • go build ./... clean
  • grype --help lists jsonl in the format options
  • End-to-end smoke: grype dir:. -o jsonl against a directory with no matches produces zero output, as designed
  • DCO signed

Closes anchore#1159. Adds a JSON Lines (newline-delimited JSON) output formatter
selectable via -o jsonl. Each line is a single match record per
@kzantow's clarification on the issue, suitable for pipelines like:

  grype <input> -o jsonl | jq -r .vulnerability.id | xargs ./cve-search.py

The 'ndjson' alias is also accepted as it is the more common name in
some communities.

Document-level metadata (descriptor, source, distro, ignoredMatches,
alertsByPackage) is intentionally omitted — JSON Lines is a flat record
stream by design. Consumers that need that metadata should continue to
use -o json. The package documentation calls this out explicitly so
future readers don't read it as an oversight.

Empty match sets produce zero bytes of output, which is the standard
jsonl convention (an empty file is a valid empty stream).

Tests:
- presenter golden snapshots for both image and directory sources
- empty-document case asserts zero output
- line-count assertion ties output line count to document Match count
- JSON-validity assertion: every emitted line independently parses

Signed-off-by: Chris (ChrisJr404) <11917633+ChrisJr404@users.noreply.github.com>
@kzantow

kzantow commented May 8, 2026

Copy link
Copy Markdown
Contributor

I have a concern that this change might be difficult to reconcile with a potential future state of Grype output. To reduce file size and represent more information, I suspect there could be a change the structure to use package references instead of including them inline with each match. This isn't a blocking concern but it is something we should consider when accepting this PR.

@ChrisJr404

Copy link
Copy Markdown
Author

Fair concern. The JSONL emitter here just streams the same per-match record shape that the existing JSON output already inlines, so we are inheriting the inline-package structure rather than introducing it. If the JSON output later moves to package-reference dedup, the JSONL stream would naturally follow the same schema change at the same time, since both encoders pull from the same intermediate models.Match.

Two ways I see to handle the forward-compat:

  1. Keep the JSONL output coupled to the JSON schema. Whatever the JSON shape is at any release, JSONL emits the same record per line. A future schema bump (whether ref-based or anything else) lands once and both formats track it together.

  2. Stamp a schema version per record (e.g. a top-level schema field on every JSONL line). Consumers can then pin to a known version and the project is free to evolve the canonical JSON without breaking JSONL streaming.

Happy to add (2) if you want a clearer compat boundary right now. If you would rather hold the PR until the canonical schema discussion lands, I can convert this to draft and let it sit. Either way, let me know which fits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add JSON lines output

2 participants