heal/tail for JSON, YAML - but structure‑aware. Get a compact preview that shows both the shape and representative values of your data, all within a strict byte budget. (Just like head/tail, headson can also work with unstructured text files.)
Available as:
- CLI (see Usage)
 - Python library (see Python Bindings)
 
Using Cargo:
cargo install headson
From source:
cargo build --release
target/release/headson --help
- Budgeted output: specify exactly how much you want to see
 - Output formats: 
auto | json | yaml | text- Styles: 
strict | default | detailed- JSON family: 
strict→ strict JSON,default→ human‑friendly Pseudo,detailed→ JS with inline comments - YAML: always YAML; 
stricthas no comments,defaultuses “# …”,detaileduses “# N more …” - Text: prints raw lines. In 
defaultstyle, omissions are shown as a single line…; indetailed, as… N more lines ….strictomits array‑level summaries. 
 - JSON family: 
 
 - Styles: 
 - Multiple inputs: preview many files at once with a shared or per‑file budget
 - Fast: processes gigabyte‑scale files in seconds (mostly disk‑bound)
 - Available as a CLI app and as a Python library
 
If you’re comfortable with tools like head and tail, use headson when you want a quick, structured peek into a JSON file without dumping the entire thing.
head/tailoperate on bytes/lines - their output is not optimized for tree structuresjqyou need to craft filters to preview large JSON filesheadsonis like head/tail for trees: zero config but it keeps structure and represents content as much as possible
headson [FLAGS] [INPUT...]
- INPUT (optional, repeatable): file path(s). If omitted, reads from stdin. Multiple input files are supported.
 - Prints the preview to stdout. On parse errors, exits non‑zero and prints an error to stderr.
 
Common flags:
-c, --bytes <BYTES>: per‑file output budget (bytes). For multiple inputs, default total budget is<BYTES> * number_of_inputs.-u, --chars <CHARS>: per‑file output budget (Unicode code points). Behaves like--bytesbut counts characters instead of bytes.-C, --global-bytes <BYTES>: total output budget across all inputs. With--bytes, the effective total is the smaller of the two.-f, --format <auto|json|yaml|text>: output format (default:auto).- Auto: stdin → JSON family; filesets → per‑file based on extension (
.json→ JSON family,.yaml/.yml→ YAML, unknown → Text). 
- Auto: stdin → JSON family; filesets → per‑file based on extension (
 -t, --template <strict|default|detailed>: output style (default:default).- JSON family: 
strict→ strict JSON;default→ Pseudo;detailed→ JS with inline comments. - YAML: always YAML; style only affects comments (
strictnone,default“# …”,detailed“# N more …”). 
- JSON family: 
 -i, --input-format <json|yaml|text>: ingestion format (default:json). For filesets inautoformat, ingestion is chosen by extensions.-m, --compact: no indentation, no spaces, no newlines--no-newline: single line output--no-space: no space after:in objects--indent <STR>: indentation unit (default: two spaces)--string-cap <N>: max graphemes to consider per string (default: 500)--head: prefer the beginning of arrays when truncating (keep first N). Strings are unaffected. Display styles place omission markers accordingly; strict JSON remains unannotated. Mutually exclusive with--tail.--tail: prefer the end of arrays when truncating (keep last N). Strings are unaffected. Display styles place omission markers accordingly; strict JSON remains unannotated. Mutually exclusive with--head.
Notes:
- Multiple inputs:
- With newlines enabled, file sections are rendered with human‑readable headers. In compact/single‑line modes, headers are omitted.
 
 - In 
--format auto, each file uses its own best format: JSON family for.json, YAML for.yaml/.yml.- Unknown extensions are treated as Text (raw lines) — safe for logs and 
.txtfiles. --global-bytesmay truncate or omit entire files to respect the total budget.- The tool finds the largest preview that fits the budget; even if extremely tight, you still get a minimal, valid preview.
 - Directories and binary files are ignored; a notice is printed to stderr for each. Stdin reads the stream as‑is.
 - Head vs Tail sampling: these options bias which part of arrays are kept before rendering. Display styles may still insert internal gap markers to honor very small budgets; strict JSON stays unannotated.
 
 - Unknown extensions are treated as Text (raw lines) — safe for logs and 
 
- 
Bytes (
-c/--bytes,-C/--global-bytes)- Measures UTF‑8 bytes in the output.
 - Default per‑file budget is 500 bytes when neither 
--linesnor--charsis provided. - Multiple inputs: total default budget is 
<BYTES> * number_of_inputs;--global-bytescaps the total. 
 - 
Characters (
-u/--chars)- Measures Unicode code points (not grapheme clusters).
 
 - 
Lines (
-n/--lines,-N/--global-lines)- Caps the number of lines in the output.
 - Incompatible with 
--no-newline. - Multiple inputs: defaults to 
<LINES> * number_of_inputs;--global-linescaps the total. 
 - 
Interactions and precedence
- All active budgets are enforced simultaneously. The render must satisfy all of: bytes (if set), chars (if set), and lines (if set). The strictest cap wins.
 - When only lines are specified, no implicit byte cap applies. When neither lines nor chars are specified, a 500‑byte default applies.
 
 
Quick one‑liners:
- 
Peek a big JSON stream (keeps structure):
zstdcat huge.json.zst | headson -c 800 -f json -t default - 
Many files with a fixed overall size:
headson -C 1200 -f json -t strict logs/*.json - 
Glance at a file, JavaScript‑style comments for omissions:
headson -c 400 -f json -t detailed data.json - 
YAML with detailed comments:
headson -c 400 -f yaml -t detailed config.yaml 
- 
Single file (auto):
headson -c 200 notes.txt - 
Force Text ingest/output (useful when mixing with other extensions):
headson -c 200 -i text -f text notes.txt - 
Many text files (fileset):
headson -c 800 -i text -f text logs/*.txt - 
Styles on Text:
- default: omission as a standalone 
…line. - detailed: omission as 
… N more lines …. - strict: no array‑level omission line (individual long lines may still truncate with 
…). 
 - default: omission as a standalone 
 
Show help:
headson --help
Note: flags align with head/tail conventions (-c/--bytes, -C/--global-bytes).
Input:
{"users":[{"id":1,"name":"Ana","roles":["admin","dev"]},{"id":2,"name":"Bo"}],"meta":{"count":2,"source":"db"}}Naive cut (can break mid‑token):
jq -c . users.json | head -c 80
# {"users":[{"id":1,"name":"Ana","roles":["admin","dev"]},{"id":2,"name":"Bo"}],"meStructured preview with headson (JSON family, default style → Pseudo):
headson -c 120 -f json -t default users.json
# {
#   users: [
#     { id: 1, name: "Ana", roles: [ "admin", … ] },
#     …
#   ]
#   meta: { count: 2, … }
# }Machine‑readable preview (JSON family, strict style → strict JSON):
headson -c 120 -f json -t strict users.json
# {"users":[{"id":1,"name":"Ana","roles":["admin"]}],"meta":{"count":2}}Regenerate locally:
- Place tapes under docs/tapes (e.g., docs/tapes/demo.tape)
 - Run: cargo make tapes
 - Outputs are written to docs/assets/tapes
 
A thin Python extension module is available on PyPI as headson.
- Install: 
pip install headson(ABI3 wheels for Python 3.10+ on Linux/macOS/Windows). - API:
headson.summarize(text: str, *, format: str = "auto", style: str = "default", input_format: str = "json", byte_budget: int | None = None, skew: str = "balanced") -> strformat:"auto" | "json" | "yaml"(auto maps to JSON family for single inputs)style:"strict" | "default" | "detailed"input_format:"json" | "yaml"(ingestion)byte_budget: maximum output size in bytes (default: 500)skew:"balanced" | "head" | "tail"(affects display styles; strict JSON remains unannotated)
 
Examples:
import json
import headson
data = {"foo": [1, 2, 3], "bar": {"x": "y"}}
preview = headson.summarize(json.dumps(data), format="json", style="strict", byte_budget=200)
print(preview)
# Prefer the tail of arrays (annotations show with style="default"/"detailed")
print(
    headson.summarize(
        json.dumps(list(range(100))),
        format="json",
        style="detailed",
        byte_budget=80,
        skew="tail",
    )
)
# YAML support
doc = "root:\n  items: [1,2,3,4,5,6,7,8,9,10]\n"
print(headson.summarize(doc, format="yaml", style="default", input_format="yaml", byte_budget=60))- [1] Optimized tree representation: An arena‑style tree stored in flat, contiguous buffers. Each node records its kind and value plus index ranges into shared child and key arrays. Arrays are ingested in a single pass and may be deterministically pre‑sampled: the first element is always kept; additional elements are selected via a fixed per‑index inclusion test; for kept elements, original indices are stored and full lengths are counted. This enables accurate omission info and internal gap markers later, while minimizing pointer chasing.
 - [2] Priority order: Nodes are scored so previews surface representative structure and values first. Arrays can favor head/mid/tail coverage (default) or strictly the head; tail preference flips head/tail when configured. Object properties are ordered by key, and strings expand by grapheme with early characters prioritized over very deep expansions.
 - [3] Choose top N nodes (binary search): Iteratively picks N so that the rendered preview fits within the byte budget, looping between “choose N” and a render attempt to converge quickly.
 - [4] Render attempt: Serializes the currently included nodes using the selected template. Omission summaries and per-file section headers appear in display templates (pseudo/js); json remains strict. For arrays, display templates may insert internal gap markers between non‑contiguous kept items using original indices.
 - [5] Diagram source: The Algorithm diagram is generated from 
docs/diagrams/algorithm.mmd. Regenerate the SVG withcargo make diagramsbefore releasing. 
MIT
