SpecCompiler: A type system for Markdown on top of Pandoc AST #11493
crisclacerda
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I wanted to share an open-source tool we built on top of pandoc.
Wanted to write software specs in plain Markdown but we needed strict DO-178C traceability for airborne software. At some point, I joked: “we just need to marry pandoc and SQLite.”
That joke initially turned into a dirty amalgamation of 1,500 lines of Python, Lua filters, SQL scripts, and Makefiles. But despite being a Frankenstein's monster, we used it to write specifications for real airborne software. The workflow was so effective, and the results were so good, that we decided we had to formalize it, rewrite it properly, and open-source it.
The result is SpecCompiler (https://github.com/SpecIR/SpecCompiler). It’s alpha, and there are rough edges everywhere, but it compiles its own documentation, so you can clone the repo and start working immediately.
What it does:
SpecCompiler lowers the pandoc AST into a typed, ReqIF inspired, relational intermediate representation (SpecIR) and executes declarative structural constraints over it. If traceability is broken, attributes are missing, or relations are ill-typed, the build fails.
If it passes it emit:
How it works:
The Lua filter primary job is to lower that AST into a relational Intermediate Representation (Spec-IR) persisted in SQLite. All substantial reasoning, cross-reference resolution, invariant checking, mandatory attribute enforcement, numbering, traceability, and global transformations happens over the IR. Once the IR is verified, a transformed AST is reconstructed and handed back to pandoc.
In that sense, this is not “another Lua filter,” but an IR-centric design where pandoc provides parsing and rendering, and the middle-end provides semantics and correctness guarantees.
Why this might matter:
Once documents are lowered into a persistent, queryable IR, you can enforce global invariants across large document sets, fail builds deterministically when structure breaks and reason about documentation as a relational database rather than a tree. For teams pushing pandoc beyond simple conversion this opens a different class of guarantees and automation.
If you’re doing non-trivial validation or global transformations on top of the pandoc AST, I’d genuinely appreciate feedback on whether this approach resonates or where it could integrate better with existing workflows.
The type system is completely extensible. If you can express your domain as objects, attributes, and relations, you can type-check it. It turns out, building a relational type system for Markdown goes way beyond safety-critical software.
Thanks for all the fish!
Beta Was this translation helpful? Give feedback.
All reactions