This document captures key learnings, debugging strategies, and architectural nuances discovered during the development of the @masthead-data/dataform-package.
Dataform executes JavaScript during the compilation phase. To trace what's happening:
- Use
console.error()for debug logs. This ensures logs go tostderrand don't corrupt the JSON output redirected to a file. - Avoid
console.log()inside Dataform definitions if you plan to pipe the output to a JSON parser, as it may inject plain text into the JSON stream.
To see the final state of all actions:
cd test-project
npx @dataform/cli compile --json > compiled.jsonInspect the tables, operations, and assertions arrays in the resulting JSON. Check preOps and queries for the injected SET @@reservation statements.
Use the provided verification script to check invariants:
node scripts/verify_compilation.jsThis script validates that reservations are prepended and that assertions are skipped.
See CONTRIBUTING.md for detailed testing instructions, including matrix testing and fast iteration modes.
CI Integration: GitHub Actions runs matrix tests on every PR.
See CONTRIBUTING.md for details on handling platform-specific bindings and avoiding missing dependency errors in CI.
Dataform makes a distinction between Action Builders (the objects returned by publish(), operate(), etc.) and the final Proto Objects (the serialized state).
- Modification Point: To ensure persistence, modifications should be made to
action.contextablePreOpsoraction.contextableQuerieson the Builder. If you only modifyproto.preOps, Dataform's internal resolution logic might overwrite your changes during the final compilation phase.
In .sqlx files, pre_operations { ... } blocks are internal to Dataform. When monkeypatching, we must ensure our reservation statement is prepended (using .unshift()) so it executes before any user-defined variables or temporary functions.
For operations, the SQL is often set via .queries(["SQL"]). This method can be called multiple times or late in the script. We monkeypatch this method on the builder instance to wrap the user's input, ensuring the reservation is always at the top of the list, regardless of when queries() is called.
Assertions in Dataform are strict. They expect a single SELECT statement. Prepending a SET statement will cause a syntax error in BigQuery because assertions are often wrapped in subqueries or views by Dataform. We explicitly skip assertions in this package.
Operations where DECLARE is the first statement at the outer level are automatically skipped. BigQuery requires DECLARE before any other statements in a script, so prepending SET @@reservation would fail. The package strips leading whitespace and SQL comments (--, #, /* */) to reliably detect this case. DECLARE inside BEGIN...END or EXECUTE IMMEDIATE is not flagged — reservation is applied normally in those cases.
See CONTRIBUTING.md for the full release workflow steps.
Performance: findReservation uses linear scan (acceptable for typical project sizes <1000 actions)