v0.57.0 — GFQL performance: parse caching, single-hop fast path, structured returns
A performance-focused GFQL release. Most changes are transparent (same results, faster); one is a user-visible output improvement (RETURN a).
Highlights
Faster GFQL, same results
- Parse memoization — repeated Cypher queries skip the ~15 ms lark parse (
parse_cypher), and repeated row-expressions skip their parse + transformer rebuild (parse_expr). ~1.3–1.7× faster end-to-end on small/interactive queries; the fixed per-call compile cost of a repeated query drops to ≈ parity with the equivalent native chain. Cached ASTs are deeply immutable, so sharing is safe. - Single-hop fast path (pandas + cuDF) — node-only
MATCH (n)and 1-hopMATCH (a {f})-[e]->(b)skip the BFS forward/backward/combine machinery. ~100× faster on pandas (node filter 204→2 ms @10m rows); on cuDF it stays on the resident frame (a couple of semi-joins instead of the BFS + ~31drop_duplicates). Differential-verified equivalent to the full path across 440 random graphs + targeted edge cases. - dtype-gated detection — numeric/bool columns skip the spurious
astype(str)+ regex scan in temporal/list detection:where_rows~3.1× (pandas), ~4.4–13.3× (cuDF). - Redundant-dedup removal — dropped
.unique()passes that only fed.isin()membership (byte-identical; one fewer GPU kernel launch per hop).
Structured whole-entity returns (#1650) — user-visible
- Terminal Cypher
RETURN a(a whole node/edge) now emits structured flattened columns (a.id,a.val, …) instead of a single Cypher display string. The columns already exist before projection, so this is near-free, lossless, directly usable, and serializes to JSON/CSV/Parquet/Arrow. Measured ~2–6.4× (pandas) / ~2.7–4.3× (cuDF). - The human-readable display string is still available on demand via
render_entity_text(result, alias). - Migration: if you previously parsed the rendered
({id: …, …})string out of aRETURN acolumn, read thea.*columns directly instead (or callrender_entity_text).
Fixes
- Single-hop fast path now honors
prune_to_endpoints(was returning both endpoints) by declining to the full path. RETURN a, a.valno longer emits a duplicatea.valcolumn.- Chains on edges-only graphs (no node-id binding) preserve the materialized binding through the full path (fixes a corrupt result /
NotImplementedErroron Python 3.14). - The chain fast path no longer bypasses
prechain/postchain/postloadpolicy hooks (a policy can deny execution) — policy-bearing queries take the full path.
Compatibility
No API changes. The one behavior change is the RETURN a output format (above). Default un-indexed traversal behavior is unchanged. pandas/cuDF production paths validated; polars/dask/spark untouched by the fast path.