Skip to content

v0.57.0 — GFQL perf: parse caching, single-hop fast path, structured returns

Latest

Choose a tag to compare

@lmeyerov lmeyerov released this 29 Jun 02:16
eac2630

v0.57.0 — GFQL performance: parse caching, single-hop fast path, structured returns

A performance-focused GFQL release. Most changes are transparent (same results, faster); one is a user-visible output improvement (RETURN a).

Highlights

Faster GFQL, same results

  • Parse memoization — repeated Cypher queries skip the ~15 ms lark parse (parse_cypher), and repeated row-expressions skip their parse + transformer rebuild (parse_expr). ~1.3–1.7× faster end-to-end on small/interactive queries; the fixed per-call compile cost of a repeated query drops to ≈ parity with the equivalent native chain. Cached ASTs are deeply immutable, so sharing is safe.
  • Single-hop fast path (pandas + cuDF) — node-only MATCH (n) and 1-hop MATCH (a {f})-[e]->(b) skip the BFS forward/backward/combine machinery. ~100× faster on pandas (node filter 204→2 ms @10m rows); on cuDF it stays on the resident frame (a couple of semi-joins instead of the BFS + ~31 drop_duplicates). Differential-verified equivalent to the full path across 440 random graphs + targeted edge cases.
  • dtype-gated detection — numeric/bool columns skip the spurious astype(str) + regex scan in temporal/list detection: where_rows ~3.1× (pandas), ~4.4–13.3× (cuDF).
  • Redundant-dedup removal — dropped .unique() passes that only fed .isin() membership (byte-identical; one fewer GPU kernel launch per hop).

Structured whole-entity returns (#1650)user-visible

  • Terminal Cypher RETURN a (a whole node/edge) now emits structured flattened columns (a.id, a.val, …) instead of a single Cypher display string. The columns already exist before projection, so this is near-free, lossless, directly usable, and serializes to JSON/CSV/Parquet/Arrow. Measured ~2–6.4× (pandas) / ~2.7–4.3× (cuDF).
  • The human-readable display string is still available on demand via render_entity_text(result, alias).
  • Migration: if you previously parsed the rendered ({id: …, …}) string out of a RETURN a column, read the a.* columns directly instead (or call render_entity_text).

Fixes

  • Single-hop fast path now honors prune_to_endpoints (was returning both endpoints) by declining to the full path.
  • RETURN a, a.val no longer emits a duplicate a.val column.
  • Chains on edges-only graphs (no node-id binding) preserve the materialized binding through the full path (fixes a corrupt result / NotImplementedError on Python 3.14).
  • The chain fast path no longer bypasses prechain/postchain/postload policy hooks (a policy can deny execution) — policy-bearing queries take the full path.

Compatibility

No API changes. The one behavior change is the RETURN a output format (above). Default un-indexed traversal behavior is unchanged. pandas/cuDF production paths validated; polars/dask/spark untouched by the fast path.