feat(s16): SPARQL vs Ad4mModel side-by-side scenario#6
Open
HexaField wants to merge 3 commits into
Open
Conversation
Side-by-side bench of raw `perspective.querySparql` vs `perspective.modelQuery` on identical Flux-shaped data — drives the per-site convert-vs-keep decisions in flux's `docs/sparql-to-ad4m-model-migration.md` (PR coasys/flux#605). Seeds channel → messages → embeddings + SR reifiers + topics, registers SHACL classes inline (Message / Embedding / Topic / SemanticRelationship), and runs five cases at small + medium tiers: sr_by_expression_limit1 1-row lookup, WHERE + LIMIT 1 sr_by_expression_with_include same + include: { embeddingTag } sr_all scan all SRs embeddings_all scan all embeddings topics_all scan smaller topic set Auto-flags whether `include` actually fires (within-5% noise check). S16_RUNS=N overrides per-case runs (default 10). Also extends InstrumentedClient with addSdna / modelQuery / addLinks (bulk) so this and future model-query scenarios can drive the RPC directly instead of falling back to the Apollo SDK. Baseline against dev: model_query is 14-150x slower than raw SPARQL across all cases at medium scale; ratio scales linearly with corpus size regardless of LIMIT. Full results land in flux PR #605's migration doc.
The previous SHACL JSON put `@HasOne` relations in a separate top-level
`relations: []` array, which the Rust executor's `SHACLShape`
deserializer (`rust-executor/src/perspectives/shacl_parser.rs`) silently
drops — that field doesn't exist on `SHACLShape`. The shape loader then
never registered `embeddingTag` / `topicTag` in `include_relations`,
so `resolve_includes_recursive` silently no-op'd the include. That made
the v1 of this scenario report "include actually fires: no" as a
finding, when in reality the relation had never been registered.
Fix: emit relations inside `properties:` with `relation_kind: "hasOne"`
and `target_class_name: "..."` (the canonical SHACL form the executor
expects — see `SHACLShape.toJSON()` in `@coasys/ad4m`'s
`core/src/shacl/SHACLShape.ts`). Same for `@Flag` properties, which are
now emitted as `has_value` + `min_count: 1` per the round-trip tests
in `model_query/round_trip_tests.rs`.
With the fix:
- `include actually fires: yes` on both tiers
- medium-tier ratios drop from 18-150x to 5-25x:
sr_by_expression_limit1 56x → 5.0x
sr_by_expression_with_include 55x → 5.5x
sr_all 20x → 9.4x
embeddings_all 18x → 10.6x
topics_all 150x → 25.0x
The remaining gap (5-25x) is real and attributable to the
unconditional reifier-metadata join in `build_instance_sparql` plus
the always-fired COUNT query. Per-phase profiling lives in the flux
PR #605 doc.
Four new cases test the orchestrator-level opt-ins landing in coasys/ad4m#846: | Case | Tests audit item | Expected impact | |---|---|---| | embeddings_all_no_metadata | A (withMetadata: false) | scan-all ratio ~3x lower | | sr_by_expression_limit1_no_count | B (count: false) | -0.1ms per call | | sr_by_id_single_plan | C (selective WHERE)+A+B | matches a Single-plan baseline | | sr_all_no_metadata_no_count | A+B combined | scan-all near-parity with raw | Each case keeps the same raw SPARQL on the left side so the ratio column reads "how much overhead does modelQuery still carry vs the hand-written SPARQL doing comparable work." Running against `dev` exercises the back-compat default (these opt-in flags are ignored, so we get the same numbers as the pre-existing cases). Running against `refactor/sparql-pushdown-last-write-wins` (PR #846) will show the ratio collapses to roughly raw-SPARQL parity for these flagged-off cases. Test plan - npx tsc --noEmit clean - ./run.sh --branch dev / #846 --scenario s16 — TBC after #846's release executor finishes building
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
s16-sparql-vs-model— a side-by-side bench of rawperspective.querySparqlvsperspective.modelQueryon identical Flux-shaped data.Drives the per-site convert-vs-keep decisions in flux's
docs/sparql-to-ad4m-model-migration.md(flux #605) and serves as the regression gate against themodel_queryorchestrator overhaul incoasys/ad4m#846.What S16 does
Seeds channel → messages (body/author/timestamp) + embeddings + SR reifiers linking each message to an embedding + topics; registers SHACL classes inline (
Message,Embedding,Topic,SemanticRelationship); then for each candidate query times rawquerySparqlagainst the equivalentmodelQuerycall back-to-back on the same perspective.9 cases × 2 tiers (
small= 100 items / 1051 links,medium= 1000 / 10151):sr_by_expression_limit1LIMIT 1short-circuitsr_by_expression_with_include@HasOnepolymorphic-on-same-predicatesr_allembeddings_alltopics_allembeddings_all_no_metadatawithMetadata: falsesr_by_expression_limit1_no_countcount: falsesr_by_id_single_plansr_all_no_metadata_no_countAuto-flags
include actually fires: yes/no(within-5 % noise check).S16_RUNS=Noverrides per-case runs (default 10).Cross-branch results (
devHEAD1f29d0b1vsrefactor/sparql-pushdown-last-write-winsHEAD376d4b1b)Both binaries built fresh into the same
CARGO_TARGET_DIRfrom the same Rust toolchain. Apple Silicon, 10 runs/case + 1 warm-up.S16 — Medium tier (1000 items, 10151 links)
sr_by_expression_limit1sr_by_expression_with_includesr_allembeddings_alltopics_allembeddings_all_no_metadatasr_by_expression_limit1_no_countsr_by_id_single_plansr_all_no_metadata_no_countS16 — Small tier (100 items, 1051 links)
embeddings_all_no_metadatasr_by_id_single_plansr_all_no_metadata_no_countsr_by_expression_limit1_no_countBack-compat: cases that don't pass the new opt-in flags are within ±10% noise on #846 — the cheaper paths only engage when the caller asks for them.
Cross-scenario impact — S5 + S8 (no S16-style changes, but share
SparqlStore)Ran S5 and S8 against both binaries to verify the K (
Solutions → Vec<Value>) refactor doesn't regress pre-existing query paths.S5 (
queryLinksscaling) — never touchesmodel_query:S8 (raw
querySparqlon Flux community graph, 1865-link tier) — 9 raw SPARQL queries:totalItemCountallItemsunprocessedItemsrecentConversationspaginatedMessagesS8 medium tier (58460 links): every query within ±8% of dev — parity dominates at that scale.
No regressions; small incidental wins where the per-call serialize/parse overhead from K is a meaningful fraction of total latency.
v1 SHACL-emission bug (now fixed)
Reviewers note: an earlier version of this branch put
@HasOnerelations in a separate top-levelrelations: []array, which Rust'sSHACLShapedeserializer silently drops. That madeinclude actually fires: noa false positive and inflated ratios 3-10×. Fixed in1f11143by emitting relations as PropertyShape entries withrelation_kind: "hasOne"per the canonical@coasys/ad4mSHACLShape.toJSON()form.Client surface added
InstrumentedClient:addSdna(uuid, name, shaclJson, sdnaType?, sdnaCode?)modelQuery(uuid, className, queryJson)addLinks(uuid, links, status?)(bulk variant)Test plan
npx tsc --noEmitclean🤖 Generated with Claude Code