Skip to content

feat(s16): SPARQL vs Ad4mModel side-by-side scenario#6

Open
HexaField wants to merge 3 commits into
mainfrom
feat/s16-sparql-vs-model
Open

feat(s16): SPARQL vs Ad4mModel side-by-side scenario#6
HexaField wants to merge 3 commits into
mainfrom
feat/s16-sparql-vs-model

Conversation

@HexaField

@HexaField HexaField commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds s16-sparql-vs-model — a side-by-side bench of raw perspective.querySparql vs perspective.modelQuery on identical Flux-shaped data.

Drives the per-site convert-vs-keep decisions in flux's docs/sparql-to-ad4m-model-migration.md (flux #605) and serves as the regression gate against the model_query orchestrator overhaul in coasys/ad4m#846.

What S16 does

Seeds channel → messages (body/author/timestamp) + embeddings + SR reifiers linking each message to an embedding + topics; registers SHACL classes inline (Message, Embedding, Topic, SemanticRelationship); then for each candidate query times raw querySparql against the equivalent modelQuery call back-to-back on the same perspective.

9 cases × 2 tiers (small = 100 items / 1051 links, medium = 1000 / 10151):

Case Tests
sr_by_expression_limit1 LIMIT 1 short-circuit
sr_by_expression_with_include @HasOne polymorphic-on-same-predicate
sr_all Full-scan overhead (~1030 SRs × 3 properties at medium)
embeddings_all Simpler scan (1000 × 2 properties)
topics_all Smallest result set — does fixed overhead dominate?
embeddings_all_no_metadata #846 audit item AwithMetadata: false
sr_by_expression_limit1_no_count #846 audit item Bcount: false
sr_by_id_single_plan #846 audit item C — selective WHERE skips TwoPhase
sr_all_no_metadata_no_count #846 A+B combined

Auto-flags include actually fires: yes/no (within-5 % noise check). S16_RUNS=N overrides per-case runs (default 10).

Cross-branch results (dev HEAD 1f29d0b1 vs refactor/sparql-pushdown-last-write-wins HEAD 376d4b1b)

Both binaries built fresh into the same CARGO_TARGET_DIR from the same Rust toolchain. Apple Silicon, 10 runs/case + 1 warm-up.

S16 — Medium tier (1000 items, 10151 links)

Case dev ratio #846 ratio improvement
sr_by_expression_limit1 4.6× 4.5× 1.03×
sr_by_expression_with_include 4.8× 4.6× 1.05×
sr_all 9.0× 8.5× 1.06×
embeddings_all 8.7× 8.8× 0.98×
topics_all 25.1× 29.7× 0.85× — raw is sub-ms, RPC floor dominates
embeddings_all_no_metadata 9.0× 3.7× 2.44× ✅
sr_by_expression_limit1_no_count 4.1× 2.8× 1.47× ✅
sr_by_id_single_plan 4.1× 1.2× 3.36× ✅
sr_all_no_metadata_no_count 9.3× 3.3× 2.80× ✅

S16 — Small tier (100 items, 1051 links)

Case dev ratio #846 ratio improvement
embeddings_all_no_metadata 7.0× 3.3× 2.11× ✅
sr_by_id_single_plan 2.5× 1.3× 1.97× ✅
sr_all_no_metadata_no_count 7.6× 2.8× 2.74× ✅
sr_by_expression_limit1_no_count 2.4× 1.8× 1.32×
(other five cases) parity (within ±15% noise)

Back-compat: cases that don't pass the new opt-in flags are within ±10% noise on #846 — the cheaper paths only engage when the caller asks for them.

Cross-scenario impact — S5 + S8 (no S16-style changes, but share SparqlStore)

Ran S5 and S8 against both binaries to verify the K (Solutions → Vec<Value>) refactor doesn't regress pre-existing query paths.

S5 (queryLinks scaling) — never touches model_query:

dataSize queryAll dev queryAll #846 ratio
100 4.11 ms 3.75 ms 0.91×
500 22.39 ms 19.46 ms 0.87×
1000 46.43 ms 45.48 ms 0.98×

S8 (raw querySparql on Flux community graph, 1865-link tier) — 9 raw SPARQL queries:

Query dev avg #846 avg ratio
totalItemCount 0.51 ms 0.44 ms 0.86×
allItems 1.86 ms 1.67 ms 0.90×
unprocessedItems 0.72 ms 0.62 ms 0.86×
recentConversations 0.42 ms 0.30 ms 0.71×
paginatedMessages 1.98 ms 1.76 ms 0.89×
(other 4 queries) 0.18-0.36 ms 0.15-0.29 ms 0.81-0.96×

S8 medium tier (58460 links): every query within ±8% of dev — parity dominates at that scale.

No regressions; small incidental wins where the per-call serialize/parse overhead from K is a meaningful fraction of total latency.

v1 SHACL-emission bug (now fixed)

Reviewers note: an earlier version of this branch put @HasOne relations in a separate top-level relations: [] array, which Rust's SHACLShape deserializer silently drops. That made include actually fires: no a false positive and inflated ratios 3-10×. Fixed in 1f11143 by emitting relations as PropertyShape entries with relation_kind: "hasOne" per the canonical @coasys/ad4m SHACLShape.toJSON() form.

Client surface added

InstrumentedClient:

  • addSdna(uuid, name, shaclJson, sdnaType?, sdnaCode?)
  • modelQuery(uuid, className, queryJson)
  • addLinks(uuid, links, status?) (bulk variant)

Test plan

  • npx tsc --noEmit clean
  • S16 cross-branch — back-compat parity + 1.5-3.4× opt-in ratio improvement
  • S5 cross-branch — 0.87-0.98× across all sizes (no regression in queryLinks)
  • S8 cross-branch — small tier 0.71-0.96× wins, medium tier parity ±8% (no regression in raw SPARQL)
  • cc @lucksus @data-bot-coasys for review

🤖 Generated with Claude Code

HexaField added 3 commits June 4, 2026 20:47
Side-by-side bench of raw `perspective.querySparql` vs
`perspective.modelQuery` on identical Flux-shaped data — drives the
per-site convert-vs-keep decisions in flux's
`docs/sparql-to-ad4m-model-migration.md` (PR coasys/flux#605).

Seeds channel → messages → embeddings + SR reifiers + topics,
registers SHACL classes inline (Message / Embedding / Topic /
SemanticRelationship), and runs five cases at small + medium tiers:

  sr_by_expression_limit1    1-row lookup, WHERE + LIMIT 1
  sr_by_expression_with_include  same + include: { embeddingTag }
  sr_all                     scan all SRs
  embeddings_all             scan all embeddings
  topics_all                 scan smaller topic set

Auto-flags whether `include` actually fires (within-5% noise check).
S16_RUNS=N overrides per-case runs (default 10).

Also extends InstrumentedClient with addSdna / modelQuery / addLinks
(bulk) so this and future model-query scenarios can drive the RPC
directly instead of falling back to the Apollo SDK.

Baseline against dev: model_query is 14-150x slower than raw SPARQL
across all cases at medium scale; ratio scales linearly with corpus
size regardless of LIMIT. Full results land in
flux PR #605's migration doc.
The previous SHACL JSON put `@HasOne` relations in a separate top-level
`relations: []` array, which the Rust executor's `SHACLShape`
deserializer (`rust-executor/src/perspectives/shacl_parser.rs`) silently
drops — that field doesn't exist on `SHACLShape`. The shape loader then
never registered `embeddingTag` / `topicTag` in `include_relations`,
so `resolve_includes_recursive` silently no-op'd the include. That made
the v1 of this scenario report "include actually fires: no" as a
finding, when in reality the relation had never been registered.

Fix: emit relations inside `properties:` with `relation_kind: "hasOne"`
and `target_class_name: "..."` (the canonical SHACL form the executor
expects — see `SHACLShape.toJSON()` in `@coasys/ad4m`'s
`core/src/shacl/SHACLShape.ts`). Same for `@Flag` properties, which are
now emitted as `has_value` + `min_count: 1` per the round-trip tests
in `model_query/round_trip_tests.rs`.

With the fix:

- `include actually fires: yes` on both tiers
- medium-tier ratios drop from 18-150x to 5-25x:
    sr_by_expression_limit1         56x →  5.0x
    sr_by_expression_with_include   55x →  5.5x
    sr_all                          20x →  9.4x
    embeddings_all                  18x → 10.6x
    topics_all                     150x → 25.0x

The remaining gap (5-25x) is real and attributable to the
unconditional reifier-metadata join in `build_instance_sparql` plus
the always-fired COUNT query. Per-phase profiling lives in the flux
PR #605 doc.
Four new cases test the orchestrator-level opt-ins landing in
coasys/ad4m#846:

| Case | Tests audit item | Expected impact |
|---|---|---|
| embeddings_all_no_metadata | A (withMetadata: false) | scan-all ratio ~3x lower |
| sr_by_expression_limit1_no_count | B (count: false) | -0.1ms per call |
| sr_by_id_single_plan | C (selective WHERE)+A+B | matches a Single-plan baseline |
| sr_all_no_metadata_no_count | A+B combined | scan-all near-parity with raw |

Each case keeps the same raw SPARQL on the left side so the ratio
column reads "how much overhead does modelQuery still carry vs the
hand-written SPARQL doing comparable work."  Running against `dev`
exercises the back-compat default (these opt-in flags are ignored,
so we get the same numbers as the pre-existing cases).  Running
against `refactor/sparql-pushdown-last-write-wins` (PR #846) will
show the ratio collapses to roughly raw-SPARQL parity for these
flagged-off cases.

Test plan
- npx tsc --noEmit clean
- ./run.sh --branch dev / #846 --scenario s16 — TBC after #846's
  release executor finishes building
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant