-
Notifications
You must be signed in to change notification settings - Fork 169
reduce change storage by 90% #3688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
| Command | Status | Duration | Result |
|---|---|---|---|
nx run-many --target=test --parallel |
❌ Failed | 9m 4s | View ↗ |
nx run-many --target=lint --parallel |
✅ Succeeded | 50s | View ↗ |
☁️ Nx Cloud last updated this comment at 2025-09-05 23:26:53 UTC
Closes opral/lix-sdk#366.
Reduces the changes per mutation from 30 to 3, a 90% decrease. Besides storage reductions, benchmarks improved across the bench:
Commit ca. 1.3x faster
version diffs are 3.5x faster
Write Amplification → Incremental Optimization Plan
Goal: Reduce change rows per user mutation while staying compatible with today’s flows (
create-checkpoint, history-by-commit) and without introducing a new “commit package” until it clearly adds value.Scope: 1 domain mutation on a single version, 1 author, typical commit (no merge).
Notation (per‑commit complexity)
parent_commit_ids). Usually 1; >1 for merges.Baseline (Status Quo)
Observed new change rows for a single domain mutation:
lix_key_value: 1lix_change_author: 1lix_change_set_element: 20lix_change_set: 2lix_commit: 2lix_commit_edge: 2lix_version: 2Notes:
Step 1 — Derive CSEs (no hot‑path CSE writes)
lix_change_set_elementon commit. Derive domain‑only CSEs in a materializer/view from commit membership (initiallycommit.change_ids; later, the commit package’sdomain_change_ids).state-history: It joinschange_set_element_all. Provide a compatibility view that unions physical rows (for old commits) with derived rows (for new commits), or switch to the derived view behind the existing name.create-checkpoint: Unchanged. It referencescommit.change_set_idand does not require physical CSE rows.state-historycan read derived CSEs in O(D) via JSON extraction instead of table joins.Step 2 — Replace
lix_commit_edgerows with derived edgeslix_commit_edgerows. Keepparent_commit_idsinside thelix_commitsnapshot and expose a view that explodes parents into an edge shape for queries.state-history: Today it joinscommit_edge_all. Provide a compatibility viewcommit_edge_allthat explodescommit_all.parent_commit_idsto(parent_id, child_id)so existing queries continue to work.create-checkpoint: Still emits a parent relationship; it can write a no‑op (or rely on the derived view).Step 3 — Drop Dual Commit (de‑duplicate commit, version, and change_set)
lix_commit: only the version’s commit (no global duplicate)lix_version: only the mutated version’s tip move (no global duplicate)lix_change_set: only one row per commit (no second/global duplicate)create-checkpoint: Unchanged. It updatesversion.commit_idandversion.working_commit_idand labels the checkpoint.state-history: Unchanged. Edges are derived fromparent_commit_idsand CSEs are derived/materialized; both remain global in views/cache.lix_commit_edgein the global scope fromparent_commit_idsso cache/queries do not depend on a “global” commit change row.Result After Steps 1–3 (1 domain mutation)
lix_key_value: 1lix_change_author: 1lix_change_set: 1lix_commit: 1 (public, minimal; includeschange_set_id)lix_version: 1lix_change_set_element(global) from commit membershiplix_commit_edgefromparent_commit_idsStep 4 (Optional) — Author normalization for multi‑change commits
change_author_all.Step 5 (Later) — Introduce
meta_change_idsmeta_change_idstolix_committhat carriescommit_id,parent_commit_ids,change_set_id, and split membership:domain_change_idsvsmeta_change_ids.Potential follow‑up: Unify version pointers under control (tip)
commit_id(version tip) lives in the control ledger (lix_version_tip), whileworking_commit_idlives in the descriptor (lix_version_descriptor).working_commit_idwhile also reading tip forcommit_id.working_commit_idinto the control plane (extendlix_version_tipor add a sibling control entity to carry the working pointer). Keep descriptor purely domain (id, name, inherits, hidden).Compatibility Summary
create-checkpoint: Remains valid throughout. It needs a realcommitwith achange_set_id, and will continue to link the previous head as a checkpoint and create a new empty working commit.state-history: Continue to “query by commit” by ensuring two compatibility views exist when steps land:commit_edge_all(derived fromcommit_all.parent_commit_ids).change_set_element_all(derived for new commits; union with physical for legacy).Rollout Guidance
parent_commit_ids.Cleanup TODOs
packages/lix-sdk/src/version/merge-version.ts, we temporarily filter control/meta schemas out of winners/deletions to keep commit membership deterministic under cache-miss. Do not blanket-filterlix_*; some are valid domain (e.g.,lix_key_value,lix_file_descriptor). Remove this filter after Step 5 introducesmeta_change_idsand formally splits domain vs meta membership.