Skip to content

refactor(workspace): component loader rewrite — diff harness + spike + first seam#10349

Draft
davidfirst wants to merge 19 commits intomasterfrom
refactor/component-loading-v2-take-2
Draft

refactor(workspace): component loader rewrite — diff harness + spike + first seam#10349
davidfirst wants to merge 19 commits intomasterfrom
refactor/component-loading-v2-take-2

Conversation

@davidfirst
Copy link
Copy Markdown
Member

Take 2 of the component-loading rewrite (PR #10086 was abandoned). This PR sets up the safety net and extracts the first seam.

Why this is a fresh start, not a rebase of #10086

See `docs/rfcs/component-loading-rewrite/README.md` and `DECISIONS.md` for the full analysis. Short version: the previous attempt collapsed V1's two-pass loading into a single inline pipeline and lost the ordering guarantee that prevents the env↔component recursion. Stubbing out Enrichment to "fix" it just deferred the work that needed to happen.

What's in this PR

1. Diff harness (`BIT_LOADER_DIFF=1`) — wraps `workspace.componentLoader` so two loader instances run side-by-side and any difference in their output is logged to JSONL. Off by default. Snapshot fields are deliberately small at the start; expand as the rewrite proceeds. e2e test verifies V1-vs-V1 produces zero diffs on a 2-component workspace.

2. Recursion root-cause spike (`DECISIONS.md` D-001) — traces the env↔component "recursion" through three call paths, identifies the one real DAG-ordering problem, and rules out the approaches the previous PR took (single-pass pipeline, lazy env binding, cycle detection).

3. First seam: env-DAG layering extracted as a pure function (`env-dag-sort.ts`). Behavior-preserving — the existing one-level limitation is pinned by a unit test. Recursive upgrade is in a follow-up PR.

Verification

  • Unit tests: 6 cases for `groupEnvsByDepLayer`
  • e2e contract tests: all 21 scenarios pass
  • Diff harness e2e: V1-vs-V1 produces zero diffs

Known limitations (for follow-up)

  • Diff harness OOMs Node's default 4GB heap on the bit7 workspace itself (300+ components × 2 loaders). Intended for small representative workspaces; bump heap for bit7. Sampling mode TBD.
  • `bit list` doesn't trigger the harness (probably doesn't go through `workspace.componentLoader`); `bit show` similarly. Worth mapping the loader's true entry surface in a follow-up.

davidfirst added 4 commits May 7, 2026 10:23
Builds the safety net before rewriting the component loader: a deterministic
component snapshot + a wrapper that runs two loader instances in parallel and
logs diffs to JSONL. Off by default, opt-in via BIT_LOADER_DIFF=1.
The env↔component "recursion" is a DAG topological-ordering problem, not
a true cycle. PR #10086 failed because it collapsed V1's two-pass load
into a single inline pipeline and lost the ordering guarantee. Recorded
in D-001 with file:line evidence.
Behavior-preserving extraction of regroupEnvsIdsFromTheList into
groupEnvsByDepLayer with unit tests. Sets up the seam to upgrade to a
proper recursive topological sort in a follow-up — see D-001.
Replaces V1's one-level partition with a proper topological sort over the
env-of-env DAG. Cycles fall back to a single layer defensively. Per D-001,
this is the load-order primitive the rewrite depends on.
Adds BIT_LOADER_DIFF_SAMPLE=N to skip the partner on most calls. Lets the
harness run on workspaces big enough that V1+V1 in parallel would OOM
the default 4GB heap. Adds a bit-show e2e (the previous "bit show isn't
covered" report turned out to be a stale-compile artifact). Documents
actual command coverage in SNAPSHOT-CONTRACT.md.
…uildLoadGroups

D-001 predicted that the recursive env-DAG sort would make the prepended
'always load core-aspect-env first' workaround unnecessary. Verified:
311-component bit7 workspace (the workspace the special case was written
for) loads cleanly without it.
davidfirst added 13 commits May 7, 2026 15:52
Extracts a generic topo-sort helper (topoLayerByDeps) and uses it for both
envs (refactored to share) and extensions. regroupExtIdsFromTheList was a
documented no-op TODO; this implements it. Same shape fix as #10350 for
envs, applied to ext-of-ext chains.
…onfig only)

Adds extensions[].config and aspects[].config to the diff snapshot, sorted
by id with stable JSON canonicalization. Excludes data because it depends on
cache warmth — the partner's cold cache eagerly computes deps that the
primary's warm cache short-circuits, producing false positives in sampled
mode. Documented the trade-off in SNAPSHOT-CONTRACT.md.
Pulls the canonical group-ordering logic out of WorkspaceComponentLoader as
buildLoadPlanGroups in load-plan.ts. The loader retains the side-effecting
parts (warming the extensions cache iteratively, registering newly-discovered
extension comp ids) and calls into the pure function for the actual ordering.
Removes the now-redundant regroupEnvsIdsFromTheList / regroupExtIdsFromTheList
wrappers — load-plan.ts calls groupEnvsByDepLayer / groupExtsByDepLayer directly.
…re function

Pulls the workspace-vs-scope partitioning logic out of groupAndUpdateIds as
classifyIds in discovery.ts, with 7 unit tests covering ws/scope mixes, version
resolution, and append-to-existing-result. Side effect: the workspace's id
surface (listIds + locallyDeletedIds) is now fetched once per call instead of
once per id — V1 fanned this out via Promise.all but the result was always
identical for every parallel call within one invocation, so caching upfront
is behavior-preserving.
…ments

The original RFC called the loader's caching "the caching nightmare". After
tracing the actual behavior, the rules are coherent — they were just
undocumented. D-002 enumerates the four caches, the cache-key composition
(four flags that genuinely produce a different Component), and the
"exact-match-or-fully-loaded" lookup rule. Inline comments at the
saveInCache / getFromCache / createComponentCacheKey touchpoints point
back to D-002 so a reader of the source can find the invariants.

No behavior change.
Single env var: BIT_LOADER_DIFF=N. Unset/0 = off. 1 = on, every call. N>1 =
sample every Nth call. Drops BIT_LOADER_DIFF_SAMPLE (folded into BIT_LOADER_DIFF)
and the user-facing BIT_LOADER_DIFF_OUT (kept internal-only for e2e tests).
The harness prints the log path to stderr at startup so the user can find
the output without consulting docs.
The flag tracked "we attempted to load this as an aspect" to suppress retry
of failures. AspectLoaderMain.isAspectLoaded already returns true for both
successfully-loaded aspects (via harmony.get) and previously-failed ones (via
the failedAspects registry populated in requireAspects). The local memoization
flag was redundant — removed, along with its references in clearCache,
clearComponentCache, the constructor, and loadCompsAsAspects.
…instance

Two of the loader's caches (scopeComponentsCache, componentsExtensionsCache)
were used as scratch state for one load operation but lived as instance
fields, accumulating across calls. Now they're a per-operation LoadScratch
value created at the top of getAndLoadSlotOrdered and threaded through
buildLoadGroups → getAndLoadSlot → getComponentsWithoutLoadExtensions →
this.get → loadOne. Reads fall back to direct workspace lookups when scratch
is absent (single-component get() path), so the public surface is unchanged.

Net effect on the class:
  Before: 4 instance caches, instance state outliving any single operation.
  After:  1 instance cache (componentsCache, the actual output cache).
          clearCache touches one thing. Scratch lifetime is visible at every
          call site instead of hidden in instance fields.
…ession)

The earlier "drop core-aspect-env special case" claim was wrong. The
recursive env-DAG sort handles load *ordering* correctly, but the special
case was also marking these envs with `core: true` — the flag that
`getAndLoadSlot` checks before triggering `loadCompsAsAspects`. Without it,
`bit status` on bit7 prints "environment with ID: teambit.harmony/envs/core-aspect-env
was not loaded (run 'bit install')" because the env aspect wasn't
registered with Harmony when components configured to use it were loaded.

Restored in load-plan.ts (Step 7b) and pinned by a unit test. D-001 updated
with a correction noting the misprediction. The diff harness didn't catch
this because V1-vs-V1 mode compares the modified V1 against itself —
they're equally wrong, so they match. Real coverage gap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant