fix(wiki): clear stale articles before regenerating to prevent orphan accumulation#558
Open
szsip239 wants to merge 1 commit intosafishamsi:v5from
Open
fix(wiki): clear stale articles before regenerating to prevent orphan accumulation#558szsip239 wants to merge 1 commit intosafishamsi:v5from
szsip239 wants to merge 1 commit intosafishamsi:v5from
Conversation
… accumulation to_wiki() writes a fresh set of community + god-node articles each call but never deletes old files from previous runs. Since community labels are LLM-generated and non-deterministic across rebuilds (per skill.md Step 5), the same conceptual community is often named differently each time, leaving its previous file as an orphan. After N rebuilds, wiki/ contains roughly N times the active article count, with index.md only referencing the most recent run's labels. Real-world: a knowledge corpus accumulated 822 wiki .md files over 5 rebuilds, of which only 111 were referenced by index.md (710 orphans). Fix: clear *.md files in the output directory at the start of to_wiki(). This is consistent with its existing fully-regenerative behavior — it always writes the full set of articles + index, never partial updates. Subdirectories and non-.md files are preserved (only top-level .md is touched), so any user-placed auxiliary assets survive. Tests: two new regression tests cover (1) stale article cleanup across runs with different labels, and (2) preservation of non-.md user files and nested subdirectories.
rosschurchill
added a commit
to rosschurchill/graphify-super
that referenced
this pull request
Apr 27, 2026
to_wiki() now globs and unlinks all *.md in the output dir before writing fresh articles, preventing orphan accumulation across rebuilds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
to_wiki()writes a fresh set of community + god-node articles each call but never deletes old files from previous runs. Since community labels are LLM-generated and non-deterministic across rebuilds (perskill.mdStep 5), the same conceptual community is often named differently each time, leaving its previous file as an orphan. After N rebuilds,wiki/contains roughly N times the active article count, withindex.mdonly referencing the most recent run's labels.Repro
Real-world: a knowledge corpus accumulated 822 wiki .md files over 5 rebuilds, of which only 111 were referenced by
index.md(710 orphans).Fix
Clear
*.mdfiles in the output directory at the start ofto_wiki(). This is consistent with its existing fully-regenerative behavior — it always writes the full set of articles + index, never partial updates. Subdirectories and non-.mdfiles are preserved (only top-level.mdis touched), so any user-placed auxiliary assets survive.Tests
Two new regression tests in
tests/test_wiki.py:test_to_wiki_clears_stale_articles— callsto_wiki()twice with different community labels, asserts old files are gone and new files exist.test_to_wiki_preserves_non_md_files— places PNG/JSON/subdirectory content in the wiki dir, asserts they survive cleanup.All 17 tests in
test_wiki.pypass (15 existing + 2 new).Compatibility
.mdfiles to clear).mdfiles placed at top level ofwiki/(not a documented workflow) would be removed on next--wiki. Subdirectories and non-.mdfiles are unaffected.Related work
Existing wiki improvements addressed adjacent concerns but not orphan cleanup:
.graphify_labels.json)