Skip to content

fix(knowledge): promote stale processing status to ready when index is ready (#418)#419

Open
voidborne-d wants to merge 1 commit intoHKUDS:mainfrom
voidborne-d:fix/418-promote-status-to-ready-when-index-ready
Open

fix(knowledge): promote stale processing status to ready when index is ready (#418)#419
voidborne-d wants to merge 1 commit intoHKUDS:mainfrom
voidborne-d:fix/418-promote-status-to-ready-when-index-ready

Conversation

@voidborne-d
Copy link
Copy Markdown

Summary

Fixes #418. get_info() now recognises the case where kb_config.json still says status: "processing" (or "initializing") but a flat version-N/ directory already contains a ready index, and reports status: "ready" instead of a perpetual processing banner.

This happens when the progress writer or worker process is interrupted after the LlamaIndex version is finalised but before update_kb_status(name, "ready") runs and rewrites kb_config.json — the on-disk truth and the persisted status diverge.

Root cause

In deeptutor/knowledge/manager.py::get_info() (lines 646–660 on main), the elif chain that resolves the reported status had no branch for "status is a live sentinel + a ready index already exists":

if effective_needs_reindex:
    status = "needs_reindex"
elif not status and dir_exists:
    ...
elif not status:
    status = "unknown"

When kb_config.json had:

  • status: "processing"
  • progress.stage: "processing_documents" (stale)
  • needs_reindex: false

…and version-1 had a docstore.json (ready: true per _is_storage_ready), the chain fell through and status stayed "processing". Issue #418 was opened by @TecFancy with this exact reproduction and a suggested fix.

Fix

Insert a new branch after effective_needs_reindex that fires only when:

  • status in {"processing", "initializing"} (live sentinel left over)
  • has_ready_llamaindex (a flat version-N already passes _is_storage_ready)
  • progress.stage != "error" (don't silently mask real failures when an older ready version exists)

When all three hold, promote status to "ready" and clear progress for that read so consumers don't render "ready" + a stale processing bar at the same time. This matches the existing update_kb_status(status="ready") semantic, which already pops progress from the persistent payload.

The persistent kb_config.json is not rewritten inside get_info; it stays a pure read. The next legitimate update_kb_status call cleans up the on-disk state.

Tests

7 new cases in tests/knowledge/test_manager_get_info_status.py, all green:

Test Asserts
test_processing_with_ready_index_promotes_to_ready Headline #418 repro: processing + stale processing_documents + ready version → ready, progress=None
test_processing_with_completed_progress_and_ready_index_promotes Variant: progress.stage == "completed"
test_initializing_with_ready_index_promotes initializing is also a live sentinel
test_processing_with_error_stage_is_not_promoted progress.stage == "error" is not auto-healed
test_processing_without_ready_index_not_promoted Genuine in-flight indexing (empty version-1/) stays processing
test_ready_status_unaffected status="ready" happy path is unchanged
test_needs_reindex_takes_precedence Mismatched embedding signature still resolves to needs_reindex (existing branch wins)

tests/knowledge/ total: 22 passed (15 pre-existing + 7 new), no regressions.

$ .venv/bin/python -m pytest tests/knowledge/ -v
======================== 22 passed, 1 warning in 0.23s =========================

ruff check clean on both touched files. The pre-existing ruff format drift in deeptutor/knowledge/manager.py (4 spots, all on main@28e225e) is unrelated and untouched here.

Scope

  • One narrow elif inserted in get_info(), no signature changes.
  • No callers updated — info["status"], info["progress"], info["statistics"]["status"], and info["statistics"]["progress"] all consume the same locals.
  • No persistent state is rewritten on read.
  • The Web UI showing a perpetual "Processing…" banner (issue's primary user-visible symptom) is fixed transitively because it reads info["status"].

Disclosure

Implementation by an AI agent (Claude Opus 4.7) with manual review. The reporter's analysis in #418 was also AI-disclosed (deepseek-v4-pro); their suggested fix shape (elif status in {"processing","initializing"} and has_ready_llamaindex: status = "ready") matched my read of the code, with two refinements driven by reviewing the surrounding callsites:

  1. Guard against progress.stage == "error" so a failed re-index that still has an older ready version doesn't silently report "ready".
  2. Clear progress on promotion to match update_kb_status(status="ready") which pops progress from the persistent payload, so consumers don't render a "ready" badge alongside a stale progress bar.

Happy to drop the error-stage guard or the progress-clear if maintainer prefers a thinner patch.

…ts (HKUDS#418)

When the persisted ``status`` in ``kb_config.json`` is still a "live"
sentinel (``processing`` / ``initializing``) but a flat ``version-N``
directory on disk is already ready, ``get_info()`` previously returned
``status="processing"`` because the elif chain had no branch to recover
the case. The UI therefore showed a perpetual "Processing…" banner even
though the KB was fully usable.

This typically happens when the progress writer or worker process
crashes after the LlamaIndex version is finalised but before
``update_kb_status(name, "ready")`` rewrites kb_config.json — the
on-disk truth and the persisted status diverge.

Fix: in ``get_info``, after the existing ``effective_needs_reindex``
short-circuit, add a branch that promotes status to ``"ready"`` when:

  - status is currently ``processing`` or ``initializing``,
  - a ready ``version-N`` exists (``has_ready_llamaindex``),
  - and the persisted progress stage is not ``error``.

The ``error`` guard preserves visibility of genuine indexing failures
when an older ready version still exists. The promotion clears the
stale progress banner to match the existing semantic in
``update_kb_status`` (where ready KBs drop progress).

The persistent ``kb_config.json`` is left untouched on read; the next
legitimate ``update_kb_status`` call cleans it up.

7 new tests in ``tests/knowledge/test_manager_get_info_status.py``
cover the headline HKUDS#418 reproduction, the completed-stage variant, the
``initializing`` variant, and the negative cases (error stage, no ready
version yet, ready-status pass-through, mismatched-signature ->
needs_reindex precedence).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: kb_config.json status stuck at "processing" after indexing completes successfully

1 participant