Skip to content

Fix KeyError on missing image/docx keys in configure_content (#1317)#1338

Open
eldar702 wants to merge 1 commit into
khoj-ai:masterfrom
eldar702:fix/1317-keyerror-image-docx
Open

Fix KeyError on missing image/docx keys in configure_content (#1317)#1338
eldar702 wants to merge 1 commit into
khoj-ai:masterfrom
eldar702:fix/1317-keyerror-image-docx

Conversation

@eldar702

@eldar702 eldar702 commented Jun 8, 2026

Copy link
Copy Markdown

Closes #1317.

Problem

GET /api/update returns HTTP 500 (Failed to update content index) on a fresh install with the default t=all. The inner handler logs Failed to setup images: 'image' / Failed to setup docx: 'docx', sets success=False, and the outer caller raises 500. Scoping to a single type such as t=markdown avoids the crash.

Fix

In src/khoj/routers/helpers.py::configure_content(), the image and docx guard conditions used a bracket lookup on a possibly-missing key (files["image"], files["docx"]), while every other content type — and even the bodies of these same two blocks — already used the safe files.get(...). When files lacks those keys the guard raises KeyError. The fix changes the two guards to files.get("image") / files.get("docx"), matching the convention used throughout the function (1 file, no new deps, no behavior change on the happy path).

Test

Added two DB-free regression tests in tests/test_helpers.py. Each calls configure_content with a files dict that omits the image/docx keys (and a non-empty unrelated key so the Github/Notion server-side branches, which hit the DB, are skipped) and asserts it returns True. Both fail before the fix (KeyErrorsuccess=False) and pass after.

Verification

  • pytest tests/test_helpers.py -k configure_content — green (full file: 6 passed, 2 skipped — network tests skipped without API keys)
  • mypy src/khoj/routers/helpers.py — no new errors introduced (pre-existing baseline unchanged)
  • ruff check / ruff format --check on touched lines — clean

Note: khoj is AGPL-3.0; this contribution is offered under the same license.

🤖 AI-assistance disclosure: drafted with assistance from Claude (Opus 4.8). All changes reviewed for correctness before submission.

…#1317)

configure_content guarded the image and docx indexing branches with a
bracket lookup (files["image"] / files["docx"]) while every other content
type — and even the bodies of these same two blocks — used the safe
files.get(...). On a fresh install with the default t=all, files often
lacks these keys, so the guard raised KeyError. The inner try/except
swallowed it into success=False, surfacing as HTTP 500
"Failed to update content index" from GET /api/update.

Use files.get("image") / files.get("docx") in the guards to match the
convention already used throughout the function. Add DB-free regression
tests asserting configure_content returns True when the image/docx keys
are absent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@eldar702 eldar702 marked this pull request as ready for review June 10, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KeyError: 'image' / 'docx' in configure_content() on /api/update (2.0.0b28)

1 participant