Skip to content

Fix LRU eviction silent failure allowing unbounded memory growth (#449)#567

Merged
itomek merged 3 commits into
mainfrom
fix/lru-eviction-449
Mar 19, 2026
Merged

Fix LRU eviction silent failure allowing unbounded memory growth (#449)#567
itomek merged 3 commits into
mainfrom
fix/lru-eviction-449

Conversation

@itomek
Copy link
Copy Markdown
Collaborator

@itomek itomek commented Mar 18, 2026

Summary

  • Pre-flight rejection: _has_indexing_capacity() (new read-only method) checks limits before indexing starts. If at limit with eviction disabled, returns success=False with memory_limit_reached=True and a descriptive error — no silent growth.
  • Logging: _evict_lru_document() and _check_memory_limits() now emit structured self.log warnings/errors on all failure paths (disabled, no files, eviction failed, limit exceeded).
  • _check_memory_limits() returns bool: callers can now detect post-index limit violations; sets memory_limit_warning=True in stats.
  • Cache-load bug fix: pre-existing bug where cache-loaded files were never added to file_access_times/file_index_times, making them invisible to LRU eviction.
  • Stats enrichment: max_indexed_files and max_total_chunks always present in index_document() stats dict.
  • CLI flags: --max-indexed-files and --max-total-chunks wired through ChatAgentConfigRAGConfig for runtime configuration and UI testing.
  • 6 new unit tests in TestMemoryLimits covering all acceptance criteria.

Test plan

  • uv run python -m pytest tests/test_rag.py -xvs — 28/28 pass
  • python util/lint.py --all --fix — passes
  • UI test (requires --no-lru-eviction flag, not yet implemented): gaia chat --ui --max-indexed-files 2 --no-lru-eviction → upload 3 docs → 3rd rejected with "Memory limit reached" error
  • UI test (eviction path): gaia chat --ui --max-indexed-files 2 → upload 3 docs → 3rd succeeds, 1st evicted

Open items

  • --no-lru-eviction CLI flag not yet implemented — needed to fully validate the rejection path through the UI (see issue comments for full test instructions)

Closes #449

🤖 Generated with Claude Code

@itomek itomek self-assigned this Mar 18, 2026
@github-actions github-actions Bot added agents rag RAG system changes cli CLI changes tests Test changes performance Performance-critical changes labels Mar 18, 2026
Base automatically changed from kalin/chat-ui to main March 18, 2026 17:43
@itomek itomek marked this pull request as ready for review March 18, 2026 18:28
@itomek itomek requested a review from kovtcharov-amd as a code owner March 18, 2026 18:28
@itomek itomek force-pushed the fix/lru-eviction-449 branch from 49f417f to 6cb61aa Compare March 19, 2026 17:54
@github-actions github-actions Bot added the documentation Documentation changes label Mar 19, 2026
@itomek itomek marked this pull request as draft March 19, 2026 17:55
@itomek itomek force-pushed the fix/lru-eviction-449 branch from 6cb61aa to 34fbfa1 Compare March 19, 2026 18:02
@itomek itomek marked this pull request as ready for review March 19, 2026 18:05
@itomek itomek added this pull request to the merge queue Mar 19, 2026
Merged via the queue into main with commit 8a6452f Mar 19, 2026
49 checks passed
@itomek itomek deleted the fix/lru-eviction-449 branch March 19, 2026 18:05
itomek added a commit that referenced this pull request Mar 19, 2026
Forward --max-indexed-files to UI server via GAIA_MAX_INDEXED_FILES env
var. Add DB-level capacity check with LRU eviction in upload_by_path()
since per-upload RAGSDK instances can't track cross-upload state. Add
O_BINARY flag on Windows for safe_open_document() and RAGSDK._safe_open()
to prevent binary/text mode issues with fd-based file reads.
itomek and others added 3 commits March 19, 2026 18:01
- Add pre-flight capacity check (_has_indexing_capacity) that rejects
  indexing when memory limits are exceeded and eviction is disabled,
  returning success=False with memory_limit_reached=True and a
  descriptive error message
- Change _check_memory_limits() to return bool (True if limits met)
- Add structured logging to _evict_lru_document and _check_memory_limits
  for eviction attempts, failures, and limit violations
- Fix pre-existing bug: cache-load path never set file_access_times or
  file_index_times, making cached files invisible to LRU eviction
- Add max_indexed_files and max_total_chunks to index_document stats dict
- Add > 1 guard to files eviction loop (parity with chunks loop)
- Add 6 unit tests covering all acceptance criteria

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add --max-indexed-files and --max-total-chunks flags to `gaia chat`
so memory limits can be set at runtime for UI testing and validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use monotonic counter for LRU ordering instead of time.time() (Windows resolution issue)
- Remove redundant default fallbacks in cli.py kwargs.get()
- Document --max-indexed-files and --max-total-chunks in CLI reference
- Rename misleading test and add proper eviction-failure test
@itomek itomek mentioned this pull request Mar 27, 2026
4 tasks
github-merge-queue Bot pushed a commit that referenced this pull request Mar 27, 2026
## Summary

Release v0.17.0 — **GAIA Agent UI**, eval benchmark framework, tool
execution guardrails, system prompt optimization, and security
hardening.

### Files Changed
- **`docs/releases/v0.17.0.mdx`** — Comprehensive release notes (new
file)
- **`docs/docs.json`** — Added `releases/v0.17.0` to Releases tab,
updated navbar to `v0.17.0 · Lemonade 10.0.0`
- **`src/gaia/version.py`** — Already at `0.17.0` on main (no change
needed)

### Release Highlights

**New Features:**
- **GAIA Agent UI** — Full-stack privacy-first desktop chat with
streaming responses, 53+ format document Q&A, ngrok tunnel for mobile,
page-level citations, session management (PR #428)
- **Agent UI Eval Framework** — `gaia eval agent` command with
7-dimension weighted scoring across 34 scenarios, redesigned Settings
modal, `<think>` block display, performance stats (PR #607)
- **Tool Execution Guardrails** — Blocking confirmation popup
(Allow/Deny/Always Allow) before write/shell tools, 60s timeout (PR
#565, #604)
- **Device Support Detection** — AMD Ryzen AI Max + Radeon ≥24GB
detection, `--base-url` remote bypass, `GAIA_SKIP_DEVICE_CHECK` override
(PR #593)
- **Terminal UI Design** — Typewriter welcome page, pixelated AMD
cursor, glassmorphism, `prefers-reduced-motion` support (PR #568)

**Performance:**
- **78% System Prompt Reduction** — 17,600 → 3,853 tokens via two-tier
RAG gating, 600s chat timeout, MCP runtime status display (PR #617)

**Security:**
- **TOCTOU Race Condition** — Atomic `O_NOFOLLOW` + `fstat` fix in
document upload, per-file `asyncio.Lock` (PR #564)

**Bug Fixes:**
- LRU eviction silent failure + new
`--max-indexed-files`/`--max-total-chunks` CLI flags (PR #567)
- Lemonade v10 device key renames: `npu` → `amd_npu`, `gpu` →
`amd_igpu`/`amd_dgpu` (PR #548)
- Agent UI rendering, Windows paths, JSON safety regex, RAG indexing
guards (PR #566, #604, #605)
- Restored accidentally reverted changes from PRs #564, #565, #568 (PR
#608)

### Post-Merge
After merging, tag and push:
```bash
git checkout main && git pull
git tag v0.17.0 && git push origin v0.17.0
```
CI runs `validate-release` → `publish-release`. PyPI gated on Kalin
approval.

## Test plan
- [ ] `docs.json` is valid JSON and renders on Mintlify
- [ ] `validate_release_notes.py` passes for v0.17.0
- [ ] `version.py` reads `0.17.0`
- [ ] Release notes content matches actual PR changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI changes documentation Documentation changes performance Performance-critical changes rag RAG system changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix LRU eviction silent failure allowing unbounded memory growth

2 participants