feat: REPL mode, NVIDIA integration, async DB, memory systems #166

jnorthrup · 2025-11-20T02:39:27Z

Major Feature Release: Interactive REPL, NVIDIA Provider, Async Database, Memory Systems

This PR creates a lite cli minimalist wheel option --repl

🎯 Major Features

1. Interactive REPL Mode (aider-style CLI)

Changelog

collector branch (2025-11-19)

Features

Python 3.14 compatibility: Forward-compatible pydantic, puremagic instead of imghdr
REPL mode: Interactive CLI with tab completion for model/provider selection
NVIDIA provider: Qwen3-coder-480b default, Bayesian model ranking
Memory systems: Hashtable, memvid QR, DuckDB vector search
Installation options: Full (2GB) or lite (200MB via requirements-lite.txt)

Implementation

src/ii_agent/cli/: REPL mode entry point and implementation
src/ii_agent/db/: Async SQLite, DuckDB integration
src/ii_agent/storage/: Three memory backend options
src/ii_agent/server/api/nvidia_models.py: NVIDIA model fetching
scripts/fetch_nvidia_models.py: Bayesian ranking script

Breaking Changes

Unpinned pydantic (was ==2.11.7, now >=2.11.7)
Removed deprecated test files

Documentation

docs/INSTALL.md: Installation guide
requirements-lite.txt: Minimal REPL-only install

…ystems Features: - Python 3.14 compatibility (forward-compatible pydantic, puremagic) - Interactive REPL mode with tab completion - NVIDIA provider with qwen3-coder-480b default - Three memory backends: hashtable, memvid QR, DuckDB - Split install: full (2GB) vs lite (200MB) Implementation: - src/ii_agent/cli/: REPL entry point - src/ii_agent/db/: Async SQLite, DuckDB - src/ii_agent/storage/: Memory backends - requirements-lite.txt: Minimal install 35 files changed, 3904 insertions(+), 1644 deletions(-)

- Checkpoint without eviction (Strategy 5) - Microkernel generation before checkpoint creation - Dynamic model-specific thresholds (90% for 64K → 15% for 1MB) - Context mode transitions: SUSPENDED, HIGH_DETAIL, HIGH_CAPACITY, NORMAL - RecallContext tool for breadcrumb trail queries - MicrocontextSubroutine for temporary context expansion - Tile generation at 33% context for future work streams - Full integration in ChatService with mode transitions - Harmonic miss tracking for context pressure analysis

Implemented comprehensive context management system to prevent performance degradation: Core ACE Components: - ContextWindowManager: Orchestrates context window monitoring and auto-summarization - ContextBandOptimizer: Golden band tiling for optimal breadcrumb retrieval - ContextCliffBenchmark: Background needle-in-haystack testing for cliff detection - SlabCheckpoint: Non-evicting checkpoint system with microkernel summaries - TileGenerator: Future work tile generation from TODO structure - DictionaryStorage: LRU cache-backed memvid storage integration Model-Specific Features: - Model constants with context windows and performance cliffs - Dynamic checkpoint thresholds based on cliff data - Golden band placement for Claude 3.5, GPT-4o, Gemini, Llama 4, DeepSeek - Harmonic miss tracking for error-induced threshold adjustment REPL Enhancements: - install-repl.sh: Local installation script to ~/.local/bin - REPL with memvid and duckdb support - Context window status display - Token counting infrastructure - Checkpoint and tile generation integration Storage Systems: - MemVid QR-encoded MP4 checkpoints - DuckDB analytics support - Slab checkpointing (context NOT evicted) - Dictionary storage with hashtable indexing Progressive Reduction Strategy: - 33% context: Dump and generate tiles - 90% context: Reduce message tokens - 95% context: Force summarization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

REPL uses DuckDB for local storage and doesn't need PostgreSQL migrations. Set IIAGENT_SKIP_MIGRATIONS and IIAGENT_SKIP_SERVER_APP_IMPORT environment variables in the ii-repl wrapper script to prevent migration errors. Fixes: - Error running migrations: 'duckdb' - Prevents unnecessary server app initialization in REPL mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Replaced static hardcoded model lists with dynamic API-based model fetching: Model Fetcher: - Query NVIDIA models/ endpoint for 200+ available models - Cache fetched models (1-hour TTL) in ~/.ii_agent/model_cache/ - Fetch OpenAI models dynamically from API - Include known Anthropic and Gemini models Tab Completion Improvements: - Support provider/model-slug/sub-slug format (e.g., nvidia/qwen/qwen3-coder-480b) - Dynamic completion from cached model lists - File path completion for /add and /drop commands - Workspace-relative path completion with directory traversal - CompositeCompleter merges file and model completers Command Updates: - /model now accepts provider/model-slug format - Backward compatible with /model provider model format - Auto-detects format based on slash presence Fixes: - Removed static hardcoded NVIDIA model list (5 models → 200+) - Proper model slug parsing for NVIDIA multi-slash format - File completion skips hidden files unless explicitly requested 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Updated model tab completion to show only one directory level at a time: Before: - /model nvidia/[TAB] → shows all 200+ models - Overwhelming and hard to navigate After: - /model nvidia/[TAB] → shows: qwen/, meta/, google/, nvidia/, kimi/, etc. - /model nvidia/qwen/[TAB] → shows only qwen models - /model nvidia/meta/[TAB] → shows only meta models Implementation: - Group models by next path segment after typed prefix - Show unique prefixes with trailing slash - Only show final model names when no more slashes - Applied to both provider/model and provider model formats UX matches bash cd completion behavior for better usability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Added slash command completion and improved path-like navigation: Command Completion: - / + TAB → shows all available commands (/help, /model, /add, /drop, etc.) - /mo + TAB → completes to /model - /ad + TAB → completes to /add Model Completion (bash-like): - /model nvidia/ + TAB → qwen/, meta/, google/, nvidia/ - /model nvidia/qwen/ + TAB → shows only qwen models at this level - Hierarchical navigation exactly like bash cd File Completion (bash-like): - /add src/ + TAB → shows files in src/ directory - /add src/ii_agent/ + TAB → shows files in ii_agent/ - Shows just filenames at current level (not full paths) - Directories have trailing slash - Preserves typed path prefix Implementation: - CommandCompleter: handles / command completion - FileCompleter: bash-like file path navigation - ModelCompleter: hierarchical model path navigation - CompositeCompleter: merges all three completers UX now matches bash completion behavior exactly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…unds Replaced all unreadable colors with readable ones on black backgrounds: - Warnings: YELLOW → WHITE - Info: CYAN → WHITE - Commands: CYAN → WHITE - Env vars: YELLOW → WHITE - Header: CYAN → BLUE Only using readable ANSI colors on black backgrounds: ✓ RED - error messages ✓ GREEN - success, files, models ✓ BLUE - prompts, workspace, headers ✓ WHITE - info, warnings, commands Removed: CYAN, YELLOW, MAGENTA (unreadable on black)

…l completion Replaces text.split() tokenization with cursor position-based word extraction to enable bash-like file completion behavior for model paths with slashes. Key changes: - Extract current word at cursor position instead of tokenizing entire command - Maintain one-level-at-a-time completion like bash cd - Handle partial model paths correctly at any cursor position - Add model_suggestions attribute for test compatibility

jnorthrup · 2025-11-21T03:58:40Z

this has a new shortcut script ii-repl

there's tab completion for models and nvidia models are working well with all ~ 200 of them under tab completion

Anthropic Claude produced a long chain of reasoning steps to wire up the TODO which I handed over to Raptor model for persistent clean contexts.

The ability for sonnet-4.5 to fail dozens of times on simple tab completions and GLM to solve them in less than a minute must be commended.

wiring up the tools remains to be completed and adding a productive finite statemachine has yet to be mined from the server code or borrowed from walking backwards in hugginface papers looking for a good one.

I brought over some adaptive context engineering notions I've had which to my knowledge is not often designed for boundless collections of providers and quotas.

These await lacing up the tools, the model appears to lack the combination of prompt and tool listings as yet

jnorthrup force-pushed the collector branch from d1ffd14 to d4a4985 Compare November 20, 2025 03:31

jnorthrup force-pushed the collector branch from d4a4985 to d8c3670 Compare November 20, 2025 03:39

jnorthrup force-pushed the collector branch from bb33517 to 172d5cf Compare November 20, 2025 16:20

jnorthrup and others added 7 commits November 20, 2025 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: REPL mode, NVIDIA integration, async DB, memory systems #166

feat: REPL mode, NVIDIA integration, async DB, memory systems #166

Uh oh!

jnorthrup commented Nov 20, 2025 •

edited

Loading

Uh oh!

jnorthrup commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: REPL mode, NVIDIA integration, async DB, memory systems #166

Are you sure you want to change the base?

feat: REPL mode, NVIDIA integration, async DB, memory systems #166

Uh oh!

Conversation

jnorthrup commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Major Feature Release: Interactive REPL, NVIDIA Provider, Async Database, Memory Systems

🎯 Major Features

1. Interactive REPL Mode (aider-style CLI)

Changelog

collector branch (2025-11-19)

Features

Implementation

Breaking Changes

Documentation

Uh oh!

jnorthrup commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jnorthrup commented Nov 20, 2025 •

edited

Loading