-
Notifications
You must be signed in to change notification settings - Fork 455
feat: REPL mode, NVIDIA integration, async DB, memory systems #166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jnorthrup
wants to merge
9
commits into
Intelligent-Internet:develop
Choose a base branch
from
jnorthrup:collector
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ystems Features: - Python 3.14 compatibility (forward-compatible pydantic, puremagic) - Interactive REPL mode with tab completion - NVIDIA provider with qwen3-coder-480b default - Three memory backends: hashtable, memvid QR, DuckDB - Split install: full (2GB) vs lite (200MB) Implementation: - src/ii_agent/cli/: REPL entry point - src/ii_agent/db/: Async SQLite, DuckDB - src/ii_agent/storage/: Memory backends - requirements-lite.txt: Minimal install 35 files changed, 3904 insertions(+), 1644 deletions(-)
- Checkpoint without eviction (Strategy 5) - Microkernel generation before checkpoint creation - Dynamic model-specific thresholds (90% for 64K → 15% for 1MB) - Context mode transitions: SUSPENDED, HIGH_DETAIL, HIGH_CAPACITY, NORMAL - RecallContext tool for breadcrumb trail queries - MicrocontextSubroutine for temporary context expansion - Tile generation at 33% context for future work streams - Full integration in ChatService with mode transitions - Harmonic miss tracking for context pressure analysis
Implemented comprehensive context management system to prevent performance degradation: Core ACE Components: - ContextWindowManager: Orchestrates context window monitoring and auto-summarization - ContextBandOptimizer: Golden band tiling for optimal breadcrumb retrieval - ContextCliffBenchmark: Background needle-in-haystack testing for cliff detection - SlabCheckpoint: Non-evicting checkpoint system with microkernel summaries - TileGenerator: Future work tile generation from TODO structure - DictionaryStorage: LRU cache-backed memvid storage integration Model-Specific Features: - Model constants with context windows and performance cliffs - Dynamic checkpoint thresholds based on cliff data - Golden band placement for Claude 3.5, GPT-4o, Gemini, Llama 4, DeepSeek - Harmonic miss tracking for error-induced threshold adjustment REPL Enhancements: - install-repl.sh: Local installation script to ~/.local/bin - REPL with memvid and duckdb support - Context window status display - Token counting infrastructure - Checkpoint and tile generation integration Storage Systems: - MemVid QR-encoded MP4 checkpoints - DuckDB analytics support - Slab checkpointing (context NOT evicted) - Dictionary storage with hashtable indexing Progressive Reduction Strategy: - 33% context: Dump and generate tiles - 90% context: Reduce message tokens - 95% context: Force summarization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
REPL uses DuckDB for local storage and doesn't need PostgreSQL migrations. Set IIAGENT_SKIP_MIGRATIONS and IIAGENT_SKIP_SERVER_APP_IMPORT environment variables in the ii-repl wrapper script to prevent migration errors. Fixes: - Error running migrations: 'duckdb' - Prevents unnecessary server app initialization in REPL mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Replaced static hardcoded model lists with dynamic API-based model fetching: Model Fetcher: - Query NVIDIA models/ endpoint for 200+ available models - Cache fetched models (1-hour TTL) in ~/.ii_agent/model_cache/ - Fetch OpenAI models dynamically from API - Include known Anthropic and Gemini models Tab Completion Improvements: - Support provider/model-slug/sub-slug format (e.g., nvidia/qwen/qwen3-coder-480b) - Dynamic completion from cached model lists - File path completion for /add and /drop commands - Workspace-relative path completion with directory traversal - CompositeCompleter merges file and model completers Command Updates: - /model now accepts provider/model-slug format - Backward compatible with /model provider model format - Auto-detects format based on slash presence Fixes: - Removed static hardcoded NVIDIA model list (5 models → 200+) - Proper model slug parsing for NVIDIA multi-slash format - File completion skips hidden files unless explicitly requested 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Updated model tab completion to show only one directory level at a time: Before: - /model nvidia/[TAB] → shows all 200+ models - Overwhelming and hard to navigate After: - /model nvidia/[TAB] → shows: qwen/, meta/, google/, nvidia/, kimi/, etc. - /model nvidia/qwen/[TAB] → shows only qwen models - /model nvidia/meta/[TAB] → shows only meta models Implementation: - Group models by next path segment after typed prefix - Show unique prefixes with trailing slash - Only show final model names when no more slashes - Applied to both provider/model and provider model formats UX matches bash cd completion behavior for better usability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Added slash command completion and improved path-like navigation: Command Completion: - / + TAB → shows all available commands (/help, /model, /add, /drop, etc.) - /mo + TAB → completes to /model - /ad + TAB → completes to /add Model Completion (bash-like): - /model nvidia/ + TAB → qwen/, meta/, google/, nvidia/ - /model nvidia/qwen/ + TAB → shows only qwen models at this level - Hierarchical navigation exactly like bash cd File Completion (bash-like): - /add src/ + TAB → shows files in src/ directory - /add src/ii_agent/ + TAB → shows files in ii_agent/ - Shows just filenames at current level (not full paths) - Directories have trailing slash - Preserves typed path prefix Implementation: - CommandCompleter: handles / command completion - FileCompleter: bash-like file path navigation - ModelCompleter: hierarchical model path navigation - CompositeCompleter: merges all three completers UX now matches bash completion behavior exactly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…unds Replaced all unreadable colors with readable ones on black backgrounds: - Warnings: YELLOW → WHITE - Info: CYAN → WHITE - Commands: CYAN → WHITE - Env vars: YELLOW → WHITE - Header: CYAN → BLUE Only using readable ANSI colors on black backgrounds: ✓ RED - error messages ✓ GREEN - success, files, models ✓ BLUE - prompts, workspace, headers ✓ WHITE - info, warnings, commands Removed: CYAN, YELLOW, MAGENTA (unreadable on black)
…l completion Replaces text.split() tokenization with cursor position-based word extraction to enable bash-like file completion behavior for model paths with slashes. Key changes: - Extract current word at cursor position instead of tokenizing entire command - Maintain one-level-at-a-time completion like bash cd - Handle partial model paths correctly at any cursor position - Add model_suggestions attribute for test compatibility
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

Major Feature Release: Interactive REPL, NVIDIA Provider, Async Database, Memory Systems
This PR creates a lite cli minimalist wheel option --repl
🎯 Major Features
1. Interactive REPL Mode (aider-style CLI)
Changelog
collector branch (2025-11-19)
Features
Implementation
src/ii_agent/cli/: REPL mode entry point and implementationsrc/ii_agent/db/: Async SQLite, DuckDB integrationsrc/ii_agent/storage/: Three memory backend optionssrc/ii_agent/server/api/nvidia_models.py: NVIDIA model fetchingscripts/fetch_nvidia_models.py: Bayesian ranking scriptBreaking Changes
Documentation
docs/INSTALL.md: Installation guiderequirements-lite.txt: Minimal REPL-only install