|
| 1 | +# Project Guide |
| 2 | + |
| 3 | +This guide is a practical map of the entire repository for contributors and maintainers. |
| 4 | + |
| 5 | +It focuses on: |
| 6 | +- What each folder is responsible for |
| 7 | +- Which Python environment and package workflow are the defaults |
| 8 | +- Which commands are currently valid |
| 9 | +- What to improve next in architecture, testing, performance, and developer experience |
| 10 | + |
| 11 | +## 1) System Summary |
| 12 | + |
| 13 | +AI Imaging Agent is a RAG plus VLM recommender for imaging software. |
| 14 | + |
| 15 | +High-level flow: |
| 16 | +1. User uploads file(s) and enters a task. |
| 17 | +2. Retrieval stage finds candidate tools (BGE-M3 + FAISS + reranker). |
| 18 | +3. Agent/VLM stage ranks candidates with image-aware reasoning. |
| 19 | +4. UI renders ranked recommendations and optional demo links. |
| 20 | + |
| 21 | +Primary orchestrator: [src/ai_agent/api/pipeline.py](src/ai_agent/api/pipeline.py) |
| 22 | + |
| 23 | +## 2) Default Python Environment And Packages (Dev Container Canonical) |
| 24 | + |
| 25 | +Assume development is done inside the dev container. |
| 26 | + |
| 27 | +Source of truth: |
| 28 | +- Dev container: [.devcontainer/devcontainer.json](../.devcontainer/devcontainer.json) |
| 29 | +- Package metadata and pinned dependencies: [pyproject.toml](../pyproject.toml) |
| 30 | +- Secondary dependency list: [requirements.txt](../requirements.txt) |
| 31 | + |
| 32 | +Default environment: |
| 33 | +- OS: Debian Bookworm (dev container) |
| 34 | +- Python: 3.12 |
| 35 | +- Environment manager: uv |
| 36 | +- Virtual environment path: .venv |
| 37 | + |
| 38 | +Recommended commands: |
| 39 | + |
| 40 | +```bash |
| 41 | +uv venv |
| 42 | +uv pip install -e . |
| 43 | +uv pip install -e ".[dev]" |
| 44 | +``` |
| 45 | + |
| 46 | +Run and test: |
| 47 | + |
| 48 | +```bash |
| 49 | +ai_agent chat |
| 50 | +ai_agent sync |
| 51 | +pytest tests/ |
| 52 | +``` |
| 53 | + |
| 54 | +Important note on command drift: |
| 55 | +- CLI officially supports `chat` and `sync` in [src/ai_agent/cli.py](../src/ai_agent/cli.py). |
| 56 | +- [justfile](../justfile) currently references `ai_agent ui`, which does not match current CLI modes. |
| 57 | +- Documentation in this guide follows the actual CLI implementation. |
| 58 | + |
| 59 | +## 3) Repository Top-Level Map |
| 60 | + |
| 61 | +- [.github/](../.github/): automation and agent instructions |
| 62 | +- [.devcontainer/](../.devcontainer/): dev container build and editor defaults |
| 63 | +- [docs/](.): MkDocs source pages |
| 64 | +- [src/](../src/): application source code |
| 65 | +- [tests/](../tests/): test suite |
| 66 | +- [data/](../data/): sample data assets |
| 67 | +- [tools/](../tools/): container/tooling helpers |
| 68 | +- [CHANGELOG.md](../CHANGELOG.md): release history |
| 69 | +- [config.yaml](../config.yaml): model/provider configuration |
| 70 | +- [mkdocs.yml](../mkdocs.yml): docs site navigation and theme |
| 71 | +- [pyproject.toml](../pyproject.toml): package metadata, dependencies, entrypoints |
| 72 | + |
| 73 | +## 4) Detailed Source Folder Responsibilities |
| 74 | + |
| 75 | +Package root: [src/ai_agent/](../src/ai_agent) |
| 76 | + |
| 77 | +### 4.1 [src/ai_agent/agent/](../src/ai_agent/agent) |
| 78 | + |
| 79 | +Purpose: conversational orchestration using PydanticAI. |
| 80 | + |
| 81 | +Key files: |
| 82 | +- [src/ai_agent/agent/agent.py](../src/ai_agent/agent/agent.py): agent setup, tool wiring, response flow |
| 83 | +- [src/ai_agent/agent/models.py](../src/ai_agent/agent/models.py): state/output models |
| 84 | +- [src/ai_agent/agent/utils.py](../src/ai_agent/agent/utils.py): helper utilities and guardrails |
| 85 | +- [src/ai_agent/agent/tools/](../src/ai_agent/agent/tools): concrete tool implementations |
| 86 | +- [src/ai_agent/agent/tools/mcp/](../src/ai_agent/agent/tools/mcp): MCP adapters |
| 87 | + |
| 88 | +Boundary: |
| 89 | +- Should orchestrate tools and policy, not own retrieval internals. |
| 90 | + |
| 91 | +### 4.2 [src/ai_agent/api/](../src/ai_agent/api) |
| 92 | + |
| 93 | +Purpose: pipeline orchestration between inputs, retrieval, and selection. |
| 94 | + |
| 95 | +Key file: |
| 96 | +- [src/ai_agent/api/pipeline.py](../src/ai_agent/api/pipeline.py) |
| 97 | + |
| 98 | +Responsibilities: |
| 99 | +- validate files |
| 100 | +- extract metadata |
| 101 | +- build retrieval query |
| 102 | +- call retrieval and selection stages |
| 103 | +- manage index refresh/reload behavior |
| 104 | + |
| 105 | +Boundary: |
| 106 | +- Keep UI concerns out of this module. |
| 107 | + |
| 108 | +### 4.3 [src/ai_agent/retriever/](../src/ai_agent/retriever) |
| 109 | + |
| 110 | +Purpose: deterministic retrieval stack (no LLM calls). |
| 111 | + |
| 112 | +Key files: |
| 113 | +- [src/ai_agent/retriever/text_embedder.py](../src/ai_agent/retriever/text_embedder.py) |
| 114 | +- [src/ai_agent/retriever/vector_index.py](../src/ai_agent/retriever/vector_index.py) |
| 115 | +- [src/ai_agent/retriever/reranker.py](../src/ai_agent/retriever/reranker.py) |
| 116 | +- [src/ai_agent/retriever/software_doc.py](../src/ai_agent/retriever/software_doc.py) |
| 117 | + |
| 118 | +Boundary: |
| 119 | +- Retrieval quality logic should stay here. |
| 120 | + |
| 121 | +### 4.4 [src/ai_agent/generator/](../src/ai_agent/generator) |
| 122 | + |
| 123 | +Purpose: selection schema and prompting primitives. |
| 124 | + |
| 125 | +Key files: |
| 126 | +- [src/ai_agent/generator/prompts.py](../src/ai_agent/generator/prompts.py) |
| 127 | +- [src/ai_agent/generator/schema.py](../src/ai_agent/generator/schema.py) |
| 128 | + |
| 129 | +Boundary: |
| 130 | +- Keep this layer focused on schema and prompt contracts, not transport/UI concerns. |
| 131 | + |
| 132 | +### 4.5 [src/ai_agent/ui/](../src/ai_agent/ui) |
| 133 | + |
| 134 | +Purpose: Gradio app and interaction handling. |
| 135 | + |
| 136 | +Key files: |
| 137 | +- [src/ai_agent/ui/app.py](../src/ai_agent/ui/app.py) |
| 138 | +- [src/ai_agent/ui/handlers.py](../src/ai_agent/ui/handlers.py) |
| 139 | +- [src/ai_agent/ui/components.py](../src/ai_agent/ui/components.py) |
| 140 | +- [src/ai_agent/ui/formatters.py](../src/ai_agent/ui/formatters.py) |
| 141 | +- [src/ai_agent/ui/state.py](../src/ai_agent/ui/state.py) |
| 142 | +- [src/ai_agent/ui/visualizations.py](../src/ai_agent/ui/visualizations.py) |
| 143 | + |
| 144 | +Boundary: |
| 145 | +- UI should call orchestrators, not reimplement retrieval/selection decisions. |
| 146 | + |
| 147 | +### 4.6 [src/ai_agent/utils/](../src/ai_agent/utils) |
| 148 | + |
| 149 | +Purpose: cross-cutting utility functions. |
| 150 | + |
| 151 | +Key files: |
| 152 | +- [src/ai_agent/utils/config.py](../src/ai_agent/utils/config.py) |
| 153 | +- [src/ai_agent/utils/file_validator.py](../src/ai_agent/utils/file_validator.py) |
| 154 | +- [src/ai_agent/utils/image_meta.py](../src/ai_agent/utils/image_meta.py) |
| 155 | +- [src/ai_agent/utils/image_io.py](../src/ai_agent/utils/image_io.py) |
| 156 | +- [src/ai_agent/utils/previews.py](../src/ai_agent/utils/previews.py) |
| 157 | +- [src/ai_agent/utils/tags.py](../src/ai_agent/utils/tags.py) |
| 158 | +- [src/ai_agent/utils/temp_file_manager.py](../src/ai_agent/utils/temp_file_manager.py) |
| 159 | + |
| 160 | +Boundary: |
| 161 | +- Keep utilities reusable and independent from UI-specific logic. |
| 162 | + |
| 163 | +### 4.7 [src/ai_agent/catalog/](../src/ai_agent/catalog) |
| 164 | + |
| 165 | +Purpose: catalog synchronization and refresh helpers. |
| 166 | + |
| 167 | +Key file: |
| 168 | +- [src/ai_agent/catalog/sync.py](../src/ai_agent/catalog/sync.py) |
| 169 | + |
| 170 | +Boundary: |
| 171 | +- Catalog IO and sync logic should stay isolated from ranking logic. |
| 172 | + |
| 173 | +### 4.8 [src/ai_agent/core/](../src/ai_agent/core) |
| 174 | + |
| 175 | +Purpose: shared core coordination such as pipeline registry. |
| 176 | + |
| 177 | +Key file: |
| 178 | +- [src/ai_agent/core/pipeline_registry.py](../src/ai_agent/core/pipeline_registry.py) |
| 179 | + |
| 180 | +Boundary: |
| 181 | +- Keep core primitives minimal and dependency-light. |
| 182 | + |
| 183 | +### 4.9 [src/ai_agent/queries/](../src/ai_agent/queries) |
| 184 | + |
| 185 | +Purpose: query assets used by catalog sync/retrieval support. |
| 186 | + |
| 187 | +Key file: |
| 188 | +- [src/ai_agent/queries/get_relevant_software.rq](../src/ai_agent/queries/get_relevant_software.rq) |
| 189 | + |
| 190 | +Boundary: |
| 191 | +- Keep query definitions versioned and testable. |
| 192 | + |
| 193 | +### 4.10 [src/ai_agent/cli.py](../src/ai_agent/cli.py) |
| 194 | + |
| 195 | +Purpose: command entry point and mode dispatch. |
| 196 | + |
| 197 | +Current modes: |
| 198 | +- `chat` |
| 199 | +- `sync` |
| 200 | + |
| 201 | +This is the command contract docs should follow. |
| 202 | + |
| 203 | +## 5) Supporting Folders |
| 204 | + |
| 205 | +### 5.1 [tests/](../tests) |
| 206 | + |
| 207 | +Contains unit/integration tests and test fixtures under [tests/data/](../tests/data). |
| 208 | + |
| 209 | +Improvement target: |
| 210 | +- add more focused tests for UI handler edge cases and tool failure handling. |
| 211 | + |
| 212 | +### 5.2 [tools/](../tools) |
| 213 | + |
| 214 | +Container and deployment support assets. |
| 215 | + |
| 216 | +Notable file: |
| 217 | +- [tools/image/Dockerfile](../tools/image/Dockerfile) (uv + Python 3.12 baseline) |
| 218 | + |
| 219 | +### 5.3 [docs/](.) |
| 220 | + |
| 221 | +Documentation source for MkDocs. |
| 222 | + |
| 223 | +Add new pages to [mkdocs.yml](../mkdocs.yml) nav to keep docs discoverable. |
| 224 | + |
| 225 | +## 6) Known Inconsistencies To Track |
| 226 | + |
| 227 | +1. [justfile](../justfile) uses `ai_agent ui`, while [src/ai_agent/cli.py](../src/ai_agent/cli.py) defines `chat` and `sync`. |
| 228 | +2. Installation docs often show pip-first flow, while dev container bootstrap is uv-first. |
| 229 | +3. [requirements.txt](../requirements.txt) is looser than [pyproject.toml](../pyproject.toml), which contains current pinned/runtime dependencies. |
| 230 | + |
| 231 | +## 7) Codebase Improvement Guidelines |
| 232 | + |
| 233 | +### 7.1 Architecture And Modularity |
| 234 | + |
| 235 | +1. Keep strict stage boundaries: retrieval logic in `retriever`, selection contracts in `generator`, orchestration in `api`. |
| 236 | +2. Minimize cross-layer imports from `ui` to low-level modules. |
| 237 | +3. Introduce lightweight interface contracts for tool adapters to reduce coupling in `agent/tools`. |
| 238 | +4. Centralize shared constants/env defaults to reduce duplicated configuration behavior. |
| 239 | + |
| 240 | +### 7.2 Testing And Quality Gates |
| 241 | + |
| 242 | +1. Add regression tests for format-token query construction and retry broadening behavior. |
| 243 | +2. Add failure-path tests for image preview generation and graceful degradation. |
| 244 | +3. Add contract tests for agent tool outputs (search, alternative search, repo info). |
| 245 | +4. Enforce formatting/lint/type checks in CI (`ruff`, `black --check`, `mypy`, `pytest`). |
| 246 | + |
| 247 | +### 7.3 Performance And Retrieval Quality |
| 248 | + |
| 249 | +1. Add benchmark fixtures for retrieval latency and reranker throughput. |
| 250 | +2. Track retrieval quality with a small fixed evaluation set (top-k recall, MRR). |
| 251 | +3. Cache expensive metadata extraction where safe for repeated files in a session. |
| 252 | +4. Make index reload behavior observable with structured counters in logs. |
| 253 | + |
| 254 | +### 7.4 Developer Experience And CI |
| 255 | + |
| 256 | +1. Align `just` tasks with real CLI contract (`chat`/`sync`). |
| 257 | +2. Add a docs link checker in CI to prevent markdown drift. |
| 258 | +3. Document one canonical local workflow (dev container first, optional local pip fallback). |
| 259 | +4. Add a short maintainer checklist for release prep and changelog updates. |
| 260 | + |
| 261 | +## 8) Practical Contributor Checklist |
| 262 | + |
| 263 | +Before opening a PR: |
| 264 | +1. Install/update in editable mode in the active environment. |
| 265 | +2. Run tests relevant to changed modules. |
| 266 | +3. Validate docs links if docs were touched. |
| 267 | +4. Update [CHANGELOG.md](../CHANGELOG.md) for user-visible changes. |
| 268 | +5. Confirm command and environment docs still match real behavior. |
| 269 | + |
| 270 | +## 9) Related References |
| 271 | + |
| 272 | +- [README.md](../README.md) |
| 273 | +- [docs/index.md](index.md) |
| 274 | +- [docs/architecture/overview.md](architecture/overview.md) |
| 275 | +- [docs/development/structure.md](development/structure.md) |
| 276 | +- [AGENTS.md](../AGENTS.md) |
| 277 | +- [.github/copilot-instructions.md](../.github/copilot-instructions.md) |
0 commit comments