feat: file attachment #6766

dinhlongviolin1 · 2025-10-08T09:36:19Z

Describe Your Changes

Overview: File Attachment Feature

What Files Are Supported

Document Formats:

PDF - Text extraction from PDF files
Text files (.txt, .md, etc.)
Code files - Various programming languages
HTML - Web pages and HTML documents
Office documents (.docx, .pptx, .csv) - The parser infrastructure is in place

How It Works

Architecture:

RAG Extension - Orchestrates the retrieval-augmented generation workflow
Vector DB Extension - Local vector storage using SQLite with sqlite-vec for ANN search
Embedding Model - Uses sentence-transformer-mini (All-MiniLM-L6-v2) for generating embeddings

Key Components:

Settings (extensions/rag-extension/settings.json):

enabled - Toggle attachment feature on/off
max_file_size_mb - Max file size (default: 20MB)
retrieval_limit (top_k) - Number of chunks to retrieve (default: 3)
retrieval_threshold - Similarity threshold (default: 0.3)
chunk_size_tokens - Chunk size for splitting (default: 512 tokens)
overlap_tokens - Overlap between chunks (default: 64 tokens)
search_mode - Auto/ANN/Linear search modes

Internal Tools Exposed:

list_attachments - List files attached to a thread
retrieve - Semantic search across attached documents
get_chunks - Retrieve specific chunks by file and order

Ingestion Flow:

File uploaded → Parsed to text
Text chunked with configurable size/overlap
Each chunk embedded using local embedding model
Vectors stored in SQLite collection per thread
Available for semantic search during chat

Notable Features:

Auto-unload protection: Embedding models won't be unloaded when autoUnload is enabled
Lazy embedding model loading: Downloads and loads sentence-transformer model on first use
Graceful fallback: If embeddings endpoint not available (501), automatically reloads model in embedding mode
ANN vs Linear search: Automatically detects sqlite-vec availability and falls back to linear search if needed

Fixes Issues

Closes #
Closes #

Self Checklist

Added relevant comments, esp in complex areas
Updated docs (for bug fixes / features)
Created issues for follow-up changes or refactoring needed

This commit introduces a new field, `is_embedding`, to the `SessionInfo` structure to clearly mark sessions running dedicated embedding models. Key changes: - Adds `is_embedding` to the `SessionInfo` interface in `AIEngine.ts` and the Rust backend. - Updates the `loadLlamaModel` command signatures to pass this new flag. - Modifies the llama.cpp extension's **auto-unload logic** to explicitly **filter out** and **not unload** any currently loaded embedding models when a new text generation model is loaded. This is a critical performance fix to prevent the embedding model (e.g., used for RAG) from being repeatedly reloaded. Also includes minor code style cleanup/reformatting in `jan-provider-web/provider.ts` for improved readability.

github-actions · 2025-10-08T21:47:00Z

Barecheck - Code coverage report

Total: 29.74%

Your code coverage diff: -0.69% ▾

Uncovered files and lines

File	Lines
core/src/browser/extension.ts	96-97, 104-107, 116-118, 127-128, 130-146, 149-150, 184, 186-195
core/src/browser/extensions/rag.ts	25-26
core/src/browser/extensions/vector-db.ts	49-50
extensions-web/src/jan-provider-web/provider.ts	6, 16-18, 21, 23-25, 27-28, 30, 32, 35-40, 42-43, 46-51, 53-54, 57, 59-76, 79, 81, 83, 85, 87-97, 99-104, 106-107, 110-112, 114-116, 118-139, 141-143, 145-162, 164-165, 168, 170-177, 179, 181-189, 191-193, 195-200, 202-203, 205-218, 220-224, 226-228, 232-235, 238-244, 246-259, 261, 263-264, 266, 269-271, 273-287, 293-301, 303-311, 313-316, 319-322, 324-329, 331, 333-366, 369-373, 375-378, 381-382, 385-388, 390-392, 394-395, 397-401, 403-407, 409-413, 415-419, 421-425, 427-431, 433, 435-438
scripts/download-bin.mjs	1-8, 10-40, 42-57, 59-88, 90-96, 98-99, 101-104, 106-115, 117-126, 128-130, 132, 134-149, 151-152, 154-162, 164-175, 177-178, 180-183, 185-262, 264-336, 338-343, 345-394, 396-397, 399-402
web-app/src/routeTree.gen.ts	13-36, 40-44, 46-50, 52-56, 58-62, 64-68, 70-74, 76-80, 82-86, 88-92, 94-98, 100-104, 106-110, 112-116, 118-122, 124-128, 130-134, 136-140, 142-146, 148-152, 154-158, 160-164, 166-171, 173-177, 533-557, 559-561
web-app/src/containers/ChatInput.tsx	133-134, 138-143, 161-165, 169-172, 178, 200-201, 203-205, 222-224, 226-227, 233-241, 245-267, 281-282, 286-289, 303-304, 309-310, 316-317, 324-327, 332-334, 341-342, 345-373, 376-380, 382-383, 385-389, 391-408, 410-414, 416, 418, 421-428, 431-432, 434-440, 442-452, 454-455, 457-474, 476-494, 497-498, 501-511, 514-523, 526, 528-534, 536, 538-541, 544-545, 547-551, 554-555, 558, 560-563, 565-569, 571-583, 585-592, 594-600, 603-606, 608-614, 616-618, 620, 622-638, 640-655, 657-661, 664-673, 675-678, 681-682, 684-687, 690-691, 694-698, 701-702, 704-707, 710-712, 715-717, 720-723, 725-726, 728-732, 734-736, 740-742, 745-748, 750-751, 753-754, 756-761, 764-770, 772-776, 779, 781-785, 788-795, 797-800, 802-804, 806-817, 819-825, 827-833, 836-839, 841, 877-886, 888-895, 898-903, 905-910, 912, 916-922, 926-934, 936-942, 944-959, 962-965, 967-968, 970, 972-973, 1031, 1069-1074, 1076-1080, 1082-1085, 1087-1097, 1104-1118, 1125-1129, 1131-1133, 1147-1149, 1154-1158, 1173-1177, 1192-1206, 1209-1223, 1231-1248, 1256, 1270, 1282-1288, 1290-1296, 1301-1318
web-app/src/containers/SettingsMenu.tsx	37, 39-40, 232-234
web-app/src/containers/ThreadContent.tsx	3-12, 17, 22-23, 25, 27-30, 32-34, 36-40, 42-45, 47-51, 53-60, 62, 64, 67-69, 83-85, 88-91, 93-96, 98-100, 102-105, 108-113, 115, 117-120, 122-123, 126-133, 136-145, 147-148, 150-152, 154, 156-164, 166-187, 189-191, 193-205, 207-210, 212-217, 219-223, 225, 229-232, 234-245, 249-255, 257-261, 263-274, 276-279, 283-293, 295-306, 308-312, 315-322, 324-334, 336-347, 350-357, 359-360, 363-368, 370-371, 374-377, 379-388, 390-393, 395-400, 402-409, 411-414, 416-419, 421-426, 428-434, 436, 438-445, 447, 450-458, 460-461, 463-464
web-app/src/hooks/useAttachments.ts	46-54, 56-70, 72-86, 88-102, 104-118, 120-134, 136-150, 152-166, 174-177, 179-193
web-app/src/hooks/useChat.ts	103-104, 108-112, 115-125, 127-136, 139-141, 144-146, 148-152, 158-167, 177-199, 202, 204, 206, 209, 212-217, 219-220, 225-227, 230-234, 237, 240-242, 244-254, 287-288, 290-293, 295-297, 299-319, 322-323, 325-328, 331-333, 335-338, 341-348, 351-365, 367-369, 383-384, 406-407, 412, 416-420, 426-427, 429-432, 434-447, 449-460, 462-471, 474-478, 480-486, 488-498, 500-504, 506-509, 511-545, 547-572, 574-576, 579, 581-583, 585-586, 589-595, 597-599, 601-602, 604-610, 612-617, 619-622, 624-632, 635, 637, 639-654, 656-661, 663-672, 674-684, 687-694, 696-709, 711, 714, 716-718, 724-725
web-app/src/hooks/useThreadScrolling.tsx	1-3, 5-7, 9-20, 22-26, 28-30, 32-34, 36-43, 46, 48-55, 58-62, 64-72, 74-81, 83-85, 87-89, 91-93, 95-104, 107-111, 113-116, 118-120, 122-123, 126-132, 134, 136-143, 145-146, 148-149, 151-153, 155-157, 160-163, 165-166, 168-170, 173-174, 176-181, 183-189, 191-194, 196-197, 199-200, 202-210, 212-220
web-app/src/hooks/useThreads.ts	43-44, 46-47, 67, 72, 114, 116-120, 187-189, 202-203, 206-209, 211-219, 224-229, 233-235, 250-272, 274, 277, 280, 282-288, 290-311, 313-330, 355-357, 360-363, 366-367, 370, 372-379, 381-383, 385-389, 391, 393-401
web-app/src/hooks/useTools.ts	27-32
web-app/src/lib/completion.ts	69-73, 91-99, 171-179, 181, 183-184, 186-188, 190, 192, 195-200, 202-209, 211-221, 224-232, 235-244, 246-247, 249, 251-267, 269-290, 306-312, 363, 366-371, 373-376, 392-397, 402-403, 405, 407-437, 440-453, 455-457, 460-470, 472-493, 495, 497-518, 520-530, 532-546, 548-551
web-app/src/lib/fileMetadata.ts	23-26, 28-36, 38, 40-41, 51-54, 56-57, 59-61, 64-67, 70-98, 101-103, 105-106
web-app/src/lib/messages.ts	48-52, 56-65, 67-70, 83-84
web-app/src/lib/platform/const.ts	6-7, 13, 15-16, 19-20, 23-24, 27-28, 31-32, 35-36, 39, 42-43, 46, 49, 52, 55-56, 59-60, 63, 66, 69, 72-73, 76, 79, 82, 85-87
web-app/src/routes/settings/attachments.tsx	1-5, 7-12, 14-16, 19-26, 29-32, 34-36, 38-42, 45-47, 50-52, 55-72, 75, 78, 81-86, 89, 91-93, 96-99, 102-106, 109-119, 121, 123-139, 142-147, 149-160, 162-181, 183-184, 187-193, 195-206, 208-209, 211-221, 223, 225-231, 233
web-app/src/services/index.ts	165-189, 191-201, 245-248, 253-256, 351-353, 356-358, 361-363
web-app/src/services/rag/default.ts	8-17, 20-24, 27-37, 40-42, 44-49
web-app/src/services/uploads/default.ts	9, 11-13, 16-31
web-app/src/types/attachment.ts	37-42, 52-57

Copilot

Pull Request Overview

This PR introduces a comprehensive file attachment feature for document retrieval-augmented generation (RAG) in the Jan application. It provides capabilities for uploading, indexing, and semantically searching documents (PDFs, text files, Office docs, etc.) during chat conversations.

Implements a complete RAG pipeline with vector storage, semantic search, and document parsing
Adds settings UI for configuring attachment behavior (file size limits, chunk sizes, search modes)
Introduces new extension architecture for RAG and vector database operations

Reviewed Changes

Copilot reviewed 92 out of 95 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
web-app/src/types/attachment.ts	Defines unified attachment types for images and documents
web-app/src/services/uploads/default.ts	Implements document ingestion service with RAG extension integration
web-app/src/routes/settings/attachments.tsx	Provides settings UI for attachment configuration
web-app/src/hooks/useAttachments.ts	Manages attachment settings state and persistence
web-app/src/containers/ChatInput.tsx	Extends chat input with document and image attachment capabilities
src-tauri/plugins/tauri-plugin-vector-db	New Tauri plugin for vector database operations with sqlite-vec support
extensions/rag-extension	RAG extension providing document retrieval tools and orchestration
core/src/browser/extensions/vector-db.ts	Core vector database extension interface

Comments suppressed due to low confidence (2)

web-app/src/locales/en/common.json:1

Removed trailing empty line at end of file.

src-tauri/plugins/tauri-plugin-vector-db/src/db.rs:1

[nitpick] Missing space after closing brace. Consider formatting: if mag_a == 0.0 || mag_b == 0.0 { return Ok(0.0) }

use crate::VectorDBError;

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

web-app/src/services/uploads/types.ts

web-app/src/lib/fileMetadata.ts

web-app/src/hooks/useThreadScrolling.tsx

Co-authored-by: Copilot <[email protected]>

Copilot

Pull Request Overview

Copilot reviewed 92 out of 95 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

web-app/src/hooks/useThreadScrolling.tsx:1

Dependencies in useCallback should only include values that actually affect the callback. Remove setIsAtBottom and setHasScrollbar from dependencies as they are stable setState functions.

import { useCallback, useEffect, useMemo, useRef, useState } from 'react'

src-tauri/plugins/tauri-plugin-vector-db/src/db.rs:1

[nitpick] Format the early return with proper spacing and line breaks for better readability.

use crate::VectorDBError;

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

web-app/src/services/uploads/default.ts

web-app/src/containers/ChatInput.tsx

web-app/src/lib/fileMetadata.ts

Co-authored-by: Copilot <[email protected]>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

- Added shallow equality guard for `connectedServers` state to prevent redundant updates when the fetched server list hasn't changed. - Updated error handling for server fetch to only clear the state when it actually contains data. - Introduced `newHasActiveModels` variable and conditional updater for `hasActiveModels` to avoid unnecessary state changes. - Adjusted error handling for active model fetch to only set `hasActiveModels` to `false` when the current state differs. These changes reduce needless re‑renders and improve component performance.

dinhlongviolin1 added 3 commits October 7, 2025 10:36

initial layout

a72c74d

working attachments

510c4a5

Merge branch 'dev' into feat/file-attachment

ff93dc3

github-project-automation bot added this to Jan Oct 8, 2025

github-actions bot assigned dinhlongviolin1 Oct 8, 2025

qnixsynapse and others added 5 commits October 8, 2025 20:03

Merge branch 'dev' into feat/file-attachment

6dd2d2d

ui ux enhancement

3400426

fix tests

fc78462

fix thread scrolling

a2fbce6

dinhlongviolin1 marked this pull request as ready for review October 8, 2025 21:46

Copilot AI review requested due to automatic review settings October 8, 2025 21:46

dinhlongviolin1 changed the title ~~Feat/file attachment~~ feat: file attachment Oct 8, 2025

Copilot AI reviewed Oct 8, 2025

View reviewed changes

web-app/src/services/uploads/types.ts Show resolved Hide resolved

web-app/src/lib/fileMetadata.ts Outdated Show resolved Hide resolved

web-app/src/hooks/useThreadScrolling.tsx Show resolved Hide resolved

Update web-app/src/lib/fileMetadata.ts

f4066e6

Co-authored-by: Copilot <[email protected]>

dinhlongviolin1 requested review from Copilot, louis-menlo, qnixsynapse and urmauur October 8, 2025 21:50

Copilot AI reviewed Oct 8, 2025

View reviewed changes

web-app/src/services/uploads/default.ts Outdated Show resolved Hide resolved

web-app/src/containers/ChatInput.tsx Outdated Show resolved Hide resolved

web-app/src/lib/fileMetadata.ts Show resolved Hide resolved

Update web-app/src/services/uploads/default.ts

45d57dd

Co-authored-by: Copilot <[email protected]>

Copilot AI review requested due to automatic review settings October 8, 2025 21:53

Copilot AI reviewed Oct 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: file attachment #6766

feat: file attachment #6766

Uh oh!

dinhlongviolin1 commented Oct 8, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: file attachment #6766

Are you sure you want to change the base?

feat: file attachment #6766

Uh oh!

Conversation

dinhlongviolin1 commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe Your Changes

Fixes Issues

Self Checklist

Uh oh!

github-actions bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Barecheck - Code coverage report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dinhlongviolin1 commented Oct 8, 2025 •

edited

Loading

github-actions bot commented Oct 8, 2025 •

edited

Loading