Skip to content

feat(file-manager): add schema and foundation for file management system#13451

Open
EurFelux wants to merge 11 commits intov2from
v2-file-manager-phase1
Open

feat(file-manager): add schema and foundation for file management system#13451
EurFelux wants to merge 11 commits intov2from
v2-file-manager-phase1

Conversation

@EurFelux
Copy link
Collaborator

What this PR does

Before this PR:
No file management data layer exists. The codebase lacks unified types, DB schema, and API definitions for the planned file manager feature.

After this PR:
Adds the complete Phase 1 foundation for the file management system (RFC: docs/tmp/rfc-file-manager.md):

  • Zod provider config schemas (fileProvider.ts) — runtime-validated discriminated union for mount provider types (local_managed, local_external, remote, system)
  • Entity types (fileNode.ts) — FileNode, FileRef, and corresponding DTOs
  • Drizzle schema (node.ts) — node and file_ref tables with indexes, CHECK constraints, self-referencing FK with CASCADE
  • DB migration — auto-generated SQL for the two new tables
  • API schema (files.ts) — FileSchemas type covering all file CRUD, tree ops, refs, batch ops, and mount listing endpoints
  • Placeholder handlers — stub implementations that satisfy type requirements (to be filled in Phase 2)
  • System node seeding — idempotent creation of 3 mount nodes: mount_files, mount_notes, system_trash
  • Path resolver utilityresolvePhysicalPath() with 9 passing unit tests

Why we need it and why it was done in this way

This is the data layer foundation that Phase 2 (services, handlers, UI) builds on. Splitting into phases keeps PRs reviewable and allows early validation of the schema design.

The following tradeoffs were made:

  • Placeholder handlers with throw Error('Not implemented') were added to satisfy ApiImplementation exhaustive type checking. These will be replaced in Phase 2.
  • Timestamps use number (ms epoch) instead of ISO strings, following the newer v2 DB convention.

The following alternatives were considered:

  • Deferring API schema and handlers to Phase 2 — rejected because adding FileSchemas to ApiSchemas validates the type system integration early.

Breaking changes

None. This is purely additive — new tables, new types, no changes to existing functionality.

Special notes for your reviewer

  • The RFC is at docs/tmp/rfc-file-manager.md for full design context.
  • nodeSeeding.ts calls getFilesDir() and getNotesDir() which depend on Electron app.getPath('userData') — this runs at app startup after Electron is ready.

Checklist

  • PR: The PR description is expressive enough and will help future contributors
  • Code: Write code that humans can understand and Keep it simple
  • Refactor: You have left the code cleaner than you found it (Boy Scout Rule)
  • Upgrade: Impact of this change on upgrade flows was considered and addressed if required
  • Documentation: Not required — internal data layer, no user-facing changes
  • Self-review: I have reviewed my own code before requesting review from others

Release note

NONE

EurFelux and others added 6 commits February 7, 2026 15:16
Add comprehensive documentation covering:
- VS Code explorer tree implementation overview
- Current notes file tree implementation
- Existing file architecture problems
- Current file management implementation details
- RFC for new file manager design (single node table + file tree)

These documents provide technical reference and design context for the file management system, highlighting current issues and proposed solutions.
Key improvements from 3 rounds of review:
- Add cross-mount move prohibition with explicit moveNode impl
- Fix permanentDelete ordering (physical files before DB)
- Fix OrphanRefScanner pagination bug (cursor-based instead of offset)
- Add ext format normalization spec and resolvePhysicalPath reference impl
- Add batch APIs (trash/move/delete) and children sorting/pagination
- Add cycle detection in moveNode and system node protection in trashNode/permanentDelete
- Add file_ref migration fault tolerance (skip missing nodes)
- Add local_external physical file sync for trash/restore operations
- Add .trash directory spec and chokidar exclusion rules
- Add operation lock mechanism to prevent chokidar race conditions
- Upgrade system_trash from dir to mount (provider_type: 'system')
- Sync Handler/Service signatures with new API schema additions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
Phase 1 implementation: Zod provider config schemas, FileNode/FileRef
entity types, Drizzle node+file_ref tables with migration, API schema
definitions with placeholder handlers, system node seeding (Files,
Notes, Trash mounts), and path resolver utility with unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
@EurFelux EurFelux changed the base branch from main to v2 March 14, 2026 06:14
@EurFelux EurFelux requested a review from a team March 14, 2026 06:14
Copy link
Collaborator Author

@EurFelux EurFelux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall

Solid Phase 1 foundation PR! The schema design, type system integration, and documentation quality are all excellent. The RFC is thorough with clear trade-off reasoning.

Highlights

  • Idempotent seeding logicNodeSeed.migrate() checks existing records before inserting, very robust
  • Minimal interfaces in pathResolverPathResolvableNode / MountInfo instead of full FileNode, clean decoupling
  • Placeholder handlersnotImplemented + exhaustive type checking validates the type system integration early
  • Test coverage — 9 test cases for pathResolver covering all edge cases
  • Documentation — RFC and supporting docs are detailed with clear trade-off records

Issues found

  1. [Medium] nodeSeeding.tsgetFilesDir() / getNotesDir() called at module load time (see inline comment)
  2. [Low] files.tsDELETE with request body in batch/delete endpoint (see inline comment)
  3. [Nit] Missing remote provider type test case (see inline comment)

None of these are blocking. Great work! 🎉

- Defer getFilesDir()/getNotesDir() calls to migrate() runtime to avoid
  calling app.getPath() at module load time before app.ready
- Change batch delete endpoint from DELETE to POST to avoid DELETE-with-body
  compatibility issues
- Add missing test case for remote provider type path resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
Copy link
Collaborator Author

@EurFelux EurFelux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after fixes

All feedback from the previous review has been addressed:

  1. nodeSeeding.ts lazy initSYSTEM_NODES module-level constant replaced with getSystemNodes() function, ensuring getFilesDir() / getNotesDir() are only called at migrate() time (after Electron app.ready).

  2. remote test case added — New describe('remote') block correctly tests the throw path, with proper Zod-required fields (auto_sync, options) included.

  3. ℹ️ DELETE with body — Acknowledged as low priority for an internal IPC-based API, acceptable to defer.

CI Status

  • general-test
  • basic-checks
  • translate
  • render-test / skills-check-windows still pending

LGTM! Clean Phase 1 foundation with solid schema design, good test coverage, and thorough documentation. 🎉

EurFelux and others added 2 commits March 14, 2026 16:35
Replace plain TypeScript interfaces with Zod schemas for runtime
validation. Adds system node ID constants, UUID v7 format validation,
type invariants via superRefine, and lifecycle state guards.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
Use a nullable ms-epoch timestamp instead of a boolean flag so the
service layer can compare with remote updatedAt to detect stale caches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
Copy link
Collaborator Author

@EurFelux EurFelux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architecture Review — Phase 1 Foundation

Note

This review was translated by Claude.

The schema design is solid overall, and the RFC documentation quality is high. Below are several architectural concerns; none of these block Phase 1, but I recommend addressing them during Phase 2 implementation:

Summary of Inline Comments

# Severity File Concern
1 Medium node.ts:60 parentId CASCADE deletion may accidentally delete the entire file tree; recommend adding delete protection for system nodes
2 Medium fileNode.ts:110 Type invariants rely only on Zod runtime validation; no constraints at the DB layer; recommend validating on write as well
3 Low node.ts:83-98 file_ref polymorphic references have no FK; may leave orphaned references when source is deleted
4 Low files.ts:76-93 recursive children query may have performance risks; suggest changing to maxDepth

Not Filed as Inline (Minor)

  • local_external Trash Path: The RFC describes moving Trash files to {base_path}/.trash/{nodeId}/, but the current pathResolver's local_external branch doesn't reserve this path. When implementing trash in Phase 2, you'll need to modify the resolvePhysicalPath signature.
  • Seeding Order: Currently migrateSeed('preference')migrateSeed('node') is manually sorted; this may become fragile as the number of seeds increases.

Overall: clean Phase 1 foundation, good to go once CI passes ✅


Original Content

Architecture Review — Phase 1 Foundation

Schema 设计整体扎实,RFC 文档质量很高。以下是从架构视角的几点担忧,均不 blocking Phase 1,但建议在 Phase 2 实现时注意:

Summary of Inline Comments

# Severity File Concern
1 Medium node.ts:60 parentId CASCADE 删除可能误删整棵文件树,建议 system node 加删除保护
2 Medium fileNode.ts:110 Type invariants 仅靠 Zod 运行时验证,DB 层无约束,建议写入时也 validate
3 Low node.ts:83-98 file_ref 多态引用无 FK,source 删除时可能留下孤儿引用
4 Low files.ts:76-93 recursive children 查询可能有性能风险,建议改为 maxDepth

Not Filed as Inline (Minor)

  • local_external Trash 路径:RFC 描述 Trash 文件移动到 {base_path}/.trash/{nodeId}/,但当前 pathResolverlocal_external 分支未预留这个分支。Phase 2 实现 trash 时需要改造 resolvePhysicalPath 签名。
  • Seeding 顺序:目前 migrateSeed('preference')migrateSeed('node') 是手动排序,seed 增多后可能变脆弱。

Overall: clean Phase 1 foundation, good to go once CI passes ✅

},
(t) => [
// Self-referencing FK: cascade delete children when parent is deleted (Trash cleanup)
foreignKey({ columns: [t.parentId], foreignColumns: [t.id] }).onDelete('cascade'),
Copy link
Collaborator Author

@EurFelux EurFelux Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This comment was translated by Claude.

[Medium] CASCADE Deletion Needs Phase 2 Protection

onDelete('cascade') is convenient when clearing Trash, but it also means that if mount_files seed node is accidentally deleted, the entire file tree along with all file_ref will cascade disappear, completely bypassing the service layer—no chance to trigger physical file cleanup.

The Phase 2 service layer is recommended to add hard-coded deletion protection for SYSTEM_NODE_IDS (mount_files, mount_notes, system_trash) (directly reject delete requests), to avoid any bugs or accidental operations triggering cascades through the service layer.

Not blocking for Phase 1.


Original Content

[Medium] CASCADE 删除需要 Phase 2 加保护

onDelete('cascade') 在清空 Trash 时很方便,但也意味着如果误操作删除了 mount_files 这个 seed 节点,整棵文件树连同所有 file_ref 都会级联消失,且完全绕过 service 层——没有机会触发物理文件清理。

Phase 2 的 service 层建议对 SYSTEM_NODE_IDSmount_filesmount_notessystem_trash)加 hard-coded 删除保护(直接拒绝 delete 请求),避免任何 bug 或意外操作通过 service 层触发级联。

Not blocking for Phase 1.

Comment on lines +83 to +98
{
id: uuidPrimaryKey(),

// Referenced file node ID
nodeId: text()
.notNull()
.references(() => nodeTable.id, { onDelete: 'cascade' }),

// Business source type (e.g. 'chat_message', 'knowledge_item', 'painting', 'note')
// Enum validated at application layer (Zod), no CHECK constraint
sourceType: text().notNull(),
// Business object ID (polymorphic, no FK constraint)
sourceId: text().notNull(),
// Reference role (e.g. 'attachment', 'source', 'asset')
role: text().notNull(),

Copy link
Collaborator Author

@EurFelux EurFelux Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] Orphan cleanup for file_ref polymorphic references

Note

This comment was translated by Claude.

sourceId has no FK constraint (polymorphic design, reasonable), but this means when deleting business objects like messages/knowledge_items, the corresponding service layer must actively clean up file_ref. If a service misses this step, orphan references will be created—potentially blocking a future "cleanup if unreferenced" auto-reclamation strategy.

Suggested Phase 2 considerations:

  1. Add defensive deleteFileRefsBySource() calls in each source entity's delete logic
  2. Periodic GC job / health check to detect records where sourceId points to non-existent business objects

Not blocking.


Original Content

[Low] file_ref 多态引用的孤儿清理

sourceId 没有 FK 约束(多态设计,合理),但这意味着删除 message / knowledge_item 等业务对象时,必须 由对应的 service 层主动清理 file_ref。如果某个 service 漏了这一步,就会产生孤儿引用——可能阻碍未来"无引用则可清理"的自动回收策略。

建议 Phase 2 考虑以下之一:

  1. 在各 source entity 的 delete 逻辑中加防御性 deleteFileRefsBySource() 调用
  2. 定期 GC job / health check 检测 sourceId 指向不存在的业务对象的记录

Not blocking.

Comment on lines +100 to +110
/** Remote file ID (e.g. OpenAI file-abc123) */
remoteId: z.string().nullable(),
/** When the local cache was last downloaded (ms epoch). Null if not cached. Compare with remote updatedAt to detect staleness */
cachedAt: z.int().nullable(),
/** Original parent ID before moving to Trash (only for Trash direct children) */
previousParentId: NodeIdSchema.nullable(),
/** Creation timestamp (ms epoch) */
createdAt: z.int(),
/** Last update timestamp (ms epoch) */
updatedAt: z.int()
})
Copy link
Collaborator Author

@EurFelux EurFelux Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This comment was translated by Claude.

[Medium] Type invariants are only validated during Zod reads, with no DB-level guarantees

The superRefine here implements rich type invariants (mount must have parentId=null, dir cannot have providerConfig, etc.), which is of high quality.

However, these constraints only take effect during Zod parse—if Phase 2's service layer has a bug and directly writes data that violates constraints through Drizzle (e.g., type=dir but with providerConfig), SQLite will silently accept it until the next read parse exposes the problem.

It is recommended that Phase 2 add defensive assertions in the repository layer's write methods (validate before writing), rather than only relying on parse during reads. For example:

async createNode(dto: CreateNodeDto): Promise<FileNode> {
  const row = await db.insert(nodeTable).values(mapped).returning()
  // Parse and validate immediately after write to ensure DB data conforms to invariants
  return FileNodeSchema.parse(row[0])
}

This can move "bad write" problems from "discovered during read" to "fail immediately during write".


Original Content

[Medium] Type invariants 只在 Zod 读取时验证,DB 层无保障

这里的 superRefine 实现了丰富的 type invariants(mount 必须 parentId=null、dir 不能有 providerConfig 等),质量很高。

但这些约束仅在 Zod parse 时生效——如果 Phase 2 的 service 层有 bug 直接通过 Drizzle 写入了违反约束的数据(比如 type=dir 但带了 providerConfig),SQLite 会默默接受,直到下次读取 parse 才暴露问题。

建议 Phase 2 在 repository 层的写入方法中加防御性断言(写入前 validate),而不是只依赖读取时的 parse。比如:

async createNode(dto: CreateNodeDto): Promise<FileNode> {
  const row = await db.insert(nodeTable).values(mapped).returning()
  // 写入后立即 parse 验证,确保 DB 中数据符合 invariants
  return FileNodeSchema.parse(row[0])
}

这样可以把 "bad write" 问题从 "读取时才发现" 提前到 "写入时立即失败"。

Allows callers to limit tree depth when using recursive=true,
preventing accidental full-tree queries on large directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: icarus <eurfelux@gmail.com>
@DeJeune DeJeune self-assigned this Mar 14, 2026
DeJeune added a commit that referenced this pull request Mar 14, 2026
…tion plan

Add comments documenting how the note table will integrate with the
file manager node table (PR #13451): add nodeId FK, backfill from
relativePath, and eventually derive path from node tree.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: suyao <sy20010504@gmail.com>
@DeJeune DeJeune added the v2 label Mar 14, 2026
@DeJeune DeJeune added this to the v2.0.0 milestone Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants