Skip to content

feat: Add native file import tools for enhanced DEVONthink integration#7

Closed
deadmanoz wants to merge 8 commits into
dvcrn:mainfrom
deadmanoz:feature/native-file-import
Closed

feat: Add native file import tools for enhanced DEVONthink integration#7
deadmanoz wants to merge 8 commits into
dvcrn:mainfrom
deadmanoz:feature/native-file-import

Conversation

@deadmanoz
Copy link
Copy Markdown
Contributor

@deadmanoz deadmanoz commented Aug 21, 2025

Summary

This was motivated by me trying to use the original functionality to "move" (via create_record) files into DEVONthink from elsewhere on my file system - it broke on some scripts that had various escape characters. This functionality will allow for much better handling of binary files in general.

• Add three new native file import tools using DEVONthink's built-in import functionality
• Solve encoding and file type issues present in content-based record creation
• Enable batch processing and directory structure preservation

New Tools Added

import_file - Single File Import

  • Import individual files with proper metadata preservation
  • Handle all file types (text, binary, images, scripts)
  • Optional custom naming and target group specification

import_directory - Directory Structure Import

  • Import entire directory hierarchies with filtering
  • Recursive processing with glob pattern filtering
  • Preserve or flatten directory structure as needed

import_with_options - Advanced Batch Import

  • Comprehensive import with duplicate handling (skip/rename/replace)
  • Batch tagging of imported records
  • Multiple source paths with unified configuration

Key Benefits

  • Encoding Safety: DEVONthink handles all character encodings internally
  • File Type Preservation: Maintains proper file associations and metadata
  • Binary File Support: Works with images, PDFs, executables, and other binary formats
  • Batch Processing: Efficient handling of multiple files and directories
  • Error Recovery: Graceful handling of problematic files with detailed reporting

Test Plan

  • Test single file import with various file types
  • Test directory import with filtering and structure preservation
  • Test advanced batch import with duplicate handling
  • Verify proper metadata preservation
  • Test error handling for invalid paths and permissions

@dvcrn
Copy link
Copy Markdown
Owner

dvcrn commented Aug 22, 2025

/gemini review

@dvcrn
Copy link
Copy Markdown
Owner

dvcrn commented Aug 22, 2025

@deadmanoz So far LGTM, were you able to properly test this? There were no issues with sandboxing of the agent?

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces valuable new functionality for native file imports, which is a great enhancement for DEVONthink integration. The addition of import_file, import_directory, and import_with_options tools significantly improves how files can be brought into DEVONthink, addressing previous limitations with encoding and binary files. The code is well-structured, and the documentation is comprehensive. However, I've identified a few critical issues in importWithOptions.ts where some advertised features are not implemented correctly, such as date preservation and duplicate handling. There are also some minor robustness improvements for importDirectory.ts and documentation inconsistencies that should be addressed.

Comment thread src/tools/importWithOptions.ts Outdated
Comment on lines +403 to +427
const duplicateRecord = findDuplicateRecord(fileInfo.name, importGroup);
if (duplicateRecord) {
const dupInfo = {};
dupInfo["path"] = fileInfo.path;

if (pDuplicateDetection === "skip") {
dupInfo["action"] = "skipped";
dupInfo["existingRecord"] = duplicateRecord.uuid();
duplicateFiles.push(dupInfo);
continue;
} else if (pDuplicateDetection === "replace") {
theApp.delete({ record: duplicateRecord });
dupInfo["action"] = "replaced";
dupInfo["existingRecord"] = duplicateRecord.uuid();
duplicateFiles.push(dupInfo);
} else { // rename
const uniqueName = generateUniqueName(fileInfo.name, importGroup);
dupInfo["action"] = "renamed to " + uniqueName;
dupInfo["existingRecord"] = duplicateRecord.uuid();
duplicateFiles.push(dupInfo);
}
}

// Import the file
const importedRecord = theApp.importPath(fileInfo.path, { to: importGroup });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There are several critical bugs in the implementation of the import logic:

  1. Unused Options: The pCreateIndex and pPreserveCreationDate options are prepared but never used in the import command. The importPath function is too simple for these options.
  2. Rename on Duplicate Bug: The rename logic for pDuplicateDetection calculates a new unique name but doesn't use it when importing the file. The file is imported with its original name, which DEVONthink will then handle as a duplicate based on its own preferences, ignoring the tool's logic.

To fix this, you should use the more powerful theApp.import() command and construct an options object that includes the asIndexed, preserveCreationDate, and name properties when applicable.

            const duplicateRecord = findDuplicateRecord(fileInfo.name, importGroup);

            const importJxaOptions = {};
            importJxaOptions["to"] = importGroup;
            importJxaOptions["asIndexed"] = pCreateIndex;
            importJxaOptions["preserveCreationDate"] = pPreserveCreationDate;

            if (duplicateRecord) {
              const dupInfo = {};
              dupInfo["path"] = fileInfo.path;
              
              if (pDuplicateDetection === "skip") {
                dupInfo["action"] = "skipped";
                dupInfo["existingRecord"] = duplicateRecord.uuid();
                duplicateFiles.push(dupInfo);
                continue;
              } else if (pDuplicateDetection === "replace") {
                theApp.delete({ record: duplicateRecord });
                dupInfo["action"] = "replaced";
                dupInfo["existingRecord"] = duplicateRecord.uuid();
                duplicateFiles.push(dupInfo);
              } else { // rename
                const uniqueName = generateUniqueName(fileInfo.name, importGroup);
                importJxaOptions["name"] = uniqueName;
                dupInfo["action"] = "renamed to " + uniqueName;
                dupInfo["existingRecord"] = duplicateRecord.uuid();
                duplicateFiles.push(dupInfo);
              }
            }
            
            // Import the file
            const importedRecord = theApp.import(fileInfo.path, importJxaOptions);

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deadmanoz could you take a look at this review point?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah no worries

Copy link
Copy Markdown
Contributor Author

@deadmanoz deadmanoz Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure theApp.import is non-existent.

theApp.importPath is the correct way: https://github.com/dvcrn/mcp-server-devonthink/blob/main/docs/devonthink-javascript-2.md#importpathpath-options

So leaving as importPath!

Comment thread CLAUDE.md Outdated
Comment thread src/tools/importDirectory.ts Outdated
@deadmanoz
Copy link
Copy Markdown
Contributor Author

@deadmanoz So far LGTM, were you able to properly test this? There were no issues with sandboxing of the agent?

Yeah I tested it, but I can test more thoroughly around the sandboxing aspects

@deadmanoz
Copy link
Copy Markdown
Contributor Author

Yo @dvcrn, I've had a crack around sandboxing and added some protections.

Let me know what you think

@dvcrn
Copy link
Copy Markdown
Owner

dvcrn commented Sep 9, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces three powerful new tools for native file importing (import_file, import_directory, import_with_options), which is a great enhancement for handling various file types and preserving metadata. The addition of a path security module is also a crucial improvement. My review focuses on improving the robustness of the new security features and the maintainability of the JXA scripts. I've identified a couple of high-severity issues in the security validation logic that should be addressed. Overall, this is a solid contribution with significant new functionality.

Comment thread src/utils/escapeString.ts Outdated
Comment thread src/utils/pathSecurity.ts Outdated
Comment thread src/tools/importDirectory.ts Outdated
Comment thread src/tools/importDirectory.ts Outdated
Comment thread src/tools/importFile.ts Outdated
Comment thread src/tools/importWithOptions.ts Outdated
@dvcrn
Copy link
Copy Markdown
Owner

dvcrn commented Sep 9, 2025

@deadmanoz sorry for the delay, I didn't see the GitHub notification

Gemini surfaced 2 'critical' issues, can you take a look at those? Besides that it lgtm and I'm okay with merging it. Thanks a lot for the PR!

Add native DEVONthink file import functionality using importPath() for proper
file type handling, metadata preservation, and error handling. Supports
importing individual files with optional custom naming and group targeting.
Add directory import functionality with file filtering, recursive processing,
and path preservation. Includes robust filter matching with brace expansion
support (*.{txt,md}) and directory structure recreation in DEVONthink.
Add comprehensive batch import tool with duplicate handling (skip/rename/replace),
automatic tagging, multiple source paths, and advanced configuration options.
Includes improved JXA error handling and robust filter matching.
- Replace importPath() with more powerful import() method across all import tools
- Fix critical bug where duplicate detection 'rename' option wasn't applied during import
- Add proper handling of createIndex and preserveCreationDate options in importWithOptions
- Update documentation to include all available options with accurate descriptions
- Ensure consistent import behavior across importFile, importDirectory, and importWithOptions tools
- Fix duplicate handling logic to actually use generated unique names during import

Resolves review comments about unused options and incorrect duplicate handling behavior.
- Replace theApp.import() with theApp.importPath() in all import tools
- Add missing sysEvents declaration in importDirectory.ts
- Fix database lookup from .whose() to .find() pattern
- Add try-catch blocks around sysEvents file/folder access
- Remove problematic file existence checks (let DEVONthink handle)
- Fix duplicate detection using .find() instead of .whose()
- Apply custom names after import for importFile tool
…rotection

- Add pathSecurity.ts utility with comprehensive file path validation
- Block access to sensitive system files (/etc/, /var/log/, ~/.ssh/, etc.)
- Implement path traversal protection (../ patterns)
- Add JXA script injection protection for user inputs
- Validate custom names and database names for unsafe patterns
- Apply security validation to all import tools (importFile, importDirectory, importWithOptions)
- Prevent syntax errors from malicious input patterns
- Add risk classification system (low/medium/high/critical)
…ing escaping

- pathSecurity.ts: Add tilde expansion before path.resolve() to properly
  handle paths like ~/.ssh/. Add ~ prefix to BLOCKED_PATHS for user-specific
  sensitive paths (shell history, env files, credential directories)

- escapeString.ts: Remove overly restrictive quote validation patterns in
  isJXASafeString() that incorrectly rejected valid inputs like "it's a file".
  Keep dangerous JXA injection pattern detection.

- importDirectory.ts: Use System Events .name() method instead of
  path.split("/").pop() for more reliable file/folder name extraction

- escapeString.test.ts: Add comprehensive tests for isJXASafeString()
  covering apostrophes, quotes, and dangerous patterns
@deadmanoz deadmanoz force-pushed the feature/native-file-import branch from aa15c30 to a8d9aa1 Compare December 19, 2025 02:37
@deadmanoz
Copy link
Copy Markdown
Contributor Author

deadmanoz commented Dec 19, 2025

@deadmanoz sorry for the delay, I didn't see the GitHub notification

Gemini surfaced 2 'critical' issues, can you take a look at those? Besides that it lgtm and I'm okay with merging it. Thanks a lot for the PR!

I finally addressed these and also the "optional" createErrorResponse ones!

…ation

- Add createErrorResponse helper function to reduce code duplication in:
  - importFile.ts (replaces 7 repetitive error blocks)
  - importDirectory.ts (replaces 5 repetitive error blocks)
  - importWithOptions.ts (replaces 6 repetitive error blocks)
- Add "Import Options Behavior Details" section to CLAUDE.md documenting:
  - duplicateDetection modes (skip, rename, replace)
  - createIndex (asIndexed) behavior
  - preserveCreationDate functionality

Addresses PR review feedback for optional improvements.
@deadmanoz
Copy link
Copy Markdown
Contributor Author

BTW don't feel like you have to merge this. Happy for you to close if you wish - I'm grateful that you created this MCP server but appreciate that it has perhaps grown beyond your use case (and now consumes lots of tokens in context)!

@dvcrn
Copy link
Copy Markdown
Owner

dvcrn commented Dec 27, 2025

Yeah I think we need to be more mindful with tokens

But I'm wondering how in your workflow you use those tools. Do you have a different filesystem MCP active and then tell the agent to import the file into DEVONthink? Does this work with sandbox and all? If yes, then I don't mind merging it

@deadmanoz
Copy link
Copy Markdown
Contributor Author

Ahh it was a few months ago.

I was using the MCP server with Claude Desktop to organise some scripts into DEVONthink:

"This was motivated by me trying to use the original functionality to "move" (via create_record) files into DEVONthink from elsewhere on my file system - it broke on some scripts that had various escape characters."

Anyway, I think I've probably only used this capability a single time since, so I'd say it's not essential

@deadmanoz deadmanoz closed this Feb 10, 2026
@deadmanoz deadmanoz deleted the feature/native-file-import branch February 10, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants