Skip to content

feat(importer): manual file import via path lookup (#766)#795

Open
statte wants to merge 14 commits into
vavallee:mainfrom
statte:feature/766-manual-file-import
Open

feat(importer): manual file import via path lookup (#766)#795
statte wants to merge 14 commits into
vavallee:mainfrom
statte:feature/766-manual-file-import

Conversation

@statte
Copy link
Copy Markdown
Contributor

@statte statte commented May 22, 2026

Summary

  • Adds a Manual Import section to Settings → Import: paste a local file/folder path, click Lookup, confirm the match, and import
  • Backend parses the filename (title, author, ASIN, series) and fuzzy-matches against wanted books using the existing tryImportInternal engine
  • Fixes inline Series Book N - Title filename parsing so Convergence_Book_2_-_Dragonslayer.m4b correctly extracts title "Dragonslayer" instead of "Convergence Book 2"
  • Format auto-detected from extension/directory; user can override (ebook / audiobook)

What's included

  • internal/importer/lookup.goLookup() on Scanner, fuzzy author matching (exact, comma-inversion, JaroWinkler ≥ 0.80), format detection
  • internal/importer/scanner.goformatHint param on tryImportInternal; public ImportFromPath() wrapper
  • internal/importer/parser.goseriesBookInlineRe pattern handles Series Book/Vol N - Title filenames
  • internal/api/manual_import.goGET /queue/manual-import/lookup and POST /queue/manual-import behind a manualImportScanner interface
  • cmd/bindery/main.go — route wiring
  • web/ManualImportSection component in ImportTab, API client methods, i18n keys

Test plan

  • go test ./internal/importer/... ./internal/api/... passes
  • Settings → Import → Manual Import section visible
  • Lookup with a valid path returns a confident match
  • Lookup with an unknown file returns "No match found"
  • Format override (ebook / audiobook) respected on import
  • Series_Book_N_-_Title.m4b filenames now match correctly

Closes #766

🤖 Generated with Claude Code

Comment thread internal/api/manual_import.go Fixed
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@statte statte marked this pull request as draft May 23, 2026 20:28
@statte statte marked this pull request as ready for review May 24, 2026 13:02
statte and others added 8 commits May 25, 2026 15:27
Adds a single-file manual import flow: the user supplies a local path,
the backend parses the filename (title, author, ASIN) and fuzzy-matches
it against wanted books, then the user confirms and the file is imported
via the existing tryImportInternal engine.

- lookup.go: Lookup() on Scanner; fuzzy author matching with JaroWinkler;
  directory -> audiobook detection
- scanner.go: add formatHint param to tryImportInternal; public
  ImportFromPath() wrapper; update all call-sites
- parser.go: recognise inline "Series Book N - Title" filenames so the
  title is extracted correctly instead of "Series Book N"
- internal/api/manual_import.go: GET /queue/manual-import/lookup and
  POST /queue/manual-import handlers behind manualImportScanner interface
- cmd/bindery/main.go: wire routes
- web: ManualImportSection in ImportTab, API client methods, i18n keys
- lookup_test.go, parser_test.go: tests for new behaviour
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rsal

filepath.Clean + filepath.IsAbs on both the Lookup and Import endpoints
before any os.Stat call. Annotate residual gosec taint with nolint
comment explaining the endpoint is admin-only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…allee#766)

Add internal/api/manual_import_test.go with 15 tests:

Lookup handler (5 paths):
- empty/relative path → 400
- os.Stat miss → 400
- scanner returns error → 500
- happy path → 200 + decoded LookupResult

Import handler (10 paths):
- bad JSON, empty/relative path, bookId ≤ 0, unrecognised format → 400
- os.Stat miss, path not a book file, book not found → 400
- closed DB (no panic) → 400/500
- directory (audiobook folder) accepted → 202
- happy path → 202 + Download record with manual-prefixed GUID

Raises patch coverage for internal/api/manual_import.go from 0% to ~94%.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
)

Add 7 tests to internal/importer/lookup_test.go:

lookup.go coverage gaps:
- BooksListError: closed DB → 'lookup: list books' error surfaces
- AuthorsListError: split bookDB/authorDB so books.List succeeds while
  authors.List fails → 'lookup: list authors' error surfaces
- Ambiguous: two books with the same title, no author in filename →
  match='ambiguous', both books in Candidates
- ASINFallthrough: ASIN in filename not found in catalogue → loop exits
  without returning, falls through to title-based match → 'confident'
- AuthorFilterRejects: title matches but parsed author ≠ book author →
  author-filter branch taken, candidate skipped → 'none'

scanner.go coverage gap:
- FormatHintAudiobook: ImportFromPath called with hint='audiobook' for an
  .epub file; the override branch (detectedFormat = formatHint) is executed
  and the audiobook path places a directory under libDir

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
)

- TestScanner_Lookup_TitleMismatch: exercises the !titleMatch→continue
  branch in Lookup (previously the loop ran on an empty catalogue so
  the branch was never taken).
- TestTryImportNZBGet/SABnzbd/Transmission_Delegates: smoke-test the
  three tryImport* wrappers; directing the import at an empty temp dir
  causes tryImportInternal to fail fast before the cleanup closure is
  invoked, so no live download client is required.
- TestManualImportImport_DownloadCreateFails: switch to split-DB setup
  (separate bookDB / downloadsDB) so books.GetByID succeeds while
  downloads.Create fails, producing the expected 500 instead of 400.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vallee#766)

The Codecov partial on scanner.go:1158

  if formatHint == models.MediaTypeAudiobook || formatHint == models.MediaTypeEbook

was caused by the right-hand operand never being the deciding factor.
TestImportFromPath_FormatHintAudiobook hit the left side (short-circuits);
TestFormatHintOverridesDetection hit the whole-false case (empty hint).
No test exercised formatHint == MediaTypeEbook (left=false, right=true).

TestImportFromPath_FormatHintEbook passes a .m4b file (auto-detects as
audiobook) with formatHint="ebook", forcing false||true evaluation and
routing the import through the ebook path. The author directory created
under libDir confirms the ebook path was taken.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… formatHint param (vavallee#766)

scanner_qbittorrent_test.go was added by upstream commit 4872293 after the
PR branched. The rebase didn't update it to pass the new formatHint string
argument introduced by this PR. Add the empty string to unblock CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@statte statte force-pushed the feature/766-manual-file-import branch from 34b0edc to fc1b55d Compare May 25, 2026 13:29
…Bnzbd partials (vavallee#766)

The two Codecov partial lines were the cleanup closure bodies in
tryImportNZBGet (scanner.go:765.82) and tryImportSABnzbd (scanner.go:976.80).
Both had count=0 because the existing smoke tests used empty download dirs
that caused tryImportInternal to return early before cleanupFunc is called.

cleanupFunc is only invoked after a fully successful import, so the new
tests supply a real .epub file and a seeded book+author so the import
completes. The NZBGet/SABnzbd clients point at non-running localhost ports;
RemoveHistory/DeleteHistory fail immediately with "connection refused",
but the closure body is entered (count becomes 1) — which is the
coverage goal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@statte
Copy link
Copy Markdown
Contributor Author

statte commented May 26, 2026

@vavallee As this is a quite substantial update, I would like input from your side. If it is not the correct way to go about the implementation, please let me know.

Copy link
Copy Markdown
Owner

@vavallee vavallee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the substantial work here, statte. The Lookup helper, parser fix (`seriesBookInlineRe`), and `formatHint` plumbing are clean and well-tested.

Two security-shaped concerns block merge as-is. Both are fixable with small edits.

1. Routes need to sit behind `auth.RequireAdmin`

The `#nosec G304` comments on `internal/api/manual_import.go:47,94` claim "endpoint is admin-only", but the route registrations at `cmd/bindery/main.go:723-724` are inside the user-level `/api/v1` group, not the `r.Use(auth.RequireAdmin)` sub-group. So any authenticated user (role `user`) can call these endpoints today.

The manual-import flow has real authority: `POST /queue/manual-import` ultimately calls `tryImportInternal`, which copies/hardlinks/moves the supplied path into the configured library directory. Combined with the existing `?deleteFiles=true` on `DELETE /book/{id}`, a non-admin can read or move arbitrary server-filesystem paths into the library and back out. That's a multi-user privilege escalation against the role model v1.15.1 hardened.

Fix: move both routes into the `r.Group(func(r chi.Router) { r.Use(auth.RequireAdmin); ... })` block in `cmd/bindery/main.go`. The nosec comments are then accurate.

2. Path needs to be constrained to a configured root

Even with admin gating, accepting any absolute path on the server filesystem is broader than the feature description implies (importing files the admin already has under their managed library). An admin pasting `/etc/...` or `/proc/...` won't yield useful results today but will exercise filesystem-read primitives that don't need to exist.

Fix: before any `os.Stat` / `Lookup` / `ImportFromPath` call, reject paths that don't resolve under `s.libraryDir` and the configured `root_folders` (the same set the scanner walks). Symlink resolution via `filepath.EvalSymlinks` before the containment check defeats symlink-traversal escapes.

This is the same defense-in-depth the existing `isAllowedPath` check in `internal/api/files.go` performs for the file-download endpoint, and it's tracked for the deletion path under #865.

What I'd recommend

Two small commits on this branch:

  1. Move the two route registrations into the `RequireAdmin` group.
  2. Add a `rejectIfOutsideAllowedRoots(path)` helper at the top of both Lookup and Import that takes the same root list the scanner already builds. The helper is also a good target for unit tests (path under root: accept; path outside: 403; symlink escaping root: 403).

Tests + parser fix are already great; this is just the access-control surface. Once those two changes land I'll happily admin-merge. Feature itself is exactly what users have been asking for in #766.

@statte
Copy link
Copy Markdown
Contributor Author

statte commented May 30, 2026

Hey,

Ok understood. I will look into on how to implement.

Thanks.

… paths to library roots (vavallee#795)

Move both /queue/manual-import routes into the RequireAdmin middleware
group so only admin users can call them.

Add isAllowedPath on ManualImportHandler that resolves symlinks via
filepath.EvalSymlinks before checking containment against the configured
library roots (cfg.LibraryDir, cfg.AudiobookDir, and the root_folders
table). Both Lookup and Import reject paths outside those roots with 403.

Tests cover: path outside roots (ebook, audiobook file, audiobook dir),
path inside either root, and symlink escape attempts — all for both
Lookup and Import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@statte
Copy link
Copy Markdown
Contributor Author

statte commented May 30, 2026

Update — addresses review feedback:

1. Admin-only routes — both /queue/manual-import/lookup and POST /queue/manual-import are now inside the r.Use(auth.RequireAdmin) group in cmd/bindery/main.go, not the user-level /api/v1 group.

2. Path containment — added isAllowedPath on ManualImportHandler that calls filepath.EvalSymlinks before checking containment against cfg.LibraryDir, cfg.AudiobookDir, and the root_folders table. Both handlers return 403 for paths outside those roots. Mirrors the existing isAllowedPath defence in internal/api/files.go.

Tests added (internal/api/manual_import_test.go):

  • Path outside roots → 403 (ebook file / audiobook file / audiobook directory, for both Lookup and Import)
  • Path inside ebook root and audiobook root → 200/202
  • Symlink inside root pointing outside → 403

Commit: 988b8d3

statte and others added 4 commits May 30, 2026 19:27
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upstream added scanner_diagnostics_test.go and a new test in
scanner_test.go after this branch was cut. Both call tryImportInternal
with the old 6-arg signature; add the empty formatHint string arg to
match the 7-arg signature introduced by this branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te time

The isAllowedPath check fires before any repo call, so containment tests
don't need a database. Replace manualImportFixture / db.OpenMemory calls
with a nil-repo handler via a new containmentHandler helper. Drops 12
migration-heavy DB opens under -race, keeping internal/api within CI's
10-minute budget after the upstream merge added significant test volume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@statte
Copy link
Copy Markdown
Contributor Author

statte commented May 30, 2026

CI timeout note:

The internal/api test package is currently timing out at 10 min under -race on CI — but this is a pre-existing issue on upstream main itself (run 26672242131 on the merge commit for v1.15.3 shows the same panic: test timed out after 10m0s). It is not introduced by this PR.

All three of the concerns raised in review are addressed:

  1. Routes behind RequireAdmin
  2. Path containment with EvalSymlinks
  3. Tests updated for tryImportInternal signature change introduced by this branch ✓

Happy to rebase or address anything else once the CI infrastructure issue is resolved upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: manual file import - path lookup + import via existing scanner pipeline

3 participants