feat(importer): manual file import via path lookup (#766)#795
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Adds a single-file manual import flow: the user supplies a local path, the backend parses the filename (title, author, ASIN) and fuzzy-matches it against wanted books, then the user confirms and the file is imported via the existing tryImportInternal engine. - lookup.go: Lookup() on Scanner; fuzzy author matching with JaroWinkler; directory -> audiobook detection - scanner.go: add formatHint param to tryImportInternal; public ImportFromPath() wrapper; update all call-sites - parser.go: recognise inline "Series Book N - Title" filenames so the title is extracted correctly instead of "Series Book N" - internal/api/manual_import.go: GET /queue/manual-import/lookup and POST /queue/manual-import handlers behind manualImportScanner interface - cmd/bindery/main.go: wire routes - web: ManualImportSection in ImportTab, API client methods, i18n keys - lookup_test.go, parser_test.go: tests for new behaviour
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rsal filepath.Clean + filepath.IsAbs on both the Lookup and Import endpoints before any os.Stat call. Annotate residual gosec taint with nolint comment explaining the endpoint is admin-only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…allee#766) Add internal/api/manual_import_test.go with 15 tests: Lookup handler (5 paths): - empty/relative path → 400 - os.Stat miss → 400 - scanner returns error → 500 - happy path → 200 + decoded LookupResult Import handler (10 paths): - bad JSON, empty/relative path, bookId ≤ 0, unrecognised format → 400 - os.Stat miss, path not a book file, book not found → 400 - closed DB (no panic) → 400/500 - directory (audiobook folder) accepted → 202 - happy path → 202 + Download record with manual-prefixed GUID Raises patch coverage for internal/api/manual_import.go from 0% to ~94%. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
) Add 7 tests to internal/importer/lookup_test.go: lookup.go coverage gaps: - BooksListError: closed DB → 'lookup: list books' error surfaces - AuthorsListError: split bookDB/authorDB so books.List succeeds while authors.List fails → 'lookup: list authors' error surfaces - Ambiguous: two books with the same title, no author in filename → match='ambiguous', both books in Candidates - ASINFallthrough: ASIN in filename not found in catalogue → loop exits without returning, falls through to title-based match → 'confident' - AuthorFilterRejects: title matches but parsed author ≠ book author → author-filter branch taken, candidate skipped → 'none' scanner.go coverage gap: - FormatHintAudiobook: ImportFromPath called with hint='audiobook' for an .epub file; the override branch (detectedFormat = formatHint) is executed and the audiobook path places a directory under libDir Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
) - TestScanner_Lookup_TitleMismatch: exercises the !titleMatch→continue branch in Lookup (previously the loop ran on an empty catalogue so the branch was never taken). - TestTryImportNZBGet/SABnzbd/Transmission_Delegates: smoke-test the three tryImport* wrappers; directing the import at an empty temp dir causes tryImportInternal to fail fast before the cleanup closure is invoked, so no live download client is required. - TestManualImportImport_DownloadCreateFails: switch to split-DB setup (separate bookDB / downloadsDB) so books.GetByID succeeds while downloads.Create fails, producing the expected 500 instead of 400. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vallee#766) The Codecov partial on scanner.go:1158 if formatHint == models.MediaTypeAudiobook || formatHint == models.MediaTypeEbook was caused by the right-hand operand never being the deciding factor. TestImportFromPath_FormatHintAudiobook hit the left side (short-circuits); TestFormatHintOverridesDetection hit the whole-false case (empty hint). No test exercised formatHint == MediaTypeEbook (left=false, right=true). TestImportFromPath_FormatHintEbook passes a .m4b file (auto-detects as audiobook) with formatHint="ebook", forcing false||true evaluation and routing the import through the ebook path. The author directory created under libDir confirms the ebook path was taken. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… formatHint param (vavallee#766) scanner_qbittorrent_test.go was added by upstream commit 4872293 after the PR branched. The rebase didn't update it to pass the new formatHint string argument introduced by this PR. Add the empty string to unblock CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
34b0edc to
fc1b55d
Compare
…Bnzbd partials (vavallee#766) The two Codecov partial lines were the cleanup closure bodies in tryImportNZBGet (scanner.go:765.82) and tryImportSABnzbd (scanner.go:976.80). Both had count=0 because the existing smoke tests used empty download dirs that caused tryImportInternal to return early before cleanupFunc is called. cleanupFunc is only invoked after a fully successful import, so the new tests supply a real .epub file and a seeded book+author so the import completes. The NZBGet/SABnzbd clients point at non-running localhost ports; RemoveHistory/DeleteHistory fail immediately with "connection refused", but the closure body is entered (count becomes 1) — which is the coverage goal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@vavallee As this is a quite substantial update, I would like input from your side. If it is not the correct way to go about the implementation, please let me know. |
vavallee
left a comment
There was a problem hiding this comment.
Thanks for the substantial work here, statte. The Lookup helper, parser fix (`seriesBookInlineRe`), and `formatHint` plumbing are clean and well-tested.
Two security-shaped concerns block merge as-is. Both are fixable with small edits.
1. Routes need to sit behind `auth.RequireAdmin`
The `#nosec G304` comments on `internal/api/manual_import.go:47,94` claim "endpoint is admin-only", but the route registrations at `cmd/bindery/main.go:723-724` are inside the user-level `/api/v1` group, not the `r.Use(auth.RequireAdmin)` sub-group. So any authenticated user (role `user`) can call these endpoints today.
The manual-import flow has real authority: `POST /queue/manual-import` ultimately calls `tryImportInternal`, which copies/hardlinks/moves the supplied path into the configured library directory. Combined with the existing `?deleteFiles=true` on `DELETE /book/{id}`, a non-admin can read or move arbitrary server-filesystem paths into the library and back out. That's a multi-user privilege escalation against the role model v1.15.1 hardened.
Fix: move both routes into the `r.Group(func(r chi.Router) { r.Use(auth.RequireAdmin); ... })` block in `cmd/bindery/main.go`. The nosec comments are then accurate.
2. Path needs to be constrained to a configured root
Even with admin gating, accepting any absolute path on the server filesystem is broader than the feature description implies (importing files the admin already has under their managed library). An admin pasting `/etc/...` or `/proc/...` won't yield useful results today but will exercise filesystem-read primitives that don't need to exist.
Fix: before any `os.Stat` / `Lookup` / `ImportFromPath` call, reject paths that don't resolve under `s.libraryDir` and the configured `root_folders` (the same set the scanner walks). Symlink resolution via `filepath.EvalSymlinks` before the containment check defeats symlink-traversal escapes.
This is the same defense-in-depth the existing `isAllowedPath` check in `internal/api/files.go` performs for the file-download endpoint, and it's tracked for the deletion path under #865.
What I'd recommend
Two small commits on this branch:
- Move the two route registrations into the `RequireAdmin` group.
- Add a `rejectIfOutsideAllowedRoots(path)` helper at the top of both Lookup and Import that takes the same root list the scanner already builds. The helper is also a good target for unit tests (path under root: accept; path outside: 403; symlink escaping root: 403).
Tests + parser fix are already great; this is just the access-control surface. Once those two changes land I'll happily admin-merge. Feature itself is exactly what users have been asking for in #766.
|
Hey, Ok understood. I will look into on how to implement. Thanks. |
… paths to library roots (vavallee#795) Move both /queue/manual-import routes into the RequireAdmin middleware group so only admin users can call them. Add isAllowedPath on ManualImportHandler that resolves symlinks via filepath.EvalSymlinks before checking containment against the configured library roots (cfg.LibraryDir, cfg.AudiobookDir, and the root_folders table). Both Lookup and Import reject paths outside those roots with 403. Tests cover: path outside roots (ebook, audiobook file, audiobook dir), path inside either root, and symlink escape attempts — all for both Lookup and Import. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Update — addresses review feedback: 1. Admin-only routes — both 2. Path containment — added Tests added (
Commit: 988b8d3 |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upstream added scanner_diagnostics_test.go and a new test in scanner_test.go after this branch was cut. Both call tryImportInternal with the old 6-arg signature; add the empty formatHint string arg to match the 7-arg signature introduced by this branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te time The isAllowedPath check fires before any repo call, so containment tests don't need a database. Replace manualImportFixture / db.OpenMemory calls with a nil-repo handler via a new containmentHandler helper. Drops 12 migration-heavy DB opens under -race, keeping internal/api within CI's 10-minute budget after the upstream merge added significant test volume. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
CI timeout note: The All three of the concerns raised in review are addressed:
Happy to rebase or address anything else once the CI infrastructure issue is resolved upstream. |
Summary
tryImportInternalengineSeries Book N - Titlefilename parsing soConvergence_Book_2_-_Dragonslayer.m4bcorrectly extracts title "Dragonslayer" instead of "Convergence Book 2"What's included
internal/importer/lookup.go—Lookup()on Scanner, fuzzy author matching (exact, comma-inversion, JaroWinkler ≥ 0.80), format detectioninternal/importer/scanner.go—formatHintparam ontryImportInternal; publicImportFromPath()wrapperinternal/importer/parser.go—seriesBookInlineRepattern handlesSeries Book/Vol N - Titlefilenamesinternal/api/manual_import.go—GET /queue/manual-import/lookupandPOST /queue/manual-importbehind amanualImportScannerinterfacecmd/bindery/main.go— route wiringweb/—ManualImportSectioncomponent inImportTab, API client methods, i18n keysTest plan
go test ./internal/importer/... ./internal/api/...passesSeries_Book_N_-_Title.m4bfilenames now match correctlyCloses #766
🤖 Generated with Claude Code