Skip to content

Release 0.9.0a1#20

Open
github-actions[bot] wants to merge 7 commits into
masterfrom
release-0.9.0a1
Open

Release 0.9.0a1#20
github-actions[bot] wants to merge 7 commits into
masterfrom
release-0.9.0a1

Conversation

@github-actions

Copy link
Copy Markdown

Human review requested!

JarbasAl and others added 7 commits April 28, 2026 17:37
* feat: declare mediavocab as hard runtime dep

audiobooker.converters.audiobook_to_release() has been implicit-
importing mediavocab since it was added; this commit makes the
relationship explicit:

- mediavocab>=0.1.0 in [project] dependencies (was unlisted).
- pytest declared as a `test` optional extra so `pip install
  audiobooker[test]` works without surprises.
- CHANGELOG entry.

115 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: track latest mediavocab API in audiobook_to_release

- Populate Release.release_date from AudioBook.year as IsoDate-compatible YYYY.
- Stamp Release.license = "public_domain" for LibriVox content so the
  Release.parsed_license property resolves to an open License.
- README + docs: mention mediavocab as a hard runtime dep, drop stale
  requests-cache references (the PR replaced CachedSession with a plain
  requests.Session), and show a search → typed Release example.
- New examples/mediavocab_release.py demonstrating parsed_license filtering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: populate rich mediavocab fields (chapters, narrators, genres, ids)

LibriVox now emits one AudioBook per book (instead of one per section) with
full per-section Chapter offsets/titles, deduplicated reader cast, librivox_id
external id, content_genres, codec/bitrate, audio_language, and total runtime
from the API. LoyalBooks similarly aggregates RSS entries into chapters of a
single book.

The audiobook_to_release converter now maps:
  - AudioBook.chapters         → Release.chapters (offset, end, title)
  - AudioBook.narrators[]      → Work.credits (PERFORMER, deduped)
  - AudioBook.genres           → Work.content_genres
  - AudioBook.external_ids     → Work.external_ids / Release.external_ids
  - AudioBook.codec/bitrate    → Release.codec/bitrate
  - AudioBook.language         → Release.audio_language
  - LoyalBooks                 → license="public_domain"

base.AudioBook gains chapters, narrators, genres, codec, bitrate,
external_ids fields; AudioBookChapter is exported. The example script now
shows off the rich Release output end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: CHANGELOG for the mediavocab integration

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: post-mediavocab audit cleanup

- ruff F/B sweep: add __all__ exports, drop unused imports, remove F841
- fix GoldenAudioBooks scraper: domain moved to goldenaudiobooks.com
  (sitemap_index.xml); _iter_post_sitemaps was filtering leaf URLs by
  index-only substring and yielded nothing
- harden Librivox _api_get: treat HTTP 500 / decode errors as empty
  result so combined queries (e.g. title=^X & author=Y) don't raise

* fix: address CodeRabbit review comments

- Use `as X` re-export aliases on all public imports in __init__.py to
  satisfy ruff F401 (unused-import on intentional re-exports)
- Reconcile narrator/narrators when both supplied: narrators list is
  authoritative, narrator singular always mirrors narrators[0]
- genres=list(genres) in librivox scrapper to avoid sharing the same
  mutable list object between tags and genres
- Remove hardcoded bitrate="128" from LibriVox (serves both 64 and 128 kbps)
- Simplify stable_id() guard in converters (always returns non-empty string)
- Fix stale "cached session" wording in docs/api.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: VCR cassette template for HTTP-backed parsers (librivox)

Adds the reference pattern for cassette-backed parser tests:
- test/conftest.py — vcr_config + vcr_cassette_dir fixtures
- test/test_librivox_vcr.py — one test per public Librivox method
- test/cassettes/ — recorded YAML cassettes (offline replay)
- .github/workflows/nightly-live.yml — daily re-record vs live API
- pyproject.toml — vcrpy + pytest-vcr in [test] extra

Replays in 0.7 s offline; nightly job catches upstream drift within 24h.
Pattern to be replicated across remaining scrapers and companion repos.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: VCR cassettes for AudioAnarchy scraper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: VCR cassettes for DarkerProjects scraper

* test: VCR cassettes for GoldenAudioBooks scraper (silent-zero regression guard)

* chore: drop legacy requirements.txt — pyproject.toml is authoritative

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: VCR cassettes for HPAudioTales scraper

* test: VCR cassettes for LoyalBooks scraper

* test: VCR cassettes for StephenKingAudioBooks scraper

* fix: copy tags list before assigning to genres in LoyalBooks scraper

Prevents shared mutable reference between tags and genres fields so
mutations to one list do not silently affect the other.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: pluggable HTTP transport with optional curl_cffi stealth backend

Allow per-instance Session injection on AudioBookSource subclasses,
add audiobooker.transport.default_session() honouring the
AUDIOBOOKER_TRANSPORT=curl_cffi env var, and expose a [stealth] extra
that pulls in curl_cffi for TLS-fingerprint impersonation. Class-level
default session is preserved so existing callers keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: add coverage for converters, cli, search, base, utils, scrappers base

Adds 100+ tests covering audiobook_to_release conversion, the Click CLI surface
(search/index/cache subcommands), the parallel search orchestrator with timeout
and dedup, AudioBook/BookAuthor/Narrator equality and narrator sync, utils
HTTP/sitemap/score edge cases, and AudioBookSource base methods.

Raises overall package coverage from 53% to 76%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: cover cache.download/play/main, index CLI, IndexedSource, youtube scraper

Adds offline tests for the cache module's download progress/failure paths,
play with cached and uncached files, _open_file across platforms, the
argparse main(), and the _find_book index/Librivox fallback chain.

Adds tests for the index CLI (build/update/stats/follow/unfollow/list/search),
IndexedSource delegation, _resolve_sources name normalisation, and the
_followed_as_sources YouTube instantiation path.

Adds offline tests for youtube.py covering _length_to_seconds, _parse_name,
_parse_video_item (both old videoRenderer and new lockupViewModel layouts),
_parse_playlist_video, extract_yt_metadata (authors/narrator/year/hashtags/
suffix stripping), _video_to_book metadata-resolution rules, _YtSourceMixin
search methods, iterate_all on channel/playlist sources (min_runtime and
title_blacklist filters), the pre-configured TheCybrarian / HorrorBabble /
TheDustyTome subclasses, and the tutubo-backed _iter_channel_videos /
_iter_playlist_videos parsers with mocked initial_data.

Raises overall package coverage from 76% to 94%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: cover scraper edge paths (no-soup, parse errors, RSS quirks)

Adds offline tests for AudioAnarchy / DarkerProjects / GoldenAudioBooks /
HPTalesAudioBooks / StephenKingAudioBooks / Librivox / LoyalBooks covering
the get_soup-returns-None branches, parse failures (missing h1/content,
no streams ParseErrorException, Harry Potter / Stephen Fry narrator branch),
LoyalBooks calc_runtime invalid input, from_rss with no chapters and with
duplicate authors, search_by_tag genre matching with absolute/relative hrefs,
iterate_popular and iterate_all RSS exception handling, and Librivox
iterate_all on empty API responses.

Raises overall package coverage from 94% to 96%, exceeding the 95% target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cache): print help when 'python -m audiobooker.cache' is run without a subcommand

Previously fell through to args.query lookup which raised AttributeError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): install test extras in coverage workflow so pytest-vcr loads cassettes

Without `install_extras: test`, vcrpy/pytest-vcr are absent from the
coverage environment; pytest.mark.vcr is treated as unknown and tests
make unguarded live HTTP calls that return None, causing all
test_loyalbooks_vcr assertions to fail.

Also tighten narrator/narrators reconciliation in AudioBook.__post_init__
so that when both fields are supplied and diverge the list stays
consistent (narrators list is authoritative, singular is set to [0]).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): pass .[test] extras to coverage workflow instead of bare 'test'

The reusable coverage workflow installs install_extras as a raw pip
argument; 'test' was being treated as a package name rather than the
project's [test] optional-dependency group.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add 30 s timeout to LibriVox API request

Prevents indefinite hangs on network stalls, addressing CodeRabbit
feedback on PR #14.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…import (#19)

default_session() falls back to an unblock_requests.CloudflareSession
(env_prefix AUDIOBOOKER, wayback_fallback) when importable, after the
explicit curl_cffi opt-in and before the plain requests session. The
import is guarded so the dependency stays optional.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant