Skip to content

Add OPDS endpoint#11379

Merged
mekarpeles merged 17 commits intointernetarchive:masterfrom
cdrini:feature/opds-endpoint
Nov 6, 2025
Merged

Add OPDS endpoint#11379
mekarpeles merged 17 commits intointernetarchive:masterfrom
cdrini:feature/opds-endpoint

Conversation

@cdrini
Copy link
Collaborator

@cdrini cdrini commented Oct 28, 2025

Add a few new endpoints to handle OPDS for @marcc

  • /opds/search
  • /opds/books/OL123M
  • /opds

This pull request introduces a number of improvements and new features, primarily focused on adding OPDS (Open Publication Distribution System) API endpoints, enhancing book provider logic, and improving internationalization and homepage subject presentation. The most significant changes include implementing new OPDS endpoints, adding a Better World Books provider, updating CORS handling, and improving the way featured subjects are displayed on the homepage.

OPDS API and Homepage Improvements

  • Added new OPDS endpoints (/opds, /opds/search, /opds/books/(OLID)) to serve book and catalog data in OPDS 2.0 JSON format, including trending, classic, romance, kids, thrillers, and textbooks sections, with internationalized titles and featured subject navigation. Featured subjects now include emojis for a more engaging UI. [1] [2] [3] [4]

  • Improved homepage subject caching and context propagation for bots and language/host settings, ensuring correct rendering and cache keys. [1] [2]

Book Provider Enhancements

  • Added a new BetterWorldBooksProvider to the book providers, enabling acquisition links for books available via Better World Books, and included it in the provider order after Internet Archive. [1] [2]

  • Updated InternetArchiveProvider.get_acquisitions to use the correct acquisition access literal and support both Solr and database edition objects, improving accuracy of acquisition links. [1] [2] [3]

  • Added a helper function to aggregate acquisitions from all providers for a given edition, supporting the new provider logic.

Internationalization (i18n)

  • Added and updated translation strings for new OPDS-related UI elements such as "Search Results", "Welcome to Open Library", "Trending Books", "Classic Books", "Romance", "Kids", "Thrillers", and "Textbooks". [1] [2] [3] [4]

CORS and Processor Handling

  • Updated CORS processor configuration to support OPDS endpoints and ensure correct cross-origin headers for both /api/ and /opds/ routes. [1] [2]

Miscellaneous

  • Minor code quality improvements, such as using functools.cached_property and type casting, and importing missing modules.

These changes collectively provide a richer API for external consumers, improve the discoverability and presentation of books, and lay the groundwork for further expansion of Open Library's digital lending and browsing capabilities.

@cdrini cdrini force-pushed the feature/opds-endpoint branch 3 times, most recently from 304c6e4 to 8582682 Compare October 31, 2025 23:43
@cdrini cdrini force-pushed the feature/opds-endpoint branch from 8582682 to 4410e3f Compare November 5, 2025 23:26
@mekarpeles mekarpeles marked this pull request as ready for review November 6, 2025 01:09
Copilot AI review requested due to automatic review settings November 6, 2025 01:09
@mekarpeles mekarpeles merged commit 3f2d9ec into internetarchive:master Nov 6, 2025
6 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds OPDS (Open Publication Distribution System) support to Open Library, enabling the platform to serve book catalogs in a standardized format for e-book readers and aggregators. The changes also refactor book provider acquisition logic and enhance CORS configuration.

Key changes:

  • Adds new OPDS endpoints (/opds, /opds/search, /opds/books/<edition>) with caching
  • Refactors CORSProcessor to support exact path matching and a "CORS everything" mode
  • Introduces get_acquisitions() function to aggregate acquisitions from multiple providers
  • Adds BetterWorldBooksProvider for purchase options
  • Adds emoji decorations to featured subjects for better OPDS navigation

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
requirements.txt Adds pyopds2 and pyopds2_openlibrary libraries from GitHub repositories
openlibrary/plugins/openlibrary/api.py Implements three new OPDS endpoints with catalog building and caching logic
openlibrary/plugins/openlibrary/processors.py Extends CORSProcessor with exact path matching and cors_everything flag
openlibrary/plugins/openlibrary/code.py Updates CORS configuration to include OPDS paths
openlibrary/plugins/openlibrary/home.py Adds emoji field to featured subjects and caches web.ctx.host for threading
openlibrary/book_providers.py Adds BetterWorldBooksProvider, new get_acquisitions() function, and acquisition access conversion methods
openlibrary/plugins/worksearch/schemes/works.py Updates to use new get_acquisitions() function instead of provider.get_acquisitions()
openlibrary/coverstore/code.py Enables CORS for all coverstore endpoints
openlibrary/i18n/messages.pot Updates translation strings, reorganizing existing entries
Comments suppressed due to low confidence (1)

openlibrary/plugins/openlibrary/home.py:1

  • [nitpick] Extremely long line (>150 characters) that reduces readability. Consider breaking this into multiple lines or extracting the query string construction into a separate variable.
"""Controller for home page."""

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

self,
ed_or_solr: Edition | dict,
) -> list[Acquisition]:
key = cast(str, ed_or_solr['key'])
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes ed_or_solr always has a 'key' field, but it could be a dict without this field (especially for solr edition documents which may use different structures). This will raise a KeyError. Add a guard to check if 'key' exists before accessing it.

Suggested change
key = cast(str, ed_or_solr['key'])
key = ed_or_solr.get('key') if isinstance(ed_or_solr, dict) else getattr(ed_or_solr, 'key', None)
if not key:
return []

Copilot uses AI. Check for mistakes.
or path.endswith(".json")
or path in self.cors_paths
or any(
web.ctx.path.startswith(path_segment)
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of path variable. Line 72 casts web.ctx.path to a local variable path for type safety, but line 78 uses web.ctx.path directly instead of the path variable. This should use path for consistency.

Suggested change
web.ctx.path.startswith(path_segment)
path.startswith(path_segment)

Copilot uses AI. Check for mistakes.
Comment on lines +628 to +631
# basically just check if it has an isbn?
return (ed_or_solr.get('isbn') or ed_or_solr.get('isbn_10') or []) + (
ed_or_solr.get('isbn_13') or []
)
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression chains multiple fallbacks in a confusing way. When isbn exists but is an empty list, or [] evaluates to [], but the intent seems to be to combine all available ISBNs. Consider using (ed_or_solr.get('isbn_10') or []) + (ed_or_solr.get('isbn_13') or []) or clarifying the logic if isbn should be checked separately.

Suggested change
# basically just check if it has an isbn?
return (ed_or_solr.get('isbn') or ed_or_solr.get('isbn_10') or []) + (
ed_or_solr.get('isbn_13') or []
)
# Combine all available ISBNs (isbn, isbn_10, isbn_13)
return (ed_or_solr.get('isbn') or []) + (ed_or_solr.get('isbn_10') or []) + (ed_or_solr.get('isbn_13') or [])

Copilot uses AI. Check for mistakes.
}

for acq in results.values():
acq.provider_name = "Better World Books"
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provider name 'Better World Books' is hardcoded here but the class already has long_name = 'Better World Books' defined at line 621. Use self.long_name instead for consistency and maintainability.

Suggested change
acq.provider_name = "Better World Books"
acq.provider_name = self.long_name

Copilot uses AI. Check for mistakes.
return False

def get_identifiers(self, ed_or_solr: Edition | dict) -> list[str]:
# basically just check if it has an isbn?
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment ends with a question mark, suggesting uncertainty about the implementation. Either confirm the logic and remove the question mark, or add a TODO if this needs refinement.

Suggested change
# basically just check if it has an isbn?
# Return all ISBNs (isbn, isbn_10, isbn_13) from the edition or solr document.

Copilot uses AI. Check for mistakes.
@cdrini cdrini deleted the feature/opds-endpoint branch November 6, 2025 03:53
This was referenced Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants