Add OPDS endpoint by cdrini · Pull Request #11379 · internetarchive/openlibrary

cdrini · 2025-10-28T14:34:21Z

Add a few new endpoints to handle OPDS for @marcc

/opds/search
/opds/books/OL123M
/opds

This pull request introduces a number of improvements and new features, primarily focused on adding OPDS (Open Publication Distribution System) API endpoints, enhancing book provider logic, and improving internationalization and homepage subject presentation. The most significant changes include implementing new OPDS endpoints, adding a Better World Books provider, updating CORS handling, and improving the way featured subjects are displayed on the homepage.

OPDS API and Homepage Improvements

Added new OPDS endpoints (/opds, /opds/search, /opds/books/(OLID)) to serve book and catalog data in OPDS 2.0 JSON format, including trending, classic, romance, kids, thrillers, and textbooks sections, with internationalized titles and featured subject navigation. Featured subjects now include emojis for a more engaging UI. [1] [2] [3] [4]
Improved homepage subject caching and context propagation for bots and language/host settings, ensuring correct rendering and cache keys. [1] [2]

Book Provider Enhancements

Added a new BetterWorldBooksProvider to the book providers, enabling acquisition links for books available via Better World Books, and included it in the provider order after Internet Archive. [1] [2]
Updated InternetArchiveProvider.get_acquisitions to use the correct acquisition access literal and support both Solr and database edition objects, improving accuracy of acquisition links. [1] [2] [3]
Added a helper function to aggregate acquisitions from all providers for a given edition, supporting the new provider logic.

Internationalization (i18n)

Added and updated translation strings for new OPDS-related UI elements such as "Search Results", "Welcome to Open Library", "Trending Books", "Classic Books", "Romance", "Kids", "Thrillers", and "Textbooks". [1] [2] [3] [4]

CORS and Processor Handling

Updated CORS processor configuration to support OPDS endpoints and ensure correct cross-origin headers for both /api/ and /opds/ routes. [1] [2]

Miscellaneous

Minor code quality improvements, such as using functools.cached_property and type casting, and importing missing modules.

These changes collectively provide a richer API for external consumers, improve the discoverability and presentation of books, and lay the groundwork for further expansion of Open Library's digital lending and browsing capabilities.

Co-authored-by: Michael E. Karpeles (mek) <michael.karpeles@gmail.com>

Copilot

Pull Request Overview

This pull request adds OPDS (Open Publication Distribution System) support to Open Library, enabling the platform to serve book catalogs in a standardized format for e-book readers and aggregators. The changes also refactor book provider acquisition logic and enhance CORS configuration.

Key changes:

Adds new OPDS endpoints (/opds, /opds/search, /opds/books/<edition>) with caching
Refactors CORSProcessor to support exact path matching and a "CORS everything" mode
Introduces get_acquisitions() function to aggregate acquisitions from multiple providers
Adds BetterWorldBooksProvider for purchase options
Adds emoji decorations to featured subjects for better OPDS navigation

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
requirements.txt	Adds pyopds2 and pyopds2_openlibrary libraries from GitHub repositories
openlibrary/plugins/openlibrary/api.py	Implements three new OPDS endpoints with catalog building and caching logic
openlibrary/plugins/openlibrary/processors.py	Extends CORSProcessor with exact path matching and cors_everything flag
openlibrary/plugins/openlibrary/code.py	Updates CORS configuration to include OPDS paths
openlibrary/plugins/openlibrary/home.py	Adds emoji field to featured subjects and caches web.ctx.host for threading
openlibrary/book_providers.py	Adds BetterWorldBooksProvider, new get_acquisitions() function, and acquisition access conversion methods
openlibrary/plugins/worksearch/schemes/works.py	Updates to use new get_acquisitions() function instead of provider.get_acquisitions()
openlibrary/coverstore/code.py	Enables CORS for all coverstore endpoints
openlibrary/i18n/messages.pot	Updates translation strings, reorganizing existing entries

Comments suppressed due to low confidence (1)

openlibrary/plugins/openlibrary/home.py:1

[nitpick] Extremely long line (>150 characters) that reduces readability. Consider breaking this into multiple lines or extracting the query string construction into a separate variable.

"""Controller for home page."""

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-06T01:11:37Z

openlibrary/book_providers.py

+        self,
+        ed_or_solr: Edition | dict,
+    ) -> list[Acquisition]:
+        key = cast(str, ed_or_solr['key'])


The code assumes ed_or_solr always has a 'key' field, but it could be a dict without this field (especially for solr edition documents which may use different structures). This will raise a KeyError. Add a guard to check if 'key' exists before accessing it.

Suggested change

key = cast(str, ed_or_solr['key'])

key = ed_or_solr.get('key') if isinstance(ed_or_solr, dict) else getattr(ed_or_solr, 'key', None)

if not key:

return []

Copilot · 2025-11-06T01:11:37Z

openlibrary/plugins/openlibrary/processors.py

+            or path.endswith(".json")
+            or path in self.cors_paths
+            or any(
+                web.ctx.path.startswith(path_segment)


Inconsistent use of path variable. Line 72 casts web.ctx.path to a local variable path for type safety, but line 78 uses web.ctx.path directly instead of the path variable. This should use path for consistency.

Suggested change

web.ctx.path.startswith(path_segment)

path.startswith(path_segment)

Copilot · 2025-11-06T01:11:38Z

openlibrary/book_providers.py

+        # basically just check if it has an isbn?
+        return (ed_or_solr.get('isbn') or ed_or_solr.get('isbn_10') or []) + (
+            ed_or_solr.get('isbn_13') or []
+        )


The expression chains multiple fallbacks in a confusing way. When isbn exists but is an empty list, or [] evaluates to [], but the intent seems to be to combine all available ISBNs. Consider using (ed_or_solr.get('isbn_10') or []) + (ed_or_solr.get('isbn_13') or []) or clarifying the logic if isbn should be checked separately.

Suggested change

# basically just check if it has an isbn?

return (ed_or_solr.get('isbn') or ed_or_solr.get('isbn_10') or []) + (

ed_or_solr.get('isbn_13') or []

)

# Combine all available ISBNs (isbn, isbn_10, isbn_13)

return (ed_or_solr.get('isbn') or []) + (ed_or_solr.get('isbn_10') or []) + (ed_or_solr.get('isbn_13') or [])

Copilot · 2025-11-06T01:11:38Z

openlibrary/book_providers.py

+        }
+
+        for acq in results.values():
+            acq.provider_name = "Better World Books"


The provider name 'Better World Books' is hardcoded here but the class already has long_name = 'Better World Books' defined at line 621. Use self.long_name instead for consistency and maintainability.

Suggested change

acq.provider_name = "Better World Books"

acq.provider_name = self.long_name

Copilot · 2025-11-06T01:11:38Z

openlibrary/book_providers.py

+        return False
+
+    def get_identifiers(self, ed_or_solr: Edition | dict) -> list[str]:
+        # basically just check if it has an isbn?


The comment ends with a question mark, suggesting uncertainty about the implementation. Either confirm the logic and remove the question mark, or add a TODO if this needs refinement.

Suggested change

# basically just check if it has an isbn?

# Return all ISBNs (isbn, isbn_10, isbn_13) from the edition or solr document.

cdrini force-pushed the feature/opds-endpoint branch 3 times, most recently from 304c6e4 to 8582682 Compare October 31, 2025 23:43

cdrini and others added 14 commits November 5, 2025 13:47

Add OPDS endpoint

9cdb8f9

Co-authored-by: Michael E. Karpeles (mek) <michael.karpeles@gmail.com>

Allow CORS access for /opds endpoints

54781d0

Add experimental /opds endpoint

ef36fdd

Use absolute URLs instead of realtive URLs in OPDS feed

1123333

Fix /opds/search endpoint error

426ee50

Add links to /opds/search

9933760

Fix sort order of navigation in /opds

7eeb2ba

Experiment with emoji in OPDS navigation

35e1ece

Add opds2 publication endpoint

9599b7c

[opds2] Small tweaks to provider init + vbump

e6d2c86

[opds2] Fix search url relative

2b81ed1

[opds2] fix base url in publications

3ccc92c

[opds2] Fix content type

16c3fe9

Update to latest pyopds2

4410e3f

cdrini force-pushed the feature/opds-endpoint branch from 8582682 to 4410e3f Compare November 5, 2025 23:26

cdrini added 3 commits November 5, 2025 18:51

Add BWB provider with dummy prices

5b0460e

Fix OPDS IA availability

9b2709f

[opds2] Fix archive labS url in requirements.txt

d20f192

mekarpeles marked this pull request as ready for review November 6, 2025 01:09

Copilot AI review requested due to automatic review settings November 6, 2025 01:09

mekarpeles merged commit 3f2d9ec into internetarchive:master Nov 6, 2025
6 checks passed

Copilot AI reviewed Nov 6, 2025

View reviewed changes

cdrini deleted the feature/opds-endpoint branch November 6, 2025 03:53

This was referenced Nov 7, 2025

Add types to vendors.py #11433

Open

Fix BWB provider error #11452

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OPDS endpoint#11379

Add OPDS endpoint#11379
mekarpeles merged 17 commits intointernetarchive:masterfrom
cdrini:feature/opds-endpoint

cdrini commented Oct 28, 2025 •

edited by mekarpeles

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        key = cast(str, ed_or_solr['key'])
+        key = ed_or_solr.get('key') if isinstance(ed_or_solr, dict) else getattr(ed_or_solr, 'key', None)
+        if not key:
+            return []

	web.ctx.path.startswith(path_segment)
	path.startswith(path_segment)

	acq.provider_name = "Better World Books"
	acq.provider_name = self.long_name

	# basically just check if it has an isbn?
	# Return all ISBNs (isbn, isbn_10, isbn_13) from the edition or solr document.

Uh oh!

Conversation

cdrini commented Oct 28, 2025 • edited by mekarpeles Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cdrini commented Oct 28, 2025 •

edited by mekarpeles

Loading