providers#73
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an “activated metadata providers” surface area: endpoints to list configured metadata providers and to search for metadata candidates via a new use case + aggregator, including a new Hardcover metadata provider implementation.
Changes:
- Add
GET /api/metadata/providersto expose activated provider IDs + capabilities. - Add
POST /api/metadata/searchbacked bySearchMetadataCandidatesUseCase. - Implement
HardcoverMetadataProviderand update metadata provider factory + server composition wiring.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| sake/src/routes/api/metadata/search/+server.ts | New API endpoint to search metadata candidates using the new use case. |
| sake/src/routes/api/metadata/providers/+server.ts | New API endpoint to list activated metadata providers + their capabilities. |
| sake/src/lib/server/infrastructure/metadata-providers/metadataProviderFactory.ts | Adds Hardcover provider + changes factory to return null for unavailable/unimplemented providers. |
| sake/src/lib/server/infrastructure/metadata-providers/hardcoverMetadataProvider.ts | New Hardcover GraphQL-backed metadata provider with rate limiting + field mapping. |
| sake/src/lib/server/application/use-cases/SearchMetadataCandidatesUseCase.ts | New use case to build a metadata query from a bookId and/or provided query overrides. |
| sake/src/lib/server/application/composition.ts | Wires activated metadata providers/aggregator and exports searchMetadataCandidatesUseCase. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Token-bucket rate limiter — 60 requests per minute (Hardcover's stated limit) | ||
| // --------------------------------------------------------------------------- | ||
|
|
||
| const RATE_LIMIT_PER_MINUTE = 60; | ||
| const REFILL_INTERVAL_MS = (60 / RATE_LIMIT_PER_MINUTE) * 1_000; // ms per token = 1000ms | ||
|
|
||
| class TokenBucket { | ||
| private tokens: number; | ||
| private lastRefill: number; | ||
|
|
||
| constructor(private readonly capacity: number) { | ||
| this.tokens = capacity; | ||
| this.lastRefill = Date.now(); | ||
| } | ||
|
|
||
| private refill(): void { | ||
| const now = Date.now(); | ||
| const elapsed = now - this.lastRefill; | ||
| const newTokens = Math.floor(elapsed / REFILL_INTERVAL_MS); | ||
| if (newTokens > 0) { | ||
| this.tokens = Math.min(this.capacity, this.tokens + newTokens); | ||
| this.lastRefill = now; | ||
| } | ||
| } | ||
|
|
||
| tryConsume(): boolean { | ||
| this.refill(); | ||
| if (this.tokens > 0) { | ||
| this.tokens--; | ||
| return true; | ||
| } | ||
| return false; | ||
| } | ||
| } | ||
|
|
||
| const rateLimiter = new TokenBucket(RATE_LIMIT_PER_MINUTE); |
There was a problem hiding this comment.
This token bucket is initialized with capacity = 60, which allows a burst of up to 60 requests immediately after startup/idle (before refilling becomes the limiter). If Hardcover enforces a strict rolling-window 60/minute limit, this can still trigger 429s. Consider lowering the capacity (e.g., 1–5) or switching to a leaky-bucket / fixed-window approach to smooth request rate.
| // Token-bucket rate limiter — 60 requests per minute (Hardcover's stated limit) | |
| // --------------------------------------------------------------------------- | |
| const RATE_LIMIT_PER_MINUTE = 60; | |
| const REFILL_INTERVAL_MS = (60 / RATE_LIMIT_PER_MINUTE) * 1_000; // ms per token = 1000ms | |
| class TokenBucket { | |
| private tokens: number; | |
| private lastRefill: number; | |
| constructor(private readonly capacity: number) { | |
| this.tokens = capacity; | |
| this.lastRefill = Date.now(); | |
| } | |
| private refill(): void { | |
| const now = Date.now(); | |
| const elapsed = now - this.lastRefill; | |
| const newTokens = Math.floor(elapsed / REFILL_INTERVAL_MS); | |
| if (newTokens > 0) { | |
| this.tokens = Math.min(this.capacity, this.tokens + newTokens); | |
| this.lastRefill = now; | |
| } | |
| } | |
| tryConsume(): boolean { | |
| this.refill(); | |
| if (this.tokens > 0) { | |
| this.tokens--; | |
| return true; | |
| } | |
| return false; | |
| } | |
| } | |
| const rateLimiter = new TokenBucket(RATE_LIMIT_PER_MINUTE); | |
| // Smoothed rate limiter — 60 requests per minute (Hardcover's stated limit) | |
| // --------------------------------------------------------------------------- | |
| const RATE_LIMIT_PER_MINUTE = 60; | |
| const REFILL_INTERVAL_MS = (60 / RATE_LIMIT_PER_MINUTE) * 1_000; // 1000ms between requests | |
| class TokenBucket { | |
| private nextAllowedAt: number; | |
| constructor() { | |
| this.nextAllowedAt = 0; | |
| } | |
| tryConsume(): boolean { | |
| const now = Date.now(); | |
| if (now >= this.nextAllowedAt) { | |
| this.nextAllowedAt = now + REFILL_INTERVAL_MS; | |
| return true; | |
| } | |
| return false; | |
| } | |
| } | |
| const rateLimiter = new TokenBucket(); |
| // GraphQL query — kept to depth ≤ 3 per Hardcover's documented limit | ||
| // --------------------------------------------------------------------------- | ||
|
|
||
| const SEARCH_QUERY = /* GraphQL */ ` | ||
| query SakeMetadataSearch($query: String!, $limit: Int!) { | ||
| search(query: $query, query_type: "Book", per_page: $limit) { | ||
| results { | ||
| hit { | ||
| id | ||
| title | ||
| description | ||
| rating | ||
| ratings_count | ||
| slug | ||
| cached_contributors | ||
| cached_tags | ||
| default_edition_id | ||
| default_edition { | ||
| isbn_13 | ||
| isbn_10 | ||
| pages | ||
| release_date | ||
| image { url width height } | ||
| publisher { name } | ||
| language { language } | ||
| } |
There was a problem hiding this comment.
The comment says this GraphQL query is kept to depth ≤ 3, but the selection set nests beyond that (e.g., search.results.hit.default_edition.image). If Hardcover actually enforces a depth limit, this comment is misleading (or the query may be rejected). Consider updating the comment to match the real constraint, or reducing query nesting if a hard depth limit exists.
| import { isMetadataLookupEnabled } from '$lib/server/config/activatedMetadataProviders'; | ||
| import { errorResponse } from '$lib/server/http/api'; | ||
| import { json } from '@sveltejs/kit'; | ||
| import type { RequestHandler } from './$types'; | ||
|
|
||
| export const GET: RequestHandler = async () => { | ||
| if (!isMetadataLookupEnabled()) { |
There was a problem hiding this comment.
isMetadataLookupEnabled() only checks whether any provider IDs are listed in ACTIVATED_METADATA_PROVIDERS, but activatedMetadataProviders can still be empty (e.g., isbndb not implemented or hardcover token missing). That leads to this endpoint returning 200 with an empty providers array while the feature is effectively unusable. Consider basing the guard on activatedMetadataProviders.length > 0, or updating isMetadataLookupEnabled() to reflect actually-instantiated providers.
| import { isMetadataLookupEnabled } from '$lib/server/config/activatedMetadataProviders'; | |
| import { errorResponse } from '$lib/server/http/api'; | |
| import { json } from '@sveltejs/kit'; | |
| import type { RequestHandler } from './$types'; | |
| export const GET: RequestHandler = async () => { | |
| if (!isMetadataLookupEnabled()) { | |
| import { errorResponse } from '$lib/server/http/api'; | |
| import { json } from '@sveltejs/kit'; | |
| import type { RequestHandler } from './$types'; | |
| export const GET: RequestHandler = async () => { | |
| if (activatedMetadataProviders.length === 0) { |
| import type { RequestHandler } from './$types'; | ||
|
|
||
| function str(v: unknown): string | null { | ||
| return typeof v === 'string' ? v : null; |
There was a problem hiding this comment.
str() returns any string verbatim, including whitespace-only values. That can make downstream providers treat a query term as present even when it’s effectively empty (e.g., ' '), producing noisy/broad upstream searches. Consider trimming and converting empty strings to null (similar to asString() in metadataProviderUtils).
| return typeof v === 'string' ? v : null; | |
| if (typeof v !== 'string') return null; | |
| const trimmed = v.trim(); | |
| return trimmed === '' ? null : trimmed; |
| hardcoverId: str(r.hardcoverId), | ||
| limit: typeof r.limit === 'number' ? r.limit : undefined | ||
| }; |
There was a problem hiding this comment.
limit is accepted as any number (including floats/negatives/very large values). This can break providers that require an integer (Hardcover GraphQL Int!) and can also amplify upstream load. Consider validating it as a safe positive integer and clamping to a reasonable max (and/or letting providers clamp internally).
| const bookId = typeof body.bookId === 'number' ? body.bookId : undefined; | ||
| const query = parseQuery(body.query); |
There was a problem hiding this comment.
bookId is accepted as any number, which includes non-integers and negative values. If bookId is intended to be a database ID, consider validating it as a positive integer (e.g., Number.isInteger(...) && bookId > 0) and returning 400 otherwise.
| export function createMetadataProvider(providerId: MetadataProviderId): MetadataProviderPort | null { | ||
| switch (providerId) { | ||
| case 'googlebooks': | ||
| return new GoogleBooksMetadataProvider(); | ||
| case 'openlibrary': | ||
| return new OpenLibraryMetadataProvider(); | ||
| case 'hardcover': | ||
| throw new Error('Hardcover metadata provider is not yet implemented'); | ||
| // Only instantiate when token is configured; silently skipped otherwise | ||
| return process.env.HARDCOVER_API_TOKEN?.trim() ? new HardcoverMetadataProvider() : null; | ||
| case 'isbndb': | ||
| throw new Error('ISBNdb metadata provider is not yet implemented'); | ||
| return null; // not yet implemented — Phase 6 | ||
| default: { |
There was a problem hiding this comment.
Returning null for activated-but-unavailable providers (e.g., missing HARDCOVER_API_TOKEN or not-yet-implemented isbndb) can result in isMetadataLookupEnabled() being true while zero providers are actually instantiated. That makes /api/metadata/providers and /api/metadata/search behave as if metadata lookup is enabled but silently do nothing. Consider either (a) always instantiating the provider and letting it return a 503 (so the aggregator can surface a provider error), or (b) changing the “enabled” check to be based on the instantiated providers list rather than the env list.
| const limit = query.limit ?? 5; | ||
|
|
||
| try { | ||
| let books: HardcoverBook[]; | ||
|
|
||
| if (query.isbn) { | ||
| const data = await graphqlFetch<HardcoverISBNResult['data']>( | ||
| token, | ||
| ISBN_LOOKUP_QUERY, | ||
| { isbn: query.isbn, limit } | ||
| ); |
There was a problem hiding this comment.
limit is passed directly into GraphQL variables where it is declared as Int!. If query.limit is a float/negative/NaN (possible from API input), Hardcover will reject the request. Consider coercing limit to a safe positive integer and clamping it to a sane max before using it in GraphQL calls.
| async execute( | ||
| input: SearchMetadataCandidatesInput | ||
| ): Promise<ApiResult<SearchMetadataCandidatesResult>> { | ||
| if (input.bookId == null && input.query == null) { | ||
| return apiError('Either bookId or query must be provided', 400); | ||
| } | ||
|
|
||
| let query: MetadataQuery; | ||
|
|
||
| if (input.bookId != null) { | ||
| const book = await this.bookRepository.getById(input.bookId); | ||
| if (!book) { | ||
| return apiError('Book not found', 404); | ||
| } | ||
| query = { | ||
| title: book.title, | ||
| author: book.author ?? null, | ||
| isbn: book.identifier ?? null, | ||
| language: book.language ?? null, | ||
| googleBooksId: book.google_books_id ?? null, | ||
| openLibraryKey: book.open_library_key ?? null, | ||
| ...input.query | ||
| }; | ||
| } else { | ||
| query = input.query!; | ||
| } | ||
|
|
||
| const result = await this.aggregator.lookup(query); | ||
| return apiOk(result); | ||
| } |
There was a problem hiding this comment.
This introduces a new use case with multiple input paths (bookId lookup + query override vs direct query). There don’t appear to be existing tests covering this behavior; adding unit tests would help prevent regressions (e.g., merging precedence, 404 on missing book, and 400 when neither bookId nor query is provided).
No description provided.