Open
Description
This is just a loosely ordered list of things I already have on my radar, to be cleaned up later™️.
Harmonized Data
- Merge missing properties into the preferred provider's data
- Check for conflicting properties during merge
- Warning for duration
- Error for GTIN
- Error for incompatible medium track counts
- Merge tracks if the total track count matches (1 medium vs N media)
- Skip missing tracklist! (by far the most common error in the test phase logs)
- Merge empty medium into medium with tracks
- Release date quality ranking, plausibility checks for each provider (new attribute "date.warning"), merge strategy "prefer latest" (Don't seed the release date if it's before the source service existed #101)
- Generally warn about pre-release data
- Guess featured artists from titles (better handling of feat. artists #39)
- Copyright notices
- Explicitness of tracks
- Optional title cleanup (numeric prefix, ETI style etc.)
- Deezer allows crediting the same artist multiple times
- Customizable search & replace rules
- Audiobook / audio drama mode?
- Preserve catalog numbers
- Improve language detection (skip too short inputs, try alternatives?)
Providers
- iTunes: Ensure that collection.trackCount equals the number of returned tracks
- iTunes: Warn about responses which contain multiple release variants for an UPC
- iTunes: Country list for optional region lookups
- iTunes: Use region from URL
- iTunes: Try to use artist (or ISRC?) region for canonical (region-specific) URL
- iTunes: Try next region instead of throwing if JSON parsing fails (see screenshot)
- Spotify: Pad GTIN with zeros if no results are found (example)
- Deezer: Truncate padded GTIN
- Bandcamp: Add untitled hidden tracks, only their count is available as OG meta header
- Extract more than just trAlbum into snapshots -> wrapper object, avoid deserializing JSON
- Bandcamp: Try band URL as label URL, extract label from packages
- Bandcamp: Check whether band is part of the release artist before using it as label
- iTunes: Warn about missing tracklist
- Bandcamp: VA releases
- Bandcamp:
/track
URLs (Support standalone Bandcamp tracks #7) - Bandcamp: Extract release ISRCs from all
/track
URLs (expensive, only if there is no better source) - iTunes: Drop " - Single" from title (Remove
- Single
&- EP
from release names automagically #9) - Bandcamp: Extract track images, from embedded player (only for pre-releases so far)
- Beatport: Warn about catalog numbers which look like GTINs #46
- iTunes: Show not only the URL with the last region when all lookup attempts failed
- Bandcamp: no tracks https://2nxmusic.bandcamp.com/album/stolen-lullabies
- Bandcamp: custom domains (fix(Bandcamp): Fall back to raw release URL for custom domains #8)
- Deezer: API sometimes returns too many tracks: https://musicbrainz.org/edit/112474481 or https://www.deezer.com/album/303245
MusicBrainz
- Suggest existing release group
- Find release group or similar releases, reuse recordings? Similar query as MB duplicates tab?
- Resolve external links to MBIDs
- Don't resolve ambiguous URLs to MBIDs
- Allow two URL rels if for the same target entity (e.g. download and streaming)
- Cache pending requests in a map, parallel resolving of all identifiers
- Use resolved MBID of release artist for unresolved but identically named track artists
- Combine release and track artists which share identifiers or names to avoid inconsistent results in edge cases (Improve artist matching and combining #54)
- Guess release group types (Support (more) release group types #15)
- Create edit note: Add permalink / homepage / repository URL (and version?)
- Optionally fill the annotation with additional data (make the sections configurable)
- Copyright notice
- Availability
- Release
and track levelcredits (text only so far)
- Explicitness (show, but do not seed for tracks; add to release/recording disambiguation?)
- Detect European releases (special country XE)
- Target seeder at existing release (Option to edit release instead of adding new release #73)
- Use ampersand for last joinphrase by default
- Support track URLs for other providers and suggest to look their release up
Infrastructure
- URL lookup -> GTIN -> parallel GTIN lookups
- Support provider-specific messages
- Return all provider error messages if no lookup was successful
- Allow to choose and exclude providers (Provider preferences)
- Allow providers to return multiple releases #70 i.e. different variants (e.g. for Bandcamp)
- Cache management: https://github.com/kellnerd/snap_storage
- Invalidation strategy: FIFO/LRU/TTL? Maximum age (optional)
- In memory and/or long-time cache? JSON files
with compression or Redis? - Cache multiple versions with timestamps (daily? only if there have been changes?)
- Let the requester know how old the data is and whether it is from the cache
- Permalinks to specific cached version (include GTIN, enabled providers, optional additional URLs or ProviderName=ProviderId pairs)
- Optimize lookups (perform no GTIN lookup if ID was already looked up)
- These repeated lookups also skew the calculated processing time for the initial provider (e.g. Deezer track requests are now cached)
- Use as few requests as possible (only make additional API calls for a provider if data is missing, e.g. iTunes regions or Deezer ISRCs)
- Lookup by metadata (label and catno, title, artist, track count etc.) for providers without GTIN
- Create provider feature categories (e.g. streaming, physical, with GTIN/ISRC, GTIN lookup, scraper, audio drama, Japanese etc.)
- Lookup the entire discography of a given artist/label
- Make MusicBrainz base url configurable (environment variable)
- Deduplicate lookup
ReleaseOptions.regions
option by using an ordered set - Manage lookup state: Each provider "Example" is split into two classes ExampleProvider and ExampleReleaseLookup, where ExampleReleaseLookup has a (readonly) property
provider
- Splits general request logic and release processing logic
- Possible to store release lookup state as class properties
- Separation of unrelated tasks once we add artist/label lookups later, e.g. as ExampleArtistLookup and ExampleLabelLookup
- Warn that available regions may not be accurate before the release date has passed (anywhere on earth, UTC-12)
- Extract provider URLs from link shortener pages
- Extract provider IDs and GTIN from a-tisket URLs
- Write more test cases...
- Preserve URL blurb (for Beatport)
- Improve logging of AggregateErrors, they make it a PITA to find the real issue
Web Interface
- Display header with logo and description
- Harmony: Music Metadata Aggregator and MusicBrainz Importer/Seeder
- Design banner logo and icon
- Display footer with version, repo URL and support URL (environment variables DENO_DEPLOYMENT_ID, REPO_BASE_URL, optional COMMIT_BASE_URL, SUPPORT_URL)
- Add OpenGraph meta tags
- Allow to choose and exclude providers (persistent provider checkboxes)
- Persist preferred regions input
- Show provider and alternative values for interesting properties
- Improve track length comparison, Deezer truncates instead of rounding
- Settings page/section with persisted checkboxes
- Multiple URL inputs (dynamic form)
- Provider URL detection on the frontend (URLPattern polyfill for Firefox and Safari? https://caniuse.com/mdn-api_urlpattern)
- CSS
- Responsive grid
- Beautiful icons: https://tabler.io/icons enhanced by a few custom ones in the same style
- Provider icons (
external links or data URIs? inline TSX SVG?SVG sprite built with TSX) - Post-submission route/page ("release actions"):
- ISRC submission (kepstin/tatsumo/custom?)
- Artwork (ECAU)
- External links (for artists, maybe for labels?) (seed artist URLs to MB artist #33)
- Dynamic region list display: count, compact flags, detailed list
- Group regions by continent
- Serve documentation, written in Markdown
-
Use HTTPS by passingSupport X-Forwarded-Proto proxy headerkey
andcert
options tostart()
- Trim GTIN input to avoid unnecessary errors
Metadata
Metadata
Assignees
Labels
No labels