Bandcamp Crawler lets you explore, analyze, and export rich metadata from Bandcamp pages, including artists, albums, tracks, and search results. It turns the public catalog into structured data you can pipe into dashboards, research workflows, or music-discovery tools. Designed for music analysts, indie label teams, and developers building features around Bandcamp content.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for bandcamp-crawler you've just found your team — Let’s Chat. 👆👆
Bandcamp Crawler is a command-line and scriptable tool that navigates Bandcamp pages and converts them into structured JSON records. It supports search pages, artist discographies, albums, and individual tracks, making it easy to analyze discographies, track performance, and catalog metadata at scale.
It is ideal for:
- Music data scientists and analysts who want high-quality catalog data.
- Indie labels and artist managers tracking releases and tags across catalogs.
- Developers integrating Bandcamp metadata into apps, dashboards, or recommendation engines.
- Supports multiple entry points including search, artist music pages, album pages, track pages, and discovery feeds.
- Traverses pagination on search and discovery views to capture more results with configurable limits.
- Extracts detailed album metadata including tags, tracklists, artwork, and artist information.
- Captures track-level attributes such as duration, position, album links, and artist details.
- Provides flexible input flags to control whether albums, tracks, or artists are followed and stored as individual records.
| Feature | Description |
|---|---|
| Multi-entry crawling | Start from search, artist, album, track, or discover pages and let the crawler resolve all supported entities. |
| Discography extraction | Collect complete artist discographies including albums, tags, and associated metadata. |
| Track-level insights | Extract track titles, positions, durations, album references, and artist information. |
| Configurable depth | Use boolean flags to decide whether to follow albums from search, tracks from albums, or albums from tracks. |
| Pagination control | Limit how many search or discover pages are traversed with a simple numeric setting. |
| Proxy-ready networking | Plug in your own proxy configuration for safer, more reliable large-scale runs. |
| Verbose debugging mode | Enable debug logging to inspect crawling flow, parsed entities, and edge cases. |
| Export-friendly output | Save results as structured JSON that can be converted to CSV, Excel, or imported into your own database or analytics stack. |
| Field Name | Field Description |
|---|---|
| dataType | Type of record scraped (e.g., search, album, track, artist). |
| title | Human-readable title of the entity (album title, track title, artist name, etc.). |
| url | Canonical URL of the scraped entity on Bandcamp. |
| image.url | URL of the primary artwork or thumbnail image associated with the entity. |
| pagination.page | Current page number in a search or discovery result set. |
| pagination.pages | Total number of available pages for the search query. |
| pagination.urls.first | URL of the first page in the search result set. |
| pagination.urls.last | URL of the last page in the search result set. |
| pagination.urls.next | URL of the next page, if another page exists. |
| results[] | Array of search results (artists, albums, tracks), each with its own dataType, title, url, and image. |
| artist.name | Name of the album or track’s artist. |
| artist.url | Canonical URL of the artist profile. |
| tags[] | List of tags describing genres, locations, or themes for an album. |
| tags[].title | Display label of the tag (e.g., metal, rock, Los Angeles). |
| tags[].url | URL link to the corresponding tag page on Bandcamp. |
| tracklist[] | Collection of tracks belonging to an album. |
| tracklist[].title | Title of the track in the album. |
| tracklist[].url | URL to the track’s page. |
| tracklist[].position | Numeric position of the track within the album. |
| tracklist[].duration | Duration of the track in mm:ss format. |
| album.title | Title of the album when scraping a track entity. |
| album.url | URL of the album containing the track. |
| duration | Duration of the track (when scraping a track entity). |
| position | Track number within the album (when scraping a track entity). |
| images[] | List of image variants associated with an album (e.g., different sizes). |
| images[].url | URL of a specific album cover variant. |
Example:
[
{
"dataType": "search",
"pagination": {
"page": 1,
"pages": 4,
"urls": {
"first": "https://bandcamp.com/search?q=five+finger+death+punch&page=1",
"last": "https://bandcamp.com/search?page=4&q=five%20finger%20death%20punch",
"next": "https://bandcamp.com/search?page=2&q=five%20finger%20death%20punch"
}
},
"results": [
{
"dataType": "artist",
"title": "Five Finger Death Punch",
"url": "https://fivefingerdeathpunch.bandcamp.com?from=search&search_item_id=3335222211",
"image": {
"url": "https://f4.bcbits.com/img/0027318719_23.jpg"
}
},
{
"dataType": "album",
"title": "AfterLife",
"url": "https://fivefingerdeathpunch.bandcamp.com/album/afterlife?from=search&search_item_id=730568840",
"image": {
"url": "https://f4.bcbits.com/img/a3711245885_7.jpg"
}
}
]
},
{
"dataType": "album",
"title": "N.A.T.I.O.N.",
"url": "https://badwolves.bandcamp.com/album/n-a-t-i-o-n",
"artist": {
"name": "Bad Wolves",
"url": "https://badwolves.bandcamp.com"
},
"tags": [
{ "title": "metal", "url": "https://bandcamp.com/tag/metal?from=tralbum&artist=924521020" },
{ "title": "rock", "url": "https://bandcamp.com/tag/rock?from=tralbum&artist=924521020" },
{ "title": "Los Angeles", "url": "https://bandcamp.com/tag/los-angeles?from=tralbum&artist=924521020" }
],
"tracklist": [
{
"title": "I'll Be There",
"url": "https://badwolves.bandcamp.com/track/ill-be-there-1",
"position": 1,
"duration": "04:02"
},
{
"title": "No Messiah",
"url": "https://badwolves.bandcamp.com/track/no-messiah",
"position": 2,
"duration": "04:20"
}
],
"images": [
{ "url": "https://f4.bcbits.com/img/a0888598634_16.jpg" },
{ "url": "https://f4.bcbits.com/img/a0888598634_10.jpg" }
]
},
{
"dataType": "track",
"title": "In The Dark",
"url": "https://inflamesofficial.bandcamp.com/track/in-the-dark",
"album": {
"title": "Foregone",
"url": "https://inflamesofficial.bandcamp.com/album/foregone"
},
"artist": {
"name": "In Flames",
"url": "https://inflamesofficial.bandcamp.com"
},
"duration": "04:17",
"position": 9
}
]
Bandcamp Crawler/
├── src/
│ ├── index.js
│ ├── cli.js
│ ├── config/
│ │ ├── defaults.js
│ │ └── schema.json
│ ├── crawlers/
│ │ ├── searchCrawler.js
│ │ ├── artistCrawler.js
│ │ ├── albumCrawler.js
│ │ └── trackCrawler.js
│ ├── parsers/
│ │ ├── searchParser.js
│ │ ├── albumParser.js
│ │ └── trackParser.js
│ ├── services/
│ │ ├── httpClient.js
│ │ ├── proxyManager.js
│ │ └── logger.js
│ └── utils/
│ ├── htmlHelpers.js
│ ├── urlNormalizer.js
│ └── pagination.js
├── config/
│ ├── example.input.json
│ └── proxy.example.json
├── data/
│ ├── samples/
│ │ ├── search-sample.json
│ │ ├── album-sample.json
│ │ └── track-sample.json
│ └── exports/
│ └── README.md
├── tests/
│ ├── searchCrawler.test.js
│ ├── albumParser.test.js
│ └── trackParser.test.js
├── package.json
├── package-lock.json
├── README.md
└── LICENSE
- Music data analysts use it to collect large-scale album, track, and artist metadata, so they can build dashboards and run catalog analytics without manual data entry.
- Indie labels and managers use it to monitor their artists’ discographies and tags, so they can track genre positioning, discoverability, and catalog completeness.
- Playlist and recommendation app developers use it to ingest structured Bandcamp metadata, so they can power search, filter, and recommendation features in their apps.
- Market researchers use it to study genre trends, location tags, and release patterns, so they can identify emerging scenes and niches in the Bandcamp ecosystem.
- Archivists and collectors use it to build personal or institutional catalogs of albums and tracks, so they can maintain curated offline or mirrored datasets for long-term reference.
Q: What kinds of URLs can I start from? A: You can start from search pages, artist music pages, album pages, track pages, and discovery pages. The crawler automatically detects what type of page it is and structures the output accordingly.
Q: How do I limit how deep the crawler goes?
A: Use the configuration flags to set maxPagesToSearch and boolean options such as fetching albums from search results or tracks from album pages. This lets you control both pagination and relationship-following behavior.
Q: Can I customize networking and proxy settings? A: Yes. The crawler accepts a proxy configuration object where you can enable or disable proxy usage and plug in your own proxy endpoints, giving you flexibility in how requests are routed.
Q: In what formats can I export the data? A: Data is produced as structured JSON records which you can easily transform into CSV, Excel, or load into your own databases, warehouses, or BI tools using standard conversion utilities or custom scripts.
Primary Metric: In typical usage, the crawler processes around 80–120 pages per minute when fetching search and discovery results with moderate pagination, while still collecting associated album and track metadata.
Reliability Metric: With sensible rate limits and optional proxy usage, it commonly achieves a 95%+ successful request rate across long-running sessions spanning hundreds of pages.
Efficiency Metric: On a mid-range machine, a full run that includes search, album, and track traversal remains memory-efficient, routinely handling thousands of entities without exceeding a few hundred megabytes of RAM.
Quality Metric: Field completeness for core attributes (title, URL, artist, basic tags, track positions, and durations) typically exceeds 98%, ensuring the resulting dataset is robust enough for analytics, cataloging, and integration into downstream systems.
