Bandcamp Crawler

Bandcamp Crawler lets you explore, analyze, and export rich metadata from Bandcamp pages, including artists, albums, tracks, and search results. It turns the public catalog into structured data you can pipe into dashboards, research workflows, or music-discovery tools. Designed for music analysts, indie label teams, and developers building features around Bandcamp content.

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for bandcamp-crawler you've just found your team — Let’s Chat. 👆👆

Introduction

Bandcamp Crawler is a command-line and scriptable tool that navigates Bandcamp pages and converts them into structured JSON records. It supports search pages, artist discographies, albums, and individual tracks, making it easy to analyze discographies, track performance, and catalog metadata at scale.

It is ideal for:

Music data scientists and analysts who want high-quality catalog data.
Indie labels and artist managers tracking releases and tags across catalogs.
Developers integrating Bandcamp metadata into apps, dashboards, or recommendation engines.

Bandcamp Catalog Intelligence

Supports multiple entry points including search, artist music pages, album pages, track pages, and discovery feeds.
Traverses pagination on search and discovery views to capture more results with configurable limits.
Extracts detailed album metadata including tags, tracklists, artwork, and artist information.
Captures track-level attributes such as duration, position, album links, and artist details.
Provides flexible input flags to control whether albums, tracks, or artists are followed and stored as individual records.

Features

Feature	Description
Multi-entry crawling	Start from search, artist, album, track, or discover pages and let the crawler resolve all supported entities.
Discography extraction	Collect complete artist discographies including albums, tags, and associated metadata.
Track-level insights	Extract track titles, positions, durations, album references, and artist information.
Configurable depth	Use boolean flags to decide whether to follow albums from search, tracks from albums, or albums from tracks.
Pagination control	Limit how many search or discover pages are traversed with a simple numeric setting.
Proxy-ready networking	Plug in your own proxy configuration for safer, more reliable large-scale runs.
Verbose debugging mode	Enable debug logging to inspect crawling flow, parsed entities, and edge cases.
Export-friendly output	Save results as structured JSON that can be converted to CSV, Excel, or imported into your own database or analytics stack.

What Data This Scraper Extracts

Field Name	Field Description
dataType	Type of record scraped (e.g., `search`, `album`, `track`, `artist`).
title	Human-readable title of the entity (album title, track title, artist name, etc.).
url	Canonical URL of the scraped entity on Bandcamp.
image.url	URL of the primary artwork or thumbnail image associated with the entity.
pagination.page	Current page number in a search or discovery result set.
pagination.pages	Total number of available pages for the search query.
pagination.urls.first	URL of the first page in the search result set.
pagination.urls.last	URL of the last page in the search result set.
pagination.urls.next	URL of the next page, if another page exists.
results[]	Array of search results (artists, albums, tracks), each with its own `dataType`, `title`, `url`, and `image`.
artist.name	Name of the album or track’s artist.
artist.url	Canonical URL of the artist profile.
tags[]	List of tags describing genres, locations, or themes for an album.
tags[].title	Display label of the tag (e.g., `metal`, `rock`, `Los Angeles`).
tags[].url	URL link to the corresponding tag page on Bandcamp.
tracklist[]	Collection of tracks belonging to an album.
tracklist[].title	Title of the track in the album.
tracklist[].url	URL to the track’s page.
tracklist[].position	Numeric position of the track within the album.
tracklist[].duration	Duration of the track in `mm:ss` format.
album.title	Title of the album when scraping a track entity.
album.url	URL of the album containing the track.
duration	Duration of the track (when scraping a track entity).
position	Track number within the album (when scraping a track entity).
images[]	List of image variants associated with an album (e.g., different sizes).
images[].url	URL of a specific album cover variant.

Example Output

Example:

[
  {
    "dataType": "search",
    "pagination": {
      "page": 1,
      "pages": 4,
      "urls": {
        "first": "https://bandcamp.com/search?q=five+finger+death+punch&page=1",
        "last": "https://bandcamp.com/search?page=4&q=five%20finger%20death%20punch",
        "next": "https://bandcamp.com/search?page=2&q=five%20finger%20death%20punch"
      }
    },
    "results": [
      {
        "dataType": "artist",
        "title": "Five Finger Death Punch",
        "url": "https://fivefingerdeathpunch.bandcamp.com?from=search&search_item_id=3335222211",
        "image": {
          "url": "https://f4.bcbits.com/img/0027318719_23.jpg"
        }
      },
      {
        "dataType": "album",
        "title": "AfterLife",
        "url": "https://fivefingerdeathpunch.bandcamp.com/album/afterlife?from=search&search_item_id=730568840",
        "image": {
          "url": "https://f4.bcbits.com/img/a3711245885_7.jpg"
        }
      }
    ]
  },
  {
    "dataType": "album",
    "title": "N.A.T.I.O.N.",
    "url": "https://badwolves.bandcamp.com/album/n-a-t-i-o-n",
    "artist": {
      "name": "Bad Wolves",
      "url": "https://badwolves.bandcamp.com"
    },
    "tags": [
      { "title": "metal", "url": "https://bandcamp.com/tag/metal?from=tralbum&artist=924521020" },
      { "title": "rock", "url": "https://bandcamp.com/tag/rock?from=tralbum&artist=924521020" },
      { "title": "Los Angeles", "url": "https://bandcamp.com/tag/los-angeles?from=tralbum&artist=924521020" }
    ],
    "tracklist": [
      {
        "title": "I'll Be There",
        "url": "https://badwolves.bandcamp.com/track/ill-be-there-1",
        "position": 1,
        "duration": "04:02"
      },
      {
        "title": "No Messiah",
        "url": "https://badwolves.bandcamp.com/track/no-messiah",
        "position": 2,
        "duration": "04:20"
      }
    ],
    "images": [
      { "url": "https://f4.bcbits.com/img/a0888598634_16.jpg" },
      { "url": "https://f4.bcbits.com/img/a0888598634_10.jpg" }
    ]
  },
  {
    "dataType": "track",
    "title": "In The Dark",
    "url": "https://inflamesofficial.bandcamp.com/track/in-the-dark",
    "album": {
      "title": "Foregone",
      "url": "https://inflamesofficial.bandcamp.com/album/foregone"
    },
    "artist": {
      "name": "In Flames",
      "url": "https://inflamesofficial.bandcamp.com"
    },
    "duration": "04:17",
    "position": 9
  }
]

Directory Structure Tree

Bandcamp Crawler/
├── src/
│   ├── index.js
│   ├── cli.js
│   ├── config/
│   │   ├── defaults.js
│   │   └── schema.json
│   ├── crawlers/
│   │   ├── searchCrawler.js
│   │   ├── artistCrawler.js
│   │   ├── albumCrawler.js
│   │   └── trackCrawler.js
│   ├── parsers/
│   │   ├── searchParser.js
│   │   ├── albumParser.js
│   │   └── trackParser.js
│   ├── services/
│   │   ├── httpClient.js
│   │   ├── proxyManager.js
│   │   └── logger.js
│   └── utils/
│       ├── htmlHelpers.js
│       ├── urlNormalizer.js
│       └── pagination.js
├── config/
│   ├── example.input.json
│   └── proxy.example.json
├── data/
│   ├── samples/
│   │   ├── search-sample.json
│   │   ├── album-sample.json
│   │   └── track-sample.json
│   └── exports/
│       └── README.md
├── tests/
│   ├── searchCrawler.test.js
│   ├── albumParser.test.js
│   └── trackParser.test.js
├── package.json
├── package-lock.json
├── README.md
└── LICENSE

Use Cases

Music data analysts use it to collect large-scale album, track, and artist metadata, so they can build dashboards and run catalog analytics without manual data entry.
Indie labels and managers use it to monitor their artists’ discographies and tags, so they can track genre positioning, discoverability, and catalog completeness.
Playlist and recommendation app developers use it to ingest structured Bandcamp metadata, so they can power search, filter, and recommendation features in their apps.
Market researchers use it to study genre trends, location tags, and release patterns, so they can identify emerging scenes and niches in the Bandcamp ecosystem.
Archivists and collectors use it to build personal or institutional catalogs of albums and tracks, so they can maintain curated offline or mirrored datasets for long-term reference.

FAQs

Q: What kinds of URLs can I start from? A: You can start from search pages, artist music pages, album pages, track pages, and discovery pages. The crawler automatically detects what type of page it is and structures the output accordingly.

Q: How do I limit how deep the crawler goes? A: Use the configuration flags to set maxPagesToSearch and boolean options such as fetching albums from search results or tracks from album pages. This lets you control both pagination and relationship-following behavior.

Q: Can I customize networking and proxy settings? A: Yes. The crawler accepts a proxy configuration object where you can enable or disable proxy usage and plug in your own proxy endpoints, giving you flexibility in how requests are routed.

Q: In what formats can I export the data? A: Data is produced as structured JSON records which you can easily transform into CSV, Excel, or load into your own databases, warehouses, or BI tools using standard conversion utilities or custom scripts.

Performance Benchmarks and Results

Primary Metric: In typical usage, the crawler processes around 80–120 pages per minute when fetching search and discovery results with moderate pagination, while still collecting associated album and track metadata.

Reliability Metric: With sensible rate limits and optional proxy usage, it commonly achieves a 95%+ successful request rate across long-running sessions spanning hundreds of pages.

Efficiency Metric: On a mid-range machine, a full run that includes search, album, and track traversal remains memory-efficient, routinely handling thousands of entities without exceeding a few hundred megabytes of RAM.

Quality Metric: Field completeness for core attributes (title, URL, artist, basic tags, track positions, and durations) typically exceeds 98%, ensuring the resulting dataset is robust enough for analytics, cataloging, and integration into downstream systems.

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bandcamp Crawler

Introduction

Bandcamp Catalog Intelligence

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Bandcamp Crawler

Introduction

Bandcamp Catalog Intelligence

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages