Spotify Albums Scraper

Scrape Spotify albums by keywords and collect clean, structured album metadata you can use in catalogs, dashboards, or research workflows. It captures essential Spotify album details like artists, cover art, release dates, and playability so you can build reliable music datasets fast.

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for spotify-albums-scraper you've just found your team — Let’s Chat. 👆👆

Introduction

This project searches Spotify albums by keyword and extracts a consistent set of album-level details into a structured dataset. It solves the problem of collecting album metadata at scale without manually browsing search pages, and it’s built for developers, analysts, and teams that need repeatable Spotify albums keyword scraping for data pipelines, content ops, and reporting.

Keyword-Based Album Discovery

Searches album results for one or more keywords and iterates through result pages automatically
Captures album entities with artist details, artwork, and release information in a normalized format
Uses browser automation with request interception to collect data efficiently and consistently
Includes randomized browser signals (user-agent + stealth) to reduce interruptions during long runs
Streams results incrementally so partial runs still produce usable output

Features

Feature	Description
Keyword album search	Scrape album results for one or more keywords with a simple input list.
Structured album metadata	Collect album identifiers, names, artists, images, release dates, and more in a consistent schema.
Request interception extraction	Reads data directly from relevant responses for more stable parsing than DOM-only approaches.
Pagination via offsets	Continues fetching additional pages until the end of results or `maxItems` is reached.
Incremental dataset pushing	Pushes batches as they are found so you don’t lose progress on partial runs.
Stealth + randomized user-agent	Mimics typical browser traits to reduce blocking and improve session stability.
Tunable timeouts	Longer navigation and handler timeouts for slow networks and heavy pages.
Multi-keyword runs	Processes multiple keywords in a single run and tags each result with its source keyword.

What Data This Scraper Extracts

Field Name	Field Description
uri	Spotify album URI identifier for the album entity.
name	Album title as shown in search results.
albumUrl	Direct URL to the album page built from the album URI.
artists	List of contributing artists (names, URIs, and related artist identifiers when available).
images	Cover art image set (commonly multiple sizes) for thumbnails and previews.
releaseDate	Album release date (when available) for timeline and freshness analysis.
releaseDatePrecision	Precision of the release date (day/month/year) when provided.
playability	Whether the album is playable in the current context/region/session.
label	Album label/publisher when included in the data payload.
totalTracks	Total number of tracks for the album (when available).
keyword	The originating keyword used to discover this album result.

Example Output

[
      {
            "uri": "spotify:album:1ATL5GLyefJaxhQzSPVrLX",
            "name": "Fine Line",
            "albumUrl": "https://open.spotify.com/albums/1ATL5GLyefJaxhQzSPVrLX",
            "artists": [
                  {
                        "name": "Harry Styles",
                        "uri": "spotify:artist:6KImCVD70vtIoJWnq6nGn3"
                  }
            ],
            "images": [
                  {
                        "url": "https://i.scdn.co/image/ab67616d0000b273....",
                        "width": 640,
                        "height": 640
                  },
                  {
                        "url": "https://i.scdn.co/image/ab67616d00001e02....",
                        "width": 300,
                        "height": 300
                  }
            ],
            "releaseDate": "2019-12-13",
            "releaseDatePrecision": "day",
            "totalTracks": 12,
            "playability": {
                  "playable": true
            },
            "keyword": "fine line"
      }
]

Directory Structure Tree

Spotify Albums Scraper (IMPORTANT :!! always keep this name as the name of the apify actor !!! Spotify Albums Scraper )/
├── src/
│   ├── main.js
│   ├── scraper/
│   │   ├── SpotifyAlbumsScraper.js
│   │   ├── interceptors.js
│   │   ├── processors.js
│   │   └── cookies.js
│   ├── utils/
│   │   ├── delays.js
│   │   ├── logger.js
│   │   └── validators.js
│   └── config/
│       ├── defaults.json
│       └── selectors.json
├── input/
│   ├── schema.json
│   └── example.input.json
├── test/
│   ├── fixtures/
│   │   └── searchAlbums.response.sample.json
│   └── unit/
│       └── processors.test.js
├── .env.example
├── .gitignore
├── package.json
├── package-lock.json
└── README.md

Use Cases

Music marketers use it to collect album metadata for keyword themes, so they can plan campaigns around releases and catalog clusters.
Data analysts use it to build searchable album datasets, so they can track release patterns and compare artist output over time.
Playlist curators use it to discover albums by genre/keyword terms, so they can refresh collections with consistent metadata.
Developers use it to feed album data into apps and dashboards, so they can power browsing, recommendations, and catalog pages.
Researchers use it to compile structured music metadata, so they can run studies without manual data entry.

FAQs

1) What inputs does this project expect? Provide a keywords array (one or more search terms). Optionally set maxItems to cap the number of albums collected per run. Results are tagged with the originating keyword for easy grouping.

2) Why do results sometimes stop before reaching maxItems? If the search reaches the end of available album results for a keyword (or no new items appear after multiple polling cycles), the run will conclude for that keyword. This prevents infinite loops when the search page has no more data.

3) Can I run multiple keywords in one job? Yes. The runner processes keywords sequentially and collects results per keyword. This is useful when building a multi-topic dataset in a single execution.

4) How do I reduce timeouts or improve stability on slower networks? Increase navigation and handler timeouts, and lower concurrency if you add it later. If the environment is constrained, ensure enough memory for headless Chromium and avoid running many browser processes at once.

Performance Benchmarks and Results

Primary Metric: ~120–220 album items/minute on a typical server-grade connection when responses load consistently via intercepted search queries.

Reliability Metric: 92–97% successful keyword runs across mixed workloads (short + long keywords), with most failures attributable to temporary network stalls or search response changes.

Efficiency Metric: Steady memory footprint for long runs by pushing incremental batches and avoiding heavy DOM parsing for every item; typical headless usage remains stable when running one keyword at a time.

Quality Metric: 95%+ field completeness for core metadata (album name, URI, artists, images, keyword), with optional fields (label, totalTracks, playability details) varying by album and region context.

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Albums Scraper

Introduction

Keyword-Based Album Discovery

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Spotify Albums Scraper

Introduction

Keyword-Based Album Discovery

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages