Skip to content

wkosmos/MusicalGeography

Repository files navigation

Musical Geography

Exploring Associations Between Geographic and Musical Characteristics

Contents

  1. Background

  2. Data

  3. Analysis

  4. Discussion

  5. Resources/Reference

Background

How does someone's life experience affect their music?

Many factors in a person's life might influence what kind of music they create, but many are subjective and difficult to find (like individual personal factors/experiences/feelings). For this exploratory analysis a few ubiquitous factors were chosen for a zoomed-out view of the relationship between personal situation and musical attributes.

Back to top

Data

Data sources: Spotify API (musical data), MusicBrainz API (artist data), ArcGIS Hub (shapefiles), Worldbank (geographic microdata), and Github (country codes).

Acquisition

Spotify

Based on the project proposal the original plan was to build a Python wrapper for Spotify's API in order to perform specific granular requests, but this was later abandoned as unnecessary as a 3rd party API wrapper for Python already exists (Spotipy).

MusicBrainz

MusicBrainz is an open database of aggregated music metadata, and was needed because Spotify doesn't store any personal information about each artist. For this project MusicBrainz' Python API wrapper MusicBrainzngs was used to search for each artist's name and write the birth country from the top result to a csv file.

ArcGIS Hub

Originally a world countries shapefile from ArcGIS Hub was used for generating maps, but later in the project was replaced with Geopandas in-built naturalearth_lowres dataset for simplicty.

Worldbank

The project proposal included plans to source geographic microdata (population, education, health indicators, income, etc.) from Worldbank, but this was removed from the scope due to time constraints.


Back to top

Exploration

After the Spotify dataset was read into a pandas DataFrame, the distribution of each numerical column was plotted: distributions of numerical columns

Note:
Popularity, acousticness, and valence seemed to have more extreme values than expected, possibly due to some sort of threshold in Spotify's calculation of these metrics.

Intuitiveness of Metrics

Spotify's description of of the subjective metrics danceability, energy, and valence are a bit too vague to form a confident idea of what they measure, so some comparisons were necessary to gauge their intuitiveness.


Spotify API Docs Definitions:

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity.

Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

Valence is measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).


First, each subjective metric was plotted against two objective metrics: tempo and loudness.

danceability vs tempo and loudness


  • Spotify appeared to be treating the 100-150 bpm range as the most danceable, which is roughly intuitive if it's assumed that people don't want to dance either too slow or too fast.
  • High danceability appearing to coincide with high loudness fit with the typical idea of dance music.

energy vs tempo and loudness


  • Energy appeared to have a very rough positive association with tempo.
  • Energy had a strong positive association with loudness, which makes intuitive sense and suggests that loudness might have been used in the calculation of energy.

valence vs tempo and loudness


  • Valence had nearly no association with tempo or loudness other than a slightly wider range of loudness values being present at 0 and 1 valence, which was likely due to the significantly larger number of tracks with these valence values.

The three subjective metrics were also plotted against each other, and appeared to all have a rough positive association.

danceability vs energy vs valence


Finally, five illustrative genres were chosen and the density distribution of the three subjective metrics was plotted for each. These distributions mostly aligned with expectations of the genres.

five genres danceability energy valence


Back to top

Cleaning/Organization

The Spotify dataset was very clean as acquired, though only 11 of the 18 columns were used.
Of the 14565 unique artists in the dataset, 9141 had a birth country value found via the MusicBrainz API, so it was only these which could be included in the later analysis of music metrics by birth country.

Country Codes

Unfortunately, the MusicBrainz database stores country codes in the ISO Alpha-2 format (2-length string), while the geopandas naturalearth_lowres dataframe contains country codes only in the ISO- Alpha-3 format (3-length string).
A csv including both ISO-a2 and ISO-a3 country codes was found on Github, and this was loaded into pandas and merged into the world dataframe.

Once the country codes in the geopandas dataframe matched those sourced from MusicBrainz columns could be added with counts and means of columns from the Spotify dataset.

Back to top

Analysis

Artists per Country

To get a sense of how skewed the analysis might be, I checked how many artists in the dataset were from each country. This plot ended up being so USA-skewed I had to use a log scale, num artists per country

Danceability

danceability by birthplace

Energy

energy by birthplace

Valence

valence by birthplace

Discussion

Conclusion

  • All three of Spotify's subjective metrics appear to show association with the birthplace of the artist, and deeper investigation seems worthwhile.

Notes

  • Artist birthplace is a flawed metric, in that it doesn't account for artists who have moved since birth, or for societal influences at a smaller than national level. This could be weighed a bit by getting data on what percentage of a country's population was born there.
  • Future interesting analyses could include investigation of the association of various cultural diasporas (esp. African) with Spotify's metrics

Resources/References

Data used in this analysis were sourced from:
Spotify API

  • Musical data - tempo, popularity, acousticness, danceability, energy, instrumentalness, loudness, liveness, speechiness, valence

MusicBrainz API

  • Artist birthplace

[Country Codes](https://gist.github.com/tadast/8827699)

  • ISO Alpha-2 and ISO Alpha-3 country code matches

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published