Hit Music Prediction Problem


For my undergraduate thesis project I was lucky enough to work on a topic most near and dear to my heart — the fusion of music and machine learning. Under the supervision of Prof. William Hamilton (McGill University) I turned my attention to applying machine learning methods to solve the hit music prediction problem.

Below, I’m going to explain it in short form.


Corr.png

“What is the hit music prediction problem?” You ask. In a nutshell, this open question involves trying to figure out which music will be popular in the future. Seems simple enough, right? Not really. Because how do you define ‘popular’ and how do you define ‘music’?

In our work, we decided to use a metric of music popularity that is used prevalently throughout North American music - the Billboard Hot 100 Charts. Every week, Billboard releases a list of 100 songs that are ‘trending’. The rankings on this list are defined by airplay, social media presence, and album successes of an artist. It is most definitely the list to be on if you’re trying to make it big in the music industry.

Now came the difficult part.

We wanted to see how well we could perform two tasks:

  • Task 1: Distinguishing between songs that had never been on the Hot 100 Chart from songs that had made it on.

  • Task 2: Distinguishing between songs that were ‘getting some heat’ and those that had never made it on.

What this meant in ML terms was that the first task involved classification of popular songs from a dataset made up of randomly selected and Top 100 songs and the second, more complex, task tests whether a neural network trained on task 1 is able to perform the classification of Emerging Artists and Hot 100 Artists.

To study these tasks we used the Million Song Dataset, a GitHub library for loading Billboard charts, and generated features using the Spotify API.

In the end, we got pretty exciting results! We found that we were able to achieve an accuracy of 91% on Task 1 and 87% on Task 2. What this meant is that we were able to find a ‘signature’ for what makes songs popular.

Previous
Previous

Recreating Rats Navigating Mazes With Reinforcement Learning

Next
Next

Genre Classification From Lyrical Content