Did the landing of Apollo 11 make songs more hopeful? Music is often said to define a generation. Music as History is an exploration of looking at Billboard Weekly Top songs since 1958 and applying topic modeling algorithms to understand emotional sentiment from song lyrics over time. This was my senior capstone working with Sam Johnson, Sarah Grace, and Jared Schober. I primarily worked on the sentiment modeling and the front-end for the data visualization. This code was inspired by Flowing Data’s visualization.
Song information (title, artist, rank, etc.) was scraped from the Billboard website to from 1958 to present. Song lyrics were from metrolyrics.com. Of a total of 301,342 songs, 172,689 were successfully scraped for a success rate of 57.3%
Used the NRC Word-Emotion Association Lexicon: a list of words and their associations with emotions and sentiments curated through Amazon’s Mechanical Turk. These included anger, fear, anticipation, trust, surprise, sadness, joy, disgust, and sentiments (negative and positive). Each song’s emotional array was calculated by the emotional makeup of its words in the lyrics, ignoring stop words (a, the, etc).
War in Iraq (2003-2011): Statistically significantly more negative in 2004-2005 compared to 2001-2002
Financial Crisis (2007-2008): Statistically significantly more fear in 2008-2009 compared to 2005-2006
One-time events don’t seem to have as much impact compared to long-term events
Post-1998, sentiment gap emerges. Music leans toward more extreme negative sentiment and vice versa for positive sentiment
Can only state correlational not causal analysis of relationship between historical events and lyric sentiment
Reaction time of music to events is generally unknown (production, distribution, etc.)
In seasonal comparisons, music sentiment reflects the “opposite” marketed sentiment of that season (i.e. winter has higher positive emotional correlations and summer has more negative)
I wanted to take this project further by creating a web interface for someone to explore the data. A result I learned was that one-time events don’t seem to have as much impact compared to long-term events, so this is a way for the user to deduce their own patterns.