Emotionally-Relevant Features for Classification and Regression of Music Lyrics



This research addresses the role of lyrics in the music emotion recognition process. Our approach is based on several state of the art features complemented by novel stylistic, structural and semantic features. To evaluate our approach, we created a ground truth dataset containing 180 song lyrics, according to Russell’s emotion model. We conduct four types of experiments: regression and classification by quadrant, arousal and valence categories. Comparing to the state of the art features (ngrams - baseline), adding other features, including novel features, improved the F-measure from 69.9%, 82.7% and 85.6% to 80.1%, 88.3% and 90%, respectively for the three classification experiments. To study the relation between features and emotions (quadrants) we performed experiments to identify the best features that allow to describe and discriminate each quadrant. To further validate these experiments, we built a validation set comprising 771 lyrics extracted from the AllMusic platform, having achieved 73.6% F-measure in the classification by quadrants. We also conducted experiments to identify interpretable rules that show the relation between features and emotions and the relation among features. Regarding regression, results show that, comparing to similar studies for audio, we achieve a similar performance for arousal and a much better performance for valence.


Music Emotion Recognition, Music Information Retrieval, Natural Language Processing

Related Project

MOODetector: A System for Mood-based Classification and Retrieval of Audio Music


IEEE Transactions on Affective Computing, IEEE, August 2016

PDF File


Cited by

Year 2017 : 1 citations

 Çano, E., Morisio, M.. "MoodyLyrics: A Sentiment Annotated Lyrics Dataset. International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence", Hong Kong, March, 2017.