Content Based Recommendation

Introduction to CBR

One of the most actively researched topics is analyzing the content of songs to judge the realatedness of two songs, typcially to make a recommendation -- "if you like A, you'll also like B". Surprisingly, you can tell a lot about a song without knowing anything about the melody, harmony or even rhythm. The overall texture, or timbre is often quite characteristic for songs from a given musical style. The current crop of content-based recommenders rely on timbral features because they are easy to compute completely automatically. It is currently very difficult to pull out the melody or rhythm of a song in a reliable way. The standard approach to building a model of the song (one that can be used to compare its similarity with other songs) is to summarize the timbre (using spectral features such as MFCCs) for short snippets of the sound. We do this on short snippets rather than simply the whole song so that information about how the song changes over time is preserved. If we calculated on the entire song, we would lose this temporal information. We then build a model (usually made of several bumps - really ellipsoids in high dimensional spaces) that correspond to areas of the timbre space that are activated by the song. This Gaussian mixture model (GMM) then becomes the representation of the song. Comparing two songs is then a question of comparing the distance between the GMM models. In music recommendation problems where this approach has been tried, it turns out that certain songs seem to be similar to very many other songs. They appear as recommendations for other songs even when they are not appropriate (of course it's not easy to determine when a music recommendation is good). This false positive problem is called the "hubness" problem. Mark is trying to understand what aspects of the modeling and timbre space lead to hubs (it turn out that perhaps anti-hubs not hubs are responsible). More on that at Mark's blog.

Application to Indian Music

We are also interested in creating song models that are appropriate for Indian music. For example, we are currently creating a recommendation system for Indian classical music. Many people have heard a little and are intrigued but don't know much more. It's a classic music discovery problem: you've heard a bit and you want to hear more. We working to give people an easy way to hear Indian classical music they like based on simple high-level descriptions of what they might like to hear. As they listen more, what they like and dislike is used to personalize the stream. This is what sites like last.fm and pandora, albeit by various different means, try and do. From a research perspective, we're interested in whether the CBR can be improved by song models that are specific to the type of music. In addition to the standard timbre modeling approaches, a survey we recently conducted showed that certain melodic characteristics are strongly correlated with certain emotions, so we are attempting to incorporate melodic features as well.

Database

nicm2008