This notebook presents a cover identification algorithm based on the Magnitude 2DFT.
Abstract: We approach cover song identification using a novel time-series representation of audio based on the 2DFT. The audio is represented as a sequence of magnitude 2D Fourier Transforms (2DFT). This representation is robust to key changes, timbral changes, and small local tempo deviations. We look at cross-similarity between these time-series, and extract a distance measure that is invariant to music structure changes. Our approach is state-of-the-art on a recent cover song dataset, and expands on previous work using the 2DFT for music representation and work on live song recognition.
A cover version of a song is one that is performed by someone else other than the original artist. Many things can change between a cover version and the original version, such as:
Successful automatic cover song identification approaches try to be invariant to these changes while keeping the aspects of music that are transferred from the original to the cover, such as:
Here are some examples! Each example has the CQT, the LiveID Fingerprint (explained later), and the audio file.
First, the original song:
load_and_display("../datasets/Elvis Presley - Can't Help Falling In Love/Can't Help Falling In Love-5V430M59Yn8.mp3",
"Can't Help Falling In Love - Elvis Presley (original)", 2)