hide
Free keywords:
-
Abstract:
Given a large audio database of music recordings, the goal of classical audio
identification is to identify a particular audio recording by means of a short
audio fragment. Even though recent identification algorithms show a significant
degree of robustness towards noise, MP3 compression artifacts, and uniform
temporal distortions, the notion of similarity is rather close to the identity.
In this paper, we address a higher level retrieval problem, which we refer to
as audio matching: given a short query audio clip, the goal is to automatically
retrieve all excerpts from all recordings within the database that musically
correspond to the query. In our matching scenario, opposed to classical audio
identification, we allow semantically motivated variations as they typically
occur in different interpretations of a piece of music. To this end, this paper
presents an efficient and robust audio matching procedure that works even in
the presence of significant variations, such as nonlinear temporal, dynamical,
and spectral deviations, where existing algorithms for audio identification
would fail. Furthermore, the combination of various deformation- and
fault-tolerance mechanisms allows us to employ standard indexing techniques to
obtain an efficient, index-based matching procedure, thus providing an
important step towards semantically searching large-scale real-world music
collections.