A graphical representation and dissimilarity measure for basic everyday sound events

Adiloglu, Kamil; Anniés, Robert; Wahlen, Elio; Purwins, Hendrik; Obermayer, Klaus

Inicio
→
Recerca: articles, congressos, llibres
→
Departament de Tecnologies de la Informació i les Comunicacions
→
Articles (Departament de Tecnologies de la Informació i les Comunicacions)
→
Ver ítem

dc.contributor.author	Adiloglu, Kamil
dc.contributor.author	Anniés, Robert
dc.contributor.author	Wahlen, Elio
dc.contributor.author	Purwins, Hendrik
dc.contributor.author	Obermayer, Klaus
dc.date.accessioned	2019-05-27T14:52:39Z
dc.date.available	2019-05-27T14:52:39Z
dc.date.issued	2012
dc.identifier.citation	Adiloglu K, Anniés R, Wahlen E, Purwins H, Obermayer K. A graphical representation and dissimilarity measure for basic everyday sound events. IEEE Trans Audio Speech Process. 2012;20(5):1542-52. DOI: 10.1109/TASL.2012.2184752
dc.identifier.issn	1558-7916
dc.identifier.uri	http://hdl.handle.net/10230/39612
dc.description.abstract	Studies of Gaver (W. W. Gaver, “How do we hear in the world? Explorations in ecological acoustics,” Ecological Psychology, 1993) revealed that humans categorize everyday sounds considering the processes that have generated them: He defined these categories in a taxonomy according to the aggregate states of the involved materials (solid, liquid, gas) and the physical nature of the sound generating interaction such as deformation, friction, etc., for solids. We exemplified this taxonomy in an everyday sound database that contains recordings of basic isolated sound events of these categories. We used a sparse method to represent and to visualize these sound events. This representation relies on a sparse decomposition of sounds into atomic filter functions in the time-frequency domain. The filter functions maximally correlated with a given sound are selected automatically to perform the decomposition. The obtained sparse point pattern depicts the skeleton of the given sound. The visualization of these point patterns revealed that acoustically similar sounds have similar point patterns. To detect these similarities, we defined a novel dissimilarity function by considering these point patterns as 3-D point graphs and applied a graph matching algorithm, which assigns the points of one sound to the points of the other sound. This novel dissimilarity measure is used in combination with a kernel machine for the classification experiments, yielding an average accuracy of 95% in one versus one discrimination tasks.
dc.format.mimetype	application/pdf
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Audio, Speech, and Language Processing. 2012;20(5):1542-52.
dc.rights	© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.1109/TASL.2012.2184752
dc.title	A graphical representation and dissimilarity measure for basic everyday sound events
dc.type	info:eu-repo/semantics/article
dc.identifier.doi	http://dx.doi.org/10.1109/TASL.2012.2184752
dc.subject.keyword	Audio coding
dc.subject.keyword	Audio analysis and synthesis
dc.rights.accessRights	info:eu-repo/semantics/openAccess
dc.type.version	info:eu-repo/semantics/acceptedVersion