A brief analysis of SLAVC method for sound source

dc.contributor.authorJuanola, Xavier
dc.contributor.authorHaro Ortega, Gloria
dc.date.accessioned2025-06-10T08:51:55Z
dc.date.available2025-06-10T08:51:55Z
dc.date.issued2024
dc.description.abstractMo and Morgado introduced in 2022 a novel self-supervised learning approach for Visual Sound Source Localization, denoted as SLAVC [Mo, S. and Mordado, P., A Closer Look at Weakly-Supervised Audio-Visual Source Localization, Advances in Neural Information Processing Systems, 2022]. The proposed method is based on multiple-instance contrastive learning. In addition to improving the results of previous methods, it also solves two critical problems that former methods faced: 1) excessive overfitting despite training on extensive datasets, 2) tendency to hallucinate sound sources even without visual evidence to support it in the video. In this paper, we briefly present the method, offer an online executable version allowing the users to test it on their own image-audio pairs and propose some improvements that could benefit the model as future work.
dc.description.sponsorshipThis work has been supported by MICINN/FEDER UE project PID2021-127643NB-I00.
dc.format.mimetypeapplication/pdf
dc.identifier.citationJuanola X, Haro G. A brief analysis of SLAVC method for sound source. Image Processing On Line. 2024;14:159-72. DOI: 10.5201/ipol.2024.525
dc.identifier.doihttp://dx.doi.org/10.5201/ipol.2024.525
dc.identifier.issn2105-1232
dc.identifier.urihttp://hdl.handle.net/10230/70650
dc.language.isoeng
dc.publisherIPOL
dc.relation.ispartofImage Processing On Line. 2024;14:159-72
dc.relation.projectIDinfo:eu-repo/grantAgreement/ES/3PE/PID2021-127643NB-I00
dc.rights© 2024 IPOL & the authors. The article is distributed under a Creative Commons CC-BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/
dc.subject.keywordAudio-visual
dc.subject.keywordSound source localization
dc.titleA brief analysis of SLAVC method for sound source
dc.typeinfo:eu-repo/semantics/article
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Haro_ipo_abri.pdf
Size:
2.98 MB
Format:
Adobe Portable Document Format

License

Rights