Data leakage in cross-modal retrieval training: a case study

Weck, Benno; Serra, Xavier

Data leakage in cross-modal retrieval training: a case study

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Weck, Benno
dc.contributor.author Serra, Xavier
dc.date.accessioned 2025-05-30T05:48:59Z
dc.date.embargoEnd info:eu-repo/date/embargoEnd/2025-12-31
dc.date.issued 2023
dc.description.abstract The recent progress in text-based audio retrieval was largely propelled by the release of suitable datasets. Since the manual creation of such datasets is a laborious task, obtaining data from online resources can be a cheap solution to create large-scale datasets. We study the recently proposed SoundDesc benchmark dataset, which was automatically sourced from the BBC Sound Effects web page. In our analysis, we find that SoundDesc contains several duplicates that cause leakage of training data to the evaluation data. This data leakage ultimately leads to overly optimistic retrieval performance estimates in previous benchmarks. We propose new training, validation, and testing splits for the dataset that we make available online. To avoid weak contamination of the test data, we pool audio files that share similar recording setups. In our experiments, we find that the new splits serve as a more challenging benchmark.
dc.embargo.liftdate 2025-12-31
dc.format.mimetype application/pdf
dc.identifier.citation Weck B, Serra X. Data leakage in cross-modal retrieval training: a case study. In: Maragos P, Berberidis K, Boufounos P, editors. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023); 2023 June 4-10; Rhodes Island: Greece. [Piscataway]: IEEE; 2023. 5 p. DOI: 10.1109/ICASSP49357.2023.10094617
dc.identifier.uri http://hdl.handle.net/10230/70563
dc.language.iso eng
dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof Maragos P, Berberidis K, Boufounos P, editors. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023); 2023 June 4-10; Rhodes Island: Greece. [Piscataway]: IEEE; 2023.
dc.rights © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. http://dx.doi.org/10.1109/ICASSP49357.2023.10094617
dc.rights.accessRights info:eu-repo/semantics/embargoedAccess
dc.subject.keyword Text-based audio retrieval
dc.subject.keyword Cross-modal
dc.subject.keyword Duplicates
dc.subject.keyword Data leakage
dc.subject.keyword Deep learning
dc.title Data leakage in cross-modal retrieval training: a case study
dc.type info:eu-repo/semantics/conferenceObject
dc.type.version info:eu-repo/semantics/acceptedVersion

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)