Heroes Corpus

Öktem, Alp

Inici
→
Dades primàries de recerca
→
Departament de Tecnologies de la Informació i les Comunicacions. Dades primàries
→
Visualitza element

dc.contributor.author	Öktem, Alp
dc.date.accessioned	2018-10-05T10:32:10Z
dc.date.available	2018-10-05T10:32:10Z
dc.date.issued	2018-10-05
dc.identifier.uri	http://hdl.handle.net/10230/35572
dc.description	Each episode directory contains word-level and segment-level information of the whole episode and also parallel samples extracted under segments_eng and segments_spa subdirectories. Each sample is stored as an WAV audio file, text file and a CSV file containing word timing information and word-level paralinguistic and prosodic features.
dc.description	This dataset contains short audio and text excerpts from the TV series "Heroes" (Copyright Universal Media Studios (2006-2007,2007-2008, 2008-2009)). It is compiled and used only for research purposes. Creation of this dataset is partially financed by the UPF DTIC-Maria de Maeztu Strategic Program. This dataset is created with automated tools. There might be errors due to the automated process.
dc.description.abstract	Heroes corpus contains mapped bilingual (English and Spanish) speech segments from the TV series Heroes. It contains 7000 single speaker speech segments extracted from the original and Spanish dubbed version of 21 episodes. Audio segments are accompanied with subtitle transcriptions and word-level prosodic/paralinguistic information.
dc.description.sponsorship	Maria de Maeztu Programme/DTIC
dc.language.iso	eng
dc.language.iso	spa
dc.relation	Publicació relacionada: Öktem A, Farrús M, Bonafonte A. Bilingual prosodic dataset compilation for spoken language translation. Paper presented at: IberSPEECH'18; 2018 Nov 21-23; Barcelona, Spain. http://hdl.handle.net/10230/35600
dc.relation.isreferencedby	http://hdl.handle.net/10230/35600
dc.rights	This dataset is licensed under a Creative Commons licence. This license doesn't affect to file's content which can be protected by copyright.
dc.rights.uri	https://creativecommons.org/licenses/by-sa/4.0/
dc.title	Heroes Corpus
dc.type	info:eu-repo/semantics/other
dc.type	Dataset
dc.subject.keyword	Parallel bilingual speech corpus prosody
dc.rights.accessRights	info:eu-repo/semantics/openAccess