Show simple item record Öktem, Alp 2018-10-05T10:32:10Z 2018-10-05T10:32:10Z 2018-10-05
dc.description Each episode directory contains word-level and segment-level information of the whole episode and also parallel samples extracted under segments_eng and segments_spa subdirectories. Each sample is stored as an WAV audio file, text file and a CSV file containing word timing information and word-level paralinguistic and prosodic features.
dc.description This dataset contains short audio and text excerpts from the TV series "Heroes" (Copyright Universal Media Studios (2006-2007,2007-2008, 2008-2009)). It is compiled and used only for research purposes. Creation of this dataset is partially financed by the UPF DTIC-Maria de Maeztu Strategic Program. This dataset is created with automated tools. There might be errors due to the automated process.
dc.description.abstract Heroes corpus contains mapped bilingual (English and Spanish) speech segments from the TV series Heroes. It contains 7000 single speaker speech segments extracted from the original and Spanish dubbed version of 21 episodes. Audio segments are accompanied with subtitle transcriptions and word-level prosodic/paralinguistic information.
dc.description.sponsorship Maria de Maeztu Programme/DTIC
dc.language.iso eng
dc.language.iso spa
dc.relation Publicació relacionada: Öktem A, Farrús M, Bonafonte A. Bilingual prosodic dataset compilation for spoken language translation. Paper presented at: IberSPEECH'18; 2018 Nov 21-23; Barcelona, Spain.
dc.rights This dataset is licensed under a Creative Commons licence. This license doesn't affect to file's content which can be protected by copyright.
dc.title Heroes Corpus
dc.type info:eu-repo/semantics/other
dc.type Dataset
dc.subject.keyword Parallel bilingual speech corpus prosody
dc.rights.accessRights info:eu-repo/semantics/openAccess

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account


Compliant to Partaking