Prosodically annotated TED talks
Prosodically annotated TED talks
Citació
- Öktem A, Farrús M, Lai C. Prosodically annotated TED talks. Repositori Digital de la UPF: Barcelona; 2018. Disponible a: http://hdl.handle.net/10230/33981
Enllaç permanent
Descripció
Dades relacionades
Resum
TED talks are a set of conference talks that have been held worldwide in more than 100 languages. They include a large variety of topics, from technology and design to science, culture and academia. This corpus consists of speech recordings and Proscript format annotations of 1046 talks by 877 English speakers, uttering a total amount of 155174 sentences.Descripció
"Audio files of the recordings are provided in the partitioned archives as WAV format. ""talk_proscripts"" archive contains Proscript format annotations of complete talks. ""punkProse_dataset"" archive contains sampled dataset partitioning used in prosodic punctuation modelling experiments (See http://github.com/alpoktem/punkProse). README.txt file contains information on the dataset and authors. Indexing of the files and their corresponding talks are listed in TED_talk_ids.txt. Proscript format files contain the sequence of uttered words in a recording, their approximate timings and corresponding acoustic measurements (pitch, intensity, speech rate). For more information on Proscript format see http://github.com/alpoktem/proscript."