A multilingual annotated corpus for the study of information structure

Citació

  • Brunetti L, Bott S, Costa J, Vallduví E. A multilingual annotated corpus for the study of information structure. In: Konopka M et al.. Grammatik und Korpora 2009 dritte internationale Konferenz Mannheim, 22.-24.09.2009 = Grammar & Corpora 2009 third international conference. 1 ed. Tubinga: Narr; 2011. p. 305-27

Enllaç permanent

Descripció

  • Resum

    This paper presents a corpus of spoken narrative texts in Catalan, Italian, Spanish, English, and German. The aim of this corpus compilation is to create an empirical resource for a comparative study of Information Structure. A total of 68 speakers were asked to tell a story in an acoustically isolated room by looking at the pictures of three textless books. A total of 222 narrations resulted in about 16 hours of speech. The recordings have been transcribed and an original annotation of non-canonical constructions for the Romance subgroup has been proposed, namely of morphosyntactically/prosodically marked constructions that relate informational categories such as topic, focus, and contrast. Transcriptions and annotations of some selected high quality recordings have been aligned to the acoustic signal stream. The corpus is available in audio and text format.
  • Mostra el registre complet