Monolingual corpus acquired in five languages and two domains

Citació

Papavassiliou V, Prokopidis P, Toral A, Arranz V, Bel N, Quochi V. Monolingual corpus acquired in five languages and two domains [Internet]. Final report 01 Feb 2011. 29 p. (Panacea Project. Work Package Reports, no. D4.3). Available from:

Enllaç permanent

Descripció

Descripció
This deliverable document explains the creation process of monolingual corpora that were created with the first version of the Corpus Acquisition and Annotation subsystem (CAA). The first version of the CAA was developed during the first year of the project following insights from D4.1 Technologies and tools for corpus creation, normalization and annotation (T6). The main component currently integrated in the CAA is the Corpus Acquisition Component (CAC). Although the CAC is described in more detail in D4.2 Initial functional prototype and documentation (due T13), we include extracts from that document in the present deliverable, when we believe that this would help the reader understand the corpus acquisition process we followed.
Col·leccions
IULA. Documentació del Projecte Panacea

Fitxers