Description:
This deliverable document explains the creation process of monolingual corpora that were created with the first version of the Corpus Acquisition and Annotation subsystem (CAA). The first version of the CAA was developed during the first year of the project following insights from D4.1 Technologies and tools for corpus creation, normalization and annotation (T6). The main component currently integrated in the CAA is the Corpus Acquisition Component (CAC). Although the CAC is described in more detail in D4.2 Initial functional prototype and documentation (due T13), we include extracts from that document in the present deliverable, when we believe that this would help the reader understand the corpus acquisition process we followed.