Cross-lingual text categorization
Cross-lingual text categorization
Citació
- Bel N, Koster CHA, Villegas M. Cross-lingual text categorization. In: Koch T, Sølvberg IT, editors. Research and AdvancedTechnology for Digital Libraries. 7th European Conference (ECDL); 2003 Aug 17-22; Trondheim, Norway. Berlin: Springer; 2003. 126-39. (LNCS, no. 2769). DOI: 10.1007/978-3-540-45175-4_13
Enllaç permanent
Descripció
Resum
This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which arises when documents in different languages must be classified according to the same classification tree. We describe practical and cost-effective solutions for automatic Cross-Lingual Text Categorization, both in case a sufficient number of training examples is available for each new language and in the case that for some language no training examples are available. Experimental results of the bi-lingual classification of the ILO corpus (with documents in English and Spanish) are obtained using bi-lingual training, terminology translation and profile-based translation.