Creació d’un programa d’anàlisi i etiquetatge de noms propis en la premsa anglesa

Enllaç permanent

Descripció

  • Resum

    In the last years computational linguistics has begun to make a name for themselves and to get traction in the computer world, where so far linguists were not relationated with. So far, different known programmers as: Guido van Rossum, creator of Python in 1991, and several universities such as the one in Pennsylvania, founder of the “Natural Language Toolkit” in 2001, have been devoted to the development of tools for word processing. These developed tools has been used worldwide throughout these years and has served as the basis for expert and less expert programmers to create their own programs according to the specific needs they had. Up until today, no program concerning name entities analysis has been developed. For this reason, in this work you can find the procedures followed to create a program that analyzes name entities from a corpus created with articles from the British press you can find online. The program was developed with the Python programming language and is divided into three sections; the constitution of the body, the creation of the functions of the program, and the evaluation of the program results and the corpus results. The various functions that are collected throughout the study show that person name entities are the most commonly used name entities in the British press followed by the organizations name entities. Therefore, the main conclusion we can assume is that the program successfully achieved the results we expected.
    En aquest treball es recull el procés de creació d’un corpus de treball i un programa específic d’anàlisi i etiquetatge d’articles de la premsa anglesa. També es valoren els resultats obtinguts de manera crítica i es proposen canvis per futures millores i intervencions per fer al programa.
  • Director i departament

  • Mostra el registre complet