Testing optimality in the morphospace of language networks with empirical data

Enllaç permanent

Descripció

  • Resum

    There is an open issue about optimality in human language, which might be behind some universal features observed across tongues. These features may stem from a tension between hearers and speakers when trying to minimize costs associated to their usage of language. Optimality issues might be also critical to understand the evolution of the language faculty. A toy model introduced by Ferrer i Cancho and Solé captures the tension between hearers and speakers. In it, tongues are reduced to a mapping from signals to objects of an external world. Theoretical studies grounded in information theory followed this study, but the framework remains of limited empirical use due to the difficulty of building word-objects mappings for real tongues. There was a recent attempt by Seoane using WordNet, but this database has some relevant limitations such as the lack of data for some grammatical classes. In this project, we look at alternative ways to map empirical data from human languages into the aforementioned least effort information-theory framework. Human language consistently falls within one of two related categories: i) fairly optimal (both for hearers and speakers simultaneously) mappings; and ii) less simultaneously optimal word-object mappings, yet presenting interesting features such as diverse clustering of concepts and good fitness to Zipf’s law of word frequency. Our novel empirical analysis of linguistic data allows us to consider more grammatical classes and to bring together words from different classes coherently. Our results offer intuitive representations of human languages into an abstract space where they can be compared with other communication systems. This also offers a way to quantify the relevance of both conflicting views about optimality in human language introduced above. As far as optimality could be disregarded, our results also suggest alternative pressures that might have shaped human language. Future work will be aimed at scaling the proposed methodology to larger sets of data to support our findings.
  • Descripció

    Treball fi de màster de: Master in Intelligent Interactive Systems
    Tutors: Luís F. Seoane, Ricard Solé
  • Mostra el registre complet