Congressos (Departament de Traducció i Ciències del Llenguatge)http://hdl.handle.net/10230/162252024-03-19T03:36:11Z2024-03-19T03:36:11ZReference bias in monolingual machine translation evaluationFomicheva, MarinaSpecia, Luciahttp://hdl.handle.net/10230/593992024-03-14T02:30:48Z2016-01-01T00:00:00ZReference bias in monolingual machine translation evaluation
Fomicheva, Marina; Specia, Lucia
In the translation industry, human translations are assessed by comparison with the source texts. In the Machine Translation (MT) research community, however, it is a common practice to perform quality assessment using a reference translation instead of the source text. In this paper we show that this practice has a serious issue – annotators are strongly biased by the reference translation provided, and this can have a negative impact on the assessment of MT quality.
Comunicació presentada a la 54th Annual Meeting of the Association for Computational Linguistics, celebrada del 7 al 10 d'agost de 2016 a Berlín, Alemanya.
2016-01-01T00:00:00ZF-to-c-structure mapping: accounting for inflectional morphology and periphrasisAlsina i Keith, Àlexhttp://hdl.handle.net/10230/589892024-02-08T02:30:44Z2023-01-01T00:00:00ZF-to-c-structure mapping: accounting for inflectional morphology and periphrasis
Alsina i Keith, Àlex
The treatment of inflectional periphrasis is problematic in LFG, apparently
because of the lexicalist nature of the framework. A close inspection of what is
usually understood by lexicalism reveals two distinct, but related, notions: lexicalism and lexical encapsulation. Complex inflectional systems show that one
can preserve lexicalism (the idea that words and phrases are different in terms of
units and rules of composition), but that it is necessary to reject lexical encapsulation (the idea that words are formed without input from syntax). An adequate
theory of inflectional morphology needs a framework that is not constrained by
lexical encapsulation. With such a framework, it is then possible to give a correct
account of inflectional periphrasis. The paper develops the analysis of two periphrastic constructions, one in Latin and one in Catalan, within a non-encapsulated version of LFG.
2023-01-01T00:00:00ZPartitivity in romance and the syntax-morphology connectionAlsina i Keith, Àlexhttp://hdl.handle.net/10230/589882024-02-08T02:30:45Z2022-01-01T00:00:00ZPartitivity in romance and the syntax-morphology connection
Alsina i Keith, Àlex
This paper claims that the relationship between morphology and syntax is
multidirectional. It argues against the generally accepted position in LFG that
word formation feeds the syntax and that syntax cannot feed word formation.
The proposal is that the rules of inflectional morphology take f-structure
information, together with other information, as their input. The main argument
for this claim is provided by the comparative analysis of two Romance
languages, one with the partitive affix and one without it. The observation that
languages without the partitive affix have null indefinite objects, whereas
languages with this affix seemingly do not, follows straightforwardly only if
we assume that syntax feeds word formation.
2022-01-01T00:00:00ZUnnatural language processing: how do language models handle machine-generated prompts?Kervadec, CorentinFranzon, FrancescaBaroni, Marcohttp://hdl.handle.net/10230/585602023-12-19T02:30:46Z2023-01-01T00:00:00ZUnnatural language processing: how do language models handle machine-generated prompts?
Kervadec, Corentin; Franzon, Francesca; Baroni, Marco
Language model prompt optimization research
has shown that semantically and grammatically
well-formed manually crafted prompts are routinely outperformed by automatically generated
token sequences with no apparent meaning or
syntactic structure, including sequences of vectors from a model’s embedding space. We use
machine-generated prompts to probe how models respond to input that is not composed of natural language expressions. We study the behavior of models of different sizes in multiple semantic tasks in response to both continuous and
discrete machine-generated prompts, and compare it to the behavior in response to humangenerated natural-language prompts. Even
when producing a similar output, machinegenerated and human prompts trigger different
response patterns through the network processing pathways, including different perplexities,
different attention and output entropy distributions, and different unit activation profiles. We
provide preliminary insight into the nature of
the units activated by different prompt types,
suggesting that only natural language prompts
recruit a genuinely linguistic circuit.
Comunicació presentada a la Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), celebrada a Singapur del 6 al 10 de desembre de 2023.
2023-01-01T00:00:00Z