Navigate
Browse
Recent Submissions

Item type: Item , Can dictionary consultation during reading be improved?(Universitat Politècnica de València, 2025) Rees, Geraint Paul; Frankenberg-Garcia, AnaWhile many ELT websites continue to promote contextual guessing as the preferred strategy for understanding unfamiliar words, there is limited evidence that this practice outperforms dictionary use for vocabulary comprehension. Moreover, nowadays, learners reading from their screens can instantly access definitions through search bars or clickable text, even without leaving the page they are reading. Despite these lexicographic advances, there is room for better integration of dictionaries with reading. This study examines the effectiveness, efficiency, and usability of Reverso, a new dictionary designed to support reading comprehension. 32 undergraduates in Spain completed a reading-comprehension quiz using either the Reverso Chrome extension or Oxford Learner¿s Dictionaries online (chosen as a benchmark). Performance was evaluated through measures of task completion time (efficiency), accurate word meaning selection (effectiveness), and responses to a standardised usability questionnaire. Both tools were effective. However, Reverso users completed tasks more efficiently and rated the tool slightly higher in usability, suggesting it is a less disruptive and more user-friendly aid for vocabulary consultation during reading.
Item type: Item , The development and exploratory analysis of the Writing and AI Knowledge Scale (WAIKS)(Universitat Politècnica de València, 2025) Rees, Geraint Paul; Fisher, Rebecca SarahGenerative AI chatbots powered by large-language models (e.g., ChatGPT, Gemini, CoPilot) offer both opportunities and challenges for L2 writers at university. Their ease of use and the apparent fluency of the texts produced make them popular with students. However, this convenience may limit opportunities for deeper processing and learning. Furthermore, AI-generated texts may contain contextual, stylistic, and pragmatic errors; plagiarised content; and implicit biases related to gender, race, and other issues which novice writers struggle to detect. The Writing and AI Knowledge Scale (WAIKS) questionnaire is envisaged as a way for teachers to quickly assess students¿ attitudes towards and knowledge of these issues. This study evaluates a WAIKS prototype with 128 L2 English writers (CEFR B2-C2) at a Spanish university. In an exploratory analysis of the prototype, Principal Components Analysis identified three constructs -Academic Appropriateness, Impression, and Intention to Use. Comparisons across language proficiency and academic experience groups suggest that students with higher English proficiency may be more aware of chatbot limitations, while students with more university experience may have greater practical motivation to use chatbots. These tentative results support the construct validity of WAIKS. Further evaluation is needed with a broader range of participants. This will help establish benchmarks for interpreting scores.
Item type: Item , Towards a Methodology for the Analysis of Neutralisation in Spanish Subtitling(Tradulex, 2015) Arias Badia, BlancaThroughout the Spanish-speaking world, people watch films and TV serials originally made in English and other languages, either dubbed or subtitled into Spanish. The wording used for the Spanish subtitles is constrained by space and time. An additional constraint, less widely recognized, is comprehensibility. The research reported in this paper shows that translators tend, possibly unconsciously, to neutralize colourful language present in the source texts. Typically, expressions such as 'He's like a kid on a sugar rush' are neutralized in Spanish to 'Es como un niño lleno de energía' (He's like a kid full of energy). Perhaps in some cases translators do this in order to make the texts more easily comprehensible. No doubt, in other cases, the literal translation of the creative unit in the source text does not work in the target language, or no direct equivalent is readily available. Whatever the reasons, the cumulative effect is that translated texts in subtitles tend to be more neutral and less attention-grabbing than the source texts, and this tendency is sometimes most noticeable at crucial points in the plot development. While the specialized literature has acknowledged the phenomenon of neutralisation in subtitling (Zaro 2001: 59-60; Díaz-Cintas 2003: 286; Bartoll 2012: 158), the conducted studies have not focused on the phenomenon of neutralisation, so it still has not been described in depth nor in a systematic manner. This paper reports a detailed corpus-driven study of neutralisation in three different American crime TV shows, i.e. Dexter (2006), The Mentalist (2008) and Castle (2009) for each episode, the corpus contains the transcript of the source text and the DVD subtitled version in Castilian Spanish. A preliminary analysis demonstrated that, in this corpus, neutralisation occurs mainly at lexical level. Therefore, the methodology employed in this study involves corpus pattern analysis and the distinction made by Hanks (2004, 2013) between norms and exploitations. The stress of the study is laid on metaphorical uses, puns and other creative uses of language. Each sentence of both the ST and the TT is scrutinised in order to detect exploitations that affect the semantics of the discourse. To date, this corpus analysis methodology for the study of lexicon has been applied mainly in the field of lexicography. Adopting it for the study of an audiovisual corpus enables us, on the one hand, to detect where neutralisation takes place in the transfer of content from the ST to the TT; and, on the other, to gather quantitative results about a generalised neutrality of the Spanish subtitles of crime fiction series.
Item type: Item , Contexts of language learning: predicting child language by interactive speech in 9 languages(Cascadilla Proceedings Project, 2025) Rüst, Olivier; Baroni, Marco; Stoll, SabineA central challenge for the language learning child is to extract the components of their communicative system from the input. While children are known to be excellent statistical learners (e.g., Saffran et al. 1999), less is understood about the structure of the input that supports this learning. Recent research suggests that, beyond input quantity, input quality (Anderson et al. 2021) and especially the linguistic interactions children experience play a crucial (Feurstein et al. 2022; Donnelly & Kidd 2021; Newman et al. 2016; Weisleder & Fernald 2013; Rowe 2012; Shneidman & Goldin-Meadow 2012; Meredith L. Rowe & Ayoub 2005). This study investigates whether and how caregiver-child interactions facilitate language learning. Specifically, we hypothesize that interactive moments characterized by increased turn-taking between child and caregiver create optimal conditions for learning. This hypothesis is tested across nine typologically diverse languages: Chintang (Sino - Tibetan), Cree (Algic), Indonesian (Austronesian), Japanese (Japonic), Ku Waru (Trans - New Guinea), Russian (Indo - European), Sesotho (Niger - Congo), Turkish (Turkic), and Yucatec (Mayan).
Item type: Item , Evil twins are not that evil: qualitative insights into machine-generated prompts(ACL (Association for Computational Linguistics), 2025) Carraz Rakotonirina, Nathanaël; Kervadec, Corentin; Franzon, Francesca; Baroni, MarcoIt has been widely observed that language models (LMs) respond in predictable ways to algorithmically generated prompts that are seemingly unintelligible. This is both a sign that we lack a full understanding of how LMs work, and a practical challenge, because opaqueness can be exploited for harmful uses of LMs, such as jailbreaking. We present the first thorough analysis of opaque machine-generated prompts, or autoprompts, pertaining to 6 LMs of different sizes and families. We find that machinegenerated prompts are characterized by a last token that is often intelligible and strongly affects the generation. A small but consistent proportion of the previous tokens are prunable, probably appearing in the prompt as a by-product of the fact that the optimization process fixes the number of tokens. The remaining tokens fall into two categories: filler tokens, which can be replaced with semantically unrelated substitutes, and keywords, that tend to have at least a loose semantic relation with the generation, although they do not engage in wellformed syntactic relations with it. Additionally, human experts can reliably identify the most influential tokens in an autoprompt a posteriori, suggesting these prompts are not entirely opaque. Finally, some of the ablations we applied to autoprompts yield similar effects in natural language inputs, suggesting that autoprompts emerge naturally from the way LMs process linguistic inputs in general.
Item type: Item , Not a nuisance but a useful heuristic: outlier dimensions favor frequent tokens in language models(ACL (Association for Computational Linguistics), 2025) Macocco, Iuri; Graichen, Nora; Boleda, Gemma; Baroni, MarcoWe study last-layer outlier dimensions, i.e. dimensions that display extreme activations for the majority of inputs. We show that outlier dimensions arise in many different modern language models, and trace their function back to the heuristic of constantly predicting frequent words. We further show how a model can block this heuristic when it is not contextually appropriate, by assigning a counterbalancing weight mass to the remaining dimensions, and we investigate which model parameters boost outlier dimensions and when they arise during training. We conclude that outlier dimensions are a specialized mechanism discovered by many distinct models to implement a useful token prediction heuristic.
Item type: Item , Documenting the final days of monolingual English learners' dictionaries using the archived web(Lexical Computing CZ s.r.o., 2025) Rees, Geraint PaulOnline dictionaries have many advantages over their physical counterparts. However, the ephemeral nature of web content means that they are often changed without notice and no ostensible record of what came before remains. This makes research on historical online dictionaries difficult and perhaps explains why, while the history of printed monolingual English learners’ dictionaries (MELDs) has been comprehensively explored, studies of online dictionaries have tended to take a cross-sectional rather than longitudinal view. This is not ideal since it means that a large period of MELD history is yet to be explored. Moreover, given recent predictions of the decline of MELDs, as we know them, in light of developments with AI chatbots and other digital tools, this gap is all the more significant. In an attempt to remedy this situation, this study applies Brügger’s (2018) framework for archived web research to explore the feasibility of using the web archive, the Wayback Machine, to trace the development of websites that give, or have given, access to ‘the big five’ MELDs. Some key challenges of using archived web material to conduct lexicographic research are discussed along with suggestions for potential solutions.
Item type: Item , Prediction hubs are context-informed frequent tokens in LLMs(ACL (Association for Computational Linguistics), 2025) Nielsen, Beatrix MG; Macocco, Iuri; Baroni, MarcoHubness, the tendency for a few points to be among the nearest neighbours of a disproportionate number of other points, commonly arises when applying standard distance measures to high-dimensional data, often negatively impacting distance-based analysis. As autoregressive large language models (LLMs) operate on high-dimensional representations, we ask whether they are also affected by hubness. We first prove that the only large-scale representation comparison operation performed by LLMs, namely that between context and unembedding vectors to determine continuation probabilities, is not characterized by the concentration of distances phenomenon that typically causes the appearance of nuisance hubness. We then empirically show that this comparison still leads to a high degree of hubness, but the hubs in this case do not constitute a disturbance. They are rather the result of context-modulated frequent tokens often appearing in the pool of likely candidates for next token prediction. However, when other distances are used to compare LLM representations, we do not have the same theoretical guarantees, and, indeed, we see nuisance hubs appear. There are two main takeaways. First, hubness, while omnipresent in high-dimensional spaces, is not a negative property that needs to be mitigated when LLMs are being used for next token prediction. Second, when comparing representations from LLMs using Euclidean or cosine distance, there is a high risk of nuisance hubs and practitioners should use mitigation techniques if relevant.
Item type: Item , Emergence of a high-dimensional abstraction phase in language transformers(International Conference on Learning Representations, 2025) Cheng, Emily; Doimo, Diego; Kervadec, Corentin; Macocco, Iuri; Yu, Lei; Laio, Alessandro; Baroni, MarcoA language model (LM) is a mapping from a linguistic context to an output token. However, much remains to be known about this mapping, including how its geometric properties relate to its function. We take a high-level geometric approach to its analysis, observing, across five pre-trained transformer-based LMs and three input datasets, a distinct phase characterized by high intrinsic dimensionality. During this phase, representations (1) correspond to the first full linguistic abstraction of the input; (2) are the first to viably transfer to downstream tasks; (3) predict each other across different LMs. Moreover, we find that an earlier onset of the phase strongly predicts better language modelling performance. In short, our results suggest that a central high-dimensionality phase underlies core linguistic processing in many common LM architectures.
Item type: Item , Unagreement and how morphology sees syntax(Lexical-Functional Grammar, 2024) Alsina i Keith, ÀlexThe phenomenon of unagreement, found in Spanish, Catalan, and Greek, among other languages, poses four theoretical problems: 1) how to account for an apparent mismatch between trigger and target in an agreement relation; 2) how to account for the fact that not all languages have this phenomenon; 3) how to account for variation in the NPs that trigger unagreement within a given language and across languages; 4) how to account for the correlation between the presence or absence of unagreement and the type of adnominal pronoun construction (APC) allowed in the language. The analysis assumes a lexicalist unencapsulated view of the relationship between syntax and inflectional morphology, which implies that agreement is a strictly morphological phenomenon. The fundamental idea is that some determiners in some languages do not specify person information. This implies that a phrase headed by such a determiner is compatible with any person feature.
Item type: Item , A multilingual annotated corpus for the study of information structure(Narr Francke Attempto Verlag, 2009) Brunetti, Lisa; Bott, Stefan Markus; Costa, Joan; Vallduví, EnricThis paper presents a corpus of spoken narrative texts in Catalan, Italian, Spanish, English, and German. The aim of this corpus compilation is to create an empirical resource for a comparative study of Information Structure. A total of 68 speakers were asked to tell a story in an acoustically isolated room by looking at the pictures of three textless books. A total of 222 narrations resulted in about 16 hours of speech. The recordings have been transcribed and an original annotation of non-canonical constructions for the Romance subgroup has been proposed, namely of morphosyntactically/prosodically marked constructions that relate informational categories such as topic, focus, and contrast. Transcriptions and annotations of some selected high quality recordings have been aligned to the acoustic signal stream. The corpus is available in audio and text format.
Item type: Item , La dislocació del verb en català(Publicacions de l'Abadia de Montserrat, 2001) Vallduví, EnricLa dislocació (a l'esquerra) d'arguments i adjunts es un fenomen força conegut en la sintaxi del català (e.g. Solà 1991, Vallduví 1992a) i altres llengües romàniques. Tanmateix, la dislocació del verb és gairebé del tot desconeguda a nivell acadèmic, tot i ésser un fenomen ben viu a tots els parlars catalans. Aquest paper presenta exemples de dislocació del verb i mostra la seva equivalència formal a la dislocació dels arguments, fent servir un seguit de tests (recció d'un element coreferenciat dins de l'oració, ubicació de partícules perifèriques) i n'esmenta les diferències (represa amb forma verbal i no pronominal, desviació morfològica cap a la forma d'infinitiu). També distingeix entre la dislocació del verb i l'avantposició del sintagma verbal, un fenomen estructural diferent. Des del punt de vista formal, és interessant la interacció entre la dislocació del verb i els pronoms febles, ja que els pronoms febles poden ser copiats a l'infinitiu dislocat de manera optativa. Finalment, s'analitza la càrrega semanticopragmàtica associada a aquesta configuració, que comparteix amb les altres dislocacions a l'esquerra.
Item type: Item , Exploring undergraduates’ attitudes towards ChatGPT: is AI resistance constraining the acceptance of chatbot technology?(Springer, 2024) Sánchez Reina, Jesús Roberto; Theophilou, Emily; Hernández Leo, Davinia; Ognibene, DimitriThe advent of Artificial Intelligence (AI) has revolutionized multiple sectors including education. The popularization of tools such as ChatGPT has sparked the debate concerning the impact of AI on traditional education and the nature of learning. This paper explores undergraduate students’ attitudes towards AI and ChatGPT acceptance. A descriptive cross-sectional study with 72 Public Relations students (M age = 19.2 years old) took place in Barcelona (Spain) during the first semester of 2023. The study implemented a mixed method approach with two validated questionnaires and an open text question to gather comprehensive insights. Findings reveal positive attitudes towards artificial intelligence and ChatGPT acceptance. The assessment of negative perceptions show concerns regarding artificial intelligence and the use of ChatGPT among participants. The correlational analysis of scales showed an intricate relationship between AI attitudes and ChatGPT acceptance while the qualitative analysis highlighted three major attitudes among students: openness, awareness, and alertness. The present study contributes to the ongoing discourse surrounding the use of ChatGPT in educational settings, emphasizing the importance of exploring students’ attitudes and concerns. As artificial intelligence continues to permeate various aspects in our daily life, it becomes crucial to explore its impact on education, particularly in higher education. By understanding students’ attitudes, both educators and institutions can enhance their proficiency in integrating artificial intelligence in a more efficient manner, ensuring a well-balanced approach that maximizes benefits while mitigating potential drawbacks of adopting AI technology.
Item type: Item , Generative AI chatbot in PyramidApp: students’ behaviors and design principles(Springer, 2024) Gutiérrez-Ferré, Aldric; Hernández Leo, Davinia; Sánchez Reina, Jesús RobertoGenerative Artificial Intelligence (GenAI) offers new opportunities to implement useful features within Computer Supported Collaborative Learning (CSCL) environments. Despite these growing prospects, there is still limited research concerning the application of GenAI in learning environments. This work in progress aims to evaluate the mediation of a masked GenAI chatbot in the setting of the CSCL web application PyramidApp. A quasi-experimental within-subjects study was designed to assess the effect of GenAI chatbot intervention within the environment of PyramidApp. In the setting of 9 online activities, we evaluated the effect of the GenAI chatbot activity in 105 conversational chat rooms. The findings revealed that the GenAI chatbot provides useful feedback as students rate the chatbot’s answers higher than their peers’ answers (MChatbot = 4.11, MStudents = 3.91). The presence of the chatbot has an effect on group communication with the length of messages increased in chat rooms where the chatbot was present. Moreover, chatbot behavior to rate the students’ answers was correlated with the students’ behavior. The present study offers valuable insights into the optimal strategies for integrating a GenAI Large Language Model into educational tools and computer supported learning.
Item type: Item , SemEval-2016 task 5: aspect based sentiment analysis(ACL (Association for Computational Linguistics), 2016) Pontiki, Maria; Galanis, Dimitris; Papageorgiou, Haris; Androutsopoulos, Ion; Manandhar, Suresh; Al-Smadi, Mohammad; Al-Ayyoub, Mahmoud; Zhao, Yanyan; Qin, Bing; De Clercq, Orphée; Hoste, Véronique; Apidianaki, Marianna; Tannier, Xavier; Loukachevitch, Natalia; Kotelnikov, Evgeniy; Bel, Nuria; Bel Rafecas, Núria; Jiménez-Zafra, Salud María; Eryiğit, GülşenThis paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams.
Item type: Item , MemoryPrompt: a light wrapper to improve context tracking in pre-trained language models(ELRA (European Language Resources Association), 2024) Carraz Rakotonirina, Nathanaël; Baroni, MarcoTransformer-based language models (LMs) track contextual information through large, hard-coded input windows. We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes information to the LM by prefixing its regular input with a sequence of vectors, akin to soft prompts, without requiring LM finetuning. Tested on a task designed to probe a LM’s ability to keep track of multiple fact updates, a MemoryPrompt-augmented LM outperforms much larger LMs that have access to the full input history. We also test MemoryPrompt on a long-distance dialogue dataset, where its performance is comparable to that of a model conditioned on the entire conversation history. In both experiments we also observe that, unlike full-finetuning approaches, MemoryPrompt does not suffer from catastrophic forgetting when adapted to new tasks, thus not disrupting the generalist capabilities of the underlying LM.
Item type: Item , Rhetorically-based scalar-additivity: the view from Italian addirittura(Cornell University. Department of Linguistics, 2022) Pistoia-Reda, Salvatore; McNally, Louise, 1965-Even-like particles have widely been analyzed as inducing scalar and additive presuppositions (cf. Horn 1969; Karttunen & Peters 1979; Rooth 1992; Gast & van der Auwera 2011). However, the additivity of even has been controversial since at least Rullmann 1997 and increasingly called into question (see Greenberg & Umbach 2021 for references); Greenberg specifically argues that scalar even-like particles can vary in additivity. This claim is surprising in light of the typological study in Gast & van der Auwera 2011, which subsumes even and similar expressions under a larger class of additive particles. Against this background, we present an analysis of Italian addirittura, which with perfino has been described as scalaradditive (Visconti 2005) – but only optionally so – and is chosen preferentially over perfino precisely in those contexts that Greenberg takes to challenge the additivity of even. We argue, drawing on observations in Atayan 2017, that addirittura contrasts with perfino in deriving its scalar alternatives from rhetorical structure rather than focus structure. Once this is recognized we can view addirittura as additive, after all, in a rhetorical sense we describe below.
Item type: Item , Reference bias in monolingual machine translation evaluation(ACL (Association for Computational Linguistics), 2016) Fomicheva, Marina; Specia, LuciaIn the translation industry, human translations are assessed by comparison with the source texts. In the Machine Translation (MT) research community, however, it is a common practice to perform quality assessment using a reference translation instead of the source text. In this paper we show that this practice has a serious issue – annotators are strongly biased by the reference translation provided, and this can have a negative impact on the assessment of MT quality.
Item type: Item , F-to-c-structure mapping: accounting for inflectional morphology and periphrasis(CSLI Publications, 2023) Alsina i Keith, ÀlexThe treatment of inflectional periphrasis is problematic in LFG, apparently because of the lexicalist nature of the framework. A close inspection of what is usually understood by lexicalism reveals two distinct, but related, notions: lexicalism and lexical encapsulation. Complex inflectional systems show that one can preserve lexicalism (the idea that words and phrases are different in terms of units and rules of composition), but that it is necessary to reject lexical encapsulation (the idea that words are formed without input from syntax). An adequate theory of inflectional morphology needs a framework that is not constrained by lexical encapsulation. With such a framework, it is then possible to give a correct account of inflectional periphrasis. The paper develops the analysis of two periphrastic constructions, one in Latin and one in Catalan, within a non-encapsulated version of LFG.
Item type: Item , Partitivity in romance and the syntax-morphology connection(CSLI Publications, 2022) Alsina i Keith, ÀlexThis paper claims that the relationship between morphology and syntax is multidirectional. It argues against the generally accepted position in LFG that word formation feeds the syntax and that syntax cannot feed word formation. The proposal is that the rules of inflectional morphology take f-structure information, together with other information, as their input. The main argument for this claim is provided by the comparative analysis of two Romance languages, one with the partitive affix and one without it. The observation that languages without the partitive affix have null indefinite objects, whereas languages with this affix seemingly do not, follows straightforwardly only if we assume that syntax feeds word formation.
