Navigate
Browse
Recent Submissions

Item type: Item , Towards caring institutional analytics: a case study illustrating the exploration of language biases(CIDUI, 2025) Marques, Francielle; Hernández Leo, Davinia; Castillo, Carlos; Trenchs i Parera, MireiaOur study focuses on institutional analytics in higher education decision-making and the need of considering contextual biases. We contribute with a case study in a university in Catalonia using (reliable) student grades and satisfaction data from 22 professors who teach at least one course in Catalan and at least one course in English to identify language-related biases or interrelationships. Our study sheds light on the relevance of this perspective and its implications
Item type: Item , AI-mediated scaffolding for student agency and peer collaboration(2026) El Aadmi-Laamech, Khadija; Santos Rodríguez, Patrícia; Fàbrega, XèniaIn early childhood contexts, agency refers to the capacity to influence what and how students learn, closely linked to competence, autonomy, reflexivity, and purpose (Varpanen, 2019; Pantić, 2017). Fostering this capacity has proven various benefits, such as heightened motivation, self-regulation, well-being, and participation in the classroom (Baker et al., 2023; Sirkko et al. 2019; Reunamo, 2007). Agency is described as a natural driver in humans (Shogren et al., 2017) according to the Self-Determination Theory (SDT) (Ryan & Deci, 2017). Specifically, the SDT connects this agentic ability to the Basic Psychological Needs (BPNs) (Shogren et al., 2015): autonomy, competence and relatedness. In the corpus of work by Wehmeyer (2004), when presenting the Causal Agency Theory, self-determined is described as “acting as the primary causal agent in one’s life and making choices and decisions regarding one’s quality of life free from undue external influence or interference” (Wehmeyer, 2004; Wehmeyer, 2001; Wehmeyer 1996). Meeting these basic needs fosters agentic potential by providing the essential scaffolding, namely the context and competencies required for agentic action. This work-in-progress examines how Generative Artificial Intelligence (GenAI) can be used, in the context of collaborative practices, to scaffold students’ agency through the lens of the SDT.
Item type: Item , Exploring morphology-aware tokenization: a case study on Spanish language modeling(ACL (Association for Computational Linguistics), 2025) Táboas García, Alba; Przybyła, Piotr; Wanner, LeoThis paper investigates to what extent the integration of morphological information can improve subword tokenization and thus also language modeling performance. We focus on Spanish, a language with fusional morphology, where subword segmentation can benefit from linguistic structure. Instead of relying on purely data-driven strategies like Byte Pair Encoding (BPE), we explore a linguistically grounded approach: training a tokenizer on morphologically segmented data. To do so, we develop a semi-supervised segmentation model for Spanish, building gold-standard datasets to guide and evaluate it. We then use this tokenizer to pre-train a masked language model and assess its performance on several downstream tasks. Our results show improvements over a baseline with a standard tokenizer, supporting our hypothesis that morphology-aware tokenization offers a viable and principled alternative for improving language modeling.
Item type: Item , Countering disinformation by finding reliable sources: a citation-based approach(Institute of Electrical and Electronics Engineers (IEEE), 2022) Przybyła, Piotr; Borkowski, Piotr; Kaczyński, KonradWe propose a new task aimed at countering dis- and misinformation, called Finding Reliable Sources. Given a one-sentence claim, the challenge is to automatically find a knowledge source (e.g. a book, a research article, a web page) that could support or refute the claim. We show that this capability could be learnt by observing associations between sentences in English Wikipedia and citations provided for them. Thus, we collect a corpus of over 50 million references to 24 million identified sources with the citation context from Wikipedia, and build search indices using several meaning representation methods. For evaluation, apart from the Wikipedia corpus, we prepare another test set based on the FEVER fact-checking dataset.
Item type: Item , Attacking misinformation detection using adversarial examples generated by language models(ACL (Association for Computational Linguistics), 2025) Przybyła, Piotr; McGill, Euan; Saggion, HoracioLarge language models have many beneficial applications, but can they also be used to attack content-filtering algorithms in social media platforms? We investigate the challenge of generating adversarial examples to test the robustness of text classification algorithms detecting low-credibility content, including propaganda, false claims, rumours and hyperpartisan news. We focus on simulation of content moderation by setting realistic limits on the number of queries an attacker is allowed to attempt. Within our solution (TREPAT), initial rephrasings are generated by large language models with prompts inspired by meaning-preserving NLP tasks, such as text simplification and style transfer. Subsequently, these modifications are decomposed into small changes, applied through beam search procedure, until the victim classifier changes its decision. We perform quantitative evaluation using various prompts, models and query limits, targeted manual assessment of the generated text and qualitative linguistic analysis. The results confirm the superiority of our approach in the constrained scenario, especially in case of long input text (news articles), where exhaustive search is not feasible.
Item type: Item , Capturing the style of fake news(AAAI Press, 2020) Przybyla, PiotrIn this study we aim to explore automatic methods that can detect online documents of low credibility, especially fake news, based on the style they are written in. We show that general-purpose text classifiers, despite seemingly good performance when evaluated simplistically, in fact overfit to sources of documents in training data. In order to achieve a truly style-based prediction, we gather a corpus of 103,219 documents from 223 online sources labelled by media experts, devise realistic evaluation scenarios and design two new classifiers: a neural network and a model based on stylometric features. The evaluation shows that the proposed classifiers maintain high accuracy in case of documents on previously unseen topics (e.g. new events) and from previously unseen sources (e.g. emerging news websites). An analysis of the stylometric model indicates it indeed focuses on sensational and affective vocabulary, known to be typical for fake news.
Item type: Item , Meta-research on Artificial Intelligence in research practices: exploring the impact of Artificial Intelligence in scientific research(Universitat Pompeu Fabra, 2025) Hernández Leo, Davinia; Amarasinghe, IshariArtificial intelligence (AI) is rapidly expanding, and its presence is increasingly visible across many domains of society. From generative systems capable of producing text, images, and code, these developments are reshaping how new knowledge is produced, communicated, and evaluated. Within the context of scientific research, AI tools are no longer limited to peripheral support tasks; instead, they are becoming integral to workflows related to literature search, data processing, analysis, writing, and even peer review. While these technologies offer opportunities such as enhanced efficiency and accessibility, they also raise critical questions about transparency, bias, and academic integrity. Within this context, the Meta-Research Conference 2025 (MERE 2025) brings together contributions that examine how AI influences research workflows at present. The papers presented in this volume explore both the potential and the limitations of AIsupported research, addressing questions such as how AI tools reshape information seeking and analytical practices, as well as what methodological challenges and biases emerge when AI systems are integrated into research. The MERE Conference is an academic initiative of the Research Methods course in the Master’s programs at the Department of Information and Communication Technologies, Universitat Pompeu Fabra. As part of the Meta-Research Project, students engage in the full research cycle, from formulating research questions to data collection and analysis, interpretation, and academic writing. This proceedings volume includes a collection of manuscripts whose authors have agreed to share their work and findings. We are pleased to present these contributions and acknowledge the effort, curiosity, and critical engagement of the students involved.
Item type: Item , Recovering the history of Bergen Belsen using an interactive 3D reconstruction in a mixed reality space the role of pre-knowledge on memory recollection(Institute of Electrical and Electronics Engineers (IEEE), 2015) Oliva, Laura S; Mura, Anna; Betella, Alberto; Pacheco Estefan, Daniel; Martínez Bueno, Enrique; Verschure, Paul F. M. J.The question addressed by our work is twofold: On the one hand we want to contribute to the preservation of the Holocaust cultural heritage using digital technology, on the other hand, we want to investigate the impact of pre-knowledge on context information when this information is acquired in a virtual environment. Specifically, we wanted to investigate the user experience after factual or emotional information prior exposure to a virtual environment showing historical information, in this case related to the Holocaust. We developed a 3D reconstruction of the delousing building of the Bergen Belsen concentration camp and deployed it in an interactive mixed reality space. Here the user was engaged in a guided tour of the delousing building and was exposed to factual information on the configuration of the building and its history through pictures and a narrating voice. The results of our study show that prior knowledge i.e "emotional vs factual" affects memory recollection and thus our ability to retain relevant information. The utcome of our study supports the usefulness of digital and interactive technologies as a tool to recover and preserve cultural heritage.
Item type: Item , BrainX3: a virtual reality tool for neurosurgical intervention in epilepsy(Springer, 2017) Pacheco Estefan, Daniel; Zucca, Riccardo; Arsiwalla, Xerxes D.; Dalmazzo, David; Principe, Alessandro; Rocamora Zúñiga, Rodrigo Alberto; Verschure, Paul F. M. J.Localizing functional regions of the cortex and deep-brain areas in epileptic patients is an important pre-surgical procedure prior to resection. This is often achieved using intra-cortical electrodes for both stimulation as well as recordings of Local Field Potentials (LFPs), concurrent to behavioral tasks related to language or motor function. To assist and improve accuracy in the identification of relevant brain networks associated to core cognitive functions, we present a novel interactive virtual reality system that aids in 3D visualization of precise electrode placement within cortical and sub-cortical regions of the patient and facilitates dynamical functional connectivity analysis of the aforementioned networks. The system is an extension of the BrainX3 tool [1][2][3], developed at the Laboratory for the Synthetic, Perceptive, Emotive and Cognitive Systems (SPECS).
Item type: Item , Two dimensional shapes for emotional interfaces: Assessing the influence of angles, curvature, symmetry and movement(International Academy, Research and Industry Association, 2015) Pacheco Estefan, Daniel; Le Groux, Sylvain; Verschure, Paul F. M. J.Recent investigations aiming to identify which are the most influential parameters of graphical representations on human emotion have presented mixed results. In this study, we manipulated four emotionally relevant geometric and kinematic characteristics of non symbolic bidimensional shapes and animations, and evaluated their specific influence in the affective state of human observers. The controlled modification of basic geometric and cinematic features of such shapes (i.e., angles, curvature, symmetry and motion) led to the generation of a variety of forms and animations that elicited significantly different self-reported affective states in the axes of valence and arousal. Curved shapes evoked more positive and less arousing emotional states than edgy shapes, while figures translating slowly were perceived as less arousing and more positive than those translating fast. In addition, we found significant interactions between angles and curvature factors both in the valence and the arousal scales. Our results constitute a direct proof of the efficacy of abstract, non-symbolic shapes and animations to evoke emotion in a parameterized way, and can be generalized for the development of real-time, emotionally aware user interfaces
Item type: Item , The Freesound API: Advances in Audio Search and Retrieval(Web Audio Conference WAC, 2025) Anastasopoulou, Panagiota; Porter, Alastair; Font Corbera, FredericThis paper showcases the latest advances of the Freesound API, focusing on its updated features for improved access and exploration of community audio content shared on the web. Important advancements include restructured metadata retrieval, integration of audio analysis descriptors for richer, content-based search, and further reliance on Solr search technology for both efficient metadata-based querying and vector search. The API introduces a streamlined query system to combine metadata and content-based queries, new similarity spaces that enable more precise sound retrieval, and a unified, easy naming scheme for all audio analysis descriptors. This system is designed to support a range of creative applications, including live performance, browser-based composition tools, educational uses, and generally sound exploration. We overview the new features, demonstrating how users can leverage the API to build more sophisticated audio applications and workflows, and how online sound collections can be utilized for accessible and flexible creative work.
Item type: Item , A temporal estimate of integrated information for intracranial functional connectivity(Institute of Electrical and Electronics Engineers (IEEE), 2018) Arsiwalla, Xerxes D.; Pacheco Estefan, Daniel; Principe, Alessandro; Rocamora Zúñiga, Rodrigo Alberto; Verschure, Paul F. M. J.A major challenge in computational and systems neuroscience concerns the quantification of information processing at various scales of the brain's anatomy. In particular, using human intracranial recordings, the question we ask in this paper is: How can we estimate the informational complexity of the brain given the complex temporal nature of its dynamics? To address this we work with a recent formulation of network integrated information that is based on the Kullback-Leibler divergence between the multivariate distribution on the set of network states versus the corresponding factorized distribution over its parts. In this work, we extend this formulation for temporal networks and then apply it to human brain data obtained from intracranial recordings in epilepsy patients. Our findings show that compared to random re-wirings of the data, functional connectivity networks, constructed from human brain data, score consistently higher in the above measure of integrated information. This work suggests that temporal integrated information may indeed be a good starting point as a future measure of cognitive complexity.
Item type: Item , Oscillatory dynamics of active learning in the human brain(Cognitive Computational Neuroscience (CCN), 2019) Pacheco Estefan, Daniel; Zucca, Riccardo; Arsiwalla, Xerxes D.; Principe, Alessandro; Rocamora Zúñiga, Rodrigo Alberto; Axmacher, Nikolai; Verschure, Paul F. M. J.While the benefits of self-directed learning on human memory are well-acknowledged, little is known on its underlying neurophysiological substrate. Here, we investigated the key signatures of volitional learning in the brain as assessed by representational similarity analysis applied to human intracranial EEG (iEEG) data. Epilepsy patients performed an episodic memory task during virtual navigation which tests differences in recognition memory for self-directed versus passive learning. Consistent with previous literature, higher recognition accuracy was observed for items studied in active as opposed to passive movement conditions at the behavioral level. In addition, we demonstrate a critical role of hippocampal low-frequency oscillations for active learning. This is observed in 1) increased hippocampal 2-6Hz power for active versus passive information sampling and 2) significantly greater encoding-retrieval similarity (ERS) for volitional as compared to passive conditions in the first second after cue onset at retrieval. Follow-up analyses will address the contribution of activity at different frequencies for item-specific ERS and volitional versus passive learning. Together, these results offer a first perspective on the key oscillatory mechanisms underlying volitional learning in the human brain.
Item type: Item , A location-based augmented reality system for the spatial interaction with historical datasets(Institute of Electrical and Electronics Engineers (IEEE), 2015) Pacheco Estefan, Daniel; Wierenga, Sytse; Omedas, Pedro; Oliva, Laura S; Wilbricht , Stefan; Billib, Stephanie; Knoch, Habbo; Verschure, Paul F. M. J.The key role that space and spatial organization of content play in memory has been taken very little into account in the design of human-data interaction systems. Here, we present a location based Augmented Reality application for the exploration and visualization of historical files, which is based on the argument that the embodied interaction with content by moving in the real, physical space will enhance its recollection from memory and comprehension. Our software architecture integrates a historical 3D reconstruction with geo referenced historical documents, as well as specific guidance components for narrative generation. All content of the application database is spatialized and can be navigated in a completely free/exploratory mode or in a passive/guided mode. We present the results of an experiment comparing spatial memory performance in the two modes. Our data confirms previous findings in the spatial navigation literature, suggesting that active exploration of an environment leads to a better spatial understanding of it.
Item type: Item , Beyond passive audiences: children’s agency in media literacy research(Il Capoverso, 2025) Sánchez Reina, Jesús RobertoOne of the most salient contemporary debates in media research concerns the role of child audiences. The enduring legacy of media-centric approaches has contributed to the construction of childhood as marginalized or invisible, frequently framing young audiences as passive recipients (mere vessels or sponges). This perspective has often eclipsed essential questions such as: What do children do with the media and information they consume? What meanings and interpretations do they construct from content that is frequently presumed to “affect” them? (Buckingham, 2013; Kellner & Share, 2007; Livingstone, 2004). This paper analyzes the opportunities and challenges of conducting fieldwork with children, framed by the principles of the sociology of childhood. The study highlights key indicators that underscore the relevance of this perspective in our research and critically reflect on the practical limitations encountered in studying young audiences.
Item type: Item , Navigating the infodemic: what can we learn from the COVID-19 crisis?(Il Capoverso, 2025) Sánchez Reina, Jesús Roberto; González-Lara, Ericka-FernandaIn times of uncertainty, an excess of information can become a significant source of psychological distress. The COVID-19 pandemic thrust millions into a whirlwind of fear and confusion, in which access to reliable information offered a sense of security against invisible threats (Gao et al., 2020; Garfin et al., 2020). However, amid the urgent search for clarity, a parallel crisis quietly unfolded: a dramatic increase in global information consumption, often marked by the spread of misinformation and fake news (Cinelli et al., 2020; Masip et al., 2020; Ramírez et al., 2020). In this context, understanding how individuals sought, interpreted, and placed trust in information offers valuable lessons for addressing future global crises and developing effective strategies to combat infodemic.
Item type: Item , Leveraging melodic context for improved Svara representation(Laboratory PRISM, 2025) Nuttall, Thomas; Vijayan, Vivek; Serra, Xavier; Pearson, LaraFor the South Indian musical tradition known as Carnatic music, embeddings of svara (note) pitch time series have proven useful for tasks such as svara classification and performance analysis. In this paper, we extend an existing embedding method by incorporating findings from musicological research on the relationship between the performance of a svara and its immediate melodic context, in order to improve the learning of these embedding models. We present a context-aware GRUbased model, adapting the existing DeepGRU architecture to encode both svara and its surrounding melodic context, before combining them via a co-attention mechanism prior to classification. For a ground truth dataset of 2,077 expert svara annotations across two performances in r¯aga Bhairavi, we observe that the inclusion of melodic context leads to a 6.6% absolute increase in F1 score for svara label classification (from 78.3% to 84.9%), and an 7.8% absolute increase (from 59.9% to 67.7%) for classification of svara-form: sub-svara clusters that capture gamaka (ornamentation) variations in the performed svara.
Item type: Item , Groombuster: a narrative mini game to educate teenagers about online grooming(Institute of Electrical and Electronics Engineers (IEEE), 2025) Mateo-Gorina,Paula; Theophilou, Emily; Sánchez Reina, Jesús Roberto; Hernández Leo, DaviniaSocial media use among teenagers has surged in recent years, heightening their exposure to risks such as online grooming-a process where adults build online relationships with minors for abusive purposes. To address this threat, it is essential to explore innovative, technological approaches for educating adolescents. Therefore, this study sees the development of a narrative-based mini game to help educate adolescents about online grooming. A pilot study with 25 adolescents and young adults evaluated the mini-game's effectiveness in assessing their understanding of grooming and detection skills through a questionnaire and their in-game behavior. The results demonstrate that the mini-game significantly improved participants' knowledge of grooming and their ability to detect grooming behaviors, as evidenced by enhanced performance throughout the game.
Item type: Item , Social interactions and online engagement in CSCL environments: examining a measurement scale(Springer, 2025) Sánchez Reina, Jesús Roberto; Theophilou, Emily; Hernández Leo, DaviniaAbstract. Social interactions are key to promote effective coordination and participation in Computer-Supported Collaborative Learning (CSCL) environments. While current research examines how structural and technological features can enhance collaboration, there is still the need for validated instruments to assess learners' social Interaction and Online Engagement during online collaborative activities. This study presents an exploratory validation of a scale aimed at assessing learners’ perceptions of social interaction and online engagement after CSCL activities. The scale was piloted within a questionnaire assessing online learning and collaboration, administered in the context of a standard Higher Education course. Data were collected from 68 undergraduate students who completed a synchronous task in the environment of the CSCL tool PyramidApp. The exploratory factor analysis showed three interpretable factors that mediate social interaction in the tool: Cognitive Engagement, Social Engagement and Perception Value of the Experience with satisfactory internal consistency for each subscale. Further research is needed to confirm the factor structure.
Item type: Item , SoccerHigh: a benchmark dataset for automatic soccer video summarization(ACM Association for Computer Machinery, 2025) Díaz-Juan, Artur; Ballester, Coloma; Haro Ortega, GloriaVideo summarization aims to extract key shots from longer videos to produce concise and informative summaries. One of its most common applications is in sports, where highlight reels capture the most important moments of a game, along with notable reactions and specific contextual events. Automatic summary generation can support video editors in the sports media industry by reducing the time and effort required to identify key segments. However, the lack of publicly available datasets poses a challenge in developing robust models for sports highlight generation. In this paper, we address this gap by introducing a curated dataset for soccer video summarization, designed to serve as a benchmark for the task. The dataset includes shot boundaries for 237 matches from the Spanish, French, and Italian leagues, using broadcast footage sourced from the SoccerNet dataset. Alongside the dataset, we propose a baseline model specifically designed for this task, which achieves an F1 score of 0.3956 in the test set. Furthermore, we propose a new metric constrained by the length of each target summary, enabling a more objective evaluation of the generated content. The dataset and code are available at https://ipcv.github.io/SoccerHigh/.
