Navegar

Examinar

Enviaments recents

  • Open AccessItem type: Ítem ,
    Predictive modeling for sales forecasting: a time series analysis of Molaris biotech product sales
    (2025-11-12) Pericot Masdevall, Pere; Ortiz de Pazos, Álvaro; Mirabent Rubinat, Guillem
    Accurate sales forecasting is essential for effective inventory planning, resource allocation, and customer service. This thesis evaluates machine learning methods for predicting sales using Molaris Biotech data from January 2014 to April 2024. We compare the performance of ElasticNet, Random Forest, XGBoost, and neural networks. Results show that ElasticNet and Random Forest deliver the highest performance, reducing MAPE by up to 30% relative to the firm’s current approach. These findings highlight the value of predictive analytics for improving operational efficiency and strengthening competitive positioning in the retail sector.
  • Open AccessItem type: Ítem ,
    Verifiable report generation: a GraphRAG approach to grounded security analysis
    (2025-07-04) Jiménez, Blanca; Fernández, Pablo; Chernavskaia, Anastasiia
    The United Nations places community protection at the core of its humanitarian mission, requiring resource deployment in volatile, high-risk areas. We present a tool leveraging up-to-date conflict and political event data to generate comprehensive country reports. The automated pipeline includes data ingestion, Knowledge Graph construction, GraphRAG-based report generation, self-evaluation using large language models. To address traditional RAG limitations in processing complex geopolitical queries, we adopt GraphRAG, integrating knowledge graph structures into retrieval processes. This improves precision by leveraging entity and relationship awareness, enabling accurate, explainable analyses. GraphRAG enhances synthesis from diverse sources — conflict data, humanitarian reporting, socio-political indicators — resulting in actionable assessments.
  • Open AccessItem type: Ítem ,
    Folk around and find out: algorithmic collusion and the limits of coordination
    (2025-07-04) Peist, Moritz; Romero, Julián; Sauer, Lucia
    The Folk Theorem establishes that collusion can be sustained in repeated interactions, yet empirical evidence suggests coordination becomes more difficult as market participants increase. This thesis presents the first test of whether Large Language Model (LLM) agents exhibit this pattern. In controlled experiments with 2–5 competing agents, we find LLM coordination erodes predictably with competition. Our results show a 3.7% reduction in equilibrium price for each additional firm (p < 0.001), with prices declining smoothly. This culminates in a 10.6% total price reduction from duopoly to five-agent markets, providing quantitative evidence on algorithmic collusion boundaries in the AI era.
  • Open AccessItem type: Ítem ,
    Synthetic data generation with denoising diffusion probabilistic models for data augmentation in data-limited satellite image classification
    (2025-06) Gómez Argüelles, Gerardo; Tausendschön, Oliver; Cassel, Timothy
    Data augmentation is essential for improving deep learning performance with limited data. This thesis examines whether class-conditional Denoising Diffusion Probabilistic Models (DDPMs) can enhance satellite image classification on the EuroSAT dataset. Using a U-Net-based DDPM, we generated synthetic images for ten land cover classes and evaluated ResNet-18 with different real-to-synthetic ratios. Results show that geometric transformations consistently outperform synthetic data, which often degrades performance, especially at higher proportions. However, hybrid approaches improved specific classes, such as AnnualCrop (+2.65 points). Overall, geometric augmentation remains most effective, though class-dependent synthetic strategies show potential for targeted enhancement.
  • Open AccessItem type: Ítem ,
    Bayesian bandits for algorithm selection: latent-state modeling and spatial reward structures
    (2025-06-04) Ernst, Marvin Michel; Gelabert Cortés, Oriol; Vadenja, Melisa
    This thesis extends the classical Multi-Armed Bandit (MAB) framework to dynamic and spatial environments. In dynamic settings, Bayesian latent-state models with Thompson Sampling and UCB are evaluated for their ability to adapt to non-stationary rewards, with comparisons to simpler autoregressive (AR) models. For spatially structured problems, Gaussian Process (GP) and Lipschitz bandits are used to exploit correlations between arms. Algorithms such as GP-UCB and Zoom-In demonstrate improved learning efficiency. Empirical results highlight the benefits of modeling temporal and spatial structure, while also emphasizing the computational trade-offs compared to classical, more tractable bandit algorithms.
  • Open AccessItem type: Ítem ,
    Rostering optimization for schedule stability in last-mile delivery
    (2024-06) Bramwell-Codd, Jonny; Mottet, Clarice; Pérez Ricardo, Carlos
    The sector of Last Mile Delivery has experienced rapid growth in recent years, with the irregular and unpredictable work schedules of couriers frequently leading to high employee turnover and dissatisfaction. Building on the scheduling model proposed by Mandal, Santini, and Archetti 2024, this thesis provides a mathematical model for optimizing workforce size and rostering couriers to a week-long schedule with a focus on shift stability and employee welfare. Our results indicate a significant cost increase associated with rostering; however, implementing flexible shift patterns that better align to demand forecasts can reduce this to some extent.
  • Open AccessItem type: Ítem ,
    Comparison of vision transformers and convolution neural networks
    (2024-06-09) Belka, Caroline; Chen, Joshua; Wallstein, Jonas
    This thesis explores the differences between Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) to understand how these architectures perceive and learn from images. We first provide an in-depth explanation and comparative literature review of CNNs and ViTs. We then investigate how both models adapt to classifying satellite images and rotated scene images, evaluated in terms of rotational invariance and learned representations using Centered Kernel Alignment (CKA). ViTs demonstrated better performance and stability, which we attribute to their ability to integrate global information through self-attention mechanisms, while CNNs showed more variation due to their hierarchical feature learning and local receptive fields.
  • Open AccessItem type: Ítem ,
    War through the lens of AI: image analysis of the Russia-Ukraine war in Spanish news
    (2024-07) Di Gianvito, Angelo; Gatland, Oliver; Yuzkiv, Viktoriia
    This study aims to analyse the evolution of visual reporting on the Russia-Ukraine war in Spanish news broadcasts. We investigate how the depiction of the war changed from December 2022 to April 2024, focusing on the evolution of war coverage and on-the-ground war imagery. To achieve this, we use a subset of over 10,000 manually labelled screenshots from news broadcasts covering the Russia-Ukraine war, distinguishing between war-related and non-war-related content. Using a fine-tuned ResNet50 model, we track how the imagery of the war evolved over time, finding that the fluctuations in war images do not strongly correlate with actual events and military actions, suggesting a divergence between media representation and reality.
  • Open AccessItem type: Ítem ,
    How to generate new versions of an original character?: an application of LoRA and DreamBooth fine-tuning of diffusion models
    (2024-07) Boudier, Maëlys; Beltrán, Natalia; Michelangelo, Arianna
    This research explores the use of LoRA (Low-Rank Adaptation) and DreamBooth fine-tuning techniques on Stable Diffusion models to generate new versions of an original comic book character. By addressing the challenge of data scarcity, these techniques enable the creation of varied character poses from limited images, significantly enhancing efficiency in the comic creation process. The study demonstrates that fine-tuning pre-trained models with minimal computational resources can produce high-quality, consistent character images, reducing repetitive tasks for artists. This research paves the way for integrating AI into comic book creation, allowing artists greater creative freedom and productivity.
  • Open AccessItem type: Ítem ,
    Harnessing big data news media for conflict prediction and anticipatory decision-making
    (2023-07) Chaves, Giovanna; Philipp, Margherita; Quiñones, Luis
    Advances in data and computing techniques have opened possibilities for real-time and cost-efficient conflict prediction and early warning capabilities, with news-based data being utilized to generate relevant forecasts. This Master’s thesis explores the use of big data news media for conflict prediction and anticipatory decision-making, with a focus on harnessing the Global Database of Events, Language and Tone (GDELT). We investigate the effectiveness of using GDELT events to predict conflict at the country-level by extracting relevant features and comparing the performance of text-based models with different target definitions and time horizons. The results show that GDELT-based features perform well in conflict prediction, particularly in tree-based and LSTM models, indicating the value of using text data for capturing patterns and providing insights into potential conflict events.
  • Open AccessItem type: Ítem ,
    Forecasting global refugee flows: a machine learning approach using non-conventional data
    (2023-08-18) De los Santos, Daniela; Frey, Eric; Vassallo, Renato
    This study presents a novel forecasting framework for global refugee flows, incorporating non-conventional data sources such as Google Trends, the GDELT project event dataset, conflict forecasts, among others. Our main objective is to generate accurate predictions for the number of new refugee arrivals per country pair, in order to help facilitate effective humanitarian response. We develop a comprehensive global model which predicts refugee outflows and country-pair flows separately. Our results reveal a significant improvement in prediction accuracy by augmenting traditional variables with non-conventional data, with Random Forest and Gradient Boosting as effective regressors.
  • Open AccessItem type: Ítem ,
    Exploring user retention in enhance VR: a comprehensive analysis using predictive models and clustering
    (2023-07) Odizzio, Catalina; Pissinis, Agostina
    This study delves into understanding and predicting user engagement in Enhance VR, a virtual reality cognitive training application, through data-driven approaches. The dataset encompasses de-identified user data including demographic characteristics, mood and session related variables. Initial data exploration involves descriptive statistics, data visualization, and inferential statistics, assessing correlations between attributes and their effects on engagement and performance. Machine learning models including Random Forests and Gradient Boosting are developed to predict user engagement levels. K-Prototypes clustering is employed for segmentation, identifying distinct user groups based on behavioral and demographic attributes. This research informs the strategic design and content delivery of Enhance VR by identifying distinct user groups and predicting engagement patterns.
  • Open AccessItem type: Ítem ,
    Leveraging satellite imagery to assess road quality in the Democratic Republic of the Congo
    (2023-06) Conner Bonmatí, Miguel; Talvi Robledo, Ramón; Wielath, Dominik Johannes
    We attempt to build a road quality classifier to detect bad roads using satellite imagery in the province of Sud-Kivu in the Democratic Republic of the Congo (DRC). Using 60 cm/pixel resolution from Google Earth, paired with 100 m IRI road quality data for Liberia, we train a CNN (EfficientNetV2) that performs with an accuracy of 47% for 5- classes and 80% for 2-classes (AUC: 0.75). We then establish a connection between the model trained in Liberia and road quality in the DRC. We find that our methods seem to work well given the many limitations of the project.
  • Open AccessItem type: Ítem ,
    Corpus construction and social media analysis about immigration in Chile
    (2022-06) Couble, Andrés; Schindler, Mathias; Stassinos, Kalliope
    This thesis presents a general-purpose corpus construction methodology with Twitter data for a given political topic in a given country. It applies the methodology to immigration in Chile from November 2021 to April 2022, resulting in a corpus with 573,999 tweets. Our results indicate increasing antiimmigration views from Chilean Twitter users. Right-leaning users are more active and more anti-immigration. Left-leaning users are mostly concerned with xenophobia and racism. Utilizing network analysis methods, we find that right-leaning users are also more influential and interconnected. The results are consistent with previous studies and the methodology is robust to other political topics such as feminism.
  • Open AccessItem type: Ítem ,
    Risk detection in cryptocurrency markets: meeting the needs of traditional finance
    (2022-05) Aguilar, Iván; Jones, Rebecca; Lovicu, Gian-Piero
    Financial institutions are beginning to integrate cryptocurrencies into their payment systems but must ensure to comply with anti-money laundering regulations to avoid facilitating transactions linked to criminal activities. We propose a cryptocurrency risk detection model that could be used by these institutions. It is novel in two ways: firstly, it prioritises a high recall, and secondly, organises the transaction data in a different 'address-level' manner. We test different Graph Neural Network (GNN) models and find that the Graph Attention Network using our address-level data achieves a recall of 83%, an improvement on results achieved in previous literature.
  • Open AccessItem type: Ítem ,
    Bayesian optimization with uncertainty aware neural networks
    (2022-05) Ampudia, David; Leung, Clinton
    Bayesian optimization has emerged as an effective and efficient approach for finding the global optimum of highly complex derivative-free black-box functions. It typically models the objective function with Gaussian processes (GP) as a surrogate. Based on this surrogate, an auxiliary acquisition function proposes candidate optima locations to query the objective function at. In this paper, we explore recent developments that may help alleviate two key limitations of GP’s: poor performance with large datasets, and non-stationary target functions. To this end, we propose and implement several scalable uncertainty aware neural networks as alternative surrogates. In a series of tests, we showcase the relative performance of ensembles, Bayesian, and direct estimation neural network approaches against that of traditional GP’s and state of the art Sparse Variational Gaussian Processes (SVGP) in Bayesian optimization settings. Our results show that not only are neural networks a scalable solution with comparable performance to GP’s, but they also hold the potential to outperform SVGP’s.
  • Open AccessItem type: Ítem ,
    Deep vector autoregression for macroeconomic data
    (2021-07-20) Agustí, Marc; Altmeyer, Patrick; Vidal-Quadras, Ignacio
    Vector autoregression (VAR) models are a popular choice for forecasting of macroeconomic time series data. Due to their simplicity and success at modelling the monetary economic indicators VARs have become a standard tool for central bankers to construct economic forecasts. In light of the recent advancements in computational power and the development of advanced machine learning and deep learning algorithms we propose a simple way to integrate these tools into the VAR framework. This paper aims to contribute to the time series literature by introducing a ground-breaking methodology which we refer to as Deep Vector Autoregression (Deep VAR). By fitting each equation of the VAR system with a deep neural network, the Deep VAR outperforms the VAR in terms of in-sample fit, out-of-sample fit and point forecasting accuracy. In particular, we find that the Deep VAR is able to better capture the structural economic changes during periods of uncertainty and recession.
  • Open AccessItem type: Ítem ,
    Understanding latent vector arithmetic for attribute manipulation in normalizing flows
    (2021) Gimenez Funes, Eduard
    Normalizing flows are an elegant approximation to generative modelling. It can be shown that learning a probability distribution of a continuous variable X is equivalent to learning a mapping f from the domain where X is defined to Rn is such that the final distribution is a Gaussian. In “Glow: Generative flow with invertible 1x1 convolutions” Kingma et al introduced the Glow model. Normalizing flows arrange the latent space in such a way that feature additivity is possible, allowing synthetic image generation. For example, it is possible to take the image of a person not smiling, add a smile, and obtain the image of the same person smiling. Using the CelebA dataset we report new experimental properties of the latent space such as specular images and linear discrimination. Finally, we propose a mathematical framework that helps to understand why feature additivity works.
  • Open AccessItem type: Ítem ,
    Tracking the economy using FOMC speech transcripts
    (2020-07-20) Battaglia, Laura; Salunina, Maria
    In this study, we propose an approach for the extraction of a low-dimensional signal from a collection of text documents ordered over time. The proposed framework foresees the application of Latent Dirichlet Allocation (LDA) for obtaining a meaningful representation of documents as a mixture over a set of topics. Such representations can then be modeled via a Dynamic Linear Model (DLM) as noisy realisations of a limited number of latent factors that evolve with time. We apply this approach to Federal Open Market Committee (FOMC) speech transcripts for the period of Greenspan presidency. We are able to extract a latent factor that fairly resembles the Economic Policy Uncertainty Index for United States.
  • Open AccessItem type: Ítem ,
    Structure and power dynamics in economic networks : a quantitative analysis of labour flow and company control networks in the UK
    (2020-06-25) Pap, Aron
    In this thesis project I analyse labour flow networks and company control networks in the UK. I observe that these networks exhibit characteristics that are typical of empirical networks, such as heavy-tailed degree distribution, strong communities with geo-industrial clustering and high assortativity. I document that distinguishing between the type of investors of firms can help to understand their degree centrality in the company control network and that large institutional entities having significant and exclusive control in a firm seem to be responsible for emerging hubs in this network. I also devise a simple network formation model to study the underlying causal processes in this company control network. I perform numerical simulations and model parameter calibration, obtaining a model that captures the empirically observed patterns in the data.