Repositori Digital de la UPF

241

Continguts institucionals

268

Dades primàries de recerca

662

Docència

20203

Recerca: articles, congressos, llibres

3500

Recerca: tesis

2938

Recerca: working papers, preprints, informes, etc.

6714

Revistes científiques

4871

Treballs d'estudiants

422

Vida universitària

Guies

Dades de recerca en accés obert

Portal de producció científica

Enviaments recents

No hi ha miniatura disponible

Bayesian bandits for algorithm selection: latent-state modeling and spatial reward structures

This thesis extends the classical Multi-Armed Bandit (MAB) framework to dynamic and spatial environments. In dynamic settings, Bayesian latent-state models with Thompson Sampling and UCB are evaluated for their ability to adapt to non-stationary rewards, with comparisons to simpler autoregressive (AR) models. For spatially structured problems, Gaussian Process (GP) and Lipschitz bandits are used to exploit correlations between arms. Algorithms such as GP-UCB and Zoom-In demonstrate improved learning efficiency. Empirical results highlight the benefits of modeling temporal and spatial structure, while also emphasizing the computational trade-offs compared to classical, more tractable bandit algorithms.

(2025-06-04) Ernst, Marvin Michel; Gelabert Cortés, Oriol; Vadenja, Melisa

Mostra'n més