Inici
→
Recerca: working papers, preprints, informes, etc.
→
Departament de Tecnologies de la Informació i les Comunicacions
→
Informes (Departament de Tecnologies de la Informació i les Comunicacions)

Informes (Departament de Tecnologies de la Informació i les Comunicacions)

Llistar per:

Documents de recerca, en accés obert, com ara working papers, informes de recerca, memòries tècniques, etc., del Departament de Tecnologies de la Informació i les Comunicacions de la UPF.

Enviaments recents

Leveraging pre-trained autoencoders for interpretable prototype learning of music audio

Alonso Jiménez, Pablo; Pepino, Leonardo; Batlle-Roca, Roser; Zinemanas, Pablo; Bogdanov, Dmitry; Serra, Xavier; Rocamora, Martín (Institute of Electrical and Electronics Engineers (IEEE), 2024)

We present PECMAE an interpretable model for music audio classification based on prototype learning. Our model is based on a previous method, APNet, which jointly learns an autoencoder and a prototypical network. Instead, ...
Computers in education: how can we support the teachers?

Hernández Leo, Davinia (2024-01-09)
Completing audio drum loops with symbolic drum suggestions

Haki, Behzad; Pelinski, Teresa; Nieto, Marina; Jordà Puig, Sergi (2023-11-20)

Sampled drums can be used as an affordable way of creating human-like drum tracks, or perhaps more interestingly, can be used as a mean of experimentation with rhythm and groove. Similarly, AI-based drum generation tools ...
Carnatic singing voice separation using cold diffusion on training data with bleeding

Plaja-Roglans, Genís; Miron, Marius; Shankar, Adithi; Serra, Xavier (2023-10-30)

Supervised music source separation systems using deep learning are trained by minimizing a loss function between pairs of predicted separations and ground-truth isolated sources. However, open datasets comprising ...
Efficient notation assembly in optical music recognition

Penarrubia, Carlos; Garrido-Muñoz, Carlos; Valero-Mas, Jose J.; Calvo Zaragoza, Jorge (2023-10-30)

Optical Music Recognition (OMR) is the field of research that studies how to computationally read music notation from written documents. Thanks to recent advances in computer vision and deep learning, there are successful ...
Activities using smart IoT planters in learning spaces: human-centred design of a dashboard

Hernández Leo, Davinia; Ferrer, Josep; Vujovic, Milica; Tabuenca, Bernardo; Ortiz-Beltran, Ariel; Greller, Wolfgang; Carrió, Mar; Moyano Claramunt, Elisabet (2023-10-25)

Education plays a transversal key role in the UN's Sustainable Development Goals agenda. Educating young people on natural health and the interpretation of scientific evidence can contribute to increased levels of informed ...
TapTamDrum: a dataset for dualized drum patterns

Haki, Behzad; Kotowski, Błażej; Lee, Cheuk Lun Isaac; Jordà Puig, Sergi (2023-10-24)

Drummers spend extensive time practicing rudiments to develop technique, speed, coordination, and phrasing. These rudiments are often practiced on "silent" practice pads using only the hands. Additionally, many percussive ...
Predicting performance difficulty from piano sheet music images

Ramoneda, Pedro; Valero-Mas, Jose J.; Jeong, Dasaem; Serra, Xavier (2023-10-24)

Estimating the performance difficulty of a musical score is crucial in music education for adequately designing the learning curriculum of the students. Although the Music Information Retrieval community has recently ...
High-resolution violin transcription using weak labels

Tamer, Nazif Can; Özer, Yigitcan; Müller, Meinard; Serra, Xavier (2023-10-24)

A descriptive transcription of a violin performance requires detecting not only the notes but also the fine-grained pitch variations, such as vibrato. Most existing deep learning methods for music transcription do not ...
TRIAD: capturing harmonics with 3D convolutions

Perez, Miguel; Kirchhoff, Holger; Serra, Xavier (2023-10-20)

Thanks to advancements in deep learning (DL), automatic music transcription (AMT) systems recently outperformed previous ones fully based on manual feature design. Many of these highly capable DL models, however, are ...
Sounds out of pläce? Score-independent detection of conspicuous mistakes in piano performances

Morsi, Alia; Tatsumi, Kana; Maezawa, Akira; Fujishima, Takuya; Serra, Xavier (2023-10-20)

In piano performance, some mistakes stand out to listeners, whereas others may go unnoticed. Former research concluded that the salience of mistakes depended on factors including their contextual appropriateness and a ...
Efficient supervised training of audio transformers for music representation learning

Alonso Jiménez, Pablo; Serra, Xavier; Bogdanov, Dmitry (2023-10-03)

In this work, we address music representation learning using convolution-free transformers. We build on top of existing spectrogram-based audio transformers such as AST and train our models on a supervised task using ...
DiffVel: note-level MIDI velocity estimation for piano performance by a double conditioned diffusion model

Kim, Hyon; Serra, Xavier (2023-08-31)

In any piano performance, expressiveness is paramount for effectively conveying the intent of the performer, and one of the most significant aspects of expressiveness is the loudness at the individual key or note level. ...
Com ho avaluem? Repensem la universitat: impacte de la intel·ligència artificial (IA) en l’aprenentatge

Hernández Leo, Davinia (2023-07-17)

Quin impacte té la Intel·ligència Artificial (IA) en l’avaluació. Com hem d’adaptar els mètodes d’avaluació per tal que segueixin essent efectius i, a més, posin a prova les competències relacionades amb l’ús de la IA
Score-Informed MIDI Velocity Estimation for Piano Performance by FiLM Conditioning

Kim, Hyon; Miron, Marius; Serra, Xavier (2023-05-12)

Piano is one of the most popular instruments among people that learn to play music. When playing the piano, the level of loudness is crucial for expressing emotions as well as manipulating tempo. These elements convey ...
TAPE: An End-to-End Timbre-Aware Pitch Estimator

Tamer, Nazif C; Özer, Yigitcan; Müller, Meinard; Serra, Xavier (2023-04-25)

Pitch estimation of a target musical source within a multi-source polyphonic signal is of great interest for music performance analysis. One possible approach for extracting the pitch of a target source is to first perform ...
Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity

Alonso-Jiménez, Pablo; Favory, Xavier; Foroughmand, Hadrien; Bourdalas, Grigoris; Serra, Xavier; Lidy, Thomas; Bogdanov, Dmitry (2023-04-25)

In this work, we investigate an approach that relies on contrastive learning and music metadata as a weak source of supervision to train music representation models. Recent studies show that contrastive learning can be ...
Note level midi velocity estimation for piano performance

Kim, Hyon; Miron, Marius; Serra, Xavier (2023-01-16)

Piano is one of the most popular music instruments. During the piano performance, loudness is an important factor for expressiveness, alongside tempo, changes in dynamics play with expectation, convey various emotions, ...
Essentia API: a web API for music audio analysis

Correya, Albin Andrew; Bogdanov, Dmitry; Alonso Jiménez, Pablo; Serra, Xavier (2023-01-10)

We present Essentia API, a web API to access a collection of state-of-the-art music audio analysis and description algorithms based on Essentia, an open-source library and machine learning (ML) models for audio and music ...
MUSAV: a dataset of relative arousal-valence annotations for validation of audio models

Bogdanov, Dmitry; Lizarraga Seijas, Xavier; Alonso-Jiménez, Pablo; Serra, Xavier (2022-09-27)

We present MusAV, a new public benchmark dataset for comparative validation of arousal and valence (AV) regression models for audio-based music emotion recognition. To gather the ground truth, we rely on relative ...