Can audio reveal music performance difficulty? Insights from the piano syllabus dataset
Can audio reveal music performance difficulty? Insights from the piano syllabus dataset
Citació
- Ramoneda P, Lee M, Jeong D, Valero-Mas JJ, Serra X. Can audio reveal music performance difficulty? Insights from the piano syllabus dataset. IEEE Trans Audio Speech Lang Process. 2025;33:1129-41. DOI: 10.1109/TASLPRO.2025.3539018
Enllaç permanent
Descripció
Resum
Automatically estimating the performance difficulty of a music piece represents a key process in music education to create tailored curricula according to the individual needs of the students. Given its relevance, the Music Information Retrieval (MIR) field comprises some proof-of-concept works addressing this task that mainly focus on high-level music abstractions such as machine-readable scores or music sheet images. In this regard, the potential of directly analyzing audio recordings has generally been neglected. This work addresses this gap in the field with two contributions: (i) PSyllabus, the first audio-based difficulty estimation dataset—collected from Piano Syllabus community—featuring 7,901 piano pieces across 11 difficulty levels from 1,233 composers as well as two additional benchmark datasets particularly compiled for evaluation purposes; and (ii) a recognition framework capable of managing different input representations—both in unimodal and multimodal manners—derived from audio to perform the difficulty estimation task. The comprehensive experimentation comprising different pre-training schemes, input modalities, and multi-task scenarios proves the validity of the hypothesis and establishes PSyllabus as a reference dataset for audio-based difficulty estimation in the MIR field. The dataset, developed code, and trained models are publicly shared to promote further research in the field.