A Novel method for open-vocabulary panoptic segmentation
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Kormushev, Nikolay
- dc.date.accessioned 2025-11-10T18:33:38Z
- dc.date.available 2025-11-10T18:33:38Z
- dc.date.issued 2025
- dc.description Treball fi de màster de: Erasmus Mundus joint Master in Artificial Intelligence (EMAI)
- dc.description Mentor: Prof. dr. Matej Kristan Co-mentor: Dr. Josip Saric
- dc.description.abstract Open-vocabulary panoptic segmentation aims to segment and classify visual content into both known and unseen categories using natural language supervision. While class-agnostic mask generators produce reasonably high quality masks, this thesis identifies two main bottlenecks limiting performance: mask quality assessment, where valid masks are often mistakenly discarded as background, and semantic classification, which remains challenging especially for unseen categories. To address these, we propose a two part solution: a novel background mask reclassification module that recovers valid masks misclassified as background, and a CLIP fine-tuning strategy that preserves alignment between visual and textual embeddings. Together, these methods improve panoptic quality (PQ) on ADE20K from a baseline of 26.6 to 27.94, with analysis showing that addressing errors in mask quality and semantic classification could theoretically increase PQ to 65.9. These findings offer practical advancements and valuable insights toward bridging the gap between open- and closed-vocabulary segmentation.ENG
- dc.identifier.uri http://hdl.handle.net/10230/71838
- dc.language.iso eng
- dc.rights Llicència CC Reconeixement-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0)
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
- dc.subject.other Visió per ordinador
- dc.title A Novel method for open-vocabulary panoptic segmentation
- dc.type info:eu-repo/semantics/masterThesis
