FoodMem: near real-time and precise food video segmentation

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author AlMughrabi, Ahmad
  • dc.contributor.author Galán, Adrián
  • dc.contributor.author Marques, Ricardo
  • dc.contributor.author Radeva, Petia
  • dc.date.accessioned 2025-11-11T07:15:11Z
  • dc.date.available 2025-11-11T07:15:11Z
  • dc.date.issued 2025
  • dc.description.abstract Food segmentation, including in videos, is vital for addressing real-world health, agriculture, and food biotechnology issues. Current limitations lead to inaccurate nutritional analysis, inefficient crop management, and suboptimal food processing, impacting food security and public health. Improving segmentation techniques can enhance dietary assessments, agricultural productivity, and the food production process. This study introduces the development of a robust framework for high-quality, near-real-time segmentation and tracking of food items in videos, using minimal hardware resources. We present FoodMem, a novel framework designed to segment food items from video sequences of 360-degree unbounded scenes. FoodMem can consistently generate masks of food portions in a video sequence, overcoming the limitations of existing semantic segmentation models, such as flickering and prohibitive inference speeds in video processing contexts. To address these issues, FoodMem leverages a two-phase solution: a transformer segmentation phase to create initial segmentation masks and a memory-based tracking phase to monitor food masks in complex scenes. Our framework outperforms current state-of-the-art food segmentation models, yielding superior performance across various conditions, such as camera angles, lighting, reflections, scene complexity, and food diversity.2This results in reduced segmentation noise, elimination of artifacts, and completion of missing segments. We also introduce a new annotated food dataset encompassing challenging scenarios absent in previous benchmarks. Extensive experiments conducted on MetaFood3D, Nutrition5k, and Vegetables & Fruits datasets demonstrate that FoodMem enhances the state-of-the-art by 2.5% mean average precision in food video segmentation and is faster on average. The source code is available at: 3.en
  • dc.description.sponsorship This work was partially funded by the EU project MUSAE (No. 01070421), 2021- SGR-01094 (AGAUR), Icrea Academia’2022 (Generalitat de Catalunya), Robo STEAM (2022-1-BG01-KA220- VET000089434, Erasmus+ EU), DeepSense (ACE053/22/000029, ACCIÓ), and Grants PID2022141566NB-I00 (IDEATE), PDC2022-133642-I00 (DeepFoodVol), and CNS2022-135480 (A-BMC) funded by MICIU/AEI/10.13039/501100 011033, by FEDER (UE), and by European Union NextGenerationEU/ PRTR. A. AlMughrabi acknowledges the support of FPI Becas, MICINN, Spain.en
  • dc.format.mimetype application/pdf
  • dc.identifier.citation AlMughrabi A, Galán A, Marques R, Radeva P. FoodMem: near real-time and precise food video segmentation. Pattern Recognit Lett. 2025 Jun;192:59-64. DOI: 10.1016/j.patrec.2025.03.014
  • dc.identifier.doi http://dx.doi.org/10.1016/j.patrec.2025.03.014
  • dc.identifier.issn 0167-8655
  • dc.identifier.uri http://hdl.handle.net/10230/71850
  • dc.language.iso eng
  • dc.publisher Elsevier
  • dc.relation.ispartof Pattern Recognition Letters. 2025 Jun;192:59-64
  • dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/01070421
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PID2022-141566NB-I00
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/PDC2022-133642-I00
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PE/CNS2022-135480
  • dc.rights © 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0/
  • dc.subject.keyword Video food segmentationen
  • dc.subject.keyword Fast segmentationen
  • dc.subject.keyword Food trackingen
  • dc.subject.keyword Segmentation transformeren
  • dc.subject.keyword Memory-based modelsen
  • dc.subject.keyword Near real-time segmentationen
  • dc.title FoodMem: near real-time and precise food video segmentationen
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion