This paper explores the development of a real-time people tracking system for immersive interactive environments using open-source deep learning models. The goal was to create an AI-based solution capable of tracking people in complex environments where immersive interactive systems are placed, and varying lighting conditions are present.
The research focuses on leveraging pre-trained models and fine-tuning them to meet specific application needs, rather than building a tracking system from scratch. ...
This paper explores the development of a real-time people tracking system for immersive interactive environments using open-source deep learning models. The goal was to create an AI-based solution capable of tracking people in complex environments where immersive interactive systems are placed, and varying lighting conditions are present.
The research focuses on leveraging pre-trained models and fine-tuning them to meet specific application needs, rather than building a tracking system from scratch. The study employed a detection and tracking framework using the RTMDet-tiny detection model, fine-tuned with specific datasets, and integrated with the DeepSORT tracking
algorithm. The refined model's performance was evaluated based on its accuracy and robustness in different sequences, considering metrics such as mean Average Precision (mAP), Higher Order Tracking Accuracy (HOTA), Multiple Object Tracking Accuracy (MOTA), and Identification F1 Score (IDF1).
The refined detection model showed an overall average accuracy (mAP) of 0.741, with significant variations depending on object size and intersection over union (IoU) thresholds. The system performed well in detecting medium-sized objects but struggled with small and large objects due to the lack of annotated diverse training data.
The discussion highlights the challenges and limitations encountered, such as the modular integration issues with OpenMMLab repositories and the high manual cost of data annotation. Future work should focus on enriching the training dataset with more varied images, implementing posture and body part detection, and exploring alternative tracking algorithms like ByteTrack for potentially better performance. Additionally,
filtering detections at the edges of the image to reduce fluctuations and improving visual descriptors for distinguishing individuals in complex environments are suggested as important steps to enhance the system's accuracy and reliability.
This study contributes to the field of Computer Vision by demonstrating the practical application of deep learning models in real-time people tracking for immersive interactive systems. The insights gained can inform future developments in AI-based tracking solutions, ensuring more engaging and personalized user experiences in various
interactive settings.
+