Presentation + Paper
13 November 2024 Context-aware model training for attention-based multicamera multiobject tracking
Miguel M. Ortiz, Francisco J. Iriarte, Hugo D. Rodríguez, Luis Unzueta, Luis M. Bergasa
Author Affiliations +
Abstract
Attention-based Siamese networks have shown remarkable results for occlusion-aware single-camera Multi-Object Tracking (MOT) applied to persons as they can effectively combine motion and appearance features. However, expanding their usage for multi-camera MOT in crowded areas such as train stations and airports is challenging. In these kinds of scenarios, there is a higher visual appearance variability of people as the viewpoints from where they are observed while they move could be very diverse. This adds extra difficulty to the already high variability coming from partial occlusions and body pose differences (standing, sitting, or lying). Besides, attention-based MOT methods are computationally intensive and therefore difficult to scale to multiple cameras. To overcome these problems, in this paper, we propose a method that exploits contextual information of the scenario such as the viewpoint, occlusion, and pose-related visual appearance characteristics of persons to improve the inter and intra feature representations in attention-based Siamese networks. Our approach combines a smart context-aware training data batching and hard triplet mining strategy with an automated model complexity tuning procedure to train the optimal model for the scenario. This method improves the fusion of motion and appearance features of persons for the data association cost matrix of the MOT algorithm. Experimental results, validated on the MOT17 dataset, demonstrate the effectiveness and efficiency of our approach, showcasing promising results for real-world applications requiring robust MOT capabilities in multi-camera setups.
Conference Presentation
© (2024) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Miguel M. Ortiz, Francisco J. Iriarte, Hugo D. Rodríguez, Luis Unzueta, and Luis M. Bergasa "Context-aware model training for attention-based multicamera multiobject tracking", Proc. SPIE 13206, Artificial Intelligence for Security and Defence Applications II, 132060K (13 November 2024); https://doi.org/10.1117/12.3034014
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Cameras

Feature extraction

Motion models

Data modeling

Matrices

Performance modeling

Back to Top