Paper
29 December 1997 Integrated audiovisual processing for object localization and tracking
Gopal Sarma Pingali
Author Affiliations +
Proceedings Volume 3310, Multimedia Computing and Networking 1998; (1997) https://doi.org/10.1117/12.298421
Event: Photonics West '98 Electronic Imaging, 1998, San Jose, CA, United States
Abstract
This paper presents a system that combines audio and visual cues for locating and tracking an object, typically a person, in real time. It is shown that combining a speech source localization algorithm with a video-based head tracking algorithm results in a more accurate and robust tracker than that obtained using any one of the audio or visual modalities. Performance evaluation results are presented with a system that runs in real time on a general purpose processor. The multimodal tracker has several applications such as teleconferencing, multimedia kiosks and interactive games.
© (1997) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Gopal Sarma Pingali "Integrated audiovisual processing for object localization and tracking", Proc. SPIE 3310, Multimedia Computing and Networking 1998, (29 December 1997); https://doi.org/10.1117/12.298421
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Cameras

Optical tracking

Visualization

Video

Imaging systems

Head

Detection and tracking algorithms

Back to Top