Multiple target tracking (MTT) or multiple object tracking (MOT), which is a key stage operation for many computer vision applications, has relied and it relies on detecting and identifying targets within videos. The objects of these videos may be, for instance, pedestrians, vehicles or animals regardless of the number of targets or their appearance. The main goal of multi object tracking is to follow paths or trajectories of multiple targets in a sequence. Multiple object tracking is a field of computer vision that has a wide range of applications, starting from video surveillance and human computer interaction to robotics. The aim of visual object tracking is to maintain objects’ identities. Developing an effective and accurate visual tracking system is very challenging due to the enormous number of problems, such as Illumination variation or background, the famous Clutters meaning the overlap between objects. Low resolution may be due to cameras problems, scale variation, occlusion, and change of targets’ positions within video. These challenges have a wide range of approaches and methods in this evolving research area with several algorithms in each subfield. This paper covers the crucial research area for Multiple Object Tracking (MOT) and this study will help researchers accomplish their scientific projects relying on the wide range of algorithms mentioned on this review.
Computer vision is a field of artificial intelligence that trains computers to gain high-level understanding from images or videos. Among the most well-known subfields in Computer Vision are object detection, object tracking, motion estimation, etc. So, object detection is a computer vision technique for detecting instances of objects in images or videos. It can be performed in different ways, either by using delimiting frames or by using object segmentation. It is useful and widely used in the following tasks: image annotation, activity recognition, face recognition, and object tracking. There are two types of approaches to detecting objects in the image: two-stage detector-based approaches and one-stage detector-based approaches with their advantages and disadvantages. Object detection algorithms and techniques generally fall under machine learning approaches and deep learning approaches. Deep learning approaches are based on convolutional neural networks that allow us to perform many tasks, such as clustering, classification, or regression. This paper, which is a survey, aims at reviewing the different approaches and models for deep learning to detect objects in the image and then the advantages and disadvantages of each approach as well as the different fields of application. Also, many data sets that are proposed to evaluate the different methods to detect objects in the image will be presented in this article.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.