Recent advancements in signal processing and computer vision are largely due to machine learning (ML). While exciting, the reality is that most modern ML approaches are based on supervised learning and require large and diverse collections of well annotated data. Furthermore, top performing ML models are black (opaque) versus glass (transparent) boxes. It is not clear what they are doing and when/where they work. Herein, we use modern video game engine technology to better understand and help create improved ML solutions by confronting the real world annotated data bottleneck problem. Specifically, we discuss a procedural environment and dataset collection process in the Unreal Engine (UE) for explosive hazard detection (EHD). This process is driven by the underlying variables impacting EHD: object, environment, and platform/sensor (low altitude drone herein). Furthermore, we outline a process for generating data at different levels of visual abstraction to train ML algorithms, encourage improved features, and evaluate ML model generalizability. Encouraging preliminary results and insights are provided relative to simulated aerial EHD experiments.
KEYWORDS: Computer simulations, Data modeling, Unmanned aerial vehicles, Image segmentation, RGB color model, Explosives, Visualization, Machine learning, 3D modeling, Video
Datasets with accurate ground truth from unmanned aerial vehicles (UAV) are cost and time prohibitive. This is a problem as most modern machine learning (ML) algorithms are based on supervised learning and require large and diverse well-annotated datasets. As a result, new creative ideas are needed to drive innovation in robust and trustworthy artificial intelligence (AI) / ML. Herein, we use the Unreal Engine (UE) to generate simulated visual spectrum imagery for explosive hazard detection (EHD) with corresponding pixel-level labels, UAV metadata, and environment metadata. We also have access to a relatively small set of real world EH data with less precise ground truth – axis aligned bounding box labels – and sparse metadata. In this article, we train a lightweight, real-time, pixel-level EHD pre-screener for a low-altitude UAV. Specifically, we focus on training with respect to different combinations of simulated and real data. Encouraging preliminary results are provided relative to real world EH data. Our findings suggest that while simulated data can be used to augment limited volume and variety real world data, it could perhaps be sufficient by itself to train an EHD pre-screener.
Numerous real-world applications require the intelligent combining of disparate information streams from sensors to create a more complete and enhanced observation in support of underlying tasks like classification, regression, or decision making. An often overlooked and underappreciated part of fusion is context. Herein, we focus on two contextual fusion challenges, incomplete (limited knowledge) models and metadata. Examples of metadata available to unmanned aerial systems (UAS) include time of day, platform/sensor position, etc., all of which have a potentially drastic impact on sensor measurements and subsequently our decisions derived from them. Additionally, incomplete models limit machine learning, specifically under-sampling of training data. To address these challenges, we investigate contextually adaptive online Choquet integration. First, we cluster and partition the training metadata. Second, a single machine learning model is trained per partition. Third, a Choquet integral is learned for the combination of these models per partition. Fourth, at test/run time we compute the degree of typicality of a new sample to our known contexts. Fifth, our trained integrals are decomposed into a bag of underlying aggregation operators and a new contextually relevant operator is imputed using a combination of the metadata clustering and observation statistics of the integral variables. This process enables machine learning model selection, ensemble fusion, and metadata outlier detection, with subsequent mitigation strategy identification or decision suppression. The above ideas are demonstrated on explosive hazard detection using surrogate data simulated by the Unreal Engine. In particular, the Unreal Engine is used because it provides us with flexibility to explore the proposed ideas across a range of diverse and controlled experiments. Our preliminary results show improved performance for fusion in different contexts and a sensitivity analysis is performed with respect to metadata degradation.
Modern supervised machine learning for electro-optical and infrared imagery is based on data-driven learning of features and decision making. State-of-the-art algorithms are largely opaque and questions exist regarding their interpretability and generalizability. For example, what are the learned features, what contexts do they work in, and are the algorithms simply memorizing observations and exploiting unwanted correlations or has it learned an internal representation and causal associations that generalize to new environments? Under the hood, current convolutional neural networks (CNN) are sophisticated curve fitters that are sensitive to sampling (volume and variety). This is problematic as collecting data from real systems is often expensive and time consuming. Furthermore, labeling and quality checking of that data can also be prohibitive. As a result, many are looking to augmentation and simulation to efficiently generate more samples. Herein, we focus on ways to combine augmentation and simulation to improve explosive hazard detection. Specifically, we use the Unreal Engine to produce ray traced simulated data sets of environments and emplacements not captured in real data. We also present a new technique, coined altitude modulated augmentation (AMA), that inserts simulated objects into real world background imagery based on metadata to augment new training data. Thus, the goal of AMA is to increase sampling of observed environments. Preliminary results show that the combination of all techniques is best, followed by augmentation, simulation, then real world data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.