Multi-domain operations drastically increase the scale and speed required to generate, evaluate, and disseminate command and control (C2) directives. In this work we evaluate the effectiveness of using reinforcement learning (RL) within an Army C2 system to design an artificial intelligence (AI) agent that accelerates the commander and staff’s decision making process. Leveraging RL’s superior ability to explore and exploit produces novel strategies that widen a commander’s decision space without increasing cognitive burden. Integrating RL into an efficient course of action war-gaming simulator and training hundreds of thousands of simulated battles using the DoD supercomputing resources generated an AI that produces acceptable strategic actions during a simulated operation. Moreover, this approach played an unexpected but significant role in strengthening the underlying wargame simulation engine by discovering and exploiting weaknesses in its design. This highlights a future role for the use of RL to test and improve DoD systems during their development.
Fast, efficient and robust algorithms are needed for real-time visual tracking that could also run smoothly on the airborne embedded systems. Flux tensor can be used to provide motion-based cues in visual tracking. In order to use any object motion detection on a raw image sequence captured by a moving platform, the motion caused by the camera movement must be stabilized first. Using feature points to estimate the homography matrix between the frames is a simple registration method that can be used for the stabilization. In order to have a good homography estimation, most of the feature points should lay on the same plane in the images. However, when the scene has complex structures it becomes very challenging to estimate a good homography. In this work, we propose a robust video stabilization algorithm which allows the flux motion detection to efficiently identify moving objects. Our experiments show satisfactory results when other methods shown to fail on the same type of raw videos.
Visual perception has become core technology in autonomous robotics to identify and localize objects of interest to ensure successful and safe task execution. As part of the recently concluded Robotics Collaborative Technology Alliance (RCTA) program, a collaborative research effort among government, academic, and industry partners, a vision acquisition and processing pipeline was developed and demonstrated to support manned-unmanned team ing for Army relevant applications. The perception pipeline provided accurate and cohesive situational awareness to support autonomous robot capabilities for maneuver in dynamic and unstructured environments, collaborative human-robot mission planning and execution, and mobile manipulation. Development of the pipeline involved a) collecting domain specific data, b) curating ground truth annotations, e.g., bounding boxes, keypoints, c) retraining deep networks to obtain updated object detection and pose estimation models, and d) deploying and testing the trained models on ground robots. We discuss the process of delivering this perception pipeline under limited time and resource constraints due to lack of a priori knowledge of the operational environment. We focus on experiments conducted to optimize the models despite using data that was noisy and exhibited sparse examples for some object classes. Additionally, we discuss our augmentation techniques used to enhance the data set given skewed class distributions. These efforts highlight some initial work that directly relates to learning and updating visual perception systems quickly in the field under sudden environment or mission changes.
Object detection from images captured by Unmanned Aerial Vehicles (UAVs) are widely used for surveillance, precision agricultural, package delivery, aerial photography, among others. Very recently, a benchmark on object detection using UAVs collected images called VisDrone2018 has been released. However, large performance drop is observed when current state-of-the-art object detection approaches developed primarily for ground-to-ground images are directly applied on the VisDrone2018 dataset. For example, the best detection model on the VisDrone2018 has only achieved detection accuracy of 0.31 mAP, significantly lower than that of ground-based object detection. This performance drop is mainly caused by several challenges, such as 1) varying flying altitudes from 1000 feet to 10 feet, 2) different weather conditions like foggy, rainy and low-light 3) a wide range of camera viewing angles. To overcome these challenges, in this paper we propose to leverage a novel approach of adversarial training that aims to learn domain invariant features with respect to varying altitudes, viewing angles, weather conditions, and object scales. The adversarial training draws on “free” meta-data that comes with the UAV datasets providing information about the data themselves, such as heights, scene visibility, viewing angles, etc. We demonstrate the effectiveness of our proposed algorithm on the recently proposed UAVDT dataset, and also show it to generalize well when applied to a different VisDrone2018 dataset. We will also show robustness of the proposed approach to variations in altitude, viewing angle, weather, and object scale.
Machine learning based perception algorithms are increasingly being used for the development of autonomous navigation systems of self-driving vehicles. These vehicles are mainly designed to operate on structured roads or lanes and the ML algorithms are primarily used for functionalities such as object tracking, lane detection and semantic understanding. On the other hand, Autonomous/ Unmanned Ground Vehicles (UGV) being developed for military applications need to operate in unstructured, combat environment including diverse off-road terrain, inclement weather conditions, water hazards, GPS denied environment, smoke etc. Therefore, the perception algorithm requirements are different and have to be robust enough to account for several diverse terrain conditions and degradations in visual environment. In this paper, we present military-relevant requirements and challenges for scene perception that are not met by current state-of-the-art algorithms, and discuss potential strategies to address these capability gaps. We also present a survey of ML algorithms and datasets that could be employed to support maneuver of autonomous systems in complex terrains, focusing on techniques for (1) distributed scene perception using heterogeneous platforms, (2) computation in resource constrained environment (3) object detection in degraded visual imagery.
Efficient and accurate real-time perception systems are critical for Unmanned Aerial Vehicle (UAV) applications that aim to provide enhanced situational awareness to users. Specifically, object recognition is a crucial element for surveillance and reconnaissance missions since it provides fundamental semantic information of the aerial scene. In this study, we describe the development and implementation of a perception frame-work on an embedded computer vision platform, mounted on a hexacopter for real-time object detection. The framework includes a camera driver and a deep neural network based object detection module and has distributed computing capabilities between the aerial platform and the corresponding ground station. Preliminary aerial real-time object detections using YOLO are performed onboard a UAV and a sequence of images are streamed to the base station where an advanced computer vision algorithm, referred to as Multi-Expert Region-based CNN (ME- RCNN), is leveraged to provide enhanced and fine-grained analytics on the aerial video feeds. Since annotated aerial imagery in the UAV domain is hard to obtain and not routinely available, we use a combination of aerial data as well as air-to-ground synthetic images, such as vehicles, generated by video gaming engines for training the neural network. Through this study, we quantify the level of improvements with the use of the synthetic dataset and the efficacy of using advanced object detection algorithms.
Development of machine vision systems to examine fruit for quality and contamination problems has been stalled due
to lack of an inexpensive, fast, method for appropriately orienting fruit for imaging. We recently discovered that apples
could be oriented based-on inertial properties. Apples were rolled down a ramp consisting of two parallel rails. When
sufficient angular velocity was achieved, the apples moved to a configuration where the stem/calyx axis was
perpendicular to the direction of travel. This discovery provides a potential basis for development of a commercially-viable
orientation system. However, many question remain concerning the underlying dynamic principles that govern
this phenomenon. An imaging system and software were constructed to allow detailed observation of the orientation
process. Sequential 640×480 monochrome images are acquired at 60 fps and 1/500 sec exposure. The software finds the
center of the apple in each image as well as the vertical movement of the track at a selected coordinate. Early tests
revealed that the compliance of the track played a significant role in the orientation process. These data will be used to
compare results from empirical tests with predictions of dynamic models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.