Paper
27 May 2015 Leveraging human oversight and intervention in large-scale parallel processing of open-source data
Enrico Casini, Niranjan Suri, Jeffrey M. Bradshaw
Author Affiliations +
Abstract
The popularity of cloud computing along with the increased availability of cheap storage have led to the necessity of elaboration and transformation of large volumes of open-source data, all in parallel. One way to handle such extensive volumes of information properly is to take advantage of distributed computing frameworks like Map-Reduce. Unfortunately, an entirely automated approach that excludes human intervention is often unpredictable and error prone. Highly accurate data processing and decision-making can be achieved by supporting an automatic process through human collaboration, in a variety of environments such as warfare, cyber security and threat monitoring. Although this mutual participation seems easily exploitable, human-machine collaboration in the field of data analysis presents several challenges. First, due to the asynchronous nature of human intervention, it is necessary to verify that once a correction is made, all the necessary reprocessing is done in chain. Second, it is often needed to minimize the amount of reprocessing in order to optimize the usage of resources due to limited availability. In order to improve on these strict requirements, this paper introduces improvements to an innovative approach for human-machine collaboration in the processing of large amounts of open-source data in parallel.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Enrico Casini, Niranjan Suri, and Jeffrey M. Bradshaw "Leveraging human oversight and intervention in large-scale parallel processing of open-source data", Proc. SPIE 9499, Next-Generation Analyst III, 94990K (27 May 2015); https://doi.org/10.1117/12.2177264
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data processing

Data storage

Databases

Clouds

Data analysis

Distributed computing

Parallel processing

RELATED CONTENT


Back to Top