Presentation
6 June 2017 Matrix sketching for big data reduction (Conference Presentation)
Author Affiliations +
Abstract
Abstract: In recent years, the concept of Big Data has become a more prominent issue as the volume of data as well as the velocity in which it is produced exponentially increases. By 2020 the amount of data being stored is estimated to be 44 Zettabytes and currently over 31 Terabytes of data is being generated every second. Algorithms and applications must be able to effectively scale to the volume of data being generated. One such application designed to effectively and efficiently work with Big Data is IBM’s Skylark. Part of DARPA’s XDATA program, an open-source catalog of tools to deal with Big Data; Skylark, or Sketching-based Matrix Computations for Machine Learning is a library of functions designed to reduce the complexity of large scale matrix problems that also implements kernel-based machine learning tasks. Sketching reduces the dimensionality of matrices through randomization and compresses matrices while preserving key properties, speeding up computations. Matrix sketches can be used to find accurate solutions to computations in less time, or can summarize data by identifying important rows and columns. In this paper, we investigate the effectiveness of sketched matrix computations using IBM’s Skylark versus non-sketched computations. We judge effectiveness based on several factors: computational complexity and validity of outputs. Initial results from testing with smaller matrices are promising, showing that Skylark has a considerable reduction ratio while still accurately performing matrix computations.
Conference Presentation
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Soundararajan Ezekiel and Michael Giansiracusa "Matrix sketching for big data reduction (Conference Presentation)", Proc. SPIE 10199, Geospatial Informatics, Fusion, and Motion Video Analytics VII, 101990F (6 June 2017); https://doi.org/10.1117/12.2262937
Advertisement
Advertisement
KEYWORDS
Matrices

Machine learning

Analytics

Current controlled current source

Digital video recorders

Information science

Video

RELATED CONTENT

A post-alignment method for stereoscopic movie
Proceedings of SPIE (September 26 2013)
Design of trace based NS 3 simulations for UAS video...
Proceedings of SPIE (April 27 2020)
Characterizing L1-norm best-fit subspaces
Proceedings of SPIE (May 05 2017)
DAIS-ITA scenario
Proceedings of SPIE (May 10 2019)

Back to Top