Paper
8 February 2015 Boost OCR accuracy using iVector based system combination approach
Author Affiliations +
Proceedings Volume 9402, Document Recognition and Retrieval XXII; 94020E (2015) https://doi.org/10.1117/12.2076241
Event: SPIE/IS&T Electronic Imaging, 2015, San Francisco, California, United States
Abstract
Optical character recognition (OCR) is a challenging task because most existing preprocessing approaches are sensitive to writing style, writing material, noises and image resolution. Thus, a single recognition system cannot address all factors of real document images. In this paper, we describe an approach to combine diverse recognition systems by using iVector based features, which is a newly developed method in the field of speaker verification. Prior to system combination, document images are preprocessed and text line images are extracted with different approaches for each system, where iVector is transformed from a high-dimensional supervector of each text line and is used to predict the accuracy of OCR. We merge hypotheses from multiple recognition systems according to the overlap ratio and the predicted OCR score of text line images. We present evaluation results on an Arabic document database where the proposed method is compared against the single best OCR system using word error rate (WER) metric.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xujun Peng, Huaigu Cao, and Prem Natarajan "Boost OCR accuracy using iVector based system combination approach", Proc. SPIE 9402, Document Recognition and Retrieval XXII, 94020E (8 February 2015); https://doi.org/10.1117/12.2076241
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Feature extraction

Systems modeling

Image processing

Speaker recognition

Classification systems

Image classification

Back to Top