Optical character recognition (OCR) is a challenging task because most existing preprocessing approaches are
sensitive to writing style, writing material, noises and image resolution. Thus, a single recognition system cannot
address all factors of real document images. In this paper, we describe an approach to combine diverse recognition
systems by using iVector based features, which is a newly developed method in the field of speaker verification.
Prior to system combination, document images are preprocessed and text line images are extracted with different
approaches for each system, where iVector is transformed from a high-dimensional supervector of each text line
and is used to predict the accuracy of OCR. We merge hypotheses from multiple recognition systems according
to the overlap ratio and the predicted OCR score of text line images. We present evaluation results on an Arabic
document database where the proposed method is compared against the single best OCR system using word
error rate (WER) metric.
KEYWORDS: Image segmentation, Cameras, Magnetorheological finishing, Optical character recognition, Image processing algorithms and systems, Image processing, Lutetium, Detection and tracking algorithms, Statistical modeling, Control systems
Document binarization is one of the initial and critical steps for many document analysis systems. Nowadays,
with the success and popularity of hand-held devices, large efforts are motivated to convert documents into
digital format by using hand-held cameras. In this paper, we propose a Bayesian based maximum a posteriori
(MAP) estimation algorithm to binarize the camera-captured document images. A novel adaptive segmentation
surface estimation and normalization method is proposed as the preprocessing step in our work and followed by
a Markov Random Field based refine procedure to remove noises and smooth binarized result. Experimental
results show that our method has better performance than other algorithms on bad or uneven illumination
document images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.