Paper
7 March 1996 Word-level recognition of multifont Arabic text using a feature vector matching approach
Erik J. Erlandson, John M. Trenkle, Robert C. Vogt III
Author Affiliations +
Proceedings Volume 2660, Document Recognition III; (1996) https://doi.org/10.1117/12.234725
Event: Electronic Imaging: Science and Technology, 1996, San Jose, CA, United States
Abstract
Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Erik J. Erlandson, John M. Trenkle, and Robert C. Vogt III "Word-level recognition of multifont Arabic text using a feature vector matching approach", Proc. SPIE 2660, Document Recognition III, (7 March 1996); https://doi.org/10.1117/12.234725
Lens.org Logo
CITATIONS
Cited by 16 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Databases

Image quality

Optical character recognition

Computing systems

Binary data

Data modeling

Back to Top