Paper
23 January 2012 An Oracle-based co-training framework for writer identification in offline handwriting
Utkarsh Porwal, Sreeranga Rajan, Venu Govindaraju
Author Affiliations +
Proceedings Volume 8297, Document Recognition and Retrieval XIX; 82970P (2012) https://doi.org/10.1117/12.912221
Event: IS&T/SPIE Electronic Imaging, 2012, Burlingame, California, United States
Abstract
State-of-the-art techniques for writer identification have been centered primarily on enhancing the performance of the system for writer identification. Machine learning algorithms have been used extensively to improve the accuracy of such system assuming sufficient amount of data is available for training. Little attention has been paid to the prospect of harnessing the information tapped in a large amount of un-annotated data. This paper focuses on co-training based framework that can be used for iterative labeling of the unlabeled data set exploiting the independence between the multiple views (features) of the data. This paradigm relaxes the assumption of sufficiency of the data available and tries to generate labeled data from unlabeled data set along with improving the accuracy of the system. However, performance of co-training based framework is dependent on the effectiveness of the algorithm used for the selection of data points to be added in the labeled set. We propose an Oracle based approach for data selection that learns the patterns in the score distribution of classes for labeled data points and then predicts the labels (writers) of the unlabeled data point. This method for selection statistically learns the class distribution and predicts the most probable class unlike traditional selection algorithms which were based on heuristic approaches. We conducted experiments on publicly available IAM dataset and illustrate the efficacy of the proposed approach.
© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Utkarsh Porwal, Sreeranga Rajan, and Venu Govindaraju "An Oracle-based co-training framework for writer identification in offline handwriting", Proc. SPIE 8297, Document Recognition and Retrieval XIX, 82970P (23 January 2012); https://doi.org/10.1117/12.912221
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Expectation maximization algorithms

Machine learning

Feature selection

System identification

Algorithm development

Analytical research

Data processing

Back to Top