Neural networks are powerful technology for classifying character patterns and object images. A huge number of training samples is very important for classification accuracy. A novel method for recognizing handwritten hiragana characters is proposed that combines pre-trained convolutional neural networks (CNN) and support vector machines (SVM). The training samples are augmented by pattern distortion such as by cosine translation and elastic distortion. A pre-trained CNN, Alex-Net, can be used as the pattern feature extractor. Alex-Net is pre-trained for large-scale object image datasets. An SVM is used as a trainable classifier. Original hiragana samples of 46 classes on the ETL9B are divided in two-fold by odd and even dataset numbers. Samples with the odd dataset number and augmented patterns on the ETL9B database are trained by the SVM. The feature vectors of character patterns are passed to the SVM from AlexNet. The average error rate was 1.130% for 100 test patterns of each of the 46 classes for a 5-times test, and the lowest error rate was 0.978% with 506138 training patterns of distorted hiragana characters. Experimental results showed that the proposed method is effective in recognizing handwritten hiragana characters.
Neural networks are powerful technology for classifying character patterns and object images. A huge number of training samples is very important for classification accuracy. A novel method for recognizing handwritten hiragana characters is proposed that combines pre-trained convolutional neural networks (CNN) and support vector machines (SVM). The training samples are augmented by pattern distortion such as by cosine translation and elastic distortion. A pre-trained CNN, Alex-Net, can be used as the pattern feature extractor. Alex-Net is pre-trained for large-scale object image datasets. An SVM is used as a trainable classifier. Original hiragana samples of 71 classes on the ETL9B are divided in two-fold by odd and even dataset numbers. Samples with the odd dataset number and augmented patterns on the ETL9B database are trained by the SVM. The feature vectors of character patterns are passed to the SVM from Alex- Net. The average error rate was 2.378% for 100 test patterns of each of the 71 classes for a 5-times test, and the lowest error rate was 2.113% with 468,600 training patterns of distorted hiragana characters. Experimental results showed that the proposed method is effective in recognizing handwritten hiragana characters.
Digital cameras and smart-phones with orientation sensors allow auto-rotation of portrait images. Auto-rotation of portrait is done by using the image file's metadata, exchangeable image file format (EXIF). The output of these sensors is used to set the EXIF orientation flag to reflect the positioning of the camera with respect to the ground. Unfortunately, software program support for this feature is not widespread or consistently applied. Our research goal is to create the EXIF orientation flag by detecting the upright direction of face images having no orientation flag and is to apply the software of organizing photos. In this paper, we propose a novel upright detection scheme for face images that relies on generation of rotated images in four direction and part-based face detection with Haar-like features. Inputted images are frontal faces and these images are in-plain rotated in four possible direction. The process of upright detection is that among four possible rotated images, if only one rotated image is accepted in face detection and other three rotated images are rejected, the upright direction is obtained from the accepted direction. Rotation angle of EXIF orientation is, 0 degree, 90 degree clockwise, 90 degree counter-clockwise, or 180 degree. Experimental results on 450 face image samples show that proposed method is very effective in detecting upright of face images with background variations.
KEYWORDS: Binary data, Image processing, Data conversion, Image segmentation, Matrices, Surgery, Data storage, Digital image processing, Feature extraction, Data processing
The rotation of an image is one of the fundamental functions in image processing and is applied to document image processing in the office. A method of image rotation based on digital image data has been developed. This paper assumes the binary digital data, and proposes a method which is different from the traditional one based on pixel data. This method can execute a high-speed rotation of binary image based on coordinate data for the start and the end of the run. Using the proposed method, the image rotation at an arbitary angle can be realized by the real number operation on the run data, which is suited to the general-purpose processor. It is a practically useful method since the processing is fast and less memory capacity is required. In this paper, a discussion is made first on the format of the run data, the number of runs and the data complexity for the binary data. Then the newly devised rotation for the binary image is described. The rotation method is used to perform successively the skew coordinate transformations in the vertical and horizontal directions, to determine the rotated images. Finally, a document image is actually rotated on a conputer. The processing time was examined to demonstrate experimentally the usefulness of the proposed method.
An automatic registration method for graph images which recognizes the structure of graphs is proposed. This method has the advantage of saving cost and time compared with the manual registration of types and labels of graphs as search indexes in electronic filing systems. The method is implemented in an advanced document image retrieval system, and the efficiency of the index registration is ascertained by experiment. The outline of this method is as follows: The types of graphs, the axes, and the labels of axes are initially detected. The types of graphs are classified into several categories and labels are attached to the axes. Next, the types are registered as indexes of the graphs, and the label images are recognized and converted into ASCII data, and also stored as indexes. An automaton model is used in the process of attaching the labels to analyze the components of the images. The automaton model refers to a knowledge-base of stored rules of layout structure. It then chooses appropriate labels for each axis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.