Paper
30 March 1995 Text/graphics separation for technical papers
Oleg G. Okun, Sergey V. Ablameyko
Author Affiliations +
Proceedings Volume 2422, Document Recognition II; (1995) https://doi.org/10.1117/12.205819
Event: IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology, 1995, San Jose, CA, United States
Abstract
One of the important operations in automatic analysis of technical papers is a text separation from graphics. In practice, a document skew often occurs both for initial document and for its image after scanning. Also text and graphic blocks can exist which have no rectangular shape. In these cases, the standard text/graphics separation methods such as projection profiles or run length smoothing are not always suitable. In this paper, we propose the text/graphics separation algorithm based on two simple and standard properties of technical paper pages. We call them as area and text compactness properties. The area property takes into account the geometrical relationships between text and graphics. The text compactness property reflects the spatial relationships between text components within block and between text and graphics. An application of both properties allows us to accurately perform the separation in the cases above. No skew correction is required before separation and text and graphic blocks can have arbitrary shape.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Oleg G. Okun and Sergey V. Ablameyko "Text/graphics separation for technical papers", Proc. SPIE 2422, Document Recognition II, (30 March 1995); https://doi.org/10.1117/12.205819
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Bromine

Detection and tracking algorithms

Image processing

Algorithm development

Binary data

Image compression

RELATED CONTENT


Back to Top