Paper
18 January 2010 Using definite clause grammars to build a global system for analyzing collections of documents
Joseph Chazalon, Bertrand Coüasnon
Author Affiliations +
Proceedings Volume 7534, Document Recognition and Retrieval XVII; 75340R (2010) https://doi.org/10.1117/12.840436
Event: IS&T/SPIE Electronic Imaging, 2010, San Jose, California, United States
Abstract
Collections of documents are sets of heterogeneous documents, like a specific ancient book series, having proper structural and semantic properties linking them. A particular collection contains document images with specific physical layouts, like text pages or full-page illustrations, appearing in a specific order. Its contents, like journal articles, may be shared by several pages, not necessary following, producing strong dependencies between pages interpretations. In order to build an analysis system which can bring contextual information from the collection to the appropriate recognition modules for each page, we propose to express the structural and the semantic properties of a collection with a definite clause grammar. This is made possible by representing collections as streams of document images, and by using extensions to the formalism we present here. We are then able to automatically generate a parser dedicated to a collection. Beside allowing structural variations and complex information flows, we also show that this approach enables the design of analysis stages, on a document or a set of documents. The interest of context usage is illustrated with several examples and their appropriate formalization in this framework.
© (2010) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Joseph Chazalon and Bertrand Coüasnon "Using definite clause grammars to build a global system for analyzing collections of documents", Proc. SPIE 7534, Document Recognition and Retrieval XVII, 75340R (18 January 2010); https://doi.org/10.1117/12.840436
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image processing

Silver

Image analysis

Image segmentation

Statistical analysis

Associative arrays

Computer programming

Back to Top