|
The papers included in this volume were part of the technical conference cited on the cover and title page. Papers were selected and subject to review by the editors and conference program committee. Some conference presentations may not be available for publication. The papers published in these proceedings reflect the work and thoughts of the authors and are published herein as submitted. The publishers are not responsible for the validity of the information or for any outcomes resulting from reliance thereon. Please use the following format to cite material from this book: Author(s), “Title of Paper,” in Document Recognition and Retrieval XXI, edited by Bertrand Coüasnon, Eric K. Ringger, Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 9021. Article CID Number (2014) ISSN: 0277-786X ISBN: 9780819499387 Copublished by SPIE P.O. Box 10, Bellingham, Washington 98227-0010 USA Telephone +1 360 676 3290 (Pacific Time) · Fax +1 360 647 1445 and IS&T—The Society for Imaging Science and Technology 7003 Kilworth Lane, Springfield, Virginia, 22151 USA Telephone +1 703 642 9090 (Eastern Time)•· Fax +1 703 642 9094 Copyright © 2014, Society of Photo-Optical Instrumentation Engineers and The Society for Imaging Science and Technology. Copying of material in this book for internal or personal use, or for the internal or personal use of specific clients, beyond the fair use provisions granted by the U.S. Copyright Law is authorized by the publishers subject to payment of copying fees. The Transactional Reporting Service base fee for this volume is $18.00 per article (or portion thereof), which should be paid directly to the Copyright Clearance Center (CCC), 222 Rosewood Drive, Danvers, MA 01923. Payment may also be made electronically through CCC Online at copyright.com. Other copying for republication, resale, advertising or promotion, or any form of systematic or multiple reproduction of any material in this book is prohibited except with permission in writing from the publisher. The CCC fee code is 0277-786X/14/$18.00. Printed in the United States of America. Paper Numbering: Proceedings of SPIE follow an e-First publication model, with papers published first online and then in print and on CD-ROM. Papers are published as they are submitted and meet publication criteria. A unique, consistent, permanent citation identifier (CID) number is assigned to each article at the time of the first publication. Utilization of CIDs allows articles to be fully citable as soon as they are published online, and connects the same identifier to all online, print, and electronic versions of the publication. SPIE uses a six-digit CID article numbering system in which:
The CID Number appears on each page of the manuscript. The complete citation is used on the first page, and an abbreviated version on subsequent pages. Numbers in the index correspond to the last two digits of the six-digit CID Number. Conference CommitteeSymposium Chair Symposium Cochair Conference Chairs
Conference Program Committee
Additional Paper Reviewers
Session Chairs
IntroductionOn behalf of the Document Recognition and Retrieval XXI 2014 (DRR XXI) Program Committee, welcome to the Twenty-first Document Recognition and Retrieval conference being held in San Francisco, California, USA. DRR is held annually as part of the IS&T/SPIE Symposium on Electronic Imaging. It is one of the leading international conferences on document recognition, with a presence for related research on information retrieval and text mining. This year we received 37 paper submissions. 28 papers were accepted, for an overall acceptance rate of 76%. Of the accepted papers, 21 were selected for oral presentation (57%), and 7 were selected for poster presentation (19%). We want to sincerely thank the Program Committee members and additional referees for helping us create a strong technical program. This year’s program includes excellent tracks on Handwriting, Form Classification, Text Recognition, Handwritten Text Line Segmentation, Layout Analysis, Information Retrieval, and Data Sets and Ground-Truthing. For the Best Student Paper Award, 8 authors have applied. We are grateful to Elisa H. Barney Smith (chair) and the award committee for carrying out the difficult task of choosing the winning paper. The winner will be announced in the EI Symposium-wide award ceremony on Wednesday morning of the conference. Google has provided $500 for the Best Student Paper Award for the third year, and we are truly grateful for their continued support of the conference. This year we have two very interesting invited presentations. Ashok Popat and Ray Smith of Google Research will give a joint presentation on “OCR for Google Books” where many challenges arise from the scale and the diverse nature of the scanned corpus. Alexei A. Efros from the University of California, Berkeley, will give a talk entitled, “What makes Big Visual Data Hard?” and speak about problems encountered in collecting and using large visual data sets, based on his extensive research in computer vision. We hope that you all have an excellent experience at DRR XXI! Bertrand Coüasnon Eric K. Ringger |