You are here

Document Image Understanding: Links to Research Projects

TeV has a long history of research in document image segmentation and on camera-based text detection. This page is a survey page that provides (some) links to academic research projects around the world, related to Document Image Understanding and Text Extraction from generic images, arranged by topics. 

Document Image Understanding combines image analysis and pattern recognition techniques to process and extract information from documents. Document images, in a wide meaning, include the output of paper documents after scanning or captured by a camera, including smart-phone camera, as well as video frames where superposed captions are present or, more generally, pictures of scenes where text is present.  

Research include therefore paper layout analysis, ocr / icr and symbol recognition, graphics recognition, text extraction from scenes. In the following are grouped some research projects, may be already closed, but still interesting, about topics such as: DOCUMENT LAYOUT ANALYSIS, TEXT IN SCENE IMAGES, HANDWRITTEN TEXT INTERPRETATION, SPECIAL TEXT,  ...

DOCUMENT LAYOUT ANALYSIS

TEXT IN SCENE IMAGES

  • FBK Italy:
  • Computer Vision Center, CVC The Document Analyis Group: 
  • Multimedia Analysis and Data Mining Lab MADM at DFKI (Germany):
    • Recognition of Overlay Text
    • Recognition of Scene Text
  • Ai Lab, Kyungpook U., Korea:
    • Text Location in complex images (1999)
    • Support Vector Machine-based Text Detection in Videos (2001).
    • Text Location using Cluster-based Template Matching (2002).
    • Text Detection and Removal in Video Sequences (2003).
  • Tohoku University (Japan), Education Center for Information Processing:
    • dlabel, in Project-O2. (2002)
  • PariTech, Mathematical Morphology Centre
  • SRI, International's Advanced Automation Technology Center:
  • Institute of Computer Science and Applied Mathematics, Bern University:
    • Identification of Text on Colored Book and Journal Covers (1999)
  • Language And Media Processing (LAMP) Lab, University of Maryland:
    • Extraction of Text from Video
  • Joint Research & Development Laboratory (JDL) for Advanced Computer and Communication Technologies:
    • Multimedia analysis.
    • Detection of Text on Road Signs from Video (2005)
    • Jersey Number Detection in Sports Video (2005)
  • CEDAR, at Buffalo University:
  • Bristol CS University:
  • PRImA, Pattern Recognition and Image Analysis (University of Salford). Text Segmentation in Web Images (A. Antonacopoulos)
    • Colour text segmentation in web images (2007)
  • PRIP, Michigan State University:
    • A Survey on Text Information Extraction from Images and Video (PR 37(5):977-997, 2004)
  • Institut Eurecom, France
    • Recognition of document images (2004) with report: Mobile Document Imaging
  • Praktische Informatik IV, University of Mannheim. MoCA project - Text Segmentation and Recognition in Digital Videos (2002),
  • Four Eyes Lab, Department of Computer Science, University of California:
  • HP Labs:
  • Circuit Theory and Signal Processing (TCTS Lab), Faculte' Polytechnique de Mons (Belgium):
    • SYPOLE project: portable text recognizer for blind or visually impaired people. (2003-2006)
    • RECITE, Interactive text recognition (2007)
    • Color text extraction with selective metric-based clustering (2007)
    • Natural Scene Text Understanding, Chap 16 in: Vision Systems, Segmentation and Pattern Recognition (2007)
  • Peter Meijer Research:
  • Smith-Kettlewell Rehabilitation Engineering Research Center (RERC) (2008)
    • Applications of CV for the Blind and Visually Impaired
    • H. Shen and J. Coughlan. "Grouping Using Factor Graphs: an Approach for Finding Text with a Camera Phone." IAPR Workshop on Graph-based Representations in Pattern Recognition (GbRPR '07)
  • Institut National de Recherche en Informatique et en Automatique:
    • "Navisio: Towards an integrated reading aid system for low vision patients" (2008) (pdf)

 
HANDWRITTEN TEXT INTERPRETATION

  • CEDAR, at Buffalo University:
  • CENPARMI, Concordia University:
    • Handwriting recognition and signature verification
    • Publications.
  • Humanity & Industry at Department of Information Technology, Uppsala University:
    • Q2B project, automatic recognition in historical hand-written manuscripts.
  • Language And Media Processing (LAMP) Lab, University of Maryland:
  • Multimedia Architectures Lab, Essex University:
    • OSCAR, an offline script and character recognition toolset. Work on OCR algorithms for discrete characters, cursive script and signatures.
  • PRHLT, Pattern Recognition and Human Language Technologies, Universitat Politecnica de Valencia:
    • MITTRAL Multimodal Interaction for Text Transcription with Adaptive Learning
  • University of Bern:
  • University of Michigan - Dearborn, Department of Electrical and Computer Engineering:
    • Character segmentation in handwritten words (1996)

 
SPECIAL TEXT

 
PERFORMANCE ASSESSMENT

  • F. Shafait, D. Keysers, T.M. Breuel
    Performance Comparison of Six Algorithms for Page Segmentation, IAPR Workshop on Document Analysis Systems, DAS 2006, LNCS vol. 3872, pp. 368-379
  • S. Mao, T. Kanungo
    Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms, PAMI 23(2):242-256, 2001
  • J. Kanai, T.A. Narkter, S.V. Rice, G.Nagy
    Performance metrics for document understanding systems 2nd International Conference on Document Analysis and Recognition, Tsukuba, Japan, October 20-22, 1993, pp. 424-427
  • The PinkPanther Environment for automatic benchmarking of document page segmentation, Pattern Recognition, Vol. 31, No. 9, pp. 1191-1204, September 1998
  • PRImA, Pattern Recognition and Image Analysis (University of Salford). Performance Assessment of Document Analysis Systems (A. Antonacopoulos)
Access mode: 

This page provides some links to academic research projects around the world, related to Document Image Understanding. They are arranged by topic. You may also be interested in our  list of OCR/ICR commercial research and products where several open source OCR are linked.
Disclaimer: TeV of FBK cannot attest to the accuracy of information provided by these links or any other linked site. Providing links to a non-FBK web site does not constitute an endorsement by FBK or any of its employees of the sponsors of the site or the information or products presented on the site.

Responsible person: 
Carla Maria Modena