Lukas Neumann (2010)
A method for text localization and recognition in real-world images
Master thesis, Czech Technical University in Prague, Faculty of Electrical Engineering, Dept. of Cybernetics.
A general algorithm for text detection and recognition in
real-world images is presented in this paper, which due to numerous
problems is a harder task than standard printed document
recognition. The algorithm finds text areas in photographs taken by a
standard camera or a mobile phone and `reads' content of the
detected text areas, even when the text occupies just a small part of
the image, when the camera is not aimed directly at the text and when
lighting conditions are not perfect. The algorithm uses many
innovative pieces, such as simultaneous processing of multiple text
line hypothesis, feedback loops to correct wrong decisions of previous
steps, use of synthetic fonts to train the algorithm (so that there is
no need for time-consuming acquisition and labeling of real-world
training data) or character recognition based on line context using a
typographic model. The algorithm was tested on four datasets of
real-world images, from which two datasets are public and they have
been already used to evaluate performance of other existing methods
for text detection and recognition. The proposed algorithm outperforms
the other existing methods by reading correctly 71\% respectively 60\%
characters in these two datasets, thus achieving the state-of-the-art
results. We conclude by showing possible applications of the proposed
algorithm, such as automatic indexing of images with text into a
database, automatic information retrieval for mapping applications,
mobile phone application to help blind people, automatic system for
drivers that warns about surrounding traffic signs or an automatic
translator of foreign signs and labels. The algorithm was also
successfully tested on cyrillic text.
In Czech.

