High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Character Recognition of Degraded Document Images Removing Strike

Author(s):

Liz Maria Mathew , ST. JOSEPH?S COLLEGE OF ENGINEERING AND TECHNOLOGY PALAI, KERALA, INDIA; Suma R, ST. JOSEPH?S COLLEGE OF ENGINEERING AND TECHNOLOGY PALAI, KERALA, INDIA

Keywords:

Binarization, Character Recognition, Discrete Wavelet Transform (DWT), Adaptive image contrast

Abstract

Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. Character recognition becomes difficult if the document images are degraded. So binarization is a technique that is used to remove the degradations. Libraries and archives around the world store large amount of old and historically important documents and manuscripts. But, due to many environmental factors, the poor quality of the materials used in their creation and improper handling cause them to suffer a high degree of degradation of various types. Today, there is a strong move towards digitization of old documents such as manuscripts so that their content can be preserved for future generations. Character recognition from noisy and degraded documents is still a challenging task. While considering historical document analysis, old printed documents have a high occurrence of degraded characters, especially broken characters due to ink fading. The objective of document image analysis is to recognize the text in degraded document images and extract the intended information. Here, a character recognition technique is proposed for the degraded document images. A strike removal method is also implemented to remove the strikes that are drawn over the text in the document images. Also, a contrast enhancement has been done to the adaptive contrast image using three methods which are linear contrast enhancement, piecewise linear stretch and homomorphic filter. The performance analysis is done based on the contrast enhancement that is done to the adaptive contrast image. The character recognition step includes horizontal scanning, vertical scanning and recognition using Discrete Wavelet Transform (DWT). The proposed system recognizes almost all characters of the input image.

Other Details

Paper ID: IJSRDV3I90502
Published in: Volume : 3, Issue : 9
Publication Date: 01/12/2015
Page(s): 701-705

Article Preview

Download Article