Abstract: Segmentationof text from badly degraded document image is a very challenging task due tothe high inter/intra- variation between the document background and theforeground text of different document images. In this project, we propose a noveldocument image binarization technique that addresses these issues by usingadaptive image contrast. The adaptive image contrast is a combination of thelocal image contrast and the local image gradient that is tolerant to text andbackground variation Caused by different types of document degradations. In theproposed technique, an adaptive contrast map is first constructed for an inputdegraded document image. The contrast map is then binarized and combined withCannys edge map to identify the text stroke edge pixels. The document text isfurther segmented by a local threshold that is estimated based on theintensities of detected text stroke edge pixels within a local window. Theproposed method is simple, robust, and involves minimum parameter tuning.Experiments on the Bickley diary dataset that consists of several challengingbad quality document images also show the superior performance of our proposedmethod, compared with other techniques.
Keywords: Binarization,Text Stroke Detection, Local Thresholding,Adaptive Contarst, Local Gradient,Post Processing