Comparison Of Various Binarization Technique For Degraded Documents

Rajwinder Kaur and Anand Kumar Mittal
Page No: 
819-824

Binarization is the process that converts an image into black-and-white a threshold value is defined and the colors above that value are converted into white. While the colors below it is converted into black. This is a very simple process in digital image processing when one has a document with black ink written on a white paper. Document image binarization is an important step in the document image analysis and recognition pipeline. The performance of a binarization technique directly affects the recognition analysis. The quality of the images however has a significant impact on the OCR performance. Since most historical archive documents images are of poor quality due to aging and discolored cards and ink fading. In recent years this method has gamed popularity over its competitors due to its simplicity superior convergence characteristics and high solution quality. Two algorithms are presented, that are suitable for scanning document images are high-speed. They are designed or operate on a portion of the image while scanning the documents, thus, they fit pipeline architecture and lend themselves to real-time implementation. The first algorithm is based on adaptive thresholding and uses local edge information to switch between global thresh holding and adaptive local thresholding determined from the statistics of a local image window. The second thresholding algorithm is based on tracking the foreground and background levels using clustering based on a variant of the K-means algorithm. The two approaches may be used independently or may be combined /or for improving performance.

Download PDF: