Bleed-through is a frequent, pervasive degradation in ancient manuscripts, which is caused by ink seeped from the opposite side of the sheet. Bleed-through, appearing as an extra interfering text, hinders document readability and makes it difficult to decipher the information contents. Digital image restoration techniques have been successfully employed to remove or significantly reduce this distortion. This paper proposes a two-step restoration method for documents affected by bleed-through, exploiting information from the recto and verso images. First, the bleed-through pixels are identified, based on a non-stationary, linear model of the two texts overlapped in the recto-verso pair. In the second step, a dictionary learning-based sparse image inpainting technique, with non-local patch grouping, is used to reconstruct the bleed-through-contaminated image information. An overcomplete sparse dictionary is learned from the bleed-through-free image patches, which is then used to estimate a befitting fill-in for the identified bleed-through pixels. The non-local patch similarity is employed in the sparse reconstruction of each patch, to enforce the local similarity. Thanks to the intrinsic image sparsity and non-local patch similarity, the natural texture of the background is well reproduced in the bleed-through areas, and even a possible overestimation of the bleed through pixels is effectively corrected, so that the original appearance of the document is preserved. We evaluate the performance of the proposed method on the images of a popular database of ancient documents, and the results validate the performance of the proposed method compared to the state of the art.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited