Wednesday, November 9, 2011

OCR Trick for Old Books

I was a book lover when I was a child. I used to spend all my allowance on buying books. I had a few hundred books by the time I became an adult. Unfortunately the storage capacity of the house remained the same. People at home were not ready to chuck out things to make room for my books. I did not know what to do. White ants had eaten through a huge pile of bound volumes of my childhood comics which I loved even when I had grown up. During one vacation for school, the bunch was there. In the next vacation a ghost of the bunch was there, seen from outside, but just a sticky mass full of the ants on the inside. It broke my heart. The thing would not catch fire too. In those days there were no computers and scanners. When those were invented, I converted my books into ebooks by scanning and optical character recognition (OCR). Then I gave away all my books. Unfortunately the old books had their pages all yellowed. With the standard parameters for OCR scanning, using 300 dpi black and white image, nothing coule be converted to text. Then I had a bright idea. I scanned the book pages as 300 dpi color images, and then did OCR on them. It worked like a charm. The images were quite larger. But image size is not a problem with huge hard disks in modern computers. The images are deleted anyway after the OCR is complete. Now all my books are ebooks. Using Calibre, I have converted them into MOBI format books, and I can read them on Amazon Kindle too.

प्रशंसा करायचीय, नावे ठेवायचीयेत, काही विचारायचय, किंवा करायला आणखी चांगले काही सुचत नाहीये, तर क्लिक करा.

संपर्क