After writing the post on converting PNMs to DjVu I’ve ran into some trouble scanning documents written in blue ink. The problem: XSane didn’t allow me to set the threshold for converting the scanned image to line-art (B&W). So, I tried scanning the document in grayscale and in color and convert it afterwards to bitonal using imagemagick. This ended up with two results. When I used the
-monochrome command line switch, the conversion looked good, but it used halftones (dithering), when I tried to convert it to DjVu it resulted in a document size twice as large as normal B&W would. The other thing that I tried is using the
-threshold switch. The DjVu compressed document size was much better now, but the document was awful looking, either it was too dark, or some of the text disappeared. After giving it some thought I knew I can find a better solution.
I came up with the idea that I need to find a way to identify the text written in blue-ink and make sure it turns to be black. The model I used is to identify it using the blue channel. So I tried putting a dynamic threshold that would give more weight to the blue channel. I’ve used the gamma option in imagemagick but it didn’t turn out good enough. I had to take different actions depending on the blue level.
My next shot at this was to use imagemagick’s
fx operator. It allows you to iterate over the pixels in the image and apply some custom expression. While it sounded very good, it turned out to be a disaster as this thing was very slow. It is so slow, it becomes useless on high-resolution pictures, it didn’t finish operating on the 300dpi scanned document even an hour after it started. In my opinion, it is below par even when operating on relatively small images. This feature would become great if it would only operate faster.
At this point I’ve realized I probably can’t do it directly from the command line, and I decided to implement a solution using Python, and the Python Imaging Library (PIL), this was the start of my new project –
biscan (blue ink scan).The program iterates over the pixels in the image and checks the blue level, if the blue level is high (but not too high) if forces the pixel to become black. The following code is still pretty experimental, and is the first prototype.
#!/usr/bin/python """ biscan - Blue Ink Scan 0.1. Takes a color image of a scanned document written using blue ink, and turnes it into a blac-and-white (lineart) image. (C) 2008 Guy Rutenberg. Released under the terms of the GPLv2. """ import sys import Image from optparse import OptionParser def parseArguments(): parser = OptionParser(usage="%prog [options] FILEIN FILEOUT", version="%prog 0.1 ") parser.add_option("-v", "--verbose", dest="verbose", action="store_true", default=False, help="be verbose") parser.add_option("-b", "--blue-threshold", dest="blue_threshold", type="int", default=220, help="Set the blue threshold (0..255) to NUM", metavar="NUM") return parser.parse_args() def main(options, args): if len(args)<2: print "See --help for usage instructions" return 1 im = Image.open(args) pix = im.load() for x in xrange(im.size): for y in xrange(im.size): if pix[x,y]<options.blue_threshold: pix[x,y] = (0,0,0) if options.verbose: print "proccessed line",x def threshold(i): if i<127: return 0 return 255 im.convert("L").point(threshold).save(args) if __name__== '__main__': (options, args) = parseArguments() sys.exit(main(options, args))
python biscan --help for options. This version outperforms all the above mentioned methods in quallity and the compression ratio achieved on the output. Its performance are reasonable, around 10 sec for 8 mega-pixel image. I’ve tested it with several different blue inks (Uniball, Pilot, Ballograf and some generic ones) and it operated pretty well on all of them.
So what next? I plan to improve the script and allow it to almost perfectly convert documents written in blue ink to B&W. I’m going to experiment with a new model for identifying the blue ink using HSL values (instead of RGB). The next version will also be more polished, as I’ve released this one under the motto of “release early” (and I hope I will also release often). So stay tuned for updates.
If you give this script a try, I will glad to hear about it. Of course if you have any question you’re welcomed to contact me.
biscan requires PIL 1.1.6 (the latest one as of this time). PIL can be found in Gentoo under