<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Guy Rutenberg &#187; Python</title>
	<atom:link href="http://www.guyrutenberg.com/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.guyrutenberg.com</link>
	<description>Keeping track of what I do</description>
	<lastBuildDate>Wed, 16 Jun 2010 19:53:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Audio Based True Random Number Generator POC</title>
		<link>http://www.guyrutenberg.com/2010/05/14/audio-based-true-random-number-generator-poc/</link>
		<comments>http://www.guyrutenberg.com/2010/05/14/audio-based-true-random-number-generator-poc/#comments</comments>
		<pubDate>Fri, 14 May 2010 12:18:14 +0000</pubDate>
		<dc:creator>Guy</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.guyrutenberg.com/?p=678</guid>
		<description><![CDATA[Few days ago I came up with an idea to create a true random number generator based on noise gathered from a cheap microphone attached to my computer. Tests showed that when sampling the microphone, the least significant bit behaves pretty randomly. This lead me to think it might be good source for gathering entropy [...]]]></description>
			<content:encoded><![CDATA[<p>Few days ago I came up with an idea to create a true random number generator based on noise gathered from a cheap microphone attached to my computer. Tests showed that when sampling the microphone, the least significant bit behaves pretty randomly. This lead me to think it might be good source for gathering entropy for a true random number generator.<br />
<span id="more-678"></span><br />
The base design was to gather the noise from the microphone than apply a process that will make in more uniform and refine its randomness. After some design iterations I came up with a process based on applying a hash function to the noise. Each iteration involves filling block of the hash function from the least significant bits of the microphone output and applying the hash. Each iteration outputs the current hash digest. Assuming the hash function is uniform, this will output a uniformly distributed blocks of bits. Furthermore, because there the previous state of the hash function influences the next digest computation, the process accumulates entropy that can smooth out potentially less random blocks. Because for all commonly used hash function the block size is much larger than the digest size the output can tell much about the current state or any future or past state. This also holds true even if someone can find all pre-images of the hash function as the amount of possible states will be too big.</p>
<p>I&#8217;ve built a Python proof of concept (using md5 as a hash function) suitable for Linux.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> hashlib
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">struct</span>
&nbsp;
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> GRandom:
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: black;">audio</span> = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;/dev/dsp&quot;</span>,<span style="color: #483d8b;">&quot;rb&quot;</span><span style="color: black;">&#41;</span>
        <span style="color: #008000;">self</span>.<span style="color: #008000;">hash</span> = hashlib.<span style="color: #dc143c;">md5</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> get_raw_block<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        buffer = <span style="color: #008000;">self</span>.<span style="color: black;">audio</span>.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span>.<span style="color: #008000;">hash</span>.<span style="color: black;">block_size</span><span style="color: #66cc66;">*</span><span style="color: #ff4500;">8</span><span style="color: black;">&#41;</span>
        <span style="color: #dc143c;">bytes</span> = <span style="color: #dc143c;">struct</span>.<span style="color: black;">unpack</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;%iB&quot;</span><span style="color: #66cc66;">%</span><span style="color: #008000;">len</span><span style="color: black;">&#40;</span>buffer<span style="color: black;">&#41;</span>, buffer<span style="color: black;">&#41;</span>
&nbsp;
        longs = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
        <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span>.<span style="color: #008000;">hash</span>.<span style="color: black;">block_size</span>/<span style="color: #ff4500;">4</span><span style="color: black;">&#41;</span>:
            temp = <span style="color: #ff4500;">0</span>
            <span style="color: #ff7700;font-weight:bold;">for</span> b <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">bytes</span><span style="color: black;">&#91;</span>i<span style="color: #66cc66;">*</span><span style="color: #ff4500;">32</span>:<span style="color: black;">&#40;</span>i+<span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">*</span><span style="color: #ff4500;">32</span><span style="color: black;">&#93;</span>:
                temp = <span style="color: black;">&#40;</span>temp <span style="color: #66cc66;">&lt;&lt;</span> <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span> ^ <span style="color: black;">&#40;</span>b <span style="color: #66cc66;">&amp;</span> <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
            longs.<span style="color: black;">append</span><span style="color: black;">&#40;</span>temp<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #dc143c;">struct</span>.<span style="color: black;">pack</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;%iI&quot;</span><span style="color: #66cc66;">%</span><span style="color: #008000;">len</span><span style="color: black;">&#40;</span>longs<span style="color: black;">&#41;</span>, <span style="color: #66cc66;">*</span>longs<span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> get_block<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: #008000;">hash</span>.<span style="color: black;">update</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span>.<span style="color: black;">get_raw_block</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">self</span>.<span style="color: #008000;">hash</span>.<span style="color: black;">digest</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>The amount of generated bits per second is given by (sample rate)*(digest size)/(block size). So for 8KHz (default) sampling rate and md5 we&#8217;ll get a theoretical speed of ~250Kbs. SHA type hashes have higher digest to block size ration thus may result in higher speeds. Another source of speed up may be to change the sample rate of the microphone. But setting it to high may have negative effects on the entropy. The code may get a considerable performance gain by porting it to c/c++, as it uses both bit manipulations and calculates hashes. Anyways, even the Python implementation&#8217;s speed allows us it be used for many cases where true randomness is required, such as generating passwords.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.guyrutenberg.com/2010/05/14/audio-based-true-random-number-generator-poc/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Python&#8217;s base64 Module Fails to Decode Unicode Strings</title>
		<link>http://www.guyrutenberg.com/2010/05/03/pythons-base64-module-fails-to-decode-unicode-strings/</link>
		<comments>http://www.guyrutenberg.com/2010/05/03/pythons-base64-module-fails-to-decode-unicode-strings/#comments</comments>
		<pubDate>Mon, 03 May 2010 18:18:24 +0000</pubDate>
		<dc:creator>Guy</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Errors]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.guyrutenberg.com/?p=672</guid>
		<description><![CDATA[If you&#8217;ve got a base64 string as a unicode object and you try to use Python&#8217;s base64 module with altchars set, it fails with the following error:

TypeError: character mapping must return integer, None or unicode

This is pretty unhelpful error message also occurs if you try any method that indirectly use altchars. For example:

base64.urlsafe_b64decode&#40;unicode&#40;'aass'&#41;&#41;
base64.b64decode&#40;unicode&#40;'aass'&#41;,'-_'&#41;

both fail while [...]]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve got a <code>base64</code> string as a <code>unicode</code> object and you try to use Python&#8217;s <a href="http://docs.python.org/library/base64.html"><code>base64</code></a> module with <code>altchars</code> set, it fails with the following error:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">TypeError: character mapping must return integer, None or unicode</pre></div></div>

<p>This is pretty unhelpful error message also occurs if you try any method that indirectly use <code>altchars</code>. For example:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #dc143c;">base64</span>.<span style="color: black;">urlsafe_b64decode</span><span style="color: black;">&#40;</span><span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'aass'</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #dc143c;">base64</span>.<span style="color: black;">b64decode</span><span style="color: black;">&#40;</span><span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'aass'</span><span style="color: black;">&#41;</span>,<span style="color: #483d8b;">'-_'</span><span style="color: black;">&#41;</span></pre></div></div>

<p>both fail while the following works:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #dc143c;">base64</span>.<span style="color: black;">urlsafe_b64decode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'aass'</span><span style="color: black;">&#41;</span>
<span style="color: #dc143c;">base64</span>.<span style="color: black;">b64decode</span><span style="color: black;">&#40;</span><span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'aass'</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>While it&#8217;s not complicated to fix it (just convert any <code>unicode</code> string to <code>ascii</code> string), it&#8217;s still annoying.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.guyrutenberg.com/2010/05/03/pythons-base64-module-fails-to-decode-unicode-strings/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>URL-Safe Timestamps using Base64</title>
		<link>http://www.guyrutenberg.com/2010/04/30/url-safe-timestamps-using-base64/</link>
		<comments>http://www.guyrutenberg.com/2010/04/30/url-safe-timestamps-using-base64/#comments</comments>
		<pubDate>Fri, 30 Apr 2010 17:08:56 +0000</pubDate>
		<dc:creator>Guy</dc:creator>
				<category><![CDATA[Tips]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Snippets]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://www.guyrutenberg.com/?p=667</guid>
		<description><![CDATA[Passing around timestamps in URLs is a common task. We usually want our URLs to be as shortest as possible. I&#8217;ve found using Base64 to result in the shortest URL-safe representation, just 6 chars. This compares with the 12 chars of the naive way, and 8 chars when using hex representation.
The following Python functions allow [...]]]></description>
			<content:encoded><![CDATA[<p>Passing around timestamps in URLs is a common task. We usually want our URLs to be as shortest as possible. I&#8217;ve found using Base64 to result in the shortest URL-safe representation, just 6 chars. This compares with the 12 chars of the naive way, and 8 chars when using hex representation.</p>
<p>The following Python functions allow you to build and read these 6 chars URL-safe timestamps:<br />
<span id="more-667"></span></p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">base64</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">struct</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">time</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> build_timestamp<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot;
    Return a 6 chars url-safe timestamp
    &quot;&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #dc143c;">base64</span>.<span style="color: black;">urlsafe_b64encode</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">struct</span>.<span style="color: black;">pack</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;!L&quot;</span>,<span style="color: #008000;">int</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> read_timestamp<span style="color: black;">&#40;</span>t<span style="color: black;">&#41;</span>:
    <span style="color: #483d8b;">&quot;&quot;&quot;
    Convert a 6 chars url-safe timestamp back to time
    &quot;&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #dc143c;">struct</span>.<span style="color: black;">unpack</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;!L&quot;</span>,<span style="color: #dc143c;">base64</span>.<span style="color: black;">urlsafe_b64decode</span><span style="color: black;">&#40;</span>t+<span style="color: #483d8b;">&quot;==&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span></pre></div></div>

<p>These functions work by translating the timestamp into a 4-byte binary form and then encoding it using a URL-safe version of Base64. And finally we strip the padding, which is neither URL-safe nor necessary (as we know the size of the encoded data).</p>
<p>The result looks something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">In [72]: build_timestamp()
Out[72]: 'S9sNOQ'</pre></div></div>

<p>We got a timestamp in using only 6 URL-safe chars.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.guyrutenberg.com/2010/04/30/url-safe-timestamps-using-base64/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An Early Release of the New cssrtl.py-2.0</title>
		<link>http://www.guyrutenberg.com/2009/09/20/an-early-release-of-the-new-cssrtl-py-2-0/</link>
		<comments>http://www.guyrutenberg.com/2009/09/20/an-early-release-of-the-new-cssrtl-py-2-0/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 10:00:50 +0000</pubDate>
		<dc:creator>Guy</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[cssrtl.py]]></category>
		<category><![CDATA[RTL]]></category>

		<guid isPermaLink="false">http://www.guyrutenberg.com/?p=578</guid>
		<description><![CDATA[It has been three years since I&#8217;ve released the ]]></description>
			<content:encoded><![CDATA[<p>It has been three years since I&#8217;ve released the <a href=/2007/12/28/convert-css-layout-to-rtl-cssrtlpy/">original version</a> of <code>cssrtl.py</code> (and two since it&#8217;s re-release). The old version did a nice job, but experience gained during that time led me to write from scratch a new version. I&#8217;ve detailed more than a month ago, the <a href="/2009/08/05/designing-a-better-a-css-rtl-convertor/">basic principles and ideas</a> that guided me to design a better tool to help adapting CSS files from left-to-right to right-to-left.</p>
<p>The guidelines weren&#8217;t just empty words, they were written while working on the <a href="/2009/08/15/rtl-and-hebrew-adaptation-of-the-fusion-wordpress-theme/">Hebrew adaptation to the Fusion theme</a> and in the same time writing a new proof-of-concept version of <code>cssrtl.py</code>. The original intent was to release a more mature version of that code when it will be completed. However, due to the apparent shortage of time in the present and foreseeable future, I can&#8217;t see myself complete the project any time soon. So following the &#8220;release early&#8221; mantra, I&#8217;ve decided to release the code as-is. As I said, the code is in working state, but not polished, so it may be of benefit but may contain bugs. If you find any bugs or have any suggestions, I would be glad to hear.<br />
<span id="more-578"></span></p>
<h3>Download</h3>
<p>You can download the new version from here: <a href="/wp-content/uploads/2009/09/cssrtl.py-2.0.tar.bz2">cssrtl.py-2.0.tar.bz2</a>. The code is available under the GPLv2 or any later version.</p>
<h3>Usage</h3>
<p>The new version works by creating a seperate css &#8220;fix&#8221; file that fixes the directionality of the css styles.</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">./cssrtl.py &lt; main.css &gt; main-rtl.css</pre></div></div>

<p>Afterward, just link the <code>main-rtl.css</code> after <code>main.css</code> in your HTML files.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.guyrutenberg.com/2009/09/20/an-early-release-of-the-new-cssrtl-py-2-0/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scanning Documents Written in Blue Ink &#8211; biscan</title>
		<link>http://www.guyrutenberg.com/2008/03/19/scanning-documents-written-in-blue-ink-biscan/</link>
		<comments>http://www.guyrutenberg.com/2008/03/19/scanning-documents-written-in-blue-ink-biscan/#comments</comments>
		<pubDate>Tue, 18 Mar 2008 22:08:28 +0000</pubDate>
		<dc:creator>Guy</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[biscan]]></category>
		<category><![CDATA[PIL]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.guyrutenberg.com/2008/03/19/scanning-documents-written-in-blue-ink-biscan/</guid>
		<description><![CDATA[After writing the post on converting PNMs to DjVu I&#8217;ve ran into some trouble scanning documents written in blue ink. The problem: XSane didn&#8217;t allow me to set the threshold for converting the scanned image to line-art (B&#038;W). So, I tried scanning the document in grayscale and in color and convert it afterwards to bitonal [...]]]></description>
			<content:encoded><![CDATA[<p>After writing the post on converting <a href="/2008/03/11/convert-pnms-to-djvu/">PNMs to DjVu</a> I&#8217;ve ran into some trouble scanning documents written in blue ink. The problem: XSane didn&#8217;t allow me to set the threshold for converting the scanned image to line-art (B&#038;W). So, I tried scanning the document in grayscale and in color and convert it afterwards to bitonal using imagemagick. This ended up with two results. When I used the <code>-monochrome</code> command line switch, the conversion looked good, but it used halftones (dithering), when I tried to convert it to DjVu it resulted in a document size twice as large as normal B&#038;W would. The other thing that I tried is using the <code>-threshold</code> switch. The DjVu compressed document size was much better now, but the document was awful looking, either it was too dark, or some of the text disappeared. After giving it some thought I knew I can find a better solution.<br />
<span id="more-46"></span><br />
I came up with the idea that I need to find a way to identify the text written in blue-ink and make sure it turns to be black. The model I used is to identify it using the blue channel. So I tried putting a dynamic threshold that would give more weight to the blue channel. I&#8217;ve used the gamma option in imagemagick but it didn&#8217;t turn out good enough. I had to take different actions depending on the blue level.</p>
<p>My next shot at this was to use imagemagick&#8217;s <code>fx</code> operator. It allows you to iterate over the pixels in the image and apply some custom expression. While it sounded very good, it turned out to be a disaster as this thing was very slow. It is so slow, it becomes useless on high-resolution pictures, it didn&#8217;t finish operating on the 300dpi scanned document even an hour after it started. In my opinion, it is below par even when operating on relatively small images. This feature would become great if it would only operate faster.</p>
<p>At this point I&#8217;ve realized I probably can&#8217;t do it directly from the command line, and I decided to implement a solution using Python, and the Python Imaging Library (PIL), this was the start of my new project &#8211; <code>biscan</code> (blue ink scan).The program iterates over the pixels in the image and checks the blue level, if the blue level is high (but not too high) if forces the pixel to become black. The following code is still pretty experimental, and is the first prototype.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#!/usr/bin/python</span>
<span style="color: #483d8b;">&quot;&quot;&quot;
biscan - Blue Ink Scan 0.1. Takes a color image of a scanned document written using
blue ink, and turnes it into a blac-and-white (lineart) image.
(C) 2008 Guy Rutenberg. Released under the terms of the GPLv2.
&quot;&quot;&quot;</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">import</span> Image
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">optparse</span> <span style="color: #ff7700;font-weight:bold;">import</span> OptionParser
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> parseArguments<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
	<span style="color: #dc143c;">parser</span> = OptionParser<span style="color: black;">&#40;</span>usage=<span style="color: #483d8b;">&quot;%prog [options] FILEIN FILEOUT&quot;</span>, 
			version=<span style="color: #483d8b;">&quot;%prog 0.1 &quot;</span><span style="color: black;">&#41;</span>
	<span style="color: #dc143c;">parser</span>.<span style="color: black;">add_option</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;-v&quot;</span>, <span style="color: #483d8b;">&quot;--verbose&quot;</span>, dest=<span style="color: #483d8b;">&quot;verbose&quot;</span>, action=<span style="color: #483d8b;">&quot;store_true&quot;</span>, default=<span style="color: #008000;">False</span>,
			<span style="color: #008000;">help</span>=<span style="color: #483d8b;">&quot;be verbose&quot;</span><span style="color: black;">&#41;</span>
	<span style="color: #dc143c;">parser</span>.<span style="color: black;">add_option</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;-b&quot;</span>, <span style="color: #483d8b;">&quot;--blue-threshold&quot;</span>, dest=<span style="color: #483d8b;">&quot;blue_threshold&quot;</span>, <span style="color: #008000;">type</span>=<span style="color: #483d8b;">&quot;int&quot;</span>, default=<span style="color: #ff4500;">220</span>,
			<span style="color: #008000;">help</span>=<span style="color: #483d8b;">&quot;Set the blue threshold (0..255) to NUM&quot;</span>, metavar=<span style="color: #483d8b;">&quot;NUM&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
	<span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #dc143c;">parser</span>.<span style="color: black;">parse_args</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">def</span> main<span style="color: black;">&#40;</span>options, args<span style="color: black;">&#41;</span>:
	<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>args<span style="color: black;">&#41;</span><span style="color: #66cc66;">&lt;</span><span style="color: #ff4500;">2</span>:
		<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;See --help for usage instructions&quot;</span>
		<span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">1</span>
&nbsp;
	im = Image.<span style="color: #008000;">open</span><span style="color: black;">&#40;</span>args<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
	pix = im.<span style="color: black;">load</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
	<span style="color: #ff7700;font-weight:bold;">for</span> x <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">xrange</span><span style="color: black;">&#40;</span>im.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>:
		<span style="color: #ff7700;font-weight:bold;">for</span> y <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">xrange</span><span style="color: black;">&#40;</span>im.<span style="color: black;">size</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>:
			<span style="color: #ff7700;font-weight:bold;">if</span> pix<span style="color: black;">&#91;</span>x,y<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">&lt;</span>options.<span style="color: black;">blue_threshold</span>:
				pix<span style="color: black;">&#91;</span>x,y<span style="color: black;">&#93;</span> = <span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>,<span style="color: #ff4500;">0</span>,<span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>
		<span style="color: #ff7700;font-weight:bold;">if</span> options.<span style="color: black;">verbose</span>: <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;proccessed line&quot;</span>,x
&nbsp;
	<span style="color: #ff7700;font-weight:bold;">def</span> threshold<span style="color: black;">&#40;</span>i<span style="color: black;">&#41;</span>:
		<span style="color: #ff7700;font-weight:bold;">if</span> i<span style="color: #66cc66;">&lt;</span><span style="color: #ff4500;">127</span>: <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">0</span>
		<span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #ff4500;">255</span>
	im.<span style="color: black;">convert</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;L&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">point</span><span style="color: black;">&#40;</span>threshold<span style="color: black;">&#41;</span>.<span style="color: black;">save</span><span style="color: black;">&#40;</span>args<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__== <span style="color: #483d8b;">'__main__'</span>:
	<span style="color: black;">&#40;</span>options, args<span style="color: black;">&#41;</span> = parseArguments<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
	<span style="color: #dc143c;">sys</span>.<span style="color: black;">exit</span><span style="color: black;">&#40;</span>main<span style="color: black;">&#40;</span>options, args<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>See <code>python biscan --help</code> for options. This version outperforms all the above mentioned methods in quallity and the compression ratio achieved on the output. Its performance are reasonable, around 10 sec for 8 mega-pixel image. I&#8217;ve tested it with several different blue inks (Uniball, Pilot, Ballograf and some generic ones) and it operated pretty well on all of them.</p>
<p>So what next? I plan to improve the script and allow it to almost perfectly convert documents written in blue ink to B&#038;W. I&#8217;m going to experiment with a new model for identifying the blue ink using HSL values (instead of RGB). The next version will also be more polished, as I&#8217;ve released this one under the motto of &#8220;release early&#8221; (and I hope I will also release often). So stay tuned for updates.</p>
<p>If you give this script a try, I will glad to hear about it. Of course if you have any question you&#8217;re welcomed to contact me.</p>
<p>N.B. The <code>biscan</code> requires PIL 1.1.6 (the latest one as of this time). PIL can be found in Gentoo under <code>dev-python/imaging</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.guyrutenberg.com/2008/03/19/scanning-documents-written-in-blue-ink-biscan/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.814 seconds -->
