Tarsum in Rust

Almost 14 years ago, I wrote a [small utility, named tarsum, to calculate checksums on files inside a tar archive. It was useful for verifying data inside backups. Recently, I decided to rewrite it in Rust. It’s available from https://github.com/guyru/tarsum.

Installation using cargo is straight forward:

$ cargo install --git https://github.com/guyru/tarsum

Surprisingly, testing on a large tar archive (recent Linux tarball, 1.3 GB), the performance of both Python and Rust implementation is very similar.

Introducing mdview – a lightweight Markdown viewer

My favorite editor is vim, but it has downsides as well. Vim doesn’t have a the GUI needed to extend it to preview things like Markdown properly. Yeah, sure vim can highlight Markdown syntax, but that is not a replacement for real previewing. With that itch in mind, I searched for a solution but found none that satisfied me. For reStructuredText I’ve found a solution that worked well. It worded by starting a local web-server and does the previewing in the browser. Inspired by it, I started writing mdview.

mdview allows you to instantly preview in your favorite browser any Markdown file you’re editing. It will automatically refresh when the file is changed, hence it great for working with the editor and browser side-by-side for live preview.
Continue reading Introducing mdview – a lightweight Markdown viewer

name-taken – Check if your project name is taken

Every time I want to start a new open-source project I come across this small “problem”: Making sure that the name for the project isn’t already taken. Today I decided to solve it by creating a simple script that queries different open-source repositories to check if there exists a project with the desired name.

Usage is quite simple:

$ name_taken.py enlightenment
Debian: Name not taken :-)
SourceForge: Name taken :-(

Currently the script is in early stage, and can search for projects in Debian’s list of packages and in SourceForge. The code is available is hosted in GitHub: https://github.com/guyru/name_taken, and licensed under GPL2 or higher. Suggestions on how to make this tool more useful (and of course patches) are really welcomed.

spass-3.1 Secure Password Generator Released

Usually release announcements go together with the actual release. Somehow, I’ve postponed writing about the new release for quite some time, but better late than never.

spass is a tool that creates cryptographically strong passwords and passphrases by generating random bits from your sound card. It works by passing noise from the sound card through a Von Neumann process to remove bias and then uses MD5 to “distill” a truly random bit from every 4 bits of input.

The new version of spass, version 3.1, was released two months ago. The code should now compile easily on both Linux (ALSA, OSS and PortAudio backends) and Windows (only PortAudio is supported). There is some minor tweaks to the CLI, but the main part is a new Qt interface, screenshots of it available on the project’s SourceForge page. I’ve also migrated the build system to CMake (from automake) which should make it easier to build.

You can download the sources, 64bit Debian package and binaries for windows from here. If you use spass and create binary packages for more platforms, it will be great.

BTW as you can see I’ve migrated the code to SourceForge from GitHub. I know it not a popular move, but their lack of binary downloads is really frustrating.

spass-3.0 Released

I’ve released today the new version of spass, a tool that creates cryptographically strong passwords and passphrases by generating random bits from your sound card.

In the user facing side, spass can now create passphrases as long as passwords. The words for the passphrases are chosen out of a list of 8192 words which means each word adds 13 bits of entropy to the passphrase.

spass can now use one of three audio backends (the old version could only use OSS):

  • Advanced Linux Sound Architecture (ALSA)
  • Open Sound System (OSS)
  • PortAudio

The PortAudio support will hopefully make it easy to port spass to other platforms as well (such as Windows). The random number generator got overhauled and now there is an unbiasing step before applying the hash function. This should help getting consistent results in terms of entropy. In the backstage I’ve migrated the project from autotools to cmake.

You can find more information, as well as both source and binary packages in https://github.com/guyru/spass.

SQL Dump for MS Access databases (.mdb files) on Linux

I recently had to work with some data that came in a huge Microsoft Access database. Because I like SQLite (and despise Access), I’ve decided to export the data to an SQLite file. The first thing I needed to do was to somehow get all the data out of the db. Being a Linux user, complicates things a bit, but thanks to mdb-tools it’s possible to process the .mdb files without resorting to Windows and buying Access. Using mdb-tools directly can be tedious if you want to export a large db with multiple tables, so when I’ve looked for a way to automate it, I came across Liberating data from Microsoft Access “.mdb” files. This post shows a nice script that dumps every table in a .mdb file to separate CSV file.

While useful, I wanted something that I could easily import into SQLite. So I’ve modified their script to generate an SQL dump of the db. Given a db file, it writes to stdout SQL statements describing the schema of the DB followed by INSERTs for each table. Actually because mdb-tools doesn’t support SQLite as a backend, the dump uses a MySQL dialect, but it should be fine with SQLite as well (SQLite will mostly ignore the parts it can’t process such as COMMENTs). The easiest way to use the script is

$ python AccessDump.py access.mdb | sqlite3 new.db

If the original db contains non-ascii characters, and isn’t encoded in UTF-8, you should set the MDB_JET3_CHARSET environment variable to the correct charset. The dump itself will be UTF-8 encoded.

$ MDB_JET3_CHARSET="cp1255" python AccessDump.py access.mdb | sqlite3 new.db

Continue reading SQL Dump for MS Access databases (.mdb files) on Linux

A Note About Open Sound System (OSS)

A while ago I wrote about creating random numbers out of noise gathered from audio device and also created a password generator based on the idea. The implementation was based on Open Sound System (commonly known as OSS). OSS was the defacto way to access audio device couple of years ago, when it hit licensing issues and subsequently replaced by ALSA. As Ubuntu no longer supports OSS (and even the ALSA wrapper for it is in Universe), I’ve decided to re-write the code in some modern alternative.
Continue reading A Note About Open Sound System (OSS)

Debugging File Type (MIME) Associations

I’m having less and less time to blog and write stuff lately, so it’s a good oppertunity to catch up with old thing I did. Back in the happy days I used Gentoo, one of irritating issues I faced was messed up file type associations. MIME type for some files was recognized incorrectly, and as a result, KDE offered to open files with unsuitable applications. In order to debug it I wrote a small python script which would help me debug the way KDE applications are associated with MIME types and what MIME type is inferred form each file.

The script does so by querying the KMimeType and KMimeTypeTrader. The script does 3 things:

  • Given a MIME type, show it’s hierarchy and a list of applications associated with it.
  • Given an applications, list all MIME types it’s associated with
  • Given a file, show its MIME type (and also the accuracy, which allows one to know why that MIME type was selected, although I admit that in the two years since I wrote it, I forgot how it works :))

The script is pasted below. I hope someone that still fiddles with less than standard installations, will find it helpful.
Continue reading Debugging File Type (MIME) Associations

Installing culmus-latex on Ubuntu 11.10

After someone complained to me that he can’t install culmus-latex on Ubuntu 11.10, I decided to check the issue. Apparently culmus-latex can’t be installed as-is on Ubuntu 11.10 (and probably other new versions of Debian and Ubuntu). The problem have been reported in few places such as Whatsup, but as I don’t frequent the forum lately, I wasn’t aware of it. Skip bellow if you’re just interested in the workaround.

Technical Details

The problem manifests itself as:

sudo make install
... snipped for brevity ...
mktexlsr: Done.
updmap-sys --enable Map=culmus.map
updmap: This is updmap, version $Id: updmap 14402 2009-07-23 17:09:15Z karl $
updmap: using transcript file `/var/lib/texmf/web2c/updmap.log'
updmap: initial config file is `/var/lib/texmf/web2c/updmap.cfg'
make: *** [install] Error 2

But if you look at updmap’s manpage there is no documentation for the return codes. Also there is no explicit place where it exits with return code 2 in the code. After some straceing I found the culprit in the combination of the set -e in the top of /usr/bin/updmap and the function pickLocalFile in /usr/share/tex-common/debianize-upddmap which overrides certain behaviors in updmap. The pickLocalFile uses the following lines

localfile="`ls $debDirname/*local*cfg 2>/dev/null`"
if [ -n "$localfile" ]; then

To check if there is a local configuration file under /etc/texmf/updmap.d. If such file doesn’t exist, instead of creating one (as the maintainers of debianize-updmap intended) it fails due to the set -e in /usr/bin/updmap. Thus updmap exists with error code 2, instead of completing the installation.

Meanwhile, until the bug is fixed, there is a simple workaround


Before installing, execute

sudo touch /etc/texmf/updmap.d/10local.cfg

And now the regular sudo make install installation should finish successfully.

As the problem is a result of a Debian bug, I don’t expect to release a new version of culmus-latex, instead I’ll report the bug to the Debian team.

Solving Sudoku using Python and Prolog

Two weeks ago, I add came up with an interesting algorithm for solving Hidato which basically involves decomposing the board the grid (can be square, hexagonal or any other shape), into classes of pieces and then arranging them (maybe I’ll write a detailed post on it in the future). So while pondering whether it would be interesting enough to go forward and actually implementing the algorithm compared to the work it would require, I started thinking what will be the simplest way to solve such puzzles, as opposed to efficient.

At first I’ve looked at general purpose constraint solvers, and decided to tackle Sudoku instead as it’s a bit simple to define in terms of constraints. I considered several libraries but in the end I’ve settled on plainly using Prolog. I chose Prolog because as a logic programming language, constraints are its bread and butter. I although kind of liked it as I haven’t done anything in Prolog for quite a few years.

Describing Sudoku in terms of constraints is extremely simple. You need to state that every cell is in a given range and that all rows, columns and sub-grid contain different integers. As mangling with lists in prolog isn’t fun, I’ve wrote a python program that outputs all the prolog statements with hardcoded references to the variables which build-up the board. It’s ugly but dead simple. The script gets the dimensions of the sub-grid.
Continue reading Solving Sudoku using Python and Prolog