SQL Dump for MS Access databases (.mdb files) on Linux

I recently had to work with some data that came in a huge Microsoft Access database. Because I like SQLite (and despise Access), I decided to export the data to an SQLite file. The first thing I needed to do was somehow get all the data out of the db. Being a Linux user complicates things a bit, but thanks to mdb-tools, it’s possible to process .mdb files without resorting to Windows and buying Access. Using mdb-tools directly can be tedious if you want to export a large db with multiple tables, so when I looked for a way to automate it, I came across Liberating data from Microsoft Access “.mdb” files. This post shows a nice script that dumps every table in a .mdb file to a separate CSV file.

While useful, I wanted something that I could easily import into SQLite. So I modified their script to generate an SQL dump of the db. Given a db file, it writes SQL statements describing the schema of the DB to stdout, followed by INSERTs for each table. Actually, because mdb-tools doesn’t support SQLite as a backend, the dump uses a MySQL dialect, but it should work fine with SQLite as well (SQLite will mostly ignore the parts it can’t process, such as COMMENTs). The easiest way to use the script is

$ python AccessDump.py access.mdb | sqlite3 new.db

If the original db contains non-ASCII characters and isn’t encoded in UTF-8, you should set the MDB_JET3_CHARSET environment variable to the correct charset. The dump itself will be UTF-8 encoded.

$ MDB_JET3_CHARSET="cp1255" python AccessDump.py access.mdb | sqlite3 new.db

Continue reading SQL Dump for MS Access databases (.mdb files) on Linux

Conditional Compilation in Autoconf and Automake

While working on my audio-based random password generator (you can view the source in GitHub), I wanted to do some conditional compilation: compiling certain parts of the program only if some option is passed to the configure script. As it usually happens with GNU’s autotools, it’s kind of hell to do it. Documentation is spread across dozens of sources, and each provides only a specific part of what to do. I’m writing it here on the blog, in the hope I’ll never have to search for how to do so again.
Continue reading Conditional Compilation in Autoconf and Automake

Some Thoughts About Android’s Full-Disk Encryption

One of the new features touted by ICS is full-disk encryption (actually, it was first available in Android 3). At first look, it is promising. The Android developers went with dm-crypt as the underlying transparent disk encryption subsystem, which is the de facto way to perform full-disk encryption in Linux nowadays. This ensures both portability of the encrypted file systems and a tried-and-tested implementation. The cipher itself is 128-bit AES in an ESSIV mode, and the encryption key is derived from the password using PBKDF2 (actually, it’s the key that encrypts the actual encryption key, allowing fast password changes). So where do I think it went wrong?

Enabling the full disk encryption.

Continue reading Some Thoughts About Android’s Full-Disk Encryption

A Note About Open Sound System (OSS)

A while ago I wrote about creating random numbers out of noise gathered from an audio device and also created a password generator based on the idea. The implementation was based on Open Sound System (commonly known as OSS). OSS was the de facto way to access audio devices a couple of years ago, until it hit licensing issues and was subsequently replaced by ALSA. As Ubuntu no longer supports OSS (and even the ALSA wrapper for it is in Universe), I’ve decided to re-write the code using some modern alternative.
Continue reading A Note About Open Sound System (OSS)

Fixing virtualenv after Upgrading Your Distribution/Python

After you upgrade your Python/distribution (specifically, this happened to me after upgrading from Ubuntu 11.10 to 12.04), your existing virtualenv environments may stop working. This manifests itself as reports that some modules are missing. For example, when I tried to open a Django shell, it complained that urandom was missing from the os module. I guess almost any module can break.

Apparently, the solution is dead simple. Just re-create the virtualenv environment:

virtualenv /PATH/TO/EXISTING/ENVIRONMENT

or

virtualenv --system-site-packages /PATH/TO/EXISTING/ENVIRONMENT

(depending on how you created it) in the same place. All the modules you’ve already installed should keep working as before (at least it was that way for me).

Debugging File Type (MIME) Associations

I’m having less and less time to blog and write stuff lately, so it’s a good opportunity to catch up with old things I did. Back in the happy days when I used Gentoo, one of the irritating issues I faced was messed-up file type associations. The MIME type for some files was recognized incorrectly, and as a result, KDE offered to open files with unsuitable applications. In order to debug it, I wrote a small Python script that would help me debug the way KDE applications are associated with MIME types and what MIME type is inferred from each file.

The script does so by querying KMimeType and KMimeTypeTrader. The script does 3 things:

  • Given a MIME type, show its hierarchy and a list of applications associated with it.
  • Given an application, list all MIME types it’s associated with.
  • Given a file, show its MIME type (and also the accuracy, which allows one to know why that MIME type was selected, although I admit that in the two years since I wrote it, I forgot how it works :))

The script is pasted below. I hope someone who still fiddles with less-than-standard installations will find it helpful.
Continue reading Debugging File Type (MIME) Associations

Installing culmus-latex on Ubuntu 11.10

After someone complained to me that he can’t install culmus-latex on Ubuntu 11.10, I decided to check the issue. Apparently, culmus-latex can’t be installed as-is on Ubuntu 11.10 (and probably other new versions of Debian and Ubuntu). The problem has been reported in a few places such as Whatsup, but as I don’t frequent the forum lately, I wasn’t aware of it. Skip below if you’re just interested in the workaround.

Technical Details

The problem manifests itself as:

sudo make install
... snipped for brevity ...
mktexlsr: Done.
updmap-sys --enable Map=culmus.map
updmap: This is updmap, version $Id: updmap 14402 2009-07-23 17:09:15Z karl $
updmap: using transcript file `/var/lib/texmf/web2c/updmap.log'
updmap: initial config file is `/var/lib/texmf/web2c/updmap.cfg'
make: *** [install] Error 2

But if you look at updmap’s man page, there is no documentation for the return codes. Also, there is no explicit place where it exits with return code 2 in the code. After some strace’ing, I found the culprit in the combination of the set -e at the top of /usr/bin/updmap and the function pickLocalFile in /usr/share/tex-common/debianize-updmap, which overrides certain behaviors in updmap. The pickLocalFile function uses the following lines

localfile=""
localfile="`ls $debDirname/*local*cfg 2>/dev/null`"
if [ -n "$localfile" ]; then

To check if there is a local configuration file under /etc/texmf/updmap.d. If such a file doesn’t exist, instead of creating one (as the maintainers of debianize-updmap intended), it fails due to the set -e in /usr/bin/updmap. Thus, updmap exits with error code 2 instead of completing the installation.

Meanwhile, until the bug is fixed, there is a simple workaround.

Workaround

Before installing, execute

sudo touch /etc/texmf/updmap.d/10local.cfg

And now the regular sudo make install installation should finish successfully.

As the problem is a result of a Debian bug, I don’t expect to release a new version of culmus-latex. Instead, I’ll report the bug to the Debian team.

mechanize – Writing Bots in Python Made Simple

I’ve been using Python to write various bots and crawlers for a long time. A few days ago I needed to write a simple bot to remove some 400+ spam pages in Sikumuna, so I took an old script of mine (from 2006) to modify it. The script used ClientForm, a Python module that allows you to easily parse and fill HTML forms using Python. I quickly found that ClientForm is now deprecated in favor of mechanize. In the beginning I was partly set back by the change, as ClientForm was pretty easy to use, and mechanize‘s documentation could use some improvement. However, I quickly changed my mind about mechanize. The basic interface for mechanize is a simple browser object that literally allows you to browse using Python. It takes care of handling cookies and such, and it has similar form-filling abilities to ClientForm, but this time they are integrated into the browser object.

For future reference for myself, and as another code example for mechanize‘s sparse documentation, I’m giving below the gist of the simple bot I wrote:

Continue reading mechanize – Writing Bots in Python Made Simple

Bye Bye OmniCppComplete, Hello Clang Complete

For years OmniCppComplete has been the de facto standard for C++ completion in Vim. But as time progressed, I got more and more annoyed by its shortcomings. OmniCppComplete is based on tokenizing provided by ctags. The ctags parsing of C++ code is problematic; you can’t even run it on libstdc++ headers (you need to download modified headers). You want to use an external library? You’ll need to run ctags separately on each library. Not to mention its inability to deduce types of anything more than trivial code. The core of the problem is that OmniCppComplete isn’t a compiler, and you can’t expect something that isn’t a compiler to fully understand code. This is what makes Visual Studio’s IntelliSense so great: it uses the Visual C++ compiler for parsing. It isn’t making wild guesses at types or what the current scope is – it knows it.
Continue reading Bye Bye OmniCppComplete, Hello Clang Complete