Guy Rutenberg

Keeping track of what I do

SQL Dump for MS Access databases (.mdb files) on Linux

with 6 comments

I recently had to work with some data that came in a huge Microsoft Access database. Because I like SQLite (and despise Access), I’ve decided to export the data to an SQLite file. The first thing I needed to do was to somehow get all the data out of the db. Being a Linux user, complicates things a bit, but thanks to mdb-tools it’s possible to process the .mdb files without resorting to Windows and buying Access. Using mdb-tools directly can be tedious if you want to export a large db with multiple tables, so when I’ve looked for a way to automate it, I came across Liberating data from Microsoft Access “.mdb” files. This post shows a nice script that dumps every table in a .mdb file to separate CSV file.

While useful, I wanted something that I could easily import into SQLite. So I’ve modified their script to generate an SQL dump of the db. Given a db file, it writes to stdout SQL statements describing the schema of the DB followed by INSERTs for each table. Actually because mdb-tools doesn’t support SQLite as a backend, the dump uses a MySQL dialect, but it should be fine with SQLite as well (SQLite will mostly ignore the parts it can’t process such as COMMENTs). The easiest way to use the script is

$ python AccessDump.py access.mdb | sqlite3 new.db

If the original db contains non-ascii characters, and isn’t encoded in UTF-8, you should set the MDB_JET3_CHARSET environment variable to the correct charset. The dump itself will be UTF-8 encoded.

$ MDB_JET3_CHARSET="cp1255" python AccessDump.py access.mdb | sqlite3 new.db


And now the script itself:

#!/usr/bin/env python
#
# AccessDump.py
# A simple script to dump the contents of a Microsoft Access Database.
# It depends upon the mdbtools suite:
#   http://sourceforge.net/projects/mdbtools/
 
import sys, subprocess, os
 
DATABASE = sys.argv[1]
 
# Dump the schema for the DB
subprocess.call(["mdb-schema", DATABASE, "mysql"])
 
# Get the list of table names with "mdb-tables"
table_names = subprocess.Popen(["mdb-tables", "-1", DATABASE],
                               stdout=subprocess.PIPE).communicate()[0]
tables = table_names.splitlines()
 
print "BEGIN;" # start a transaction, speeds things up when importing
sys.stdout.flush()
 
# Dump each table as a CSV file using "mdb-export",
# converting " " in table names to "_" for the CSV filenames.
for table in tables:
    if table != '':
        subprocess.call(["mdb-export", "-I", "mysql", DATABASE, table])
 
print "COMMIT;" # end the transaction
sys.stdout.flush()

Written by Guy

July 16th, 2012 at 10:52 pm

Posted in Projects

Tagged with , , ,

6 Responses to 'SQL Dump for MS Access databases (.mdb files) on Linux'

Subscribe to comments with RSS or TrackBack to 'SQL Dump for MS Access databases (.mdb files) on Linux'.

  1. return with Can’t allocate filename
    altought i added “sudo” and fullPathName

    till now i uses kexi in order to export tables.

    queency

    17 Jul 12 at 11:49

  2. This seems to happen (at least for me) when the database filename is incorrect. Are you sure you’re passing the filename correctly (properly escaped etc.)?

    Guy

    17 Jul 12 at 19:56

  3. Can I run it on a Mac? Any way to install mdbtools on Mac?

    George

    10 Oct 12 at 22:04

  4. My guess is that you can, but I think you should check the mdb-tools website to make sure.

    Guy

    10 Oct 12 at 22:24

  5. I got rid of “Can’t allocate filename” error replacing
    subprocess.call(["mdb-export", "-I", "mysql", DATABASE, table])
    with
    subprocess.call(["mdb-export", "-I", DATABASE, table])

    hemam

    16 Nov 12 at 10:29

  6. That was awesome, thanks!

    jarrett

    24 Apr 13 at 19:53

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>