gettext with Autotools Tutorial

In this tutorial, we walk through the steps needed to add localizations to an existing project that uses GNU Autotools as its build system.

We start by taking a slightly modified version of the Hello World example that comes with the Automake sources. You can keep track of the changes to the source throughout this tutorial by following the commits to amhello-gettext on GitHub. We start with the following files:

$ ls -RF
.:
configure.ac  Makefile.am  README  src/

./src:
main.c  Makefile.am

Running gettextize

The first step is copying some necessary gettext infrastructure into your project. This is done by running gettextize in the root directory of your project. The command will create a bunch of new files and modify some existing files. Most of these files are auto-generated, so there is no need to add them to your version control. You should only add those files you create or modify manually.

You will need to add the following lines to your configure.ac.

AM_GNU_GETTEXT([external])
AM_GNU_GETTEXT_VERSION(0.18)

The version specified is the minimum required version of gettext your package can compile against.

Copy po/Makevars.template to po/Makevars and modify it as needed.

The next step is to copy over gettext.h to your sources.

$ cp /usr/share/gettext/gettext.h src/

libintl.h is the header that provides the different translation functions. gettext.h is a convenience wrapper around it which allows disabling gettext if --disable-nls is passed to the ./configure script. It is recommended to use gettext.h in favor of libintl.h.

Triggering gettext in main()

In order for gettext to work, you need to trigger it in your main(). This is done by adding the following lines to the main() function:

setlocale (LC_ALL, "");
bindtextdomain (PACKAGE, LOCALEDIR);
textdomain (PACKAGE);

You should also add #include "gettext.h" to the list of includes.

PACKAGE should be the name of your program, and is usually defined in the config.h file generated by either autoconf or autoheader. To define LOCALEDIR, we need to add the following line to src/Makefile.am:

AM_CPPFLAGS = -DLOCALEDIR='"$(localedir)"'

If AM_CPPFLAGS is already defined, just append the -DLOCALEDIR='"$(localedir)"' part to it.

Marking strings for translation

At this point, your program should compile with gettext. But since we did not translate anything yet, it will not do anything useful. Before translating, we need to mark the translatable strings in the sources. Wrap each translatable string in _(...), and add the following lines to each file that contains translatable strings:

#include "gettext.h"
#define _(String) gettext (String)

Extracting strings for translation

Before extracting the strings, we need to tell gettext where to look. This is done by listing each source file with translatable strings in po/POTFILES.in. So in our example, po/POTFILES.in should look like:

# List of source files which contain translatable strings.
src/main.c

Afterward, the following command can be used to actually extract the strings to po/amhello.pot (which should go in version control):

make -C po/ update-po

If you haven’t run ./configure yet, you need to run autoreconf --install && ./configure before running the above make command.

Translating strings

To begin translating, you need a *.po file for your language. This is done using msginit:

cd po/ && msginit --locale he_IL.utf8

The locale should be specified as a two-letter language code followed by a two-letter country code. In my example, I’ve used Hebrew; hence, it will create a po/he.po file. To translate the program, you edit the .po file, using either a text editor or a dedicated program (see the list of editors here).

After you update the .po file for your language, list the language in po/LINGUAS (you need to create it). For example, in my case:

# Set of available languages
he

Now you should be ready to compile and test the translation. Unfortunately, gettext requires installing the program in order to properly load the message catalogs, so we need to call make install.

./configure --prefix /tmp/amhello
make
make install

Now, to check the translation, simply run /tmp/amhello/bin/hello (you might need to change LC_ALL or LANGUAGE, depending on your locale, to see the translation).

$ LANGUAGE=he /tmp/amhello/bin/hello 
שלום עולם!

Final note about bootstrapping: when people check out your code from version control, many autogenerated files will be missing. The simplest way to bootstrap the code into a state where you can simply call ./configure && make is by using autoreconf:

autoreconf --install

This will add any missing files and run all the Autotools utilities (aclocal, autoconf, automake, autoheader, etc.) in the right order. Additionally, it will call autopoint, which copies the necessary gettext files that were generated when you called gettextize earlier in the tutorial. If your project is using a ./autogen.sh script that calls the Autotools utilities manually, you should add a call to autopoint --force before the call to aclocal.

Finally, these are the files that end up under version control in our example:

$ ls -RF
.:
configure.ac  Makefile.am  po/  README  src/

./po:
amhello.pot  he.po  LINGUAS  Makevars  POTFILES.in

./src:
gettext.h  main.c  Makefile.am

References

Question Marks Instead of Non-ASCII Chars When Using Gettext in PHP

Yesterday I’ve ported a PHP website to use Gettext for localization (l10n). After reading through the Gettext documentation and going through the documentation on the PHP site, I’ve managed to get everything working (almost). I had one problem: all the non-ASCII characters (accented Latin chars, Japanese, and Chinese) were displayed as question marks (?) instead of in the correct form. This happened despite my using UTF-8 encoded files.

While some people (e.g. this one) suggested that it’s not possible to use non-ASCII characters when using UTF-8 encoded message files, there is a solution, and it’s quite simple. All you have to do is call bind_textdomain_codeset and pass it UTF-8 as charset.

Vim Macros for Wrapping Strings for Gettext

I’m working on a website, and we decided to localize it using GNU gettext. Soon enough, I found it tiring to wrap each string manually in _( and ), and also to do it in Smarty (using {t}string{/t}). So I decided that I needed a macro that would let me highlight the string that needs translation, and the macro would wrap it for me.

I ended up writing two macros: one for PHP files (but it’s also good for C/C++, etc.) and one for Smarty.

:vmap tg di_(<ESC>pa)<ESC>
:vmap ts di{t}<ESC>pa{/t}<ESC>

To use these macros, just highlight the string for translation in Vim’s visual mode and press tg (or ts), and your string will be wrapped for translation.