Changing language twice with gettext

PledgeBank is quite an unusual site. Many international websites simply need translation (e.g. Debian in Chinese), there aren’t any data items which vary between regions. Others have multiple international markets, with a special website tweaked for each one (for example Amazon in Canada, which has some French and English text on every page).

PledgeBank is slightly different. First of all the interface needs translating into other languages, like Debian. And we don’t quite have markets like Amazon. Partly this is because we don’t yet know what our markets are, so we just make sites for every country and language combination. We have pledges, which have both a local area and a language associated with them. We’ve also got global pledges.

All this means that sometimes pledges and text in multiple languages gets shown on one page. For example, if your browser is configured for the Brazilian language, and you are in Brazil, then www.pledgebank.com will look like this. At the time of writing there is only one Brazilian pledge, so below it we show some global pledges in English as examples.

We use some software called GNU gettext to do our translations. Obviously, I’m not telling the truth – people do the translation, gettext just substitutes the translations into the pages. It’s a great piece of software, simple, old, well used and supported, with good tools for translators to update translations with.

For some time there’s been a bug in PledgeBank. On certain pages the language can change back and forth several times, and gettext would start returning translations for earlier languages rather than the current one. I’m setting the LANG environment variable to tell it what language to use. After much debugging and an email to GNU, it turns out that this is to do with gettext’s cache. The cache was behaving differently on FreeBSD and Linux, which was confusing me even more.

To clear the cache you rebind the text domain, that is call textdomain(textdomain(NULL)), after changing the environment variable. This makes everything work happily everywhere. And the main point of this post is to get that nugget into search engines, so anybody else with the same problem has a hope of finding out..