Translating strings in internationalized applications

Internationalization (“i18n”) and localization (“l10n”) of software is a complex topic with many facets. One aspect of internationalization is the translation of strings in programs into different languages.

Here’s an example of how not to do it (assuming t is a translation lookup function):

StringBuilder sb = new StringBuilder(t("User "));
sb.append(user.name());
sb.append(t(" logged in "));
sb.append(minutes);
sb.append(" ");
if (minutes == 1) {
    sb.append(t("minute"));
} else {
    sb.append(t("minutes"));
}
sb.append(t(" ago."));
return sb.toString();

Translatable strings and concatenation don’t mix well, be it via StringBuilder, the plus operator or in template files like JSPs. Different languages have different sentence structures. You can’t know in advance in which order the parts must appear in the translated text. So the most basic rule is: never construct sentences programmatically from sentence fragments if they are intended for translation.

Here’s a slightly better variant:

if (minutes == 1) {
    return t("User {0} logged in {1} minute ago.", user.name(), minutes);
}
return t("User {0} logged in {1} minutes ago.", user.name(), minutes);

I18n frameworks always offer the possibility to pass arguments to the translation lookup function. This way translators can freely choose the positions of these arguments via placeholders in the translated string.

However, not all languages have pluralization rules similar to English, where you have to handle only two cases (one and zero/many). For example, Russian and Polish use different forms of nouns with different numerals higher than one. Here’s an extensive table listing the plural rules for different languages: The rules are classified into these categories: “one”, “two”, “few”, “many”, “other”. Good i18n frameworks provide translation lookup functions where you can pass the count as an additional argument. The framework then dispatches to different translation keys, depending on the count and the target language:

user.login.minutes.one=...
user.login.minutes.two=...
user.login.minutes.many=...
user.login.minutes.other=...

There are other traps that you have to watch out for, e.g.

  • different punctuation marks: you can’t simply assume that you can convert any translated text into a label by appending “:” to it, or that you can convert any translated text into a quotation by surrounding it with ” and “.
  • gender rules, which can be handled similarly to the pluralization rules

Conclusion

This article gave a small glimpse into the topic of internationalization, to help avoid the most basic mistakes. Check out the documentation of your internationalization framework to see what it can offer.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s