Idea Discussion /
Internationalization
Difference (last change) (Author, normal page display)
Changed: 37c37
Source: ![]() |
Source: ![]() |
Changed: 43c43
Source: ![]() |
Source: ![]() |
Changed: 47c47
Source: ![]() |
Source: ![]() |
Changed: 53c53
Source: ![]() |
Source: ![]() |
![]() |
|
Internationalization vs. Localization
- i18n: internationalization
- l10n: localization
- Also, very roughly said, when it comes to multi-lingual messages, internationalization is usually taken care of by programmers, and localization is usually taken care of by translators.

This page will deal with both internationalization and localization. Sorry if the terms are mistakenly used for one another. It is a new concept to the editor that they are separate ideas.
There seems to be 2 schools of thought in the area of internationalization/localization:
- Compile-time generated using version().
- Runtime-time generated using some sort of a plugin architecture or language resource files.
Compile-time
Version identifiers should have a prefix (such as "locale_" or "loc_"). This should make it clear to most viewers of the code what's happening. Not everyone would intuitively know that ky_KG is a locale feature, but locale_ky_KG is clear. (By the way "ky" is a lang, or language; "KG" is a country. The combination "ky_KG" is a locale).
version (locale_en_GB) {
... } else version (locale_en_US) { ... } else version (locale_en) { ... }This is just a start...
Runtime-time
Proposal for D
- We need a class Locale (or possibly a struct Locale) containing those ISO codes. Java might use strings internally, but there are a whole bunch of reasons why that's not such a good idea - such as the fact that "fr", "fra" and "fre" are all, equivalently, the language code for French, and should all compare as equal; such as case and other punctuation concerns ("en-us" == "en-US" == "en_us" == "en_US", etc.). I'd vote for putting enums inside the class (enum Language and enum Country - the variant field will still need to be a string). I imagine that the gettext implementation will need to use our yet-to-be-invented Locale class, and the unicode lib certainly will (and soon).

gettext
- What about just porting GNU gettext to phobos? This way you have a semi-standart way of localizing programs (which a lot of translators know about), and a set of pre-written tools (even nice GUI ones).
- Looking at the python implementation it should not be difficult; the Python implementation is only 493 lines (gettext.py) I'll see if I can take enough time to do it in the next weeks.

- GNU gettext documentation,
http://www.gnu.org/software/gettext/manual/html_chapter/gettext_toc.html
- The Python implementation (that of about 500 lines) does't use any external C lib at all; it's 100% pure Python.

Common Locale Date Repository
- D should define locales exclusively in terms of ISO language and country codes, plus variant extensions. Unicode defines locales that way, and the etc.unicode library will have no choice but to use the ISO codes. Collation and stuff like that will need to rely on data from the CDLR (Common Locale Data Repository - see
http://www.unicode.org/cldr/).

More links
- QT/KDE:
http://doc.trolltech.com/3.0/linguist-manual.html
- Java:
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Formattable.html
- Python:
http://doc.astro-wise.org/locale.html