Kitchen.i18n Module¶
I18N is an important piece of any modern program. Unfortunately, setting up i18n in your program is often a confusing process. The functions provided here aim to make the programming side of that a little easier.
Most projects will be able to do something like this when they startup:
# myprogram/__init__.py:
import os
import sys
from kitchen.i18n import easy_gettext_setup
_, N_ = easy_gettext_setup('myprogram', localedirs=(
os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'),
os.path.join(sys.prefix, 'lib', 'locale')
))
Then, in other files that have strings that need translating:
# myprogram/commands.py:
from myprogram import _, N_
def print_usage():
print _(u"""available commands are:
--help Display help
--version Display version of this program
--bake-me-a-cake as fast as you can
""")
def print_invitations(age):
print _('Please come to my party.')
print N_('I will be turning %(age)s year old',
'I will be turning %(age)s years old', age) % {'age': age}
See the documentation of easy_gettext_setup() and
get_translation_object() for more details.
See also
gettextfor details of how the python gettext facilities work
- babel
The babel module for in depth information on gettext, message catalogs, and translating your app. babel provides some nice features for i18n on top of
gettext
Functions¶
easy_gettext_setup() should satisfy the needs of most users.
get_translation_object() is designed to ease the way for anyone that
needs more control.
- kitchen.i18n.easy_gettext_setup(domain, localedirs=(), use_unicode=True)¶
Setup translation functions for an application
- Parameters:
domain – Name of the message domain. This should be a unique name that can be used to lookup the message catalog for this app.
localedirs – Iterator of directories to look for message catalogs under. The first directory to exist is used regardless of whether messages for this domain are present. If none of the directories exist, fallback on
sys.prefix+/share/localeDefault: No directories to search so we just use the fallback.use_unicode – If
Truereturn thegettextfunctions forstrstrings else return the functions for bytebytesfor the translations. Default isTrue.
- Returns:
tuple of the
gettextfunction andgettextfunction for plurals
Setting up
gettextcan be a little tricky because of lack of documentation. This function will setupgettextusing the Class-based API for you. For the simple case, you can use the default arguments and call it like this:_, N_ = easy_gettext_setup()
This will get you two functions,
_()andN_()that you can use to mark strings in your code for translation._()is used to mark strings that don’t need to worry about plural forms no matter what the value of the variable is.N_()is used to mark strings that do need to have a different form if a variable in the string is plural.See also
- Kitchen.i18n Module
This module’s documentation has examples of using
_()andN_()get_translation_object()for information on how to use
localedirsto get the proper message catalogs both when in development and when installed to FHS compliant directories on Linux.
Note
The gettext functions returned from this function should be superior to the ones returned from
gettext. The traits that make them better are described in theDummyTranslationsandNewGNUTranslationsdocumentation.Changed in version kitchen-0.2.4: ; API kitchen.i18n 2.0.0 Changed
easy_gettext_setup()to return the lgettext functions instead of gettext functions when use_unicode=False.
- kitchen.i18n.get_translation_object(domain, localedirs=(), languages=None, class_=None, fallback=True, codeset=None, python2_api=True)¶
Get a translation object bound to the message catalogs
- Parameters:
domain – Name of the message domain. This should be a unique name that can be used to lookup the message catalog for this app or library.
localedirs – Iterator of directories to look for message catalogs under. The directories are searched in order for message catalogs. For each of the directories searched, we check for message catalogs in any language specified in:attr:languages. The message catalogs are used to create the Translation object that we return. The Translation object will attempt to lookup the msgid in the first catalog that we found. If it’s not in there, it will go through each subsequent catalog looking for a match. For this reason, the order in which you specify the
localedirsmay be important. If no message catalogs are found, either return aDummyTranslationsobject or raise anIOErrordepending on the value offallback. Rhe default localedir fromgettextwhich isos.path.join(sys.prefix, 'share', 'locale')on Unix is implicitly appended to thelocaledirs, making it the last directory searched.languages –
Iterator of language codes to check for message catalogs. If unspecified, the user’s locale settings will be used.
See also
gettext.find()for information on what environment variables are used.class – The class to use to extract translations from the message catalogs. Defaults to
NewGNUTranslations.fallback – If set to data:False, raise an
IOErrorif no message catalogs are found. IfTrue, the default, return aDummyTranslationsobject.codeset – Set the character encoding to use when returning byte
bytesobjects. This is equivalent to callingoutput_charset()on the Translations object that is returned from this function.python2_api – When data:True (default), return Translation objects that use the python2 gettext api (
gettext()andlgettext()return bytebytes.ugettext()exists and returnsstrstrings). WhenFalse, return Translation objects that use the python3 gettext api (gettext returnsstrstrings and lgettext returns bytebytes. ugettext does not exist.)
- Returns:
Translation object to get
gettextmethods from
If you need more flexibility than
easy_gettext_setup(), use this function. It sets up agettextTranslation object and returns it to you. Then you can access any of the methods of the object that you need directly. For instance, if you specifically need to accesslgettext():translations = get_translation_object('foo') translations.lgettext('My Message')
This function is similar to the python standard library
gettext.translation()but makes it better in two ways- It returns
NewGNUTranslationsorDummyTranslations objects by default. These are superior to the
gettext.GNUTranslationsandgettext.NullTranslationsobjects because they are consistent in the string type they return and they fix several issues that can causethe python standard library objects to throwUnicodeError.
- It returns
- This function takes multiple directories to search for
The latter is important when setting up
gettextin a portable manner. There is not a common directory for translations across operating systems so one needs to look in multiple directories for the translations.get_translation_object()is able to handle that if you give it a list of directories to search for catalogs:translations = get_translation_object('foo', localedirs=( os.path.join(os.path.realpath(os.path.dirname(__file__)), 'locale'), os.path.join(sys.prefix, 'lib', 'locale')))
This will search for several different directories:
A directory named
localein the same directory as the module that calledget_translation_object(),In
/usr/lib/localeIn
/usr/share/locale(the fallback directory)
This allows
gettextto work on Windows and in development (where the message catalogs are typically in the toplevel module directory) and also when installed under Linux (where the message catalogs are installed in/usr/share/locale). You (or the system packager) just need to install the message catalogs in/usr/share/localeand remove thelocaledirectory from the module to make this work. ie:In development: ~/foo # Toplevel module directory ~/foo/__init__.py ~/foo/locale # With message catalogs below here: ~/foo/locale/es/LC_MESSAGES/foo.mo Installed on Linux: /usr/lib/python2.7/site-packages/foo /usr/lib/python2.7/site-packages/foo/__init__.py /usr/share/locale/ # With message catalogs below here: /usr/share/locale/es/LC_MESSAGES/foo.mo
Note
This function will setup Translation objects that attempt to lookup msgids in all of the found message catalogs. This means if you have several versions of the message catalogs installed in different directories that the function searches, you need to make sure that
localedirsspecifies the directories so that newer message catalogs are searched first. It also means that if a newer catalog does not contain a translation for a msgid but an older one that’s inlocaledirsdoes, the translation from that older catalog will be returned.Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Add more parameters to
get_translation_object()so it can more easily be used as a replacement forgettext.translation(). Also change the way we use localedirs. We cycle through them until we find a suitable locale file rather than simply cycling through until we find a directory that exists. The new code is based heavily on the python standard librarygettext.translation()function.Changed in version kitchen-1.2.0: ; API kitchen.i18n 2.2.0 Add python2_api parameter
Translation Objects¶
The standard translation objects from the gettext module suffer from
several problems:
They can throw
UnicodeErrorThey can’t find translations for non-ASCII byte
strmessagesThey may return either
unicodestring or bytestrfrom the same function even though the functions say they will only returnunicodeor only return bytestr.
DummyTranslations and NewGNUTranslations were written to fix
these issues.
- class kitchen.i18n.DummyTranslations(fp=None, python2_api=True)¶
Safer version of
gettext.NullTranslationsThis Translations class doesn’t translate the strings and is intended to be used as a fallback when there were errors setting up a real Translations object. It’s safer than
gettext.NullTranslationsin its handling of bytebytesvsstrstrings.Unlike
NullTranslations, this Translation class will never throw aUnicodeError. The code that you have around a call toDummyTranslationsmight throw aUnicodeErrorbut at least that will be in code you control and can fix. Also, unlikeNullTranslationsall of this Translation object’s methods guarantee to return bytebytesexcept forugettext()andungettext()which guarantee to returnstrstrings.When byte
bytesare returned, the strings will be encoded according to this algorithm:If a fallback has been added, the fallback will be called first. You’ll need to consult the fallback to see whether it performs any encoding changes.
If a byte
byteswas given, the same bytebyteswill be returned.If a
strstring was given andset_output_charset()has been called then we encode the string using theoutput_charsetIf a
strstring was given and this isgettext()orngettext()and_charsetwas set output in that charset.If a
strstring was given and this isgettext()orngettext()we encode it using ‘utf-8’.If a
strstring was given and this islgettext()orlngettext()we encode using the value oflocale.getpreferredencoding()
For
ugettext()andungettext(), we go through the same set of steps with the following differences:We transform byte
bytesintostrstrings for these methods.The encoding used to decode the byte
bytesis taken frominput_charsetif it’s set, otherwise we decode using UTF-8.
- input_charset¶
is an extension to the python standard library
gettextthat specifies what charset a message is encoded in when decoding a message tostr. This is used for two purposes:
If the message string is a byte
bytes, this is used to decode the string to astrstring before looking it up in the message catalog.In
ugettext()andungettext()methods, if a bytebytesis given as the message and is untranslated this is used as the encoding when decoding tostr. This is different from_charsetwhich may be set when a message catalog is loaded becauseinput_charsetis used to describe an encoding used in a python source file while_charsetdescribes the encoding used in the message catalog file.
Any characters that aren’t able to be transformed from a byte
bytestostrstring or vice versa will be replaced with a replacement character (ie:u'�'in unicode based encodings,'?'in other ASCII compatible encodings).See also
gettext.NullTranslationsFor information about what methods are available and what they do.
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 * Although we had adapted
gettext(),ngettext(),lgettext(), andlngettext()to always return bytebytes, we hadn’t forced those bytebytesto always be in a specified charset. We now make sure thatgettext()andngettext()return bytebytesencoded usingoutput_charsetif set, otherwisecharsetand if neither of those, UTF-8. Withlgettext()andlngettext()output_charsetif set, otherwiselocale.getpreferredencoding(). * Make settinginput_charsetandoutput_charsetalso set those attributes on any fallback translation objects.Changed in version kitchen-1.2.0: ; API kitchen.i18n 2.2.0 Add python2_api parameter to __init__()
- output_charset()¶
Compatibility for python2.3 which doesn’t have output_charset
- set_output_charset(charset)¶
Set the output charset
This serves two purposes. The normal
gettext.NullTranslations.set_output_charset()does not set the output on fallback objects. On python-2.3,gettext.NullTranslationsobjects don’t contain this method.
- class kitchen.i18n.NewGNUTranslations(fp=None, python2_api=True)¶
Safer version of
gettext.GNUTranslationsgettext.GNUTranslationssuffers from two problems that this class fixes.gettext.GNUTranslationscan throw aUnicodeErroringettext.GNUTranslations.ugettext()if the message being translated has non-ASCII characters and there is no translation for it.gettext.GNUTranslationscan return bytebytesfromgettext.GNUTranslations.ugettext()andstrstrings from the othergettext()methods if the message being translated is the wrong type
When byte
bytesare returned, the strings will be encoded according to this algorithm:If a fallback has been added, the fallback will be called first. You’ll need to consult the fallback to see whether it performs any encoding changes.
If a byte
byteswas given, the same bytebyteswill be returned.If a
strstring was given andset_output_charset()has been called then we encode the string using theoutput_charsetIf a
strstring was given and this isgettext()orngettext()and a charset was detected when parsing the message catalog, output in that charset.If a
strstring was given and this isgettext()orngettext()we encode it using UTF-8.If a
strstring was given and this islgettext()orlngettext()we encode using the value oflocale.getpreferredencoding()
For
ugettext()andungettext(), we go through the same set of steps with the following differences:We transform byte
bytesintostrstrings for these methods.The encoding used to decode the byte
bytesis taken frominput_charsetif it’s set, otherwise we decode using UTF-8
- input_charset¶
an extension to the python standard library
gettextthat specifies what charset a message is encoded in when decoding a message tostr. This is used for two purposes:
If the message string is a byte
bytes, this is used to decode the string to astrstring before looking it up in the message catalog.In
ugettext()andungettext()methods, if a bytebytesis given as the message and is untranslated his is used as the encoding when decoding tostr. This is different from the_charsetparameter that may be set when a message catalog is loaded becauseinput_charsetis used to describe an encoding used in a python source file while_charsetdescribes the encoding used in the message catalog file.
Any characters that aren’t able to be transformed from a byte
bytestostrstring or vice versa will be replaced with a replacement character (ie:u'�'in unicode based encodings,'?'in other ASCII compatible encodings).See also
gettext.GNUTranslations.gettextFor information about what methods this class has and what they do
Changed in version kitchen-1.1.0: ; API kitchen.i18n 2.1.0 Although we had adapted
gettext(),ngettext(),lgettext(), andlngettext()to always return bytebytes, we hadn’t forced those bytebytesto always be in a specified charset. We now make sure thatgettext()andngettext()return bytebytesencoded usingoutput_charsetif set, otherwisecharsetand if neither of those, UTF-8. Withlgettext()andlngettext()output_charsetif set, otherwiselocale.getpreferredencoding().