HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

ocPortal Tutorial: Localisation and internationalisation

Written by Chris Graham, ocProducts
ocPortal has support for internationalisation, including:
  • time zones
  • translation into different languages
  • different character sets (for example, Cyrillic)
  • different locales, for different numbering systems (for example, European comma and decimal-point difference)



Time zones

In ocPortal, time zones can be adjusted in two ways:
  • adjusting the site time-zone relative to server time (this is a configuration option). This is convenient if the server is located in a different time-zone to the site (e.g. A UK site using an American hosting company)
  • adjusting member time-zones, relative to the site time-zone (OCF only)

Language file format

This section will describe the format used to store language string s in ocPortal. In theory, this is not needed to be known, as an interface in the Admin Zone is provided that works with this behind-the-scenes; however it is useful to know, especially if you are wishing to work through the language files in a text editor.

ocPortal language packs are made up of .ini files, containing mappings between special codes (based on the English) and the actual string as displayed. For example, a common string in the 'global' language file (the one containing common strings used throughout the portal), is coded as:

Code

PROCEED=Proceed
ocPortal is developed as British English, and this is technically known as the 'fall-back language', because it should always have a complete set of files and strings.
Thumbnail: Choosing a language and language file to edit in the language editor

Choosing a language and language file to edit in the language editor


The .ini files for any translation are stored together in a directory that is named with the standard two-letter code to denote that language; for example, English is 'EN'. A list of these codes is in lang/langs.ini.

All bundled languages packs are located in the 'lang' directory of ocPortal. There is also a 'lang_custom' directory which contains custom language packs, or language packs that 'override' those available in the 'lang' directory on a file-by-file basis. Whenever language files are edited in the Admin Zone, the file is automatically overridden to a lang_custom one if it has not been already.

Not all language files need to be translated, and language files do not have to be complete, as if a string cannot be found and the fall-back language (English) isn't being used, ocPortal will look in the English language pack using the fall-back mechanism.
Thumbnail: Using the language editor to translate language strings

Using the language editor to translate language strings





The language editor

{!DOC_TRANSLATE}

Special strings

Language string codes that are in lower-case are special strings, that should not be translated directly. These strings contain encoded information relating to the language pack.

String codename Purpose
charset The character set needed for the language (standard code for an ASCII character set)
dir The direction of text (usually ltr, but sometimes rtl for languages such as Arabic)
locale The locale: there are standard locale codes for unix, based on language codes, but they vary across operating system: use what works on your server.
The locale code is used to prepare certain operating system date strings, and number formatting.
en_right Sometimes templates have to apply properties values of "left" or "right" to table cells, according to the text direction. For a rtl language, this becomes "left" instead of "right".
en_left As above, but opposite.
language_author Your name


Character sets

There are three systems that are in common usage to allow diverse characters to be displayed in a document:
  • HTML entities
  • Unicode
  • Character sets

ocPortal supports character sets. In some places, HTML entities will work, but there are definitely places where, in the current version of ocPortal, they will not. Unicode is not appropriate for PHP systems like ocPortal, due to the "binary safe" design of PHP strings.

To understand character sets, you need to understand how strings (or text files) are composed. Each character (a symbol, represented by a 'glyph' on the screen) is essentially represented a number, 0-255; 0-127 are usually standard, and specified using the "7-bit ASCII code": the 128-255 range is essentially free, and what the numbers map to depends on the "character set" used. As different languages use different characters (for example, accented characters, or a whole different alphabet, or even a pictographical language), different languages use different character sets.
A file that uses "high" characters will look different when viewed in editors set to different character sets. In order to put in text in the appropriate character set, and to view it, your editor must be set to it; this is to be expected to be by default if you are translating to your native language.

Comcode and HTML pages

In addition to the language files, Comcode and HTML pages may be translated. To translate a Comcode page, either manually copy the Comcode page .txt file from the pages/comcode/EN directory, to the appropriate pages/comcode/<lang> directory and change it there, or simply choose the target language and edit the file using ocPortal.
As HTML pages are created outside ocPortal, you must manually copy the file in the equivalent way to as stated for Comcode pages.

Multiple languages on one site

Thumbnail: Language configuration

Language configuration

It is possible to configure ocPortal such that members may select which language to use on your site, and pages are then presented in this language. There are a number of ways a user may choose a language:
  • via the language block (which inserts a keep_lang parameter into the URL, to preserve their choice until they close the browser window)
  • via their member profile (OCF supports this better than other language drivers, although the integration can be improved by editing the lang/map.ini file)
  • via their web browser stated language (disabled by default, as most users unfortunately have it misconfigured)

Thumbnail: Changing the default site language

Changing the default site language

However, at this point of time we do not support preparation of content in multiple languages. As a result of this, if a user selects a language pack that uses an unusual character set, content may appear garbled.

Criticising language packs

Thumbnail: Choosing a language to criticise the translation of

Choosing a language to criticise the translation of

{!DOC_CRITICISE_LANGUAGE_PACK}

ocProducts policy on languages

Officially, ocPortal is only supported in British English, due to the difficulty for us to test and maintain translations foreign to the development team, and to provide support in those languages. However, we provide incentives to individuals or groups to do translations (free 2 year registration for those significantly involved, in return for permission to re-distribute), and will distribute them with ocPortal releases for convenience; at the time of writing, the bundled languages are only partial translations, as they were made for a previous version, but they provide a base to work from, and a good demonstration to how the translation system works.
If problems are reported to us, such as bugs occurring when a non-English pack is used, or difficulties with the way templates are constructed for a right-to-left script, we will endeavour to fix them. This can sometimes pose problems for us though:
  • understanding technical problems described in poor English is not easy
  • we often don't have the necessary environment to easily reproduce the problem
  • we sometimes do not have the insight into the language involved, which means we need to conduct research
  • a lower proportion of non-English-speaking users register than English-speaking ones, for various reasons (such as difficulty in paying to a primarily American payment gateway, or a lack of us being able to offer support in their language, or economic differences making registration less affordable), and thus resolving language problems cannot be our top priority
therefore support is wholly unofficial. Practically we very much want ocPortal to be widely used by people in any language, even if people feel they cannot register.

Thumbnail: Criticising the translation of a language

Criticising the translation of a language

When reporting a language issue, or asking a question not made clear in this tutorial:
  • please use clear, accurate, and sufficient amounts of English writing. Do not just pass phrases through babelfish or SysTrans, because it comes out garbled (for example 'Gate' being used to refer to our product, instead of 'Portal'). Do not use 'pigeon English' or 'street English' with words such as 'da' instead of 'the'
  • please do not expect us to be able to provide accurate responses or a high level of service if the above is not done: regardless of what we offer as a part of registration, we will not learn a foreign language, or spend hours decoding bad English
  • please do not expect us to understand any language other than English, whether it be spoken to us directly, or be it in resource we are pointed to
  • please do not expect us to get personally involved in a translation, other than making sure important questions do not go unanswered

Thumbnail: Making a language pack

Making a language pack

This section may seem rather negative, but we are actually enthusiastic about taking on translations and helping translators. We'd like:
  • ocPortal available in all major languages
  • to work to make sure translations to any language (even oriental and right-to-left ones) are possible without third party re-coding
  • to compensate translators for their work (but in a way that doesn't cost us money we cannot re-make in related sales)
  • to provide the information and tools that any translator needs
  • ocPortal to work seamlessly with multiple language packs installed at once






Concepts

language string
A piece of text, often a phrase, used by ocPortal; identified by a short code WRITTEN_LIKE_THIS
character set
A set of characters that the one-byte-per-character representation system ties to; used to allow more than 255 characters to be represented on computers so that they may show many different language scripts