HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

ocPortal Tutorial: Localisation and internationalisation

Written by Chris Graham, ocProducts
This tutorial is designed as a comprehensive guide to ocPortal's translation features, written for people wanting to make a complete ocPortal translation and understand the full technical details. We also have a simpler tutorial.

ocPortal has support for internationalisation, including:
  • time zones
  • translation of text into different languages (.ini or .po files)
  • translation of text into different languages (Comcode pages)
  • translation of text into different languages (text files)
  • translation of images into different languages (e.g. labelled buttons)
  • different character sets (for example, Cyrillic)
  • different locales, for different numbering systems (for example, European comma and decimal-point difference)
  • there is support for translating content into different languages


Time zones

In ocPortal, time zones can be adjusted in two ways:
  • adjusting the site time-zone relative to server time (this is a site configuration option). This is convenient if the server is located in a different time-zone to the site (e.g. A British website using an American hosting company)
  • adjusting member time-zones, relative to the site time-zone (OCF only)

ocProducts relationship with translations

The core development team really want ocPortal to be widely used by people in any language, but do not get involved in maintaining or developing individual language packs (other than the standard English). We may distribute third-party packs (which you can make from within ocPortal) if there is popular request.

Internationalisation can be difficult and time consuming if someone has not already created a language pack for your language. We recommend that you try to plan ahead and bring together a team from your country to make translations go faster.

If you have any feedback on how translation can be easier without the core development team having to get involved with the work/maintenance/politics of your individual language then please report it. The team is willing to work hard to make translation easier, but doesn't have the resources to work alongside each individual translator.

If someone has already created a language pack it might be as simple as installing it from the ocPortal addon directory.

Language file format (technical overview)

This section will describe the format used to store language string s in ocPortal. In theory, this is not needed to be known, as an interface in the Admin Zone is provided that works with this behind-the-scenes; however it is useful to know, especially if you are wishing to work through the language files in a text editor.

ocPortal language packs are made up of .ini files, containing mappings between special codes (based on the English) and the actual string as displayed. For example, a common string in the 'global' language file (the one containing common strings used throughout the portal), is coded as:

Code

PROCEED=Proceed
ocPortal is developed in British English, and this is technically known as the 'fall-back language', because it always has a complete set of language files and strings.
Thumbnail: Choosing a language and language file to edit in the language editor

Choosing a language and language file to edit in the language editor


The .ini files for any translation are stored together in a directory that is named with the standard two-letter code to denote that language; for example, English is 'EN'. A list of these codes is in lang/langs.ini.

All bundled languages packs are located in the 'lang' directory of ocPortal. There is also a 'lang_custom' directory which contains custom language packs, or language packs that 'override' those available in the 'lang' directory on a file-by-file basis. Whenever language files are edited in the Admin Zone, the file is automatically overridden to a lang_custom one if it has not been already.
Not all language files need to be translated, and language files do not have to be complete, as if a string cannot be found and the fall-back language (English) isn't being used, ocPortal will look in the English language pack using the fall-back mechanism.
Thumbnail: Using the language editor to translate language strings

Using the language editor to translate language strings


The language editor (i.e. how to change strings)

The language editor allows you to translate 'strings' so that your website is displayed in a language other than the original British English. Alternatively, you may just wish to change language strings to change the 'style' of the website.

The language editor is very easy to use. All you need to do is go to the translation module, choose your language, choose the language file to translate, and then you are presented with an interface to translate the strings.
A small level of integration is provided for languages which Google can translate, so as to provide a guide.

You can reach the language editor from the 'Style' section of the Admin Zone, under the 'Language' icon.

We now recommend doing translations via Launchpad (although you can translate locally and transfer onto Launchpad, see below). See the "Collaborative translations on Launchpad" section.

Many users like to translate stuff just on the public part of their own website. There is an option to change the language strings that you see on a page from the page footer.

It is possible to export local language changes to Launchpad, so that you can pass them back to the community to co-operate and collaborate. This is done by exporting your local translations (.ini files) to .po files and then uploading those on Launchpad.

Special strings

Language string codes that are in lower-case are special strings, that should not be translated directly. These strings contain encoded information relating to the language pack.

String codename Purpose
charset The character set needed for the language (standard code for an ASCII character set). Many people change this to 'utf-8' (Unicode, works with any characters), although regional character sets are supported also.
locale The locale: there are standard locale codes for unix, based on language codes, but they vary across operating system: use what works on your server.
The locale code is used to prepare certain operating system date strings, and number formatting.
dir The direction of text (usually ltr, but sometimes rtl for languages such as Arabic). An "rtl" language would likely require many few template changes as well as language changes. If someone does this we would consider integrating the changes back into a future version of ocPortal.
en_right Sometimes templates have to apply CSS properties values of 'left' or 'right', according to the text direction. For a rtl language, this becomes 'left' instead of 'right'.
en_left As above, but opposite.
language_author Your name
date_* / time_* / calendar_* Date/time formatting in one of the two PHP time formats (1, 2). If there are no '%' signs it's "date", if there are % signs it's "strftime".
dont_escape_trick Ignore this one


Also, generally you should not translate things inside single quotes.

For example,

Code

The renderer to use (hook-type: 'blocks/main_custom_gfx').
would be translated like:

Code

Le moteur de rendu à utiliser (type crochet: 'blocks/main_custom_gfx').

Character sets

There are three systems that are in common usage to allow diverse characters to be displayed in a document:
  • HTML entities
  • Unicode
  • Character sets

ocPortal supports character sets and mostly supports Unicode too. In some places, HTML entities will work, but there are definitely places where, in the current version of ocPortal, they will not. Unicode is not ideal for PHP systems like ocPortal, due to the 'binary safe' design of PHP strings – however, in practice it does work due to backwards-compatibility in Unicode and the fact that ocPortal has special code to take Unicode into account when it matters.

To understand character sets, you need to understand how strings (or text files) are composed. Each character (a symbol, represented by a 'glyph' on the screen) is essentially represented a number, 0-255; 0-127 are usually standard, and specified using the '7-bit ASCII code': the 128-255 range is essentially free, and what the numbers map to depends on the 'character set' used. As different languages use different characters (for example, accented characters, or a whole different alphabet, or even a pictographical language), different languages use different character sets.

A file that uses 'high' characters will look different when viewed in editors set to different character sets. In order to put in text in the appropriate character set, and to view it, your editor must be set to it; this is to be expected to be by default if you are translating to your native language.

Things you can translate

As well as the core .ini files, there are other things that may be translated.

Comcode (and HTML) pages

To translate a Comcode page, either manually copy the Comcode page .txt file from the pages/comcode/EN directory, to the appropriate pages/comcode/<lang> directory and change it there, or simply choose the target language and edit the file using ocPortal.

As HTML pages are created outside ocPortal, you must manually copy the file in the equivalent way to as stated for Comcode pages.

Text files

There are some other text files you might want to translate are, in a similar way to Comcode pages (see above):
  • text/EN/quotes.txt
  • text/EN/rules*.txt
And these files don't need translating but could be replaced with equivalents in your language:
  • text/EN/too_common_words.txt (a list of words that should not be considered in search results, for example)
  • text/EN/word_characters.txt (a list of characters that appear in words in your language – most languages have all the English characters, but also accented ones)

None of these files are very important, only translate them if you want to.

Images

If you look under the themes/default/images/ directory you will see there is an EN directory that contains images with English text on. You can copy this to the ISO codename of your language pack (e.g. FR), and then replace the images with translated ones. Make sure you clear your theme image cache (Admin Zone, Tools section, Cleanup tools icon) after doing this. We have the PSD files (requires Adobe Photoshop or compatible software) for many of the images in our downloads database. The font is a commercial font called 'Kabel', so you may wish to use a free font like 'Arial' instead.

WYSIWYG editor

ocPortal uses a third-party WYSIWYG editor – CKEditor.
It has it's own translations which should automatically be linked to your own by the standard ISO language name.

Template/CSS editor

ocPortal uses a third-party code editor – a modified version of EditArea.
You need to make sure you have translated versions of all data/editarea/lang/<lang>.js files. There are quite a few translations already in there.

MySQL collations

MySQL has 'collations' which basically sets the MySQL character set. ocPortal does not handle these, it uses whatever is there.
This generally does not matter a lot (because anything that you ask to store will be correctly stored and retrieved regardless of collation), but there are two special cases:
  1. It does make a small difference in searches. For example, in languages there are usually 'equivalent' characters (e.g. lower case and upper case), and the MySQL collation tells MySQL about those.
  2. If the charset ocPortal is using is not matching up with what MySQL is using in terms of unicode vs non-unicode (e.g. MySQL uses UTF-8 but ocPortal uses ISO-8859-1) then conversion errors can happen as there are character code sequences that non-Unicode text might use which are totally invalid in Unicode and hence won't be stored at all. Users of English (who have limited alphabet that is all in lower ASCII and thus interchangeably compatible with both latin1 and UTF-8) would likely not notice this problem, but it becomes a problem for anyone doing internationalisation who have such a unicode vs non-unicode conflict.

The normal Western European collation (used by English) is 'latin1_swedish_ci'. If anybody wonders why 'Swedish' is used for 'English', it is because English does not use accented characters and hence was considered a subset of Swedish, which does.

When ocPortal 9 or earlier installs, it will use latin1 (equivalent of ISO-8859-1) by default, because this is what the default English language pack uses. MySQL doesn't make it easy to convert the character set of the database, so the best way to do it is to export to an SQL dump, edit the dump in a text editor, and reimport into a new database:
  • Use phpMyAdmin to export all the tables to a .sql file on your computer; if it asks what character set to make the file, choose UTF-8
  • Do whatever you would normally do to backup your database; that will usually mean keeping a copy of the above file (but ensure it is complete before relying on it, sometimes SQL dumps don't download correctly)
  • Use a text editor to replace all instances of latin1 with utf8 in the file
  • Save the edited file
  • Use phpMyAdmin to drop all tables in your database
  • Use phpMyAdmin to import the edited file

GD fonts

If you find that the vertical text shown on permission editing interfaces is incorrect, it may be due to an incompatibility between PHP and the free Bitstream fonts that ocPortal bundles.
This is known to happen with Russian characters. The solution is to replace the data/fonts/FreeMonoBoldOblique.ttf file with Courier New Bold Italic.ttf from your own computer. We would distribute this file with ocPortal, except we don't have a license to; however if you have a copy of Windows or Mac OS you should have your own licensed copy of this file.
This OcCLE command can also be used to grab it from a URL that works at the time of writing:

Code

:file_put_contents(get_file_base().'/data/fonts/Vera.ttf',http_download_file('http://typo3.org/extensions/repository/fulllist/pdf_generator2_fonts/0.0.1/info/?tx_terfe_pi1%5BdownloadFile%5D=fonts%252Fverdana.ttf&cHash=909a78c3bd'));

Collaborative translations on Launchpad

You can use Launchpad to translate ocPortal into your language with the help of others.

Launchpad is great because:
  • You do not need to feel that you are alone translating everything yourself anymore
  • It's very easy to work together. People can be translating the same language at the same time
  • Anyone can download the current set of translations at any time

The process is as follows:
  1. Go to the Launchpad site.
  2. Sign up
  3. Log in
  4. Set your languages
  5. Start translating (the strings are split across about 60 files, often it works well to work with other people, each doing different files)
  6. It is advisable to translate something inside the 'global' language file before doing any downloading, as ocPortal needs a partially or fully translated global language file Launchpad .po file to automatically flip the site into utf-8, which is what Launchpad encodes language in.

Note that some of the "English" will be written as "English: (English value). Explanation: (Explanation)". This is because Launchpad has no specific way to explain what strings are used for, so when importing we have put the explanation and original English together like this. These particular strings are just documentation strings, they should not be translated.

Also, you should not translate the strings marked as located as follows:
  • Located in [strings]en_left
  • Located in [strings]en_right
This is because the values are used for CSS, not human language. They should only be altered if you're trying to achieve a right-to-left layout (in which case you would switch them around).

Need some help? Try the translation forum.

Downloading translations from Launchpad

  1. There is a link to download the .po files on the page for the version you are translating (it'll archive all files for you in all languages and then e-mail you a download link). Do not download individual po files because it'll name them in a weird way, download the whole set for the language.
  2. Extract all the files to a single directory (the download from Launchpad uses subdirectories, but ocPortal wants all the .po files together). You only need to extract the files relating to your language but it won't matter if you extract all languages as ocPortal can still find the right ones.
  3. If there are any .ini files with the same name as .po files you are about to place, they will need removing. .ini files take precedence so would block .po files from working. If the .ini files have stuff not in the .po files you should step back and export them to .po and into Launchpad, and restart the downloading process once they've merged back in.
  4. Copy the .po files to the usual language directory, i.e. lang_custom/XX, where XX-is the two-letter-codename for the language. So, for example you should have a lang_custom/XX/global-XX.po file. More details are in the Launchpad FAQ.
  5. You may need to create some extra language directories for Comcode pages and template cacheing if you are not on a SuExec server – basically anywhere where ocPortal has an EN directory create a directory for your language pack's 2-letter-codename too.

Language packs

Once you have a perfect translation on Launchpad, and have imported it into your site, and your completely happy, you can export it to a language pack. This is done from the addon section of ocPortal. We encourage you to make these language packs and upload them to ocPortal.com's addon section, to make it easier for people who don't understand Launchpad.

Turning on a different language

Thumbnail: Changing the default site language

Changing the default site language

To change the default language used on the whole site, use the http://yourbaseurl/config_editor.php script (load up the URL, with yourbaseurl substituted with your real base URL).

Thumbnail: Language configuration

Language configuration

It is possible to configure ocPortal such that members may select which language to use on your site, and pages are then presented in this language. There are a number of ways a user may choose a language:
  • via the language block (which inserts a keep_lang parameter into the URL, to preserve their choice until they close the browser window)
  • via their member profile (OCF supports this better than other language drivers, although the integration can be improved by editing the lang/map.ini file)
  • via their web browser stated language (disabled by default, as most users unfortunately have it misconfigured)

As members can select their language by editing their member profile it may be necessary to edit your own profile to the language you're trying to check even if you changed the default, because you might already have your profile saved as the previous different language (usually English).

To test a language without editing anything you can append &keep_lang=FR to the URL (this is an example for French). If the URL did not contain an "?" symbol already you would need to append ?keep_lang=FR instead.
If this confuses you, put the side_language block onto one of your panels. This does the same thing.

Debugging

If you're having problems getting things working a good early diagnosis step is to check what your site is trying to do. If you look at your page source from inside your web browser, you will see something like the following near the top of the code:

Code

<html lang="EN" dir="ltr">
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

Here you can see the site is running for you with the language "EN" which is specifying a charset of "ISO-8859-1" and a text direction of "ltr" (left to right).

If this is not what you thought was the case it might just tell you where your problem exists.

Criticising language packs

Thumbnail: Choosing a language to criticise the translation of

Choosing a language to criticise the translation of

A tool to criticise language packs is provided, to identify what has not been translated, amongst other things. This tool is intended for those who translate language files without using the inbuilt editor, or for those who have upgrade the software and need to update their language packs.

Advanced: Translating content

ocPortal can have its content translated and delivered for each language, without requiring any duplication.

ocPortal's multi-language support automatically becomes available when you have more than one language installed and have the OCF "Enable regionalisation" option enabled.

We need to consider the following cases:
  1. Sending newsletters
  2. Editing theme images
  3. Editing Comcode pages
  4. Using the Zone Editor
  5. Everything else

For '1' (newsletters), you will get a choice what language to send it for when you go to the newsletter module. Subscribers choose their language when they sign up.

For '2'-'4', you will get a choice of language which to edit under when you go to the respective section of ocPortal. What you save will be saved accordingly.

You will notice for any of '1'-'4', when you choose your language you will temporarily see the website in the language you are working under, until you finish. This is useful, but also ocPortal does it for architectural reasons. Be aware however, that the reason content is saved in a certain language here is due to the language selection you just made, and not necessarily directly related to the language you are viewing. This will be clarified in the next paragraph.

For '5', translation is performed in a special 'Translate content' part of the Admin Zone. It is crucial to understand that it is not performed just by editing content to your own language on normal edit screens. Content added to ocPortal is saved against the language being used by the submitter (except from '1' to '4' above). Therefore, when adding content you must ensure you have the right language choice, and a good rule of thumb to check this is by seeing if the language ocPortal is uses in its interface matches the language you expect to be submitting content in. When editing content, the content is always saved against the language you see it in when you are editing – if it has been translated already then it will be edited as such, otherwise it will still be in the original submitted language – never translate from an edit screen. If something is edited (so long as there were actual changes), all translations are automatically marked 'broken', and will be put back into the translation queue.

You will see there is an option in the footer for opening up a 'Translate content' screen just with language strings that were included on the page you are viewing.

In ocPortal almost everything (*) can be translated, but obviously you would not want to translate every forum post for a large community (for example). For this reason, ocPortal saves language with 'priorities', and that of the highest priority will be presented for translation first. For example, the names of zones would be the highest priority, whilst forum posts would be the lowest.

(*) A few things cannot be translated such as forum names. The reasoning is that you do not want such things translated, but rather you should have a different copy of each forum for each language. This is an exceptional situation, and is only designed like this due to the way forums are used. Other kinds of category may be translated as described above.

Advanced: Non-ISO languages

By default ocPortal supplies language codes and their names, based on the ISO 639 standards. You can add new codes by overriding the lang/langs.ini file to lang_custom/langs.ini if you like. When you add a new language pack in ocPortal you are limited to either choosing an existing code or typing a new one of 1-4 characters – the 1-4 character code would be mapped to a nice name via langs.ini, or if no mapping exists, it would be shown as-is.

Advanced: Right-to-left languages

ocPortal has built-in support for right-to-left languages. You need to change the 'dir', 'en_left' and 'en_right' language strings to activate it.
However there is one issue. Because Comcode is written in English, and punctuation symbols are considered right-to-left punctuation when "automatic bi-directional detection" is enabled, there is a conflict between the desire to type Comcode in English and the desire to type normal right-to-left script.
The following is in our CSS, but commented out:

Code

input[type="text"],textarea { /* So Comcode can be typed */
   unicode-bidi: bidi-override;
   direction: ltr;
}
Uncommenting this makes text input areas work in left-to-right. You can choose to enable it, to make Comcode easier to type, but it will make right-to-left languages harder to type and understand.

We have tried to make our default theme support right-to-left nicely, but unfortunately there are many cases where we could not elegantly do it because we are setting things on a pixel-way instead of a left/right-way. For example, you may see list bullets displaying on the wrong side of a list element. It is caused by CSS like:

Code

ul.compact_list li {
   margin: 0 0 0 17px;
   padding: 0;
}
which would need changing to:

Code

ul.compact_list li {
   margin: 0 17px 0 0;
   padding: 0;
}
Therefore to make things display neatly you will need to make a modified theme that makes these kinds of changes for margin settings, padding settings, and background settings.

Advanced: just one language

If you don't want multiple languages, just your own, you can do this. In other words, you can disable the English language pack as a choice (anything non-translated would still be in English, e.g. most of the admin stuff if you didn't translate that).

Two steps are required:
  1. Disable the 'Enable regionalisation' option
  2. Open up http://yoursite/config_editor.php and change the default language

Cheating

If you don't want to worry about a proper translation, but do want to support multiple languages, Google provide some code for Google Translate that you can easily include in your site footer (or header, or a panel) to allow people to translate the site.

Concepts

language string
A piece of text, often a phrase, used by ocPortal; identified by a short code WRITTEN_LIKE_THIS
character set
A set of characters that the one-byte-per-character representation system ties to; used to allow more than 255 characters to be represented on computers so that they may show many different language scripts

See also