Written by ninadgawad on March 2nd, 2008
Implementing Websites For Internationalization
Most business in today’s open market are demanding their websites & web application to have multilingual support. This is in order to attract non-English speaking audiences to increase their customer base and make people comfortable using the application in their native language. But for this, it would require an existing business application to be customized to support native locale. The process of producing an application that can be localized for a particular country without any changes to the program code is called Internationalization.
Following are the list of things one needs to take care while making an application to implement Internationalization.
1. Producing Content in Unicode
You need to make sure that all files which contain the multilingual content are in Unicode format. By default most text editors will store your file in ASCII which does not support other language characters. So after saving any content in ASCII format and loading it, your browser might not be able to display correct text in native language. Here you need to convert the file from ASCII to UTF format.
For example, the steps required for doing this if you are using Notepad are,
- Go to > File > Save As > Encoding
- Change Encoding from ASCII to UTF-8
2. Files needed to be converted to UTF
All your properties file should have a suffix based on the language in which the content is stored in this file. For example,
- For English, content_en.properties for French content_fr.properties and so on.
- Most java frameworks like Struts, Spring have inbuilt support to detect JVM locale and select the corresponding properties file based on the extension of the file & JVM/Browser setting. You don’t need to add any code to detect the setting and set the properties file
Make sure that your JVM has set the parameter -Dfile.encoding=UTF-8
Set charset attribute of the SCRIPT tag to “UTF-8”. For example,
Use the steps as in the first example to make sure the content within the XML files in is UTF-8 format.
Add this line to head of all html pages.
<meta name=”http-equiv” content=”Content-type: text/html; charset=UTF-8“/>
Set Content-type explicitly through your server side scripts. To do this, add this line to all your jsp pages,
<%@page contentType=”text/html; charset=UTF-8” pageEncoding=”UTF-8″%>
3. Configuring the Web Server
Your Web Server has to be configured to set the Content-type of headers to UTF-8 since by default the web server will replace the Content-type-header to ISO-8859 encoding. For example:
- In Apache Web Server, edit httpd.conf to set AddDefaultCharset=Off.
- In Tomcat Server set connector settings within the server.xml file to URIEncoding=”UTF-8″
4. Handling Request & Response Objects
Make sure that before reading request parameters & writing to response objects set character encoding type to UTF-8. This can be handled using filters where all the contents coming in and going out can be set to UTF-8 encoding. For example,
5. Creating Database with Unicode encoding
While using a database make sure that the database created has encoding set to UNICODE. Most of the database servers like Oracle, Postgress, MySQL and MSSQL have this support. In order to create databases with unicode encoding use the following query,
CREATE DATABASE MyWebApps DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Once all the above steps are implemented your web application will work smoothly for all UTF-8 supported languages. In some cases, you will have to take care of NumberFormat, DecimalFormat and DateFormat. There are some FAQ’s on internationalization provided by Sun to get you started. But feel free to put in your doubts related to internationalization via comments. I will be glad to help you out.
(Ninad Gawad is a Java EE Developer who blogs at Technology Discussion)