
Internationalization, Part 1
by David FlanaganEditor's note: Writing software that is truly multilingual is not an easy task. In this excerpt from Chapter 8 of Java Examples in a Nutshell, 3rd Edition, author David Flanagan offers real-world programming examples covering the three steps to internationalization in Java. This week, he covers how to use Unicode character encoding and how to handle local customs. Next week's excerpt will cover the third step: localizing user-visible messages.
Related Reading ![]() Java Examples in a Nutshell |
Internationalization is the process of making a program flexible enough to run correctly in any locale. The required corollary to internationalization is localizationthe process of arranging for a program to run in a specific locale.
There are several distinct steps to the task of internationalization. Java (1.1 and later) addresses these steps with several different mechanisms:
-
A program must be able to read, write, and manipulate localized text. Java uses the Unicode character encoding, which by itself is a huge step toward internationalization. In addition, the
InputStreamReader
andOutputStreamWriter
classes convert text from a locale-specific encoding to Unicode and from Unicode to a locale-specific encoding, respectively. -
A program must conform to local customs when displaying dates and times, formatting numbers, and sorting strings. Java addresses these issues with the classes in the
java.text
package. -
A program must display all user-visible text in the local language. Translating the messages a program displays is always one of the main tasks in localizing a program. A more important task is writing the program so that all user-visible text is fetched at runtime, rather than hardcoded directly into the program. Java facilitates this process with the
ResourceBundle
class and its subclasses in thejava.util
package.
This chapter discusses all three aspects of internationalization.
A Word About Locales
A locale represents a geographic, political, or
cultural region. In Java, locales are represented by the
java.util.Locale
class. A locale is frequently
defined by a language, which is represented by its standard lowercase
two-letter code, such as en (English) or fr (French). Sometimes,
however, language alone is not sufficient to uniquely specify a
locale, and a country is added to the specification. A country is
represented by an uppercase two-letter code. For example, the United
States English locale (en_US) is distinct from the British English
locale (en_GB), and the French spoken in Canada (fr_CA) is different
from the French spoken in France (fr_FR). Occasionally, the scope of
a locale is further narrowed with the addition of a system-dependent
variant string.
The Locale
class maintains a static default
locale, which can be set and queried with Locale.setDefault(
)
and Locale.getDefault( )
.
Locale-sensitive methods in Java typically come in two forms. One
uses the default locale, and the other uses a
Locale
object that is explicitly specified as an
argument. A program can create and use any number of nondefault
Locale
objects, although it is more common simply
to rely on the default locale, which is inherited from the underlying
default locale on the native platform. Locale-sensitive classes in
Java often provide a method to query the list of locales that they
support.
Finally, note that AWT and Swing GUI components (see Chapter 11) have a locale property, so it is possible for different components to use different locales. (Most components, however, are not locale-sensitive; they behave the same in any locale.)
Unicode
Java uses the Unicode character encoding. (Java 1.3 uses Unicode Version 2.1. Support for Unicode 3.0 will be included in Java 1.4 or another future release.) Unicode is a 16-bit character encoding established by the Unicode Consortium, which describes the standard as follows (see http://unicode.org ):
The Unicode Standard defines codes for characters used in the major languages written today. Scripts include the European alphabetic scripts, Middle Eastern right-to-left scripts, and scripts of Asia. The Unicode Standard also includes punctuation marks, diacritics, mathematical symbols, technical symbols, arrows, dingbats, etc. ... In all, the Unicode Standard provides codes for 49,194 characters from the world's alphabets, ideograph sets, and symbol collections.
In the canonical form of Unicode encoding, which is what Java
char
and String
types use,
every character occupies two bytes. The Unicode characters
\u0020
to \u007E
are equivalent
to the ASCII and ISO8859-1 (Latin-1) characters
0x20
through 0x7E
. The Unicode
characters \u00A0
to \u00FF
are
identical to the ISO8859-1 characters 0xA0
to
0xFF
. Thus, there is a trivial mapping between
Latin-1 and Unicode characters. A number of other portions of the
Unicode encoding are based on preexisting standards, such as
ISO8859-5 (Cyrillic) and ISO8859-8 (Hebrew), though the mappings
between these standards and Unicode may not be as trivial as the
Latin-1 mapping.
Note that Unicode support may be limited on many platforms. One of the difficulties with the use of Unicode is the poor availability of fonts to display all the Unicode characters. Figure 8-1 shows some of the characters that are available in the standard fonts that ship with Sun's Java 1.3 SDK for Linux. (Note that these fonts do not ship with the Java JRE, so even if they are available on your development platform, they may not be available on your target platform.) Note the special box glyph that indicates undefined characters.
Figure 8-1. Some Unicode characters and their encodings
Example 8-1 lists code used to create the displays
of Figure 8-1. Because Unicode characters are
integrated so fundamentally into the Java language, this
UnicodeDisplay
program does not perform any
sophisticated internationalization techniques to display Unicode
glyphs. Thus, you'll find that Example 8-1 is more of a Swing GUI example rather than an
internationalization example. If you haven't read
Chapter 11 yet, you may not understand all the code
in this example.
Example 8-1. UnicodeDisplay.java
package je3.i18n;
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
/**
* This program displays Unicode glyphs using user-specified fonts
* and font styles.
**/
public class UnicodeDisplay extends JFrame implements ActionListener {
int page = 0;
UnicodePanel p;
JScrollBar b;
String fontfamily = "Serif";
int fontstyle = Font.PLAIN;
/**
* This constructor creates the frame, menubar, and scrollbar
* that work along with the UnicodePanel class, defined below
**/
public UnicodeDisplay(String name) {
super(name);
p = new UnicodePanel( ); // Create the panel
p.setBase((char)(page * 0x100)); // Initialize it
getContentPane( ).add(p, "Center"); // Center it
// Create and set up a scrollbar, and put it on the right
b = new JScrollBar(Scrollbar.VERTICAL, 0, 1, 0, 0xFF);
b.setUnitIncrement(1);
b.setBlockIncrement(0x10);
b.addAdjustmentListener(new AdjustmentListener( ) {
public void adjustmentValueChanged(AdjustmentEvent e) {
page = e.getValue( );
p.setBase((char)(page * 0x100));
}
});
getContentPane( ).add(b, "East");
// Set things up so we respond to window close requests
this.addWindowListener(new WindowAdapter( ) {
public void windowClosing(WindowEvent e) { System.exit(0); }
});
// Handle Page Up and Page Down and the up and down arrow keys
this.addKeyListener(new KeyAdapter( ) {
public void keyPressed(KeyEvent e) {
int code = e.getKeyCode( );
int oldpage = page;
if ((code == KeyEvent.VK_PAGE_UP) ||
(code == KeyEvent.VK_UP)) {
if (e.isShiftDown( )) page -= 0x10;
else page -= 1;
if (page < 0) page = 0;
}
else if ((code == KeyEvent.VK_PAGE_DOWN) ||
(code == KeyEvent.VK_DOWN)) {
if (e.isShiftDown( )) page += 0x10;
else page += 1;
if (page > 0xff) page = 0xff;
}
if (page != oldpage) { // if anything has changed...
p.setBase((char) (page * 0x100)); // update the display
b.setValue(page); // and update scrollbar to match
}
}
});
// Set up a menu system to change fonts. Use a convenience method.
JMenuBar menubar = new JMenuBar( );
this.setJMenuBar(menubar);
menubar.add(makemenu("Font Family",
new String[ ] {"Serif", "SansSerif", "Monospaced"},
this));
menubar.add(makemenu("Font Style",
new String[ ]{
"Plain","Italic","Bold","BoldItalic"
}, this));
}
/** This method handles the items in the menubars */
public void actionPerformed(ActionEvent e) {
String cmd = e.getActionCommand( );
if (cmd.equals("Serif")) fontfamily = "Serif";
else if (cmd.equals("SansSerif")) fontfamily = "SansSerif";
else if (cmd.equals("Monospaced")) fontfamily = "Monospaced";
else if (cmd.equals("Plain")) fontstyle = Font.PLAIN;
else if (cmd.equals("Italic")) fontstyle = Font.ITALIC;
else if (cmd.equals("Bold")) fontstyle = Font.BOLD;
else if (cmd.equals("BoldItalic")) fontstyle = Font.BOLD + Font.ITALIC;
p.setFont(fontfamily, fontstyle);
}
/** A convenience method to create a Menu from an array of items */
private JMenu makemenu(String name, String[ ] itemnames,
ActionListener listener)
{
JMenu m = new JMenu(name);
for(int i = 0; i < itemnames.length; i++) {
JMenuItem item = new JMenuItem(itemnames[i]);
item.addActionListener(listener);
item.setActionCommand(itemnames[i]); // okay here, though
m.add(item);
}
return m;
}
/** The main( ) program just creates a window, packs it, and shows it */
public static void main(String[ ] args) {
UnicodeDisplay f = new UnicodeDisplay("Unicode Displayer");
f.pack( );
f.show( );
}
/**
* This nested class is the one that displays one "page" of Unicode
* glyphs at a time. Each "page" is 256 characters, arranged into 16
* rows of 16 columns each.
**/
public static class UnicodePanel extends JComponent {
protected char base; // What character we start the display at
protected Font font = new Font("serif", Font.PLAIN, 18);
protected Font headingfont = new Font("monospaced", Font.BOLD, 18);
static final int lineheight = 25;
static final int charspacing = 20;
static final int x0 = 65;
static final int y0 = 40;
/** Specify where to begin displaying, and redisplay */
public void setBase(char base) { this.base = base; repaint( ); }
/** Set a new font name or style, and redisplay */
public void setFont(String family, int style) {
this.font = new Font(family, style, 18);
repaint( );
}
/**
* The paintComponent( ) method actually draws the page of glyphs
**/
public void paintComponent(Graphics g) {
int start = (int)base & 0xFFF0; // Start on a 16-character boundary
// Draw the headings in a special font
g.setFont(headingfont);
// Draw 0..F on top
for(int i=0; i < 16; i++) {
String s = Integer.toString(i, 16);
g.drawString(s, x0 + i*charspacing, y0-20);
}
// Draw column down left.
for(int i = 0; i < 16; i++) {
int j = start + i*16;
String s = Integer.toString(j, 16);
g.drawString(s, 10, y0+i*lineheight);
}
// Now draw the characters
g.setFont(font);
char[ ] c = new char[1];
for(int i = 0; i < 16; i++) {
for(int j = 0; j < 16; j++) {
c[0] = (char)(start + j*16 + i);
g.drawChars(c, 0, 1, x0 + i*charspacing, y0+j*lineheight);
}
}
}
/** Custom components like this one should always have this method */
public Dimension getPreferredSize( ) {
return new Dimension(x0 + 16*charspacing,
y0 + 16*lineheight);
}
}
}
Pages: 1, 2 |
