Internationalization: Understanding Locale in the Java Platform

By John O'Conner,
September 20, 2005

Language and geographic environment are two important influences on our culture. They create the system in which we interpret other people and events in our life. They also affect, even define, proper form for presenting ourselves and our thoughts to others. To communicate effectively with another person, we must consider and use that person's culture, language, and environment.

Similarly, a software system should respect its users' language and geographic region to be effective. Language and region form a locale, which represents the target setting and context for localized software. The Java platform uses java.util.Locale objects to represent locales. This article describes the Locale object and its implications for programs written for the Java platform.

This article is divided into the following sections:


Locales identify a specific language and geographic region. Locale-sensitive objects use java.util.Locale objects to customize how they present and format data to the user. Locales affect user interface language, case mapping, collation (sorting), date and time formats, and number and currency formats. Locales are critical to many culturally and linguistically sensitive data operations.

A java.util.Locale is a lightweight object that contains only a few important members:

  • A language code
  • An optional country or region code
  • An optional variant code

When writing or talking about locales, you can use a text abbreviation for a convenient representation. This notation separates each component of a locale with an underscore character:

<language code>[_<country code>[_<variant code>]]

These three elements provide enough information to other locale-sensitive objects so that they can modify their behavior for a specific linguistic or cultural purpose. For example, a java.text.NumberFormat object created for a German-speaking Swiss locale will format numbers differently than it would for a German-speaking Austrian locale. See Table 1.

Table 1. Formatted Output Varies by Locale

Locale Formatted Numbers
German (Germany) 123.456,789
German (Switzerland) 123'456.789
English (United States) 123,456.789

Because Locale objects are just identifiers, locale-sensitive classes like java.text.NumberFormat or java.text.DateFormat do all the work to provide localized number or date formats. The java.text.DateFormat class, for example, uses a Locale object during its instantiation to decide how to format dates correctly.

The following sections describe each component of a locale.

Language Codes

Language codes are defined by ISO 639, an international standard that assigns two- and three-letter codes to most languages of the world. Locale uses the two-letter codes to identify the target language. Table 2 lists several of these language codes.

Table 2. Language Code Examples in the ISO 639 Standard

Language Code
Arabic ar
German de
English en
Spanish es
Japanese ja
Hebrew he

Language is an important component of a locale because it describes the language that a particular group of customers uses. Your applications will use this information to provide a user interface that conforms to your customer's language.

Of course, language doesn't paint the entire picture of a locale. For example, even though you may use de as the locale language code, de alone doesn't tell you anything about where German is spoken. Several countries use German as an official first or even second language. One of the differences in German from one country to another is sorting order. For this reason and others, language is not always sufficient to precisely define a locale.

Country (Region) Codes

Country codes are defined by ISO 3166, another international standard. It defines two- and three-letter abbreviations for each country or major region in the world. In contrast to the language codes, country codes are set uppercase. Table 3 shows a few of the defined codes. Locale uses the two-letter codes instead of the three-letter codes that this standard also defines.

Table 3. Some Country Codes Defined in the ISO 3166 Standard

Country Code
United States US
Canada CA
France FR
Japan JP
Germany DE

Country codes are an important locale component because java.text.Format objects for dates, time, numbers, and currency are particularly sensitive to this element. Country codes add precision to the language component of a locale. For example, French is used in both France and Canada. However, precise usage and idiomatic expressions vary in the two countries. These differences can be captured with different locale designators in which only the country code is different. For example, the code fr_CA (French-speaking Canada) is different from fr_FR (French-speaking France).

Variant Code

Operating system (OS), browser, and other software vendors can use the code to provide additional functionality or customization that isn't possible with just a language and country designation. For example, a software company may need to indicate a locale for a specific operating system, so its developers may create an es_ES_MAC or an es_ES_WIN locale for the Macintosh or Windows platforms for customers in Spain.

One historical example from the Java platform itself is the use of the EURO variant for European locales that use the euro currency. During the transition period for those countries, the Java 2 Platform, Standard Edition (J2SE) version 1.3 used this variant. For example, although a de_DE (German-speaking Germany) locale existed, a de_DE_EURO (German-speaking German locale with a euro variant) was added to the Java environment. Because the euro currency is now the standard for the affected countries, those variants were removed in J2SE 1.4. Most application designs will probably not require variant codes.


The Locale class has several constructors:

  • Locale(String language)
  • Locale(String language, String country)
  • Locale(String language, String country, String variant)

The following shows how each constructor can be used:

  // Create a generic English-speaking locale.
    Locale locale1 = new Locale("en");
    // Create an English-speaking, Canadian locale.
    Locale locale2 = new Locale("en", "CA");
    // Create a very specific English-speaking, U.S. locale
    // for Silicon Valley.
    Locale locale3 = new Locale("en", "US", "SiliconValley");

Using the ISO 639 two-letter code, the en represents English. The ISO 3166 codes CA and US represent Canada and the United States, respectively: The last code line above shows how to create a locale with an optional variant: an en_US_SiliconValley locale. This locale is more specific than the first instance. Not only is the locale a U.S. English-speaking one, but it is also associated with an additional variant, SiliconValley. Because one of its purposes is to provide developers the ability to define custom locales, the variant can be anything you need.

Although the compiler and runtime environment won't complain if you make up your own language and country or region identifiers, you should use only the codes that the ISO standards define. By constraining yourself to the ISO definitions, you'll ensure compatibility with other applications and coding standards. More importantly, locale-sensitive class libraries use only the ISO codes. For example, the java.text.NumberFormat class will understand how to behave for the de_DE (German language, Germany) locale, but it won't understand what to do with a fictitous foo_biz locale. If you use non-ISO identifiers, you will have to write the locale-sensitive code to support them.

Preconstructed Locales

The Locale class has many static fields that represent instantiated Locale objects. For example, Locale.FRANCE is a premade static Locale object that represents French-speaking France. You can use Locale.FRANCE anywhere you might use new Locale("fr", "FR"). Table 4 shows some of the preconstructed Locale objects that are available.

Table 4. Some of the Preconstructed Locale Objects

Locale Name Locale
Locale.CHINA zh_CN
Locale.CHINESE zh
Locale.PRC zh_CN
Locale.TAIWAN zh_TW
Locale.ENGLISH en
Locale.UK en_GB
Locale.US en_US
Locale.FRANCE fr_FR
Locale.FRENCH fr

Preconstructed locales exist for convenience. However, the list of static class constants is small and incomplete. Not every important locale is represented. Locale-sensitive class support is not determined by whether the locale exists as a Locale class constant. For example, no South American Locale class constants exist, yet they have complete support by several locale-sensitive classes including DateFormatand NumberFormat.

Because so few premade locales exist, you should probably just avoid these static objects altogether. I have described them here because they exist and because you may see them in someone else's code. Although these shortcut identifiers are convenient, your code will be more consistent without them.

Identifying Supported Locales

What locales does the Java platform support? You can create any locale that you'd like. However, your runtime environment may not fully support the Locale object you create.

If you want to know what Locale objects you can create, the answer is simple: You can create any locale you'd like. The constructors won't complain about non-ISO arguments. However, a more helpful restatement of the question is this: For which locales do the class libraries provide more extensive information? For which locales can the libraries provide collation, time, date, number, and currency information? Also, you might ask what scripts or writing systems your runtime environment supports.

The following sections describe how to determine supported locales in the runtime libraries. Additionally, they describe the supported scripts that can be displayed by the text components. Finally, they enumerate the localizations that are available for the runtime libraries and the software development kit (SDK) itself.

Enabled Locales in java.util and java.text Packages

The runtime environment has no requirement that all locales be supported equally by every locale-sensitive class. Every locale-sensitive class implements its own support for a set of locales, and that set can be different from class to class. For example, a number-format class can support a different set of locales than can a date-format class.

Additionally, there is no requirement that all runtime implementations support the same set of locales. But all implementations must support a minimal list of them. This list is quite short: English (U.S.). Fortunately, the runtime environment that Sun provides is much more extensive. Although it is not formally required, Sun's runtime implementation supports the same set of locales in each of its locale-sensitive data format classes. This provides a consistent set of support across classes. The J2SE 5.0 platform's Supported Locales guide provides a full list of all supported locales. Table 5 shows a few of the supported locales.

Table 5. Some of the Supported Locales in java.util and java.text Packages

Language Country Locale ID
Arabic Saudia Arabia ar_SA
Chinese (simplified) China zh_CN
Chinese (traditional) Taiwan zh_TW
Dutch Netherlands nl_NL
English Australia en_AU
English Canada en_CA
English United Kingdom en_GB
English United States en_US
French Canada fr_CA
French France fr_FR
German Germany de_DE
Hebrew Israel he_IL
Hindi India hi_IN
Italian Italy it_IT
Japanese Japan ja_JP
Korean South Korea ko_KR
Portuguese Brazil pt_BR
Spanish Spain es_ES
Swedish Sweden sv_SE
Thai (Western digits) Thailand th_TH
Thai (Thai digits) Thailand th_TH_TH

To find out what locales your Java Runtime Environment (JRE) supports, you have to ask each locale-sensitive class. Each class that supports multiple locales will implement the method getAvailableLocales(). For example,

Locale[] localeList = NumberFormat.getAvailableLocales();

The getAvailableLocales() method is implemented in many of the classes in the java.text and java.util packages. For example, NumberFormat, DateFormat, Calendar, and BreakIterator provide it.

The Locale class itself is localized for several locales. In the example below, a German locale instance provides information about itself in English (the default on the author's host), German, and French:

  Locale deLocale = new Locale("de", "DE");
  Locale frLocale = new Locale("fr", "FR");
  System.out.println("Default language name (default): " + 
  System.out.println("German language name (German): " + 
  System.out.println("German language name (French): " + 

The output would look like this:

  German language name (default): German
  German language name (German): Deutsch
  German language name (French): allemand

Enabled Script Support

Text components don't usually support single locales. Instead, text widgets support a set of scripts that often cross locale boundaries. Although it is impossible to get a listing of supported scripts from the various text components themselves, the list is published as part of the J2SE 5.0 platform's Supported Locales guide.

In general, peered AWT components can render scripts that the underlying host supports. If your host system is localized to Arabic, the AWT text components will display Arabic. On an Arabic system, you would also be able to enter Arabic text into components like TextField or TextArea. However, you cannot expect those same AWT components to display text that isn't in the same script as the localized host. An English host would not typically be able to display Arabic in a TextField, for example. Java Foundation Classes/Swing (JFC/Swing) components, however, can often support multiple scripts because of their independence from the host platform and their use of Unicode as a multiscript character set. Therefore, Swing components can often display a script even when peered AWT components cannot. Table 6 shows some of the supported scripts.

Table 6. Some of the Supported Scripts for Text Display

Writing System Language
Arabic Arabic
Chinese (simplified) Chinese
Chinese (traditional) Chinese
Devanagari Hindi
Hebrew Hebrew
Japanese Japanese
Korean Korean
Latin: Western European subset English, French, German, Italian, Spanish, Swedish, and so on
Thai Thai
Greek Greek
Cyrillic Belorussian, Russian, and so on
Latin: Baltic subset Latvian, Lithuanian
Latin: Central European subset Czech, Hungarian, Polish, and so on
Latin: Turkic subset Turkish and so on

JRE and SDK Localizations

User interface elements in the runtime environment have been localized for several locales. These elements include AWT and Swing components and other messages generated by the JRE and included tools. Table 7 shows all the localizations provided for J2SE 5.0.

Table 7. User Interface Translations for the JRE

Language Locale ID
Chinese (simplified) zh_CN
Chinese (traditional) zh_TW
English en
French fr
German de
Italian it
Japanese ja
Korean ko
Spanish es
Swedish sv

Some tools, such as the Javac compiler, come only with the J2SE Software Development Kit (SDK). These tools provide error, warning, and informational messages to the user. Those messages within the SDK utilities and tools, including the compiler, are translated into English and Japanese. These translations are available in the J2SE 5.0 SDK.

Representing Locale as a String

Although most of your use of locale will require a Locale object reference, it is sometimes convenient to use an alternative representation, especially for internal debugging purposes. A Locale object's toString() method returns a String that is the concatenation of the language, region, and variant codes. The toString() method separates each component with the underscore character, _. You may want to use this method during debugging because it provides a convenient and readable representation.

Consider the locale created with this code:

Locale l = new Locale("ja", "JP");

The toString() method will return ja_JP for this locale.

This string is not appropriate for presentation to most end users. Most customers are not familiar with the ISO 639 and 3166 standards for language and country codes and will think the string is too cryptic. Fortunately, more user-friendly text representations are available, and we will discuss those later in this article.

Using Locale

Although not always obvious, Locale objects are used throughout the Java class libraries. Even when you don't explicitly ask for a locale preference, the Java environment uses default locale settings to provide you with localized information and behavior. When you use an explicit locale, you can use a different locale for each part of your application.

For example, you can use the es_MX, Spanish (Mexico), locale for displaying localized messages in Spanish and use the en_US, English (United States), locale for number and currency formatting. This type of support would be useful for Spanish speakers who live and work in the United States. Although the application user could view Spanish menus, prompts, and text, the rest of the application could format numbers and currency correctly for the United States. This is a simple example of how you can use multiple locales in a single application. If your applications ever require this level of locale support, you have the freedom to determine the behavior of every aspect of an application.

Some of the locale-sensitive classes format numbers, currencies, dates, and time. Others provide collation as well as word-breaking services. Classes will typically provide a constructor or factory method for creating instances. In each case, you usually have the option of providing an explicit locale preference.

Using a Default Locale

The default locale is used by locale-sensitive objects whenever your application doesn't specify an explicit locale choice. Depending on the default locale is not wise. In multiuser applications, a single default locale is usually not appropriate for everyone using the system. Instead your application should explicitly provide a preference to all locale-sensitive objects. The default locale is a systemwide resource, available throughout your application to any locale-sensitive object. As the default, it may be correct for your application's user, although you should be explicit in multilingual or multicultural environments. This is especially important when your application runs on a single host that supports many users.

Retrieve the default locale using the following method:

public static Locale getDefault()

The default locale of your application is determined in three ways. First, unless you have explicitly changed the default, the getDefault() method returns the locale that was initially determined by the Java Virtual Machine (JVM) when it first loaded. That is, the JVM determines the default locale from the host environment. The host environment's locale is determined by the host operating system and the user preferences established on that system.

Second, on some Java runtime implementations, the application user can override the host's default locale by providing this information on the command line by setting the user.language,, and user.variant system properties.

The following code will print out a different locale depending on the value of these properties when it is invoked:

  import java.util.Locale; 
  public class Default { 
    public static void main(String[] args) { 

You can experiment with the above code example. Running on a U.S. English system, the above code prints en_US. If you provide command-line options as described above, you can coax your application into using any locale you need. For example, you can invoke the application like this:

java -Duser.language=fr Default

This invocation prints fr_CA as the default locale.

Third, your application can call the setDefault(Locale aLocale) method. The setDefault(Locale aLocale) method lets your application set a systemwide resource. After you set the default locale with this method, subsequent calls to Locale.getDefault() will return the newly set locale.

Note: Take care not to call setDefault() in applets. The Security Manager will not allow you to call this method because it affects a systemwide resource within the JVM that runs on the host.

In most cases, using the default locale with other classes means ignoring it altogether. For example, if you want to format a number for your default locale, you simply create a NumberFormat without any arguments:

NumberFormat nf = NumberFormat.getInstance();

That's it. Using the default locale requires almost nothing on your part. Other locale-sensitive classes follow this same pattern. If you want the default locale behavior, do nothing special when creating the object. However, the default isn't always appropriate, and you'll need to be more explicit at those times.

Using an Explicit Locale

In some computing environments, applications use only a single locale throughout their life cycle. In other environments, applications use a global locale that can be changed. Those environments allow you to programmatically change the global locale preference, which remains in effect until you explicitly change it again. The Java application environment is unique, providing you with the ability to use a variety of locales throughout your application in any way you require.

Multinational companies have customers all around the globe. This means that both their customers and employees may speak different languages and have different expectations for how the company and its software should behave. Moreover, it is entirely possible, even common, to have a French employee handle a sales record for an Italian customer. In those situations, you will need absolute control over which locale your business and user interface objects use to manipulate and represent data. Your application may need to print sales receipts using Italian date and currency formats, yet sort customer lists for an English sales employee. The combinations are far too numerous to list, but Java technology provides you the flexibility to handle that complexity.

In order to get the most flexibility, you must explicitly request support for a target locale for each locale-sensitive class that you use. That means you must track the locale preferences for multiple aspects of the application or assign locale preferences to different users and customers.

If you have tracked the user's locale preference, you would create instances of locale-sensitive classes by explicitly providing a locale in a constructor or creation method. Imagine that a preferences object stores your customer's locale choice:

  Locale userLocale = preferences.getLocale(); 
  NumberFormat nf = NumberFormat.getInstance(userLocale); 

Retrieving Locale Information

Although Locale objects don't contain much information, they do provide a few interesting methods. As you might expect, the information is tightly related to the object's language, country, and variant. Some of this information is locale-independent; some is locale-dependent. All this means is that the Locale object has two different forms for most of its methods. One set of information is not customer-oriented or localized. The other set is localized and is suitable for presentation to the user.

Locale-Independent Information

The getLanguage() method returns the ISO 639 two-letter abbreviation for the locale's language. For example, if you have created the locale ja_JP, this method returns the code ja. The method's full signature is

public String getLanguage()

An extension of the ISO 639 standard defines three-letter language codes. Although these codes are not currently used in J2SE 5.0, the codes are available. Use the following method to retrieve the three-letter language code:

public String getISO3Language()

An example shows the difference:

  Locale aLocale = Locale.JAPAN;
  System.out.println("Locale: " + aLocale); System.out.println("ISO 2 letter: " 
     + aLocale.getLanguage()); 
  System.out.println("ISO 3 letter: " + aLocale.getISO3Language()); 
  aLocale = Locale.US;
  System.out.println("Locale:" + aLocale);
  System.out.println("ISO 2 letter: " + aLocale.getLanguage()); 
  System.out.println("ISO 3 letter: " + aLocale.getISO3Language()); 

The output would look like this:

  Locale: ja_JP 
  ISO 2 letter: ja
  ISO 3 letter: jpn 
  Locale: en_US 
  ISO 2 letter: en 
  ISO 3 letter: eng 

The getCountry() method returns the ISO 3166 two-letter abbreviation for the locale's region or country member. Its full signature is the following:

public String getCountry()

An ISO extension defines a three-letter code for countries too:

public String getISO3Country()

An example demonstrates their difference:

  Locale aLocale = Locale.CANADA_FRENCH;
  System.out.println("Locale: " + aLocale);
  System.out.println("ISO 2 letter: " + aLocale.getCountry());
  System.out.println("ISO 3 letter: " + aLocale.getISO3Country());

The output would look like this:

  Locale: fr_CA
  ISO 2 letter: CA
  ISO 3 letter: CAN

If your Locale object has a variant field, the getVariant() method will identify and return it as a String. If the Locale object has not defined a variant, this method returns an empty String. This method's declaration is the following:

public String getVariant()

The following class methods can be used to retrieve arrays of all the valid language and country codes available:

  • public static String[] getISOCountries()
  • public static String[] getISOLanguages()

A developer is much more likely to appreciate and to use the code returned by getLanguage() than is a customer. Your customer probably expects something different, as described in the next section.

Locale-Dependent Information

The codes supplied by the getLanguage(), getCountry(), and getVariant() methods are not especially user-friendly. Your customer should probably not have to interpret these codes, so locale provides additional methods that provide more readable, customer-oriented information.

Locale objects provide methods that return human-understandable text representations. This text representation is different from what the toString() method provides. Unlike the simple concatenation of the language, country, and variant fields, these methods provide human-readable, localized information about the locale:

  • public final String getDisplayLanguage()
  • public final String getDisplayCountry()
  • public final String getDisplayVariant()

Display Language

When you need to display a locale's language to your user, you should use the Locale object's getDisplayLanguage() method. This method returns the displayable, human-readable name of the locale's language. The display name is localized for the default locale if you don't provide a target locale argument. There are two forms of this method:

  • public final String getDisplayLanguage()
  • public final String getDisplayLanguage(Locale targetLocale)

The following examples show how these methods can be used:

  Locale deLocale = Locale.GERMANY;
  // Default system Locale is en_US for this method call.
  String defaultLanguage = deLocale.getDisplayLanguage();
  // Target de_DE is used as an explicit target language.
  String targetLanguage = deLocale.getDisplayLanguage(deLocale);

The output would look like this:


The output German is the U.S. English word for the locale's language. That's not especially impressive, but notice how you can provide a target locale argument. In that situation, getDisplayLanguage() tries to find and return a localized version of its language component.

This is important because you can show your customer the language of each locale that your application supports in the customer's target language. You can provide this list in your application to allow your customers to choose their preferred locale.

This brings up an interesting question: How do you present a locale's display language in the locale's language? You would do it with the following code:

String displayLang = aLocale.getDisplayLanguage(aLocale);

In other words, you simply provide the Locale object to itself in the call to getDisplayLanguage(). This same trick works with other displayable locale elements as well. For example, the display country and variant can be handled this way too. The following code snippet demonstrates this technique (see Figure 1):

  Locale[] locales = { new Locale("en", "US"), new Locale("ja","JP"),
          new Locale("es", "ES"), new Locale("it", "IT") }; 
  for (int x=0; x< locales.length; ++x) { 
      String displayLanguage = locales[x].getDisplayLanguage(locales[x]); 
      println(locales[x].toString() + ": " + displayLanguage); 

Figure 1. Displaying the Display Language in the Locale's Language

Display Country

Retrieve a locale's country or region component for user display with the following code:

  • public final String getDisplayCountry()
  • public final String getDisplayCountry(Locale targetLocale)

The first method form provides a localized country name for the default locale. The second form provides the same information localized for the target locale.

  Locale deLocale = Locale.GERMANY;
  // default en_US
  String defaultCountry = deLocale.getDisplayCountry();
  // target de_DE
  String targetCountry = deLocale.getDisplayCountry(deLocale);

The output would look like this:


Display Variant

Variants are less used than other elements of a Locale. However, at times you still need to access that information. The getDisplayVariant() method returns the display name for the private variant member of Locale:

  • public final String getDisplayVariant()
  • public final String getDisplayVariant(Locale targetLocale)

One way the Java platform uses this variant is to support the Thai language. By convention, a NumberFormat object for the th and th_TH locales will use common Arabic digit shapes, or Arabic numerals, to format Thai numbers. However, a NumberFormat for the th_TH_TH locale uses Thai digit shapes. The following code demonstrates (see Figure 2):

  Locale[] thaiLocale = {new Locale("th"), new Locale("th","TH"), 
              new Locale("th","TH", "TH")};
      for(Locale locale: thaiLocale) {
          NumberFormat nf = NumberFormat.getNumberInstance(locale);
          StringBuffer msgBuff = new StringBuffer();
          msgBuff.append(locale.toString() + ": ");
          textArea.append(msgBuff.toString() + "\n");

Figure 2. Displaying Numbers in Arabic and Traditional Thai Digits

Display Name

The display name is simply a combination of the localized language, country, and variants demonstrated earlier. The method forms are the following:

  • public final String getDisplayName()
  • public final String getDisplayName(Locale targetLocale)

Unlike locale's toString() method, which concatenates the individual components and separates them with an underscore character, the getDisplayName() method uses parentheses to separate the elements.

  Locale deLocale = Locale.GERMANY;
  // default en_US
  String defaultCountry = deLocale.getDisplayName();
  // target de_DE
  String targetCountry = deLocale.getDisplayName(deLocale);

The output would look like this:

  German (Germany)
  Deutsch (Deutschland)


Locale is an identifier for a language, an optional country (or region), and an optional variant code. Locale objects provide several methods that provide information about the locale's state. Although a locale doesn't itself contain much functionality, locale-sensitive objects depend on locale for an indication on how to behave. Locale-sensitive objects use the locale to customize their behavior so that they meet the user's expectations. In the Java platform, each locale-sensitive object is responsible for its own locale-dependent behavior. By design, locale-sensitive classes are independent of each other. That is, the set of supported locales in one class does not need to be the same as the set in another class.

In traditional operating systems and localization models, only one locale setting is active at a time. In those systems, after you programmatically set the locale, all locale-sensitive functions use the specified locale, which remains active throughout the application as a global locale. It changes when there is another global locale activation through a setlocale or similar call. The Java platform, however, treats locales a little differently. A Java application can actually have multiple locales active at the same time. That is, it is possible to use a U.S. date format and a German number format in the same application. The ability to use multiple Locale objects with various Format objects provides developers the opportunity to create complex combinations necessary for creating multicultural and multilingual applications.

For More Information