Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / June 2007

Tip: Looking for answers? Try searching our database.

Recording an ISO-639-1 language code

Thread view: 
Haggis McMutton - 24 Jun 2007 21:32 GMT
I want to make a class where a variable stores an ISO-639-1 language code.
I'm not sure whether to store this is as string (eg. 'EN', 'ES', 'FR',
etc.) or an 8-bit int where -128 would be 'AA', -127 would be 'AB', etc.

It seems to me that an 8-bit integer would be better programming. But to
do this I'd have to have a way to convert this 8-bit int into a language
code and back again when it's been entered by a human or given to a human
as it seems bad to me to expect a human to understand my own internal
language code system.

There would obviously be need for a function that converts between the
two, is it better to do this or just record it as a string?
Tom Hawtin - 24 Jun 2007 22:28 GMT
> I want to make a class where a variable stores an ISO-639-1 language code.
> I'm not sure whether to store this is as string (eg. 'EN', 'ES', 'FR',
> etc.) or an 8-bit int where -128 would be 'AA', -127 would be 'AB', etc.

Only that would give you 26*26 possibilities, which wouldn't fit in an
8-bit integer. Valid country codes would fir in one byte, at the moment.
How many million of these are you going to have to make it worth packing?

> It seems to me that an 8-bit integer would be better programming. But to
> do this I'd have to have a way to convert this 8-bit int into a language
> code and back again when it's been entered by a human or given to a human
> as it seems bad to me to expect a human to understand my own internal
> language code system.

"Better programming" almost always means clearer programming. Having
individual bytes represent country codes probably isn't going to help.
Having Strings for everything similarly isn't particularly helpful. You
might want to think about a country code (or just country) class. You
can share instances, so if you have lots of references to country codes
you are using 4 or 8 bytes per reference (on a machine with perhaps over
1,000,000,000 bytes available).

import java.util.Map;

public class ISOCountry {
    private static final Map<String,ISOCountry> instances;
    static {
        instances = java.util.HashMap<String,ISOCountry>();
        for (String code : java.util.Local.getISOCountries()) {
            instances.put(code, new ISOCountry(code));
        }
    }
    public static ISOCountry valueOf(String code) {
        ISOCountry country = instances.get(code);
        if (country == null) {
            throw new IllegalArgumentException();
        } else {
            return country;
        }
    }

    private final String code;
    private ISOCountry(String code) {
        this.code = code;
    }
    public String getISOCountry() {
        return code;
    }
}

(Disclaimer: Code not tested or even compiled.)

Tom Hawtin
Malcolm Dew-Jones - 25 Jun 2007 18:48 GMT
: I want to make a class where a variable stores an ISO-639-1 language code.

Then store it as an ISO-639-1 language code.

: I'm not sure whether to store this is as string (eg. 'EN', 'ES', 'FR',
: etc.) or an 8-bit int where -128 would be 'AA', -127 would be 'AB', etc.

How is an ISO-639-1 language code documented?  I haven't looked it up but
I bet it is simply a two character string.  Therefore you should use the
same.

Hide that in a class if you want, but if a programmer dumps the data then
they should see a two character string.

I'm not sure why there would be any reason to consider anything else in
any normal situation.

$0.10
Roedy Green - 26 Jun 2007 21:04 GMT
On Sun, 24 Jun 2007 20:32:27 GMT, Haggis McMutton
<haggis@somewhere.ere> wrote, quoted or indirectly quoted someone who
said :

>There would obviously be need for a function that converts between the
>two, is it better to do this or just record it as a string?

You can store it in one byte as an int, or 32 bits as a fixed length
character pair, even longer for a string since you need a length.

It really depends on how many of these you have.  If you have a
database of millions of records it might be worth sweating over.  Most
likely there are tens of thousands of things more deserving of such
detailed attention.

Consider that you can't do much interesting with ad-hoc SQL queries in
int form.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Lew - 29 Jun 2007 04:52 GMT
> On Sun, 24 Jun 2007 20:32:27 GMT, Haggis McMutton
> <haggis@somewhere.ere> wrote, quoted or indirectly quoted someone who
[quoted text clipped - 13 lines]
> Consider that you can't do much interesting with ad-hoc SQL queries in
> int form.

<http://java.sun.com/javase/6/docs/api/java/util/Locale.html#getISOLanguages()>
<http://java.sun.com/javase/6/docs/api/java/util/Locale.html#getLanguage()>
<http://java.sun.com/javase/6/docs/api/java/util/Locale.html#ENGLISH> et al.

Signature

Lew



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.