Please excuse this question from an internationalization newbie.
I have a database containing Unicode strings for Norwegian and German
place names with special characters, e.g. the letter
"ö" (UTF-8: C3 B6; an "o" with two dots above it)
Say a string variable containing placenames from the database is
called
String unicodeString;
and say the string contains the place name "Sömmerda".
Now I want to draw the Unicode string onto a Graphics object, like
g2D_of_Image.drawString(unicodeString, x, y);
This works, except that the "ö" shows up "funny", it looks like
"ö" (rather than the "o" with two dots above it)
(Other special characters similarly show up as funny stuff);
So what do you do to make Unicode (UTF-8) show up as the proper
special character on the Graphics object?
(I'm not actually sure, but the unicode may get mangled on the way
from the database into my string "unicodeString", so perhaps the
question is bigger, how do you write a Unicode-enabled java code :-o)
Thanks for any advice.
Wolfgang,
Santa Barbara
> (I'm not actually sure, but the unicode may get mangled on the way
> from the database into my string "unicodeString", so perhaps the
> question is bigger, how do you write a Unicode-enabled java code :-o)
see DataInput#readUTF();

Signature
http://uio.dev.java.net
http://reader.imagero.com
> Now I want to draw the Unicode string onto a Graphics object, like
>
[quoted text clipped - 6 lines]
> So what do you do to make Unicode (UTF-8) show up as the proper
> special character on the Graphics object?
It looks very much like you've actually got two characters in your Java
String object to represent this one character. That's definitely wrong.
You should have one character (which apparently should be \u00F6).
Where that two-byte sequence comes from is a different issue, as
discussed below. You can pretty safely drop your concern about the
drawString method, though, because it's apparent that your String is
incorrect before you get there. A problem in drawString would not cause
anything to draw two characters instead of one.
Your last bit doesn't make a lot of sense. When you call drawString,
you give it a Java String (which is a sequence of characters), not a
sequence of bytes. A Java String doesn't have an encoding like UTF-8,
because it's not encoded. So "Unicode (UTF-8)" is meaningless. All
Java Strings are Unicode, and no Java Strings are UTF-8.
> (I'm not actually sure, but the unicode may get mangled on the way
> from the database into my string "unicodeString", so perhaps the
> question is bigger, how do you write a Unicode-enabled java code :-o)
That seems quite likely. This is sometimes tricky to get right,
depending on your JDBC driver. Some databases can be built with
multiple different encodings, and some JDBC drivers can get confused
unless you help them along.
Care to share which DBMS this is, and any special build-time or run-time
configuration? The exact definition of the table and the Java code you
use to retrieve the value would be good, too.

Signature
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
Wolfgang - 31 May 2004 20:33 GMT
The database I use is mySQL. I found out that mySQL prior to version
4.1.0-alpha does not support Unicode, so there is the first problem. I
then switched to mySQL version 4.1.0-alpha last night and there is
some improvement in that I can now see special characters IF I query
the database with an graphical client (I use iSQL-Viewer 2.1.1 from
http://prdownloads.sourceforge.net/isql; from isqlviewer.com). But
for this a query has to be formated in the following contorted way:
SELECT full_name FROM gm WHERE full_name LIKE
CONVERT(_latin1'Hauss??mmern' USING utf8);
(this looks for the place name string "Hauss?mmern")
Given this query, the graphical SQL client actually returns and
displays the string "Hauss?mmern", incl. the correct special character
"?".
If send the exact same query using my Java program, which extracts and
draws the place name on a map, my graphic shows "Hauss??mmern". My
Java program does things like this:
ResultSet rs = stmt.executeQuery("SELECT full_name FROM gm WHERE
full_name LIKE CONVERT(_latin1'Hauss??mmern' USING utf8);");
String unicodeString = rs.getString("full_name");
g2D_of_Image.drawString(unicodeString, x, y);
I'm not surprised that, the Unicode for "?" shows up as two
characters "??" instead of just one whatever character, because the
Unicode consists of the two bytes C3 B6 (in fact, if you search Google
for "?", the URL that appears is contains the very same two
characters, as in
http://www.google.com/webhp?hl=en&tab=iw&q=%22%C3%B6%22).
I'm just just frustrated that I can't make it show up as the "?" it's
supposed to represent.
Thanks,
Wolfgang
>> Now I want to draw the Unicode string onto a Graphics object, like
>>=20
[quoted text clipped - 42 lines]
>Chris Smith - Lead Software Developer/Technical Trainer
>MindIQ Corporation