Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2005

Tip: Looking for answers? Try searching our database.

Unicode and such

Thread view: 
EdwardH - 20 Oct 2005 12:39 GMT
The file "höhö" is shown as "h?h?" when I get a file.getName().

java.nio.charset.Charset.defaultCharset().name()
US-ASCII

System.getProperty("file.encoding")
ANSI_X3.4-1968

I've played around and set file.encoding to ascii, utf-8, utf-16, cp437
and iso-8859-1. Nothing helps.

Can anyone tell me what to do to fix this?

(I'm running and amd64 linux system, btw).
Roedy Green - 20 Oct 2005 12:45 GMT
>Can anyone tell me what to do to fix this?

Try setting the encoding specifically at file open.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

EdwardH - 20 Oct 2005 13:03 GMT
> Try setting the encoding specifically at file open.

Where would one do that?

File doesn't take a (String filename, String encoding) constructor.
EdwardH - 20 Oct 2005 14:23 GMT
>> Try setting the encoding specifically at file open.
>
> Where would one do that?
>
> File doesn't take a (String filename, String encoding) constructor.

Fixed!

export LC_CTYPE=en_US

It was previously POSIX, which I'm sure is short for "Piece of sh.t IX".
Roedy Green - 20 Oct 2005 23:58 GMT
>Where would one do that?
>
>File doesn't take a (String filename, String encoding) constructor.

The file class has nothing to do with contents or reading or writing.
It is about file names and existence.

You need to look elsewhere.  In regular file i/o it is the Readers and
Writers.

In nio look at the Charset, CharsetDecoder

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

Chris Uppal - 20 Oct 2005 13:47 GMT
> The file "höhö" is shown as "h?h?" when I get a file.getName().
>
> java.nio.charset.Charset.defaultCharset().name()
> US-ASCII

So the system has no way of printing out the name using the default charset.
If you check the four chars in the name then they, presumably, will not include
63 (the question mark), but will have the correct Unicode code point for ö
(whatever that might be).

You don't say how you are viewing the filename, but whatever it is (debugger,
System.out.println(), ...) will need to be told to use a charset that can
represent ö.

> System.getProperty("file.encoding")
> ANSI_X3.4-1968
>
> I've played around and set file.encoding to ascii, utf-8, utf-16, cp437
> and iso-8859-1. Nothing helps.

I don't know (off the top of my head) what the 'file.encoding' property is used
for, but I very much doubt if it's relevant here.  At a guess it's used as the
default charset for interpreting the /contents/ of files -- but that's a guess.

   -- chris
Mike Schilling - 21 Oct 2005 02:29 GMT
> I don't know (off the top of my head) what the 'file.encoding' property is
> used
> for, but I very much doubt if it's relevant here.  At a guess it's used as
> the
> default charset for interpreting the /contents/ of files -- but that's a
> guess.

You're right; it's the default encoding used by FileReader and FileWriter.
Thomas Fritsch - 21 Oct 2005 20:59 GMT
>> I don't know (off the top of my head) what the 'file.encoding' property
>> is used for, [...] At a guess it's used as the
>> default charset for interpreting the /contents/ of files -- but that's a
>> guess.
>
> You're right; it's the default encoding used by FileReader and FileWriter.

Even more: it's the default encoding used by
 InputStreamReader, OutputStreamWriter
 String ( constructor String(byte[]), method getBytes() )

Signature

"TFritsch$t-online:de".replace(':','.').replace('$','@')

Mike Schilling - 21 Oct 2005 22:52 GMT
>>> I don't know (off the top of my head) what the 'file.encoding' property
>>> is used for, [...] At a guess it's used as the
[quoted text clipped - 7 lines]
>  InputStreamReader, OutputStreamWriter
>  String ( constructor String(byte[]), method getBytes() )

So it is.  That is, it's the "defaut encoding", period.  Misleadingly named,
if you ask me.
zero - 20 Oct 2005 14:39 GMT
EdwardH <edwardh@N:O:S:P:A:M:edward.dyndns.org> wrote in news:nnL5f.148696
$dP1.506571@newsc.telia.net:

> The file "höhö" is shown as "h?h?" when I get a file.getName().
>
[quoted text clipped - 10 lines]
>
> (I'm running and amd64 linux system, btw).

omg I'm getting nightmares again...  I had a similar problem with
retreiving a name from a Clipper database in an internship (I had to
convert an old Clipper program to Java).  In the end I just gave up and
added some code that replaced the ö characters with their Unicode
equivalent.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.