Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / June 2005

Tip: Looking for answers? Try searching our database.

natiev2ascii

Thread view: 
wkijava - 27 Jun 2005 14:08 GMT
I have 2 questions refering to native2ascii:

1.I receive an xls-file from a japanese windows-user. It has two
columns: key for a properties-file and japanese text value for this
key. I intend to create a properties-file from this, which I may run
through native2ascii. But if I export the xls-file as unicode text, the
(latin ) key values also get translated into 2-byte charactes; this
results in a space between the characters , thus corrupting the key:
e.g. bws.welcome gets b w s . w e l c o m e . How can I solve this
issue?

2. If I create a text file with the japanese text only and save it as
Unicode, I can run it through native2ascii. But whatever enconding I
use, the unicode values generated are not valid and browsers don't
display anything resembling Japanese. The encondings I tried are: SJIS,
MS932, EUC_JP and ISO2022JP. JISAutoDetect does not work, the command
complains, that the encoding is not found. I use SKD 1.4.2_06.
My japanese colleague says, that he has a standard japanese Windows
with no special settings. SJIS works best, but still the translation
contains values, which are obivously not Unicode. Any suggestions?

Kind regards,
Wolfgang
Thomas Weidenfeller - 27 Jun 2005 16:56 GMT
> 1.I receive an xls-file from a japanese windows-user. It has two
> columns: key for a properties-file and japanese text value for this
> key. I intend to create a properties-file from this, which I may run
> through native2ascii. But if I export the xls-file as unicode text,

There is not one "Unicode text" format. There are a couple of Unicode
encodings like UTF-8, UTF-16LE, UTF-16BE. Figure out what you really got.

> (latin ) key values also get translated into 2-byte charactes; this
> results in a space between the characters , thus corrupting the key:
> e.g. bws.welcome gets b w s . w e l c o m e . How can I solve this
> issue?

Up to this point there is no issue. You have likely written a file in
UTF-16 (LE or BE).

> 2. If I create a text file with the japanese text only and save it as
> Unicode, I can run it through native2ascii. But whatever enconding I
> use, the unicode values generated are not valid and browsers don't
> display anything resembling Japanese. The encondings I tried are: SJIS,
> MS932, EUC_JP and ISO2022JP. JISAutoDetect does not work, the command
> complains, that the encoding is not found.

a) Why do you think that should work? You have just told us that you
have saved the file in some Unicode encoding. Non of the above encodings
is a Unicode encoding.

b) Why do you think a browser should display any Japanese from the
output of native2ascii? native2ascii does not write HTML. The Japanese
characters in the native2ascii output are not HTML character entities.
They are written as Java's \u Unicode escape sequences in plain ASCII.

> contains values, which are obivously not Unicode. Any suggestions?

Specify the Unicode encoding in which you have written your data as
argument to native2ascii.

/Thomas

Signature

The comp.lang.java.gui FAQ:
ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/computer-lang/java/gui/faq

wkijava - 28 Jun 2005 09:22 GMT
Well actually your tip helped. By the way, Excel allows only to export
to 'Unicode text'; it's not more specific. But I have to admit, that I
misunderstood the native2ascii documentation. I run it with encoding
UTF-16 (just a guess) and it worked. The properties-file is now
processed properly and the web application shows japanese characters.
Thanks very much.
Roedy Green - 28 Jun 2005 10:17 GMT
>1.I receive an xls-file from a japanese windows-user. It has two
>columns: key for a properties-file and japanese text value for this
[quoted text clipped - 4 lines]
>e.g. bws.welcome gets b w s . w e l c o m e . How can I solve this
>issue?

Your file is partly encoding in one way and partly in another.
native2ascii is not prepared to deal with that. Yo will have to write
some custom code to open the file to read the two different sections
with different encodings, or split the file in two and process the two
halves conventionally.

The fileio amanuensis will generate a skeleton program to read the
pieces.

See http://mindprod.com/applets/fileio.html

Signature

Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes

Roedy Green - 28 Jun 2005 10:21 GMT
>2. If I create a text file with the japanese text only and save it as
>Unicode, I can run it through native2ascii. But whatever enconding I
[quoted text clipped - 5 lines]
>with no special settings. SJIS works best, but still the translation
>contains values, which are obivously not Unicode. Any suggestions?

I suggest you have a peek at some files no the net that do display
Japanese correctly.  I suspect they may start with a unicode header
then flip to Japanese for the body or something similar.

There needs to be a way of noting the encoding of a file in some
standard way.  
Kludges include:

1. a companion file with the name originalfile.encoding

2. embedded code in first few bytes.

3. hiding it in the filename somewhere.

I think the world is resisting this hoping that non-Unicode encodings
will disappear.  See http://mindprod.com/jgloss/encoding.html

Signature

Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.