Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / June 2005

Tip: Looking for answers? Try searching our database.

Newbie problem: conversion with unicode

Thread view: 
Francois - 23 Jun 2005 11:08 GMT
Hi all,

I have caracters represented with escape sequence in a file:
\u00B3
\u2074
\u2075

these represents superscript 3, 4, 5 respectively.
Now I read the file and  read these as String "\u00B3".
I would convert this representation in a new String object: which would
have the superscrit 3 4 and 5.

What is the correct code for this ? I have compilation errors or
converion errors ...

Thanks a lot for any help !

Francois Rappaz
Harald - 23 Jun 2005 22:27 GMT
> I have caracters represented with escape sequence in a file:
> \u00B3
[quoted text clipped - 7 lines]
> What is the correct code for this ? I have compilation errors or
> converion errors ...

Strip the backslash, strip the u, convert the resulting four hex digits
to an int (Integer.parseInt() or the likes should help), cast the
resulting int to char, append to a StringBuilder (StringBuffer),
convert the resulting StringBuilder to String.

 Harald.

Signature

---------------------+---------------------------------------------
Harald Kirsch (@home)|
Java Text Crunching: http://www.ebi.ac.uk/Rebholz-srv/whatizit/software

Francois - 24 Jun 2005 09:39 GMT
Thanks a lot !

That's is exactly what I needed and it gives something like

String s = rawCode;
byte c = new byte[1]
try {
    c[0] = Integer.decode("0x" + rawCode.substring(2)).byteValue();
} catch (NumberFormatExceptione){e.printStackTrace();}
String result = new String(c);

Could I have encoding problems with that code ? or should it worked on
any situation ?
TIA
Francois
HK - 24 Jun 2005 17:43 GMT
> Thanks a lot !
>
> That's is exactly what I needed and it gives something like
>
> String s = rawCode;

You don't use the s below, but rawCode (should not matter).

> byte c = new byte[1]

This should definitively be char[], not byte[].

> try {
>     c[0] = Integer.decode("0x" + rawCode.substring(2)).byteValue();

You certainly want .intValue(), not .byteValue(). Remember that
char --- in Java and very much unlike C/C++ --- is 2 bytes long.
In addition I would rather use

 c[0] = Integer.parseInt(rawCode.substring(2), 16);

> } catch (NumberFormatExceptione){e.printStackTrace();}
> String result = new String(c);
>
> Could I have encoding problems with that code ? or should it worked on
> any situation ?

I assume you checked before that rawCode starts with '\\' and 'u'
and that it contains at most 4 hex digits. If it contains 5, digits,
you loose
something in the cast to char.

 Harald.
Francois - 28 Jun 2005 13:52 GMT
Well probably nobody care, but the code above does not work:
the following seems to be alright:

char c[] = new char[n]
s has been read in a file and contain "\u2079" for example
...
c[i]= (char)Integer.parseInt(s.substring(2),16);
...
and for the whole array c
String result = new String(c);

Francois


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.