Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / May 2006

Tip: Looking for answers? Try searching our database.

EncodingProblem with ToString

Thread view: 
Peter Plumber - 17 May 2006 15:53 GMT
Hi,

I am a very beginner with programming java.
I am trying to use java.beans.XMLEncoder for creating a String
containing the XML serialization of my object.
I am using the following code (probably clumsy code)

  /**
  * serialize object to XML as String.
  */
  public String serialize(){
    ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
    XMLEncoder xmlCreater = new XMLEncoder(streamOut);
    xmlCreater.writeObject(this);
    xmlCreater.close();
    return streamOut.toString();
  }

my problem is that in the result some characters are changed
e.g.: "PhÃ?nomene" instead of "Phänomene"

how could I solve this problem?
is there a less lengthy way to get the bean XML?

thanks

Peter
Dale King - 17 May 2006 16:50 GMT
> Hi,
>
[quoted text clipped - 19 lines]
> how could I solve this problem?
> is there a less lengthy way to get the bean XML?

Despite its appearing to be a textual format, XML really is a binary
format. It is not in general valid to convert it to a string. It has
internal information about character encodings, which can change from
one entity to another.

From the result you got it looks like the XML output is in UTF-8
encoding, which I see is what XMLEncoder is specified to produce.

Why do you think you need to convert it to a string? If it is just for
display for debug purposes then you can use streamOut.toString("UTF8"),
but once again you really should not in general convert the XML output
to a string. If you are saving the XML output or transmitting it then
the raw bytes are what should be used.

Signature

 Dale King

Peter Plumber - 17 May 2006 17:32 GMT
Thanks a lot for that info.
What should the function be like at best?

  /**
  * serialize object to XML as String.
  */
  public ByteArrayOutputStream serialize(){
    ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
    XMLEncoder xmlCreater = new XMLEncoder(streamOut);
    xmlCreater.writeObject(this);
    xmlCreater.close();
    return streamOut;
  }

  /**
  * serialize object to XML as String.
  */
  public void serialize(OutputStream streamOut){
    XMLEncoder xmlCreater = new XMLEncoder(streamOut);
    xmlCreater.writeObject(this);
    xmlCreater.close();
  }

sth completely different?

thx

Peter

Dale King schrieb:

>> Hi,
>>
[quoted text clipped - 33 lines]
> to a string. If you are saving the XML output or transmitting it then
> the raw bytes are what should be used.
Dale King - 18 May 2006 17:58 GMT
> Thanks a lot for that info.
> What should the function be like at best?
[quoted text clipped - 20 lines]
>
> sth completely different?

Besides removing the word string from the comments, I would change the
first method to:

   public byte[] serialize()
   {
       ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
       serialize(streamOut);
       streamOut.close();
       return streamOut.toByteArray();
   }

And there are ways that you can actually convert the bytes of XML into a
string, but they are non-trivial. You would have to parse XML data and
re-encode it again using the correct encoding and changing the encoding
declarations. But there really shouldn't be a need to that.

Signature

 Dale King

Thomas Fritsch - 17 May 2006 18:04 GMT
> I am a very beginner with programming java.
> I am trying to use java.beans.XMLEncoder for creating a String
[quoted text clipped - 9 lines]
>      xmlCreater.writeObject(this);
>      xmlCreater.close();
So far, so good. Your streamOut contains a byte[] array with the
XML-representation encoded in UTF-8. This is consistent with the first
generated XML line saying
 <?xml version="1.0" encoding="UTF-8"?>
Note that in UTF-8 a german 'ä' character ('\u00e4') is encoded as 2 bytes
(0xc3, 0xa4), not as 1 byte (0xe4) as you might expect.

>      return streamOut.toString();
According to the API doc of ByteArrayOutputStream#toString() here you are
decoding the byte[] array to a String using the system's *default*
encoding, what ever that may be (and in your case it is definitely *not*
UTF-8).
What you really want is: decode the byte[] array to a String using the UTF-8
encoding (hence: exactly revert the UTF-8 encoding as done by the
XMLEncoder). That means you have to use
      return streamOut.toString("UTF-8");

This will convert the 2 bytes (0xc3, 0xa4) back to the 1 character 'ä'.
>    }
>
[quoted text clipped - 3 lines]
> how could I solve this problem?
> is there a less lengthy way to get the bean XML?

Signature

Thomas



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.