Hi,
I am a very beginner with programming java.
I am trying to use java.beans.XMLEncoder for creating a String
containing the XML serialization of my object.
I am using the following code (probably clumsy code)
/**
* serialize object to XML as String.
*/
public String serialize(){
ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
XMLEncoder xmlCreater = new XMLEncoder(streamOut);
xmlCreater.writeObject(this);
xmlCreater.close();
return streamOut.toString();
}
my problem is that in the result some characters are changed
e.g.: "PhÃ?nomene" instead of "Phänomene"
how could I solve this problem?
is there a less lengthy way to get the bean XML?
thanks
Peter
Dale King - 17 May 2006 16:50 GMT
> Hi,
>
[quoted text clipped - 19 lines]
> how could I solve this problem?
> is there a less lengthy way to get the bean XML?
Despite its appearing to be a textual format, XML really is a binary
format. It is not in general valid to convert it to a string. It has
internal information about character encodings, which can change from
one entity to another.
From the result you got it looks like the XML output is in UTF-8
encoding, which I see is what XMLEncoder is specified to produce.
Why do you think you need to convert it to a string? If it is just for
display for debug purposes then you can use streamOut.toString("UTF8"),
but once again you really should not in general convert the XML output
to a string. If you are saving the XML output or transmitting it then
the raw bytes are what should be used.

Signature
Dale King
Peter Plumber - 17 May 2006 17:32 GMT
Thanks a lot for that info.
What should the function be like at best?
/**
* serialize object to XML as String.
*/
public ByteArrayOutputStream serialize(){
ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
XMLEncoder xmlCreater = new XMLEncoder(streamOut);
xmlCreater.writeObject(this);
xmlCreater.close();
return streamOut;
}
/**
* serialize object to XML as String.
*/
public void serialize(OutputStream streamOut){
XMLEncoder xmlCreater = new XMLEncoder(streamOut);
xmlCreater.writeObject(this);
xmlCreater.close();
}
sth completely different?
thx
Peter
Dale King schrieb:
>> Hi,
>>
[quoted text clipped - 33 lines]
> to a string. If you are saving the XML output or transmitting it then
> the raw bytes are what should be used.
Dale King - 18 May 2006 17:58 GMT
> Thanks a lot for that info.
> What should the function be like at best?
[quoted text clipped - 20 lines]
>
> sth completely different?
Besides removing the word string from the comments, I would change the
first method to:
public byte[] serialize()
{
ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
serialize(streamOut);
streamOut.close();
return streamOut.toByteArray();
}
And there are ways that you can actually convert the bytes of XML into a
string, but they are non-trivial. You would have to parse XML data and
re-encode it again using the correct encoding and changing the encoding
declarations. But there really shouldn't be a need to that.

Signature
Dale King
Thomas Fritsch - 17 May 2006 18:04 GMT
> I am a very beginner with programming java.
> I am trying to use java.beans.XMLEncoder for creating a String
[quoted text clipped - 9 lines]
> xmlCreater.writeObject(this);
> xmlCreater.close();
So far, so good. Your streamOut contains a byte[] array with the
XML-representation encoded in UTF-8. This is consistent with the first
generated XML line saying
<?xml version="1.0" encoding="UTF-8"?>
Note that in UTF-8 a german 'ä' character ('\u00e4') is encoded as 2 bytes
(0xc3, 0xa4), not as 1 byte (0xe4) as you might expect.
> return streamOut.toString();
According to the API doc of ByteArrayOutputStream#toString() here you are
decoding the byte[] array to a String using the system's *default*
encoding, what ever that may be (and in your case it is definitely *not*
UTF-8).
What you really want is: decode the byte[] array to a String using the UTF-8
encoding (hence: exactly revert the UTF-8 encoding as done by the
XMLEncoder). That means you have to use
return streamOut.toString("UTF-8");
This will convert the 2 bytes (0xc3, 0xa4) back to the 1 character 'ä'.
> }
>
[quoted text clipped - 3 lines]
> how could I solve this problem?
> is there a less lengthy way to get the bean XML?

Signature
Thomas