Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / September 2006

Tip: Looking for answers? Try searching our database.

Encoding in file

Thread view: 
Lukasz - 27 Sep 2006 09:24 GMT
Hi,

In my application I create some files and I write some text into. I
want to use UTF-8 encoding, but both methods that I tried seem to
ignore specified encoding. I used:

OutputStream fout= new FileOutputStream(nazwa);
OutputStream bout= new BufferedOutputStream(fout);
OutputStreamWriter out = new OutputStreamWriter(bout, "UTF8");

and

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new
FileOutputStream(nazwa),"UTF8"));

The problem seems to be simple, but often it is hard to find an answer
for a most simple question.
Thomas Kellerer - 27 Sep 2006 09:41 GMT
Lukasz wrote on 27.09.2006 10:24:
> Hi,
>
> In my application I create some files and I write some text into. I
> want to use UTF-8 encoding, but both methods that I tried seem to
> ignore specified encoding. I used:

Can you be more specific what you mean with "seem to ignore"?

> BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new
> FileOutputStream(nazwa),"UTF8"));

This works for me, with the only difference that I use "UTF-8"

Thomas
Lukasz - 27 Sep 2006 10:44 GMT
Thomas Kellerer napisal(a):
> Lukasz wrote on 27.09.2006 10:24:
> > Hi,
[quoted text clipped - 11 lines]
>
> Thomas

In UTF-8, for example " sign should be replaced with ;quote (or
something like that). Neither of my method does it.
Thomas Kellerer - 27 Sep 2006 11:01 GMT
Lukasz wrote on 27.09.2006 11:44:

> In UTF-8, for example " sign should be replaced with ;quote

Not at all!

What you are describing is HTML (or XML) "escaping".
That has nothing to do with the encoding of characters.

UTF-8 is an encoding that stores characters that do not fit into 8bit ASCII with
as a variable number of bytes. Some characters are encoded with one byte, some
with two, some with three.

The " sign fits into the 8bit ASCII range, and will be encoded with one byte
(hex 22)
The Euro symbol for example does not fit into the 8bit ASCII range, and will be
encoded with two bytes with UTF-8 (20 AC)

Thomas
Lukasz - 27 Sep 2006 11:12 GMT
Thomas Kellerer napisal(a):
> Lukasz wrote on 27.09.2006 11:44:
> >
[quoted text clipped - 15 lines]
>
> Thomas

And what should I make, to replace this " sign with :quote, as well as
other signs with xml escaping?
Thomas Kellerer - 27 Sep 2006 11:32 GMT
Lukasz wrote on 27.09.2006 12:12:
> Thomas Kellerer napisal(a):
>> Lukasz wrote on 27.09.2006 11:44:
[quoted text clipped - 17 lines]
> And what should I make, to replace this " sign with :quote, as well as
> other signs with xml escaping?

There is not standard API (as far as I know). You'll have to roll your own. But
maybe the Jakarta site has something.

Thomas
Steve W. Jackson - 27 Sep 2006 15:57 GMT
> Lukasz wrote on 27.09.2006 12:12:
> > Thomas Kellerer napisal(a):
[quoted text clipped - 23 lines]
>
> Thomas

If the information being written is actually XML, it should be a
non-issue.  I've found that it's necessary to use the UTF-8 encoding
name on the OutputStreamWriter to ensure that the file itself gets that
encoding, but the method used to serialize the XML must also know that
it should use UTF-8 and it will automatically take care of this
"escaping".

= Steve =
Signature

Steve W. Jackson
Montgomery, Alabama



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.