Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / July 2007

Tip: Looking for answers? Try searching our database.

How to write Unicode

Thread view: 
Stefan Ram - 05 Jul 2007 00:47 GMT
When writing into a Unicode text file, given that the Stream
 encoding was set to »UTF-8«, what is the proper, best or
 canonical way to terminate a line?

 Some possibilities are given on the following lines.

printStream.printf( "\n" );
printStream.printf( "%n" );
printStream.print(( char )0x000A  );
printStream.print(( char )0x000D  );
printStream.print(( char )0x000D  ); printStream.print(( char )0x000A  );
printStream.print(( char )0x0085  ); // 0x0085 is Unicode »NEL - next line«
printStream.print(( char )0x2028  ); // 0x2028 is Unicode »line separator«
Arne Vajhøj - 05 Jul 2007 00:59 GMT
>   When writing into a Unicode text file, given that the Stream
>   encoding was set to »UTF-8«, what is the proper, best or
[quoted text clipped - 9 lines]
> printStream.print(( char )0x0085  ); // 0x0085 is Unicode »NEL - next line«
> printStream.print(( char )0x2028  ); // 0x2028 is Unicode »line separator«

For a disk file in UTF-8 I can not really see any reason not to use
System.getProperty("line.separator").

Arne
Rob - 12 Jul 2007 15:39 GMT
> >   When writing into a Unicode text file, given that the Stream
> >   encoding was set to »UTF-8«, what is the proper, best or
[quoted text clipped - 14 lines]
>
> Arne

If you're trying to get from a Java String to UTF-8 bytes, you could
try using String.getBytes("UTF-8"). The JDK will take care of
converting for you. If your Java String contains \n I'd expect it to
be converted to UTF-8 properly. Once you have the byte array you can
write the bytes directly to the file.
Arne Vajhøj - 14 Jul 2007 20:24 GMT
>>>   When writing into a Unicode text file, given that the Stream
>>>   encoding was set to »UTF-8«, what is the proper, best or
[quoted text clipped - 15 lines]
> be converted to UTF-8 properly. Once you have the byte array you can
> write the bytes directly to the file.

1)  \n is line separator on Unix/Linux - it is not line seaprator
    on all platforms.

2)  \n (and \r) are the same in ASCII, ISO-8859-1, UTF-8 etc..

Arne
Thomas Fritsch - 05 Jul 2007 08:52 GMT
Stefan Ram schrieb:
>   When writing into a Unicode text file, given that the Stream
>   encoding was set to »UTF-8«, what is the proper, best or
[quoted text clipped - 9 lines]
> printStream.print(( char )0x0085  ); // 0x0085 is Unicode »NEL - next line«
> printStream.print(( char )0x2028  ); // 0x2028 is Unicode »line separator«

Not to forget
  printStream.println();

Signature

Thomas

Lew - 07 Jul 2007 16:17 GMT
Stefan Ram schrieb:
>>   When writing into a Unicode text file, given that the Stream
>>   encoding was set to »UTF-8«, what is the proper, best or
[quoted text clipped - 10 lines]
>> printStream.print(( char )0x2028  ); // 0x2028 is Unicode »line
>> separator«

> Not to forget
>   printStream.println();

Lest we forget:
> All characters printed by a PrintStream are converted into bytes using the platform's default character encoding. The PrintWriter class should be used in situations that require writing characters rather than bytes.

Assuming that your variable "printStream" is of type "PrintStream", which you
did not aver.

I get a cringe seeing "the Stream encoding was set" - Java IO Streams don't
have encodings.  The PrintStream methods use encodings, but the Stream doesn't.

To answer your question, printf()'s "%n" specifies "the platform-specific line
separator", but that has nothing to do with encodings.

Signature

Lew

Roedy Green - 12 Jul 2007 04:16 GMT
>I get a cringe seeing "the Stream encoding was set" - Java IO Streams don't
>have encodings.  The PrintStream methods use encodings, but the Stream doesn't.

When you are playing with encodings you use a Reader/Writer.

see http://mindprod.com/applet/fileio.html
for sample code.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.