Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / February 2006

Tip: Looking for answers? Try searching our database.

Peculiar issue with French characters

Thread view: 
sumitra@gmail.com - 30 Jan 2006 11:47 GMT
Hello All,

I need to print out French characters
(ççÇÇààÀÀèèÈÈééÉÉ) in a PDF file by running my code on
Unix. I'm using iText to create the PDF. The configurations in iText
for the fonts include BaseFont.IDENTITY_H for encoding and
BaseFont.EMBEDDED.

The PDF encoding I have given is:
/BaseFont /Courier /Encoding /WinAnsiEncoding

which generates the PDFs with the French text fine on Windows. Should I
be changing this??

The problem is that with these parameters, on Unix, all I get is
garbled text in my pdf doc.

Compiling with -encoding ISO-8859-1 does not help because these French
values are picked up at run time from a Hashtable. I have checked the
Hashtable contents and they look good.

My code uses a lot of StringWriter() and I would like to know if I need
to explicitly set the encoding here to "8859_1" and if so, how?? I've
tried the ByteArrayOutputStream approach to replace the StringWriter
and wrapped that in OutputStreamWriter with the ecoding 8859_1. That
did not help.

I also tried the getBytes() method of StringWriter and tried to convert
it to another encoding, but that did not help too!!

I really am at a loss now as to how to resolve my problem.
If anyone out there has an idea do let me know please!
Thanks in advance.

--Sum
opalpa@gmail.com opalinski from opalpaweb - 30 Jan 2006 12:51 GMT
What happens when you generate the pdf on unix and view it on windows?

Opalinski
opalpa@gmail.com
http://www.geocities.com/opalpaweb/
Sum - 31 Jan 2006 04:34 GMT
When I generate the pdf on Unix and view it on Windows, I see only
garbled text.
Thomas Hawtin - 30 Jan 2006 16:41 GMT
> My code uses a lot of StringWriter() and I would like to know if I need
> to explicitly set the encoding here to "8859_1" and if so, how?? I've
[quoted text clipped - 4 lines]
> I also tried the getBytes() method of StringWriter and tried to convert
> it to another encoding, but that did not help too!!

Character encoding matters at the point you encode characters as bytes
(or the opposite decode).

Lots of APIs confuse the matter by picking the encoding up from the
system defaults. So code may work on one setup, but not on another. To
get around a fatal bug in Adobe Acrobat Reader I had to change
encodings, meaning I could get different results depending upon which
window/tab I launched an application from.

FileWriter doesn't support character encodings, so don't use that class.
OutputStreamWriter has constructors to take character encodings, and one
which doesn't (so don't use that one). StringWriter.getBytes does not
exist. Swing has various methods which may depend upon configured
encoding, a specified encoding or just chopping the top byte off each
character (including surrogates).

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Sum - 31 Jan 2006 04:41 GMT
My bad, I meant the String.getBytes() method and not
StringWriter.getBytes(), which as you rightly pointed out, does not
exist.

What I noticed while running my app on Unix was that the French string
being returned to my program was:

ççÃÃà à ÃÃèèÃÃééÃÃ

whereas I expected to see:

ççÇÇààÀÀèèÈÈééÉÉ

This does not happen on Windows. Also, I actually compile my code on
Windows, and put the tarball onto Unix.
What do you suppose is happening now??
Roedy Green - 31 Jan 2006 04:49 GMT
>This does not happen on Windows. Also, I actually compile my code on
>Windows, and put the tarball onto Unix.
>What do you suppose is happening now??

There is an implied default encoding used to map any conversion byte
<=> String.  See http://mindprod.com/jgloss/encoding.html
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

sumitra@gmail.com - 06 Feb 2006 04:47 GMT
Figured it out. The one thing that I did not do was to start the
application (in Unix) from the same session where I had set LANG to
fr_FR. I assumed that setting LANG=fr_FR would have an environment
level effect, however that turned out to be only for that telnet
session!

Thanks for the help everyone.  :-D


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.