> I have some text data in a file I need to parse.
> .
[quoted text clipped - 33 lines]
> thanks
> lbrtchx
> // !! use "UTF8" for java.io classes
: Well, actually I had tried both "UTF8" and "UTF-8"and java appears
to be taken both as the same
.
> // !! your input file may not be UTF-8, actually ...
: This is the very first thing I checked using KDE's kate
.
> // !! aRdLn is/are discarded ...
: What do you mean? What I posted was some extract from my actual code
.
the problem I am having might be related to the BOM "byte order
marker" under Linux/Knoppix, but I am not sure about it
.
I see there was a most despised SUN bug that was declared as "Closed,
will not be fixed"
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058
.
// __ I am using the following JVM
sh-3.1# java -version
java version "1.4.2_11"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_11-b06)
Java HotSpot(TM) Client VM (build 1.4.2_11-b06, mixed mode)
.
// __ Defaul encoding is "ANSI_X3.4-1968"
String aDefEnc = System.getProperty("file.encoding");
System.out.println("// __ aDefEnc=" + aDefEnc);
// __ aDefEnc=ANSI_X3.4-1968
.
// __ i I use -Dfile.encoding=UTF-8 as JVM parameter
String aDefEnc = System.getProperty("file.encoding");
System.out.println("// __ aDefEnc=" + aDefEnc);
// __ aDefEnc=UTF-8
// __ OStrmRdr.getEncoding()=UTF8
.
// __ if I use aEnc="UTF-8";
sh-3.1# java k_killed08Test
// __ OStrmRdr.getEncoding()=UTF8
.
// __ if I use aEnc="UTF8";
sh-3.1# java k_killed08Test
// __ OStrmRdr.getEncoding()=UTF8
.
// __ if I use some non-sense string aEnc="8FTU";
java.io.UnsupportedEncodingException: 8FTU
at sun.io.Converters.getConverterClass(Converters.java:215)
at sun.io.Converters.newConverter(Converters.java:248)
at
sun.io.CharToByteConverter.getConverter(CharToByteConverter.java:64)
at sun.nio.cs.StreamEncoder$ConverterSE.<init>(StreamEncoder.java:189)
at sun.nio.cs.StreamEncoder$ConverterSE.<init>(StreamEncoder.java:172)
at
sun.nio.cs.StreamEncoder.forOutputStreamWriter(StreamEncoder.java:72)
at java.io.OutputStreamWriter.<init>(OutputStreamWriter.java:82)
at k_killed08Test.parse(k_killed08Test.java:54)
at k_killed08Test.main(k_killed08Test.java:26)
.
lbrtchx
hiwa - 24 Dec 2006 03:24 GMT
Does your document really have BOM?