i'm trying to parse a file using StreamTokenizer class.
the file has strings like "Hello\u1234There".
StreamTokenizer returns this as a string "Hellou1234There".
i understand this class is not unicode capable - but any hints
as to how this could be parsed properly.
i thought any 'u' followed by 4 hex digits would work because
i've got some strings where 'u' + hext digits is part of a string.
> i'm trying to parse a file using StreamTokenizer class.
> the file has strings like "Hello\u1234There".
[quoted text clipped - 6 lines]
> i thought any 'u' followed by 4 hex digits would work because
> i've got some strings where 'u' + hext digits is part of a string.
The u + hex digits is part of Java source code processing, including
processing of string literals.
Since you are reading from a Stream, how about applying a filter to do
any preprocessing you want?
Patricia