Hi,
We need help with processing special characters when processing XML
consecutively first by SAX and then converting that output into DOM.
This is what we do:
The input XML has all special chacters like ampersand replaced with
the correct strings: &
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
parser.parse( new File( FileWithXml ), handler );
the handler saves all the parsed XML into a string in a particular
format. - in the parsed XML, the & gets converted into &
String parsedString = parsedXml.toString();
parsedString needs to be converted into a document:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
document = factory.newDocumentBuilder().parse(new InputSource(new
StringReader(parsedString)));
But due to the presence of &, we cannot convert to a document unless &
is again replaced with &
Is there a way to retain special characters the first time around, so
we dont have to replace all occurences again before converting to a
document? Can a custom entity reference handler be used for anything
like this?
Thanks for any help
Rohit
Greg R. Broderick - 05 Apr 2007 14:47 GMT
Piper707@hotmail.com wrote in news:1175728257.299882.68170
@p77g2000hsh.googlegroups.com:
> Is there a way to retain special characters the first time around, so
> we dont have to replace all occurences again before converting to a
> document? Can a custom entity reference handler be used for anything
> like this?
I'd recommend not attempting to re-use the parsed XML (parsed by the SAX
parser) as input to the DOM parser. Instead, just create a new InputSource
from the input file and use that to feed the DOM parser.
Cheers
GRB

Signature
---------------------------------------------------------------------
Greg R. Broderick gregb+usenet200612@blackholio.dyndns.org
A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------