Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2007

Tip: Looking for answers? Try searching our database.

Handling Whitespace in Java DOM

Thread view: 
Jason Cavett - 12 Dec 2007 15:46 GMT
Before I get flamed, I have already read on how to ignore whitespace
in an XML document via the Java DOM.  However, according to the
following link, it currently is not working as intended.  (See:
http://forums.java.net/jive/thread.jspa?messageID=226957 for the
thread and http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6564400
for the bug report)

So, I can't seem to use DocumentBuilderFactory's
setIgnoringElementContentWhitespace, so I am wondering if there is
another way to handle whitespace in an XML file.  When I parse the XML
file and get the children the text nodes look something like this...

name: #text
data: (all whitespace - spaces, \n, etc.)

Now, I suppose I could just check to see if the name is #text and
ignore it as I loop through the nodes, but that seems kind of crummy
to do that.  Is there a better way that I'm not seeing?

Thanks
Arne Vajhøj - 15 Dec 2007 23:12 GMT
> Before I get flamed, I have already read on how to ignore whitespace
> in an XML document via the Java DOM.  However, according to the
[quoted text clipped - 14 lines]
> ignore it as I loop through the nodes, but that seems kind of crummy
> to do that.  Is there a better way that I'm not seeing?

We had this question back here back in September.

My conclusion was that you need a DTD to get it working.

See code below.

Arne

===========================================

import java.io.StringReader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.traversal.DocumentTraversal;
import org.w3c.dom.traversal.NodeFilter;
import org.w3c.dom.traversal.TreeWalker;
import org.xml.sax.InputSource;

public class XMLandWS {
    public static void parse(String xml) throws Exception {
        System.out.print(xml);
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setIgnoringElementContentWhitespace(true);
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document doc = db.parse(new InputSource(new StringReader(xml)));
        TreeWalker walk = ((DocumentTraversal)
doc).createTreeWalker(doc.getDocumentElement(), NodeFilter.SHOW_TEXT,
null, false);
        Node n;
        while ((n = walk.nextNode()) != null) {
            System.out.println("=" + n.getNodeValue().replace("\n",
"\\n").replace(" ", "_"));
        }
    }
    public static void main(String[] args) throws Exception {
        parse("<all>\n" +
              "  <one>A</one>\n" +
              "  <one>BB</one>\n" +
              "  <one>CCC</one>\n" +
              "</all>\n");
        parse("<!DOCTYPE all [\n" +
              "<!ELEMENT all (one)*>\n" +
              "<!ELEMENT one (#PCDATA)>\n" +
              "]>\n" +
              "<all>\n" +
              "  <one>A</one>\n" +
              "  <one>BB</one>\n" +
              "  <one>CCC</one>\n" +
              "</all>\n");
        parse("<!DOCTYPE all [\n" +
                "<!ELEMENT all (#PCDATA|one)*>\n" +
                "<!ELEMENT one (#PCDATA)>\n" +
                "]>\n" +
                "<all>\n" +
                "  <one>A</one>\n" +
                "  <one>BB</one>\n" +
                "  <one>CCC</one>\n" +
                "</all>\n");
    }
}


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.