Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2006

Tip: Looking for answers? Try searching our database.

extracting part of xml

Thread view: 
puzzlecracker - 16 Feb 2006 05:03 GMT
let's say I have the following xml file
<info>
     <item>
     ............
     </item>

     <item>
     ............
     </item>

     <item>
     ............
     </item>

</info>

I want to extract  each< item> in its entirity; thus,  in above, I want
to create  3 files
each containing just
      <item>
     ............
     </item>

.I tried using xpath didnt help, not sure how to readof the actual
tags.

Thanks.
Jean-Francois Briere - 16 Feb 2006 07:25 GMT
This is how to retrieve the nodes:

String xpathExpr = "/info/item";
String inputFilename = "yourFile.xml";
XPath xpath = XPathFactory.newInstance().newXPath();
InputSource inputSource = new InputSource(inputFilename);
NodeList nodes = (NodeList)xpath.evaluate(xpathExpr, inputSource,
XPathConstants.NODESET);

Regards
Denis - 16 Feb 2006 07:38 GMT
Have you try to use http://jaxen.org/ ?

DM
Jean-Paul - 16 Feb 2006 07:40 GMT
- Download the JDOM library from http://www.jdom.org.

- Import the library in your project / favorite IDE

Given the following an XML file called items.xml with the following
contents:
<?xml version="1.0"?>
<info>
    <item>hello</item>
    <item>world</item>
    <item>!</item>
</info>

We will be producing 3 files, each named item1.xml, item2.xml,
item3.xml with the following piece of code using the JDOM library:

import org.jdom.*;
import org.jdom.input.*;
import org.jdom.output.*;
import java.io.*;
import java.util.*;

public class XMLItemManipulator {

    private List<Element> items;

    public XMLItemManipulator() {
        items = null;
    }

    public void readItems(File xmlFile) throws FileNotFoundException,
IOException {

        // make sure the file exists and can be read
        if(!xmlFile.exists())
            throw new FileNotFoundException("cannot find the xml file");

        if(!xmlFile.canRead())
            throw new IOException("file exists but does not have *read*
permission");

        // now that we have made sure we got the file, just get the objects
        // necessary to read it and create and XML doc outta if
        SAXBuilder builder = new SAXBuilder();
        Document doc = null;

        try {
            doc = builder.build(xmlFile);
        } catch(JDOMException e) {
            System.out.println("An error occured while build the XML Doc!");
            e.printStackTrace();
        }

        // get the root element, in you case this would be <info>
        Element root = doc.getRootElement();

        // get the list of children of the root element
        // which have the "item" tag.
        // meaning that even if you had other tags that
        // were children of the root, we really wouldn't care
        // perfect for an heterogenous xml file containing more
        // than the "item" elements
        items = root.getChildren("item");
    }

    // now that you got the items you might want to manipulate them
    // it depends on what you wanna do with them while they're in
    // memory. I recommend you have a look at the JDOM doc for more info.
    public void manipulateItems() {
        // put some code here
    }

    // once you have manipulated them or since you got the items,
    // you can now decide to write them separately to files.
    // To do this, it's very simple.
    public void writeItems() throws IOException, Exception {
        Element root = null;
        Document doc = null;
        FileWriter writer = null;
        XMLOutputter out = new XMLOutputter();
        int size = items.size();

        try {
            for(int counter = 0; counter < size; counter++) {
                root = new Element("item");
                root.addContent(items.get(counter).cloneContent());

                doc = new Document(root);
                writer = new FileWriter(new File("item" + counter + ".xml"));
                out.output(doc, writer);
                out.output(doc, System.out);
            }

        } catch(IOException e) {
           throw e; // put better handling of exception here
        } catch(Exception e) {
            throw e; // put better handling of exception here
        } finally {
            try {
                writer.close();
            } catch(Exception e) {
                e.printStackTrace(); // imagine better handling here
            }
        }
    }

    // testing all of this with a main method (normally you'd write)
    // a full test case to do this but that's your decision
    public static void main(String[] args) {
        XMLItemManipulator manip = new XMLItemManipulator();
        File file = new File("items.xml");

        try {
            manip.readItems(file);
            manip.manipulateItems(); // this is optional
            manip.writeItems();

        } catch(Exception e) {
            e.printStackTrace();
        }
    }
}

There you go. Let us know how it goes.

Regards,

Jean-Paul H.
ab2305@gmail.com - 17 Feb 2006 02:25 GMT
> - Download the JDOM library from http://www.jdom.org.
>
[quoted text clipped - 124 lines]
>
> Jean-Paul H.

thanks
it didnt work.

Item is not the root tag but they are scatter of the doc...

<Info>

<Item>
..............
</Item>

etc
</Info>

suggest
Jean-Paul - 17 Feb 2006 09:55 GMT
Even so, you should be able to modify the code to make it work. What
this code does is that it gives you the basics. From here and with the
documentation of the JDOM library, you should be able to get a solution
on our own. Also try to read on how to properly manipulate XML with
Java.
ab2305@gmail.com - 19 Feb 2006 05:42 GMT
I am wary about using JDOM in a commercial software. Is it possibly to
acchive the same with standard tools that are part of 1.5?

Thanks.
James McGill - 19 Feb 2006 07:12 GMT
> I am wary about using JDOM in a commercial software. Is it possibly to
> acchive the same with standard tools that are part of 1.5?

What are you trying to do (I missed the thread?)

The JDK has reference implementations of DOM and SAX, all in JAXP which
shares ancestry with Xerces.  I prefer DOM4J but I can't give you an
intelligent rationale other than, "it's always worked well when I've
used it".  

Since 1.5, it seems like it should be unnecessary to use anything
additional for xml processing, unless you need a particular
implementation for performance or compatability reasons.   But I must
admit, I didn't see the original question and I might be being naive.
ab2305@gmail.com - 23 Feb 2006 02:45 GMT
After I do the extraction, i save items to the file, However, when I
read them back, one item at the time (using sax parser- provided by
eclipse), rarely, but for some them I get the following exception.
Can someonw point out the problem? thanks

org.xml.sax.SAXParseException: Invalid byte 2 of 3-byte UTF-8 sequence.
    at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
    at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at javax.xml.parsers.SAXParser.parse(Unknown Source)
    at
com.touchgraph.amazoncache.io.AmazonParser.parse(AmazonParser.java:33)
    at
com.touchgraph.amazoncache.io.AmazonCacheReader.readCache(AmazonCacheReader.java:35)
    at
com.touchgraph.amazoncache.io.AmazonCacheStore.getBooksFromCache(AmazonCacheStore.java:185)
    at
com.touchgraph.amazoncache.io.AmazonCacheStore.loadSimilarFromCache(AmazonCacheStore.java:131)
    at
com.touchgraph.amazoncache.io.AmazonCacheStore.getSimilarBooks(AmazonCacheStore.java:44)
    at
com.touchgraph.amazoncache.io.AmazonDataModel.addSimilarBooks(AmazonDataModel.java:70)
    at
com.touchgraph.amazoncache.io.AmazonCacheFrame$1.actionPerformed(AmazonCacheFrame.java:85)
    at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
    at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
    at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
    at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown
Source)
    at java.awt.Component.processMouseEvent(Unknown Source)
    at javax.swing.JComponent.processMouseEvent(Unknown Source)
    at java.awt.Component.processEvent(Unknown Source)
    at java.awt.Container.processEvent(Unknown Source)
    at java.awt.Component.dispatchEventImpl(Unknown Source)
    at java.awt.Container.dispatchEventImpl(Unknown Source)
    at java.awt.Component.dispatchEvent(Unknown Source)
    at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
    at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
    at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
    at java.awt.Container.dispatchEventImpl(Unknown Source)
    at java.awt.Window.dispatchEventImpl(Unknown Source)
    at java.awt.Component.dispatchEvent(Unknown Source)
    at java.awt.EventQueue.dispatchEvent(Unknown Source)
    at java.awt.EventDispatchThread.pumpOneEventForHierarchy(Unknown
Source)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.run(Unknown Source)
null
Jean-Paul - 25 Feb 2006 12:07 GMT
It seems that there is a problem on the way you're reading the files
back into your system. Can you show us some code?
puzzlecracker - 25 Feb 2006 15:52 GMT
> It seems that there is a problem on the way you're reading the files
> back into your system. Can you show us some code?
I already solved it.  I all  I needed to do is to write files with a
different encoding.

thanks


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.