Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / August 2007

Tip: Looking for answers? Try searching our database.

XPathAPI(node, xpathStr) & XPathContext.getDTMHandleFromNode(node) slow

Thread view: 
David  Portabella - 27 Jul 2007 23:33 GMT
Hello,

I am using xalan 2.7.0: http://xml.apache.org/xalan-j/
As I run XPathAPI.eval(node, xpathStr) over and over again on several
nodes, it gets slower and slower.
This is documented in the XPathAPI documentation, and it suggests to
use the low-level XPath API:
http://xml.apache.org/xalan-j/apidocs/org/apache/xpath/XPathAPI.html

I am now using the low-level XPath API as follows:
   XPathContext xpathSupport = new XPathContext();
   PrefixResolverDefault prefixResolver = new
PrefixResolverDefault(document);
   XPath xpath = new XPath(xpathStr, null, prefixResolver,
XPath.SELECT, null);

and then, for each node:
   int ctxtNode = xpathSupport.getDTMHandleFromNode(contextNode);
   XObject object =  xpath.execute(xpathSupport, node,
prefixResolver);

It gets a bit better, but still, after using over and over again on
several nodes, it gets slower and slower.
I think that the problem is that
XPathContext.getDTMHandleFromNode(child) does not free memory.

Test this simplistic example yourself:
++++++++++++++++++++++++++++++++++++++++++++
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import org.apache.xpath.*;
import org.apache.xml.utils.*;

public class Test {
   public static void main(String[] argv) throws Exception {
       int numChilds = 100000+1;

       System.out.println("Building a document with " + numChilds + "
childs");
       Document doc =
DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
       Element root = doc.createElement("root");
       doc.appendChild(root);
       for (int i = 0; i < numChilds; i ++) {
           Element child = doc.createElement("child");
           root.appendChild(child);
           Element subChild = doc.createElement("sub-child");
           child.appendChild(subChild);
           Element subSubChild = doc.createElement("sub-sub-child");
           subChild.appendChild(subSubChild);
           subSubChild.setAttribute("title", "title" + i);
       }

       XPathContext xpathSupport = new XPathContext();
       PrefixResolverDefault prefixResolver = new
PrefixResolverDefault(doc);
       XPath titleXpath = new XPath("sub-child/sub-sub-child/@title",
null, prefixResolver, XPath.SELECT, null);
       Runtime r = Runtime.getRuntime();

       System.out.println("Evaluating XPath for each " + numChilds +
" childs");
       NodeList nodeList = root.getChildNodes();
       int size = nodeList.getLength();
       for (int i = 0; i < size; i++) {
           long start = System.currentTimeMillis();
           Element child = (Element) nodeList.item(i);
           int ctxtNode = xpathSupport.getDTMHandleFromNode(child);
           //String title = titleXpath.execute(xpathSupport,
ctxtNode, prefixResolver).toString();
           long duration = System.currentTimeMillis() - start;
           if (i < 10 || (i % (numChilds/10)) == 0)
               System.out.println("child #" + i + "\t took " +
duration + " ms." +
                                  "\tfreeMemory: " + r.freeMemory() +
"\ttotalMemory: "+r.totalMemory());
           else if (i == 10)
               System.out.println("printing some selected childs only
from now on...");
       }
   }
}

++++++++++++++++++++++++++++++++++++++++++++
Here you can see an example of the result:

$ java Test
Building a document with 100001 childs
Evaluating XPath for each 100001 childs
child #0         took 77 ms.    freeMemory: 10642840    totalMemory:
45129728
child #1         took 1 ms.     freeMemory: 10583848    totalMemory:
45129728
child #2         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #3         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #4         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #5         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #6         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #7         took 1 ms.     freeMemory: 10583848    totalMemory:
45129728
child #8         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
child #9         took 0 ms.     freeMemory: 10583848    totalMemory:
45129728
printing some selected childs only from now on...
child #10000     took 3 ms.     freeMemory: 10980392    totalMemory:
45129728
child #20000     took 5 ms.     freeMemory: 9976808     totalMemory:
45129728
child #30000     took 7 ms.     freeMemory: 6332656     totalMemory:
45129728
child #40000     took 9 ms.     freeMemory: 5112168     totalMemory:
45129728
child #50000     took 12 ms.    freeMemory: 1373472     totalMemory:
45129728
child #60000     took 14 ms.    freeMemory: 19851264    totalMemory:
66650112
child #70000     took 16 ms.    freeMemory: 16515832    totalMemory:
66650112
child #80000     took 19 ms.    freeMemory: 15040280    totalMemory:
66650112
child #90000     took 21 ms.    freeMemory: 7435744     totalMemory:
66650112
child #100000    took 24 ms.    freeMemory: 17416944    totalMemory:
66650112

++++++++++++++++++++++++++++++++++++++++++++
each time I call xpathSupport.getDTMHandleFromNode(child) it does not
free the memory,
and so it gets slower and slower.

How to solve this problem?
Some people has suggested to use the DOM4J package instead of Xalan.
However, we already have quite a lot of software using Xalan and
changing the code would have some cost.
Is it possible to solve this problem without discarding xalan?

Regards,
DAvid
Piotr Kobzda - 30 Jul 2007 15:04 GMT
David Portabella wrote:

> I am now using the low-level XPath API as follows:
>     XPathContext xpathSupport = new XPathContext();
[quoted text clipped - 12 lines]
> I think that the problem is that
> XPathContext.getDTMHandleFromNode(child) does not free memory.

It seems to me that Xalan holds all the references to the DTM nodes it
creates (possibly in XPathContext, or DTMManager instance).  I'm not
very familiar with Xalan API, nor its internals, but I came to that
conclusion after some experimenting with your example code under Java SE
embedded version of Xalan (I don't know which particular versions of
Xalan each Java embeds).

I tried to remove that DTM references from context as follows:

    DTMManager dtmManager = xpathSupport.getDTMManager();

    DTM dtm = dtmManager.getDTM(ctxtNode);
    dtmManager.release(dtm, true);

But it seems that all DTM references released that way are still
referenced somewhere (possibly in per document context).  As the result,
your example performs even slower with that.

However, the above seems to be handy when used for each child node
separately from the original DOM document. i.e. used for the node's
clone referred to that way:

    int ctxtNode =
xpathSupport.getDTMHandleFromNode(child.cloneNode(true));

Without releasing the clone's DTM handle, the example very quickly ends
with OutOfMemoryError.  But when both the above changes are used, xpath
performs equally fast for each child's clone created in the loop.

Of course, the above solution will work correctly as long as your xpath
expression is not referencing any data of the child's parent node (nor
any data of some other nodes not within its subtree).  There are
possibly some other limitations caused by this trick, which I can't come
up with now.  But in your particular example, it seems to work fast and
properly.

For those who'd like to check that with Sun's Java 5 and 6 internally
embedded version of Xalan, enough is to replace the following two imports:

> import org.apache.xpath.*;
> import org.apache.xml.utils.*;

with:

import com.sun.org.apache.xml.internal.utils.*;
import com.sun.org.apache.xpath.internal.*;
import com.sun.org.apache.xml.internal.dtm.*;

> Is it possible to solve this problem without discarding xalan?

Hope so.  Let us know if the above solves your problem.

piotr
David  Portabella - 27 Aug 2007 16:29 GMT
> David Portabella wrote:
> > I am now using the low-level XPath API as follows:
[quoted text clipped - 67 lines]
>
> piotr

Hello Piotr,

Thanks for your help; sorry for the long delay.

Your solution works for me also, thanks a lot!

I'm now looking at the implications.
For instance, it may be difficult to use the trick for selecting some
nodes which need to be modified.
I'll let you know.

Many thanks,
DAvid


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.