Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2006

Tip: Looking for answers? Try searching our database.

SAX callback method question

Thread view: 
steve_marjoribanks@hotmail.com - 23 Feb 2006 16:59 GMT
If I have an XML document with some elements like this:

<line>
   <point>0 2</point>
   <point>1 4</point>
   <point>3 5</point>
   etc.

</line>

ie. a collection of points which I want to extract the coordinates of
from the XML file and draw them using Java.
I was thinking I can obviously use the startElement method and use a
test to see if it's a <point> element and then use the characters
method to extract the coordinates as strings and cast them to intergers
and store in an array or similar. This might sound like a silly
question but will the parser always traverse through the XML document
in order parsing as it goes? ie, if using the method just described,
will the coordinates of the points be stored in the correct order in
the array?

Also, if my XML document was like:

<line>
   <point>5 1</point>
   <point>4 8</point>
   etc
</line>
<line>
   <point>3 4</point>
   <point>4 1</point>
   etc
</line>
etc

how would I go about making sure that the point coordinates for each
line remain separate from each other and do not get mixed up?
I'm starting to think that DOM might have been a better idea than SAX!!
:-(

Steve
Robert Klemme - 23 Feb 2006 17:17 GMT
> If I have an XML document with some elements like this:
>
[quoted text clipped - 16 lines]
> described, will the coordinates of the points be stored in the
> correct order in the array?

Yes.  AFAIK order matters by the XML standard.

> Also, if my XML document was like:
>
[quoted text clipped - 14 lines]
> I'm starting to think that DOM might have been a better idea than
> SAX!! :-(

I prefer SAX as it's less resource intensive and you can easily skip
things you want to ignore without wasting mem or CPU cycles.

The way I usually do it is this: create a proxy that implements the
callback interface(s) I need.  Internally when it sees an opening element
it will create a delegate instance based on the nane of the element and
puts it onto a stack by giving him the reference of the current elem.
Then the proxy delegates the method call to the topmost element on the
stack.  Delegates store state as they see fit and model instances are
updated when the closing tag is detected.

Hope that was clear enough.

Btw, your points are really structured elements.  I'd rather do something
like:

<line>
   <point>
     <x>0</x>
     <y>2</y>
   </point>
</line>

(With better names probably.)

Kind regards

   robert
steve_marjoribanks@hotmail.com - 23 Feb 2006 17:33 GMT
Thanks for the reply. With regards to the naming, I just made up an
example, a sample of the real XML I am using is shown below. The
problem is that my schema is an extension of other schemas and as such
contains elements and complex types whose naming is out of my control.

                <geotechml:layers>
                        <geotechml:Layer materialID="1">
                            <geotechml:layerTop>
                                <geotechml:Curve>
                                    <gml:LineString>
                                        <gml:pos>0 10</gml:pos>
                                        <gml:pos>30 10</gml:pos>
                                        <gml:pos>60 40</gml:pos>
                                    </gml:LineString>
                                </geotechml:Curve>
                            </geotechml:layerTop>
                        </geotechml:Layer>
                        <geotechml:Layer materialID="2">
                            <geotechml:layerTop>
                                <geotechml:Curve>
                                    <gml:LineString>
                                        <gml:pos>0 30</gml:pos>
                                        <gml:pos>20 40</gml:pos>
                                        <gml:pos>60 40</gml:pos>
                                    </gml:LineString>
                                </geotechml:Curve>
                            </geotechml:layerTop>
                        </geotechml:Layer>
                        <geotechml:Layer materialID="3">
                            <geotechml:layerTop>
                                <geotechml:Curve>
                                    <gml:LineString>
                                        <gml:pos>0 60</gml:pos>
                                        <gml:pos>20 65</gml:pos>
                                        <gml:pos>50 70</gml:pos>
                                        <gml:pos>70 70</gml:pos>
                                        <gml:pos>100 80</gml:pos>
                                    </gml:LineString>
                                </geotechml:Curve>
                            </geotechml:layerTop>
                        </geotechml:Layer>
                    </geotechml:layers>

In the example above I need to extract the values of the 3 coordinate
points given for each lineString and then draw them in my Java
application.
Sorry, but being a newbie I have no idea what you're talking about when
you gave your solution using a proxy? Any chance you could exlplain
further please? (sorry!).
Do you think in this instance it would be easier to use DOM? I say this
because although I don't need to extract data from every element (as
shown above) there are a number of elements which I need to get the
data from and they're not all named the same as in the example above
either.

Steve
Robert Klemme - 23 Feb 2006 18:16 GMT
> Thanks for the reply. With regards to the naming, I just made up an
> example, a sample of the real XML I am using is shown below. The
> problem is that my schema is an extension of other schemas and as such
> contains elements and complex types whose naming is out of my control.

Well, bad. :-}

<snip/>

> In the example above I need to extract the values of the 3 coordinate
> points given for each lineString and then draw them in my Java
> application.
> Sorry, but being a newbie I have no idea what you're talking about
> when you gave your solution using a proxy? Any chance you could
> exlplain further please? (sorry!).

You create an object that does just part of the job (finding the one that
should do the real work) and then delegates the method invocation to that
object.

> Do you think in this instance it would be easier to use DOM? I say
> this because although I don't need to extract data from every element
> (as shown above) there are a number of elements which I need to get
> the data from and they're not all named the same as in the example
> above either.

Can't really tell as I don't see the whole picture.  If you use DOM,
you'll have to do the traversal or work with an XSLT processor.  If those
documents can be large I'd favour the other approach but YMMV (especially
if you need a lot of the data from the tree).

Kind regards

   robert
steve_marjoribanks@hotmail.com - 24 Feb 2006 12:12 GMT
>You create an object that does just part of the job (finding the one that
>should do the real work) and then delegates the method invocation to that
>object.

Do you mean kind of 'nesting' callback methods? So would I have one
handler that find a certain node and then delegates the handling of the
children of that node to another node and so on until I get the data I
need? Sorry for all the questions!

>Can't really tell as I don't see the whole picture.  If you use DOM,
>you'll have to do the traversal or work with an XSLT processor.  If those
>documents can be large I'd favour the other approach but YMMV (especially
>if you need a lot of the data from the tree).

Hmm, its a tricky one. I originally chose SAX because the documents I'm
working with with have the potential to become fairly large, not
massive but not particularly small either. Also, I have no need to
write or change the XML so I thought I'd use SAX. Having thought about
it now though, I do need to extract a fair amount of data from the tree
but as shown above I'll need to traverse down though a fairly large
tree structure to get the information I need because there is a
reasonably 'deep' tree structure and the information needed is at the
bottom of the tree.
Robert Klemme - 24 Feb 2006 13:18 GMT
>> You create an object that does just part of the job (finding the one
>> that should do the real work) and then delegates the method
[quoted text clipped - 4 lines]
> the children of that node to another node and so on until I get the
> data I need? Sorry for all the questions!

Yes.  I think you get the hang of it.

>> Can't really tell as I don't see the whole picture.  If you use DOM,
>> you'll have to do the traversal or work with an XSLT processor.  If
[quoted text clipped - 10 lines]
> reasonably 'deep' tree structure and the information needed is at the
> bottom of the tree.

But if you just need info from some top level nodes and leaf nodes and
there's a lot of stuff in between that you want to ignore, then that
sounds as if you rather only extract 20% of the data.  In that case I'd go
for SAX.

Kind regards

   robert
steve_marjoribanks@hotmail.com - 26 Feb 2006 14:11 GMT
Having had a think about it, I'm struggling to get my head around how
this would actually be implemented. I've read up on the DefaultHandler
and as far as I can work out you can only assign one per reader. How
can I use multiple handlers on just one input?
Robert Klemme - 27 Feb 2006 11:00 GMT
> Having had a think about it, I'm struggling to get my head around how
> this would actually be implemented. I've read up on the DefaultHandler
> and as far as I can work out you can only assign one per reader. How
> can I use multiple handlers on just one input?

This is a basic pattern called "delegation" (also "strategy pattern" and
"state pattern").  Information about this abounds on the web but you might
be better off by first reading a book about OO design and / or software
design in general.

Kind regards

   robert
steve_marjoribanks@hotmail.com - 26 Feb 2006 14:11 GMT
Having had a think about it, I'm struggling to get my head around how
this would actually be implemented. I've read up on the DefaultHandler
and as far as I can work out you can only assign one per reader. How
can I use multiple handlers on just one input?

Thanks
Steve


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.