Java Forum / General / December 2007
Looking for XML package - need to read only
Ramon F Herrera - 11 Dec 2007 13:38 GMT I was given a bunch of XML files with the task of converting them to spreadsheet (helped by OpenOffice, of course). The one part that I am missing is a Class that loads the *.xml file and provides me with the ability to make queries and retrieve the data using different methods.
This is the first time I have to deal with XML. I didn't even know that you can open them with a browser.
TIA,
-Ramon
Steve W. Jackson - 11 Dec 2007 15:10 GMT In article <f7098f9f-4eb3-4f1c-89a0-ddb6d83241c2@d4g2000prg.googlegroups.com>,
> I was given a bunch of XML files with the task of converting them to > spreadsheet (helped by OpenOffice, of course). The one part that I am [quoted text clipped - 7 lines] > > -Ramon Since you're asking in a Java group, I assume you're seeking a Java XML parser. In that event, look no further than the API Javadocs for your parser. Since at least 1.4, if not earlier, Sun has included XML parsing in Java. See the javax.xml package and its subpackages. In particular, you'll find parsers in javax.xml.parsers, where you can get a SAXParser if you want or you can use the DocumentBuilderFactory to provide you with a DocumentBuilder, which has parse methods.
= Steve =
 Signature Steve W. Jackson Montgomery, Alabama
Ramon F Herrera - 11 Dec 2007 15:43 GMT On Dec 11, 11:10 am, "Steve W. Jackson" <stevewjack...@knology.net> wrote:
> In article > <f7098f9f-4eb3-4f1c-89a0-ddb6d8324...@d4g2000prg.googlegroups.com>, [quoted text clipped - 23 lines] > Steve W. Jackson > Montgomery, Alabama Thanks, Steve...
I guess the decision to make at this stage is whether the Sun-provided XML support is enough or if it should be (complemented?, replaced?) by any of these:
http://java-source.net/open-source/xml-parsers
BTW: Now I am looking for an XML tutorial for javax.xml.
TIA,
-Ramon
Steve W. Jackson - 11 Dec 2007 17:11 GMT In article <041e303f-738d-4592-8be4-4409d1787cbb@j20g2000hsi.googlegroups.com>,
> On Dec 11, 11:10 am, "Steve W. Jackson" <stevewjack...@knology.net> > wrote: [quoted text clipped - 39 lines] > > -Ramon Since Java 5 (or 1.5), the Sun JDK has included Xerces. In my work environment, we used Xerces-J well before that, and only minor changes have been made to discard the external jar files and use what Sun now gives us.
As for a tutorial...you might try Google. A search for "java xml tutorial" turned up some interesting possibilities.
 Signature Steve W. Jackson Montgomery, Alabama
Wayne - 12 Dec 2007 02:32 GMT > On Dec 11, 11:10 am, "Steve W. Jackson" <stevewjack...@knology.net> > wrote: [quoted text clipped - 35 lines] > > -Ramon I've been looking over VTD-XML for the last few days. It "sounds" promising; has anyone used it yet?
-Wayne
Arne Vajhøj - 12 Dec 2007 01:48 GMT > I was given a bunch of XML files with the task of converting them to > spreadsheet (helped by OpenOffice, of course). The one part that I am > missing is a Class that loads the *.xml file and provides me with the > ability to make queries and retrieve the data using different methods. There exist a standard API for XML usage in Java, so your code will actually be th same independent on which implementation you use: Xerces, Crimson, what already comes bundles with your Java (which is in reality either Xerces or Crimson).
There also exist some other API's than the standard one. JDOM is an example of such.
I will assumer you will go with the standard API.
Since you use the term query I will assume you will want to use W3C DOM parser and not the SAX parser.
I will also assume you will want to use XPath for queries, since that is usually the easiest.
You will need to read bit, but here are a code snippet to guide you to the right classes and methods in the Java API docs:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new File("/path/to/your/xml/file")); NodeList elements = XPathAPI.selectNodeList(doc.getDocumentElement(), "/roottag/othertag");
I have a lot of small Java XML examples on the shelf, so write if you need something specific.
Arne
Ramon F Herrera - 13 Dec 2007 15:33 GMT > You will need to read a bit... You can say that again...
It seems that the harder problem is to ignore about 95% of the material out there. There are so many ways to skin the proverbial (XML) cat.
Every piece of code I have seen takes a different approach, with varying sub-approaches. :-/
I will post a specific question next.
-Ramon
Ramon F Herrera - 13 Dec 2007 15:51 GMT > I have a lot of small Java XML examples on the shelf, so write > if you need something specific. > > Arne Later on, I might take you up on that offer, Arne. Thanks so much...
Meanwhile, I have a question for the XML crowd. I found a tutorial:
http://www.totheriver.com/learn/xml/xmltutorial.html#6
which is very close to what I need, with one exception. The tutorial traverses the XML file below:
<?xml version="1.0" encoding="UTF-8"?> <Personnel> <Employee type="permanent"> <Name>Seagull</Name> <Id>3674</Id> <Age>34</Age> </Employee> <Employee type="contract"> <Name>Robin</Name> <Id>3675</Id> <Age>25</Age> </Employee> <Employee type="permanent"> <Name>Crow</Name> <Id>3676</Id> <Age>28</Age> </Employee> </Personnel>
and display the fields. Pretty straightforward stuff. In the case above the tag/field names are known in advance ("Name", "Id", and "Age"). In my case, those names are not known until run time.
-Ramon
---------------------------------------------------------------
private Employee getEmployee(Element empEl) {
//for each <employee> element get text or int values of //name ,id, age and name // I DO NOT know these names below in advance. // How can I retrieve them?. - Ramon String name = getTextValue(empEl, "Name"); int id = getIntValue(empEl, "Id"); int age = getIntValue(empEl, "Age");
String type = empEl.getAttribute("type");
//Create a new Employee with the value read from the xml nodes Employee e = new Employee(name, id, age, type);
return e; }
private void parseDocument() { //get the root elememt Element docEle = dom.getDocumentElement();
//get a nodelist of <employee> elements
// I know the name of this top element. No problem here - Ramon NodeList nl = docEle.getElementsByTagName("Employee"); if (nl != null && nl.getLength() > 0) { for (int i = 0; i < nl.getLength(); i++) {
//get the employee element Element el = (Element) nl.item(i);
//get the Employee object Employee e = getEmployee(el);
//add it to list myEmpls.add(e); } } }
Arne Vajhøj - 16 Dec 2007 02:34 GMT > Meanwhile, I have a question for the XML crowd. I found a tutorial: > > http://www.totheriver.com/learn/xml/xmltutorial.html#6 > > which is very close to what I need, with one exception. The tutorial > traverses the XML file below:
> <Employee type="permanent"> > <Name>Seagull</Name> > <Id>3674</Id> > <Age>34</Age> > </Employee>
> and display the fields. Pretty straightforward stuff. In the case > above the tag/field names are known in advance ("Name", "Id", and > "Age"). In my case, those names are not known until run time.
> //for each <employee> element get text or int values of > //name ,id, age and name [quoted text clipped - 4 lines] > int age = getIntValue(empEl, "Age"); > String type = empEl.getAttribute("type"); Why is that a problem ?
They take a String as argument, but it does not matter whether it is a String literal or a String variable.
If you do not even know them before you read them then you will not to parse slightly different.
But it is possible to get all children of a node and get their tag name.
Arne
Ramon F Herrera - 16 Dec 2007 06:54 GMT > > Meanwhile, I have a question for the XML crowd. I found a tutorial: > [quoted text clipped - 23 lines] > They take a String as argument, but it does not matter whether > it is a String literal or a String variable. Either way, I am screwed. I cannot use a variable that says:
String clueless = "I have no idea about your name, can you tell me what it is?";
-Ramon
Ramon F Herrera - 16 Dec 2007 07:06 GMT > > Meanwhile, I have a question for the XML crowd. I found a tutorial: > [quoted text clipped - 18 lines] > > int age = getIntValue(empEl, "Age"); > > String type = empEl.getAttribute("type");
> Why is that a problem ? > > But it is possible to get all children of a node and get their tag name.
Well, this is more like it.
I am sure there must be a way. The question is HOW? Can you provide some code?
My current (subject to change, as all my beliefs!) hypothesis is that I am using the wrong approach (DOM) and that I have to try something else (SAX). It seems to me that DOM doesn't want to allow you to retrieve something as confusing as as (variable) tag name. The DOM designer assumed (as you did, Arne) that the programmer knows the tag names.
-Ramon
Roger Lindsjö - 16 Dec 2007 10:44 GMT >>> Meanwhile, I have a question for the XML crowd. I found a tutorial: >>> http://www.totheriver.com/learn/xml/xmltutorial.html#6 [quoted text clipped - 33 lines] > designer assumed (as you did, Arne) that the programmer knows the tag > names. Not the best looking ocde, but it traverses the DOM and prints the node names. Use with java XMLTest < xmlfile.
import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document; import org.w3c.dom.Node; import org.w3c.dom.NodeList;
public class XMLTest {
public static void main(String[] args) throws Exception { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(System.in); printNode(document, ""); }
private static void printNode(Node node, String indent) { System.out.println(indent + node.getNodeName()); NodeList list = node.getChildNodes(); for (int i = 0; i < list.getLength(); i++) { printNode(list.item(i), indent + " "); } } }
//Roger Lindsjö
Arne Vajhøj - 16 Dec 2007 17:01 GMT > > But it is possible to get all children of a node and get their tag name. > > Well, this is more like it. > > I am sure there must be a way. The question is HOW? Can you provide > some code? It is straightforward:
NodeList subelements = elementyouhave.getChildNodes(); for(int j = 0; j < subelements.getLength(); j++) { String tag = subelements.item(j).getNodeName(); // do some more }
In some cases XPathAPI.selectNodes can also be an option.
> My current (subject to change, as all my beliefs!) hypothesis is that > I am using the wrong approach (DOM) and that I have to try something > else (SAX). It seems to me that DOM doesn't want to allow you to > retrieve something as confusing as as (variable) tag name. The DOM > designer assumed (as you did, Arne) that the programmer knows the tag > names. No. You can discover an XML DOM tree fine.
I still think DOM will be better than SAX for you.
Arne
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|