Hi,
I'm in the process of making an application that should convert a
Word-document into XML. The Word files will use style names such as
Heading1, Heading2 etc. Since the application has to work on both Macs
and PCs I cannot use Word 2003 XML-features, and this Java is the way
to go. Now I'm a little uncertain on how to "attack" this problem. We
already have a Word-application that extract all information from a
Word-document into a text file. This file is tagged, but not in XML.
I'm planning to extract all text from Word, into a Java application
that makes the XML-file.
Any thoughs or suggestion on how to build an XML-file from this tagged
text file?
Eivind Løland-Andersen
Paul Davis - 31 Jul 2006 12:49 GMT
> Any thoughs or suggestion on how to build an XML-file from this tagged
> text file?
First you probably want to define some kind of schema for your XML
file. I would suggest going for the Open Document format.
http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office
This could help with future maintainability (also, they already have a
schema)
Just curious, what are you using to read the Word docs?
(I'm guessing POI or the libs in Open Office)