Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / April 2007

Tip: Looking for answers? Try searching our database.

Parsing with complex regular expressions

Thread view: 
kevin  cline - 24 Apr 2007 22:58 GMT
I have complex multi-line string to parse, so I created a complex
regular expression by combining a bunch of simpler regular
expressions, like this:

   private static final String WS = " +";
   private static final String EOL = " *\n";
   private static final String REST_OF_LINE = ".*\n";
   private static final String REST_OF_BLOCK = REST_OF_LINE + "(?:" +
WS + REST_OF_LINE + ")*";
   private static final String AMOUNT = "\\d+\\.\\d+";
   private static final String CURRENCY = "[A-Z]{3}" + AMOUNT;

   private static final String FARE = "[A-Z]{3} +\\d*" + EOL
        + WS + CURRENCY + " +" + CURRENCY + EOL
        + WS + AMOUNT + REST_OF_LINE
        + WS + AMOUNT + "[A-Z]*" + EOL
        + " {7}" + REST_OF_LINE;

  ...

  private static final java.util.regex.Pattern PAT =
Pattern.compile( ... );

This works great to recognize valid input, but extracting the data
parsed is not so easy.  I wanted to capture it all with capturing
groups, but I ran into two problems: first, the Matcher only stores
the last match for each group,
and second, the groups have to be accessed by index, which would
require keeping track of them in the whole expression.

Is there a more powerful regular expression class out there somewhere,
or a more powerful parsing technology that would help with this
problem?  It would be a trivial matter in either Perl (by attaching
code to the sub-expressions) or in C++ (using the SPIRIT parsing
library), but in Java I'm pretty clueless.

Thanks for the help.
Kai Schwebke - 25 Apr 2007 03:51 GMT
kevin cline schrieb:
> I have complex multi-line string to parse, so I created a complex
> regular expression by combining a bunch of simpler regular
> expressions, like this:
...
> Is there a more powerful regular expression class out there somewhere,
> or a more powerful parsing technology that would help with this
> problem?

You may have a look at javacc, a parser generater for Java like
yacc or bison for C (https://javacc.dev.java.net/).

Kai


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.