Hi group,
I am implementing a fairly simple and straightforward text-search (I
display each line that contains required pattern ) that supports
Boolean queries in the following format:
str1 AND str2 NOT str3 - where not is a unary operation thus the
following would be equivalent to str1 AND str ANDNOT str3 by default
unless a user specifies otherwise....
Additionally, It is left-associative;
I would also like to have parenthesis as well: (str1 AND (str2 OR
str3))
That is it....
Should I write a parser for that followed by walking the AST (abstract
syntax tree) - javacc or antlar? - or would java regular expressions
suffice? it should be fast, whereas java regex is known to be slow....
Any suggestions, examples, references would be highly appreciated.
Thx
Stefan Schulz - 09 Jan 2006 13:53 GMT
As soon as you have parenthesis, you will need a context free parser.
Regular expressions can not even check if the expression has
well-formed brackets (see pumping lemma)
Oliver Wong - 10 Jan 2006 21:42 GMT
> Hi group,
>
> I am implementing a fairly simple and straightforward text-search (I
> display each line that contains required pattern ) that supports
> Boolean queries in the following format:
[...]
> I would also like to have parenthesis as well: (str1 AND (str2 OR
> str3))
[...]
> Should I write a parser for that followed by walking the AST (abstract
> syntax tree) - javacc or antlar? - or would java regular expressions
> suffice? it should be fast, whereas java regex is known to be slow....
>
> Any suggestions, examples, references would be highly appreciated.
Schulz is correct in saying regular expressions are not sufficiently
powerful. As for references, I recommend you post on comp.compilers to get
your design worked out if need be, and then comp.compilers.tools.javacc for
specific help on implementing your parser.
- Oliver