Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / July 2006

Tip: Looking for answers? Try searching our database.

java pattern matcher

Thread view: 
Moiristo - 12 Jul 2006 16:35 GMT
I have a problem with the java Matcher.

Consider the following regex: <<<(.)+>>>
Now, consider the following String containing two occurrences of the
above regex:

SELECT TOP <<<Upper limit>>> FROM (SELECT TOP <<<Lower limit>>> FROM person)

However, when I use Matcher.find(), it does not what I expect it to do.
Instead of matching the two substrings, it finds only one, namely:

'Upper limit>>> FROM (SELECT TOP <<<Lower limit'

This is of course correct, but not what I intended. Can someone help me
to solve this?
Stefan Ram - 12 Jul 2006 16:42 GMT
>Consider the following regex: <<<(.)+>>>

 Try

<<<(.)+?>>>

 or

<<<([^>])+>>>
Stefan Ram - 12 Jul 2006 16:46 GMT
>  Try
><<<(.)+?>>>
>  or
><<<([^>])+>>>

 But you actually might want

<<<(.+?)>>>

 or

<<<([^>]+)>>>

 , respectively.
Moiristo - 12 Jul 2006 18:24 GMT
>>  Try
>> <<<(.)+?>>>
[quoted text clipped - 10 lines]
>
>   , respectively.

Thank you! Could you please explain why this works and not the regex I used?
Oliver Wong - 12 Jul 2006 18:36 GMT
>>>  Try
>>> <<<(.)+?>>>
[quoted text clipped - 11 lines]
> Thank you! Could you please explain why this works and not the regex I
> used?

   I guess Java's regular expression engine is by default greedy (most RE
engines are greedy by default). I'm guessing the first alternative, (.+?),
the '?' acts as a modifer on '+', telling it not to be greedy. That is,
instead of matching the longest possible substring, it tries to match the
shortest possible substring.

   In the second alternative, instead of accepting "<<<" followed by
anything, followed by ">>>", it accepts "<<<" followed by anything except
">", followed by ">>>".

   - Oliver
lordy - 13 Jul 2006 00:29 GMT
>>>>  Try
>>>> <<<(.)+?>>>
[quoted text clipped - 23 lines]
>
>     - Oliver

And the latter is more efficient. The first will do a lot of
backtracking given the expected input strings.

Lordy
Stefan Ram - 12 Jul 2006 18:59 GMT
>> <<<(.+?)>>>
>> <<<([^>]+)>>>
>Thank you! Could you please explain why this works and not the regex I used?

 The first suggestion should be used, because the second one
 fails to match "<<<ab<c>def>>>".

 Oliver by now has explained the expressions. See also:

http://download.java.net/jdk6/docs/api/java/util/regex/Pattern.html
Moiristo - 12 Jul 2006 20:50 GMT
>>> <<<(.+?)>>>
>>> <<<([^>]+)>>>
[quoted text clipped - 6 lines]
>
> http://download.java.net/jdk6/docs/api/java/util/regex/Pattern.html

Thank you both. I knew it was something like that, but I didn't know
about modifiers in regex's; I only knew that '?' stood for 'once or not
at all'.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.