Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2008

Tip: Looking for answers? Try searching our database.

Question about Quantifiers in java Regular expression

Thread view: 
NeoGeoSNK - 02 Mar 2008 14:42 GMT
Hello,
I have learned Java Regular expression for a long time, but still
confused about Quantifiers:

import java.util.regex.*;
public class NRGRegex{
    public static void main(String[] args){
        Pattern p = Pattern.compile("a??");
        String a = "aaa";
        Matcher m = p.matcher(a);
        while(m.find()){
        System.out.println("found char = " + m.group() + " at " + m.start()
+ " and " + m.end());                       }
    }
}

the output result is:
found char =  at 0 and 0
found char =  at 1 and 1
found char =  at 2 and 2
found char =  at 3 and 3
here "a??" is Reluctant quantifiers but why all char 'a' not match
successful?

when I use greedy quantifiers Pattern p = Pattern.compile("a?");
the output result is:
found char = a at 0 and 1
found char = a at 1 and 2
found char = a at 2 and 3
found char =  at 3 and 3

I think  greedy quantifiers first eat whole string "aaa" at a time,
but why the emtry char at (0,0) (1,1) (2,2) can't match successful
compare with Reluctant quantifiers ?

Thanks!
Joshua Cranmer - 02 Mar 2008 18:40 GMT
> Hello,
> I have learned Java Regular expression for a long time, but still
[quoted text clipped - 19 lines]
> here "a??" is Reluctant quantifiers but why all char 'a' not match
> successful?

The definition of "a?" means that either a is matched or it isn't.
Without a quantifier, it attempts to match a first and only omit the a
when it can't match. However, you specified the reluctant quantifier,
which makes the `?' operator attempt to not match first.

Psuedocode for "a?":
try to match `a' and then the rest of the regex
if match fails:
   try to match nothing and rest of regex
   return result of match
else:
   return true

For "a??":
try to match nothing and then the rest of the regex
if match fails:
   try to match `a' and rest of regex
   return result of match
else:
   return true

Since "a??" is the full regex, the first attempt (to match nothing) will
succeed at every point, and the fall back of matching `a' will never occur.

> when I use greedy quantifiers Pattern p = Pattern.compile("a?");
> the output result is:
[quoted text clipped - 6 lines]
> but why the emtry char at (0,0) (1,1) (2,2) can't match successful
> compare with Reluctant quantifiers ?

Greedy means, essentially, to assume that a match will work and only
unmatch a character if it doesn't work. Reluctant quantifiers will
attempt to match the rest of the regex and only match more if it has to.

A typical example is this:
Finding a closing parenthesis in an arithmetic expression (can't handle
nested):
"(1+4)*5-6/(1+9)": the obvious regex "\\(.*\\)" will match the entire
string, whereas "\\(.*?\\)" will match only "(1+4)".

If you want to match "aaa", the regex "a*" or "a+" will do so.

Finally, there is the possessive quantifier, which refuses to backtrack
on failed matches. I can imagine that there are times when this would be
helpful, but none that I can think of off the top of my head...

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

NeoGeoSNK - 03 Mar 2008 08:45 GMT
> > Hello,
> > I have learned Java Regular expression for a long time, but still
[quoted text clipped - 74 lines]
> Beware of bugs in the above code; I have only proved it correct, not
> tried it. -- Donald E. Knuth

Thanks, It's very clear,
> The definition of "a?" means that either a is matched or it isn't.
> Without a quantifier, it attempts to match a first and only omit the a
> when it can't match. However, you specified the reluctant quantifier,
> which makes the `?' operator attempt to not match first.
so do you mean:
X? meaning X,once or not at all
but
X?? meaning not at all or X,once

one question is:
> "(1+4)*5-6/(1+9)": the obvious regex "\\(.*\\)" will match the entire
> string, whereas "\\(.*?\\)" will match only "(1+4)".

I have test it, and "\\(.*?\\)" match both (1+4) and (1+9), why do you
think it only match (1+4) ?

Thanks for your repay again.
Lars Enderin - 03 Mar 2008 12:25 GMT
NeoGeoSNK skrev:
>>> Hello,
>>> I have learned Java Regular expression for a long time, but still
[quoted text clipped - 17 lines]
> I have test it, and "\\(.*?\\)" match both (1+4) and (1+9), why do you
> think it only match (1+4) ?

That regexp matches first (1+4), then (1+9). The other regexp matches
from the first ( up to and including the last ), once.
Joshua Cranmer - 03 Mar 2008 21:49 GMT
>> The definition of "a?" means that either a is matched or it isn't.
>> Without a quantifier, it attempts to match a first and only omit the a
[quoted text clipped - 4 lines]
> but
> X?? meaning not at all or X,once

Right.

>> "(1+4)*5-6/(1+9)": the obvious regex "\\(.*\\)" will match the entire
>> string, whereas "\\(.*?\\)" will match only "(1+4)".
>>
> I have test it, and "\\(.*?\\)" match both (1+4) and (1+9), why do you
> think it only match (1+4) ?

Oops, I should have been clearer. "(1+9)" will be matched as well. What
I had intended to say was that the first match would not match the whole
string but merely the indicated substring.

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.