Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / April 2008

Tip: Looking for answers? Try searching our database.

reluctant quantifier doesn't work

Thread view: 
alex_us01 - 09 Apr 2008 21:01 GMT
Hi all,

I use an regex in Java 6 as this:
"<a .*?some string here</a>"

and expect that .*? will match the smallest char sequence.
However, it matches the longest.

I give this regex to a method I wrote:
    private Matcher getMatcherMLDA(String regex, String input) {
        Pattern pat = Pattern.compile(regex,Pattern.MULTILINE |
Pattern.DOTALL);
        Matcher matcher = pat.matcher(input);
        return matcher;
    }

It looks like the reluctant quantifier doesn't work.
thanks!
Roedy Green - 09 Apr 2008 22:37 GMT
On Wed, 9 Apr 2008 13:01:50 -0700 (PDT), alex_us01
<arisalex@gmail.com> wrote, quoted or indirectly quoted someone who
said :

>I use an regex in Java 6 as this:
>"<a .*?some string here</a>"

Try this

"<a( .*)?some string here</a>"
Signature


Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Gordon Beaton - 10 Apr 2008 06:59 GMT
> Try this
>
> "<a( .*)?some string here</a>"

Now it isn't reluctant anymore.

/gordon

--
alex_us01 - 30 Apr 2008 23:09 GMT
> > Try this
>
[quoted text clipped - 5 lines]
>
> --

True!
Therefore, this is not a solution.
Thanks, Gordon.
Gordon Beaton - 10 Apr 2008 06:58 GMT
> I use an regex in Java 6 as this:
> "<a .*?some string here</a>"
>
> and expect that .*? will match the smallest char sequence.
> However, it matches the longest.

First, you failed to show us how you are observing that.

Second, you need to realize that unless you specified a region, the
regexp is matched against the *entire* input, and the string matched
by the reluctant quantifier will match as much as necessary (but not
more) for the entire input to match.

You likely need to add an additional quantifier to the end of your
regexp to match the remainder of the input you aren't interested in,
thus allowing the reluctant quantifier to match less:

 "<a .*?some string here</a>.*"

/gordon

--
Hendrik Maryns - 10 Apr 2008 13:34 GMT
Gordon Beaton schreef:
>> I use an regex in Java 6 as this:
>> "<a .*?some string here</a>"
[quoted text clipped - 14 lines]
>
>   "<a .*?some string here</a>.*"

Or use find() instead of match().
(but you’ll want to introduce some capturing groups)

H.
Signature

Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html

alex_us01 - 30 Apr 2008 23:00 GMT
I do not use find() or match() at all.
Once I get the matcher via getMatcherMLDA(),
I use replaceFirst() or replaceAll() on it.

For example:

Matcher matcher = getMatcherMLDA(X,Y);
matcher.replaceAll("");
OR
matcher.replaceFirst("");

Let me know if any more clarification is needed. Thanks.
alex_us01 - 30 Apr 2008 23:12 GMT
> > I use an regex in Java 6 as this:
> > "<a .*?some string here</a>"
[quoted text clipped - 3 lines]
>
> First, you failed to show us how you are observing that.

Not quite. I chose not to.
It is too complicated.
I avoid posting long code unless asked.

> Second, you need to realize that unless you specified a region, the
> regexp is matched against the *entire* input, and the string matched
> by the reluctant quantifier will match as much as necessary (but not
> more) for the entire input to match.

Wrong.
I tried in a small example with the method I provided above and it
worked.
It did NOT need to match the *entire* input string
and it worked for a small pattern and small input.
Perhaps, I shall find an example that it doesn't work and post again.
(However, in the meantime, if you get an insight, let me know.)

Alex
alex_us01 - 30 Apr 2008 23:15 GMT
Alex wrote:
> It did NOT need to match the *entire* input string
> and it worked for a small pattern and small input.

Here is an example, that this worked:

---
Matcher matcher = ctr.getMatcherMLDA("<.*?>", "<alex> <alex>");
String out = matcher.replaceFirst("REPLACED");
---

When I ran this code, the out variable contains:
"REPLACED <alex>"


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.