Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2008

Tip: Looking for answers? Try searching our database.

Regexes: Forcing the LAST Match

Thread view: 
Hal Vaughan - 07 Jan 2008 10:54 GMT
I'm not sure what terms to look for here.  I want to use a regex that will
match as little as possible in a string from the *end* of the string.  For
example, if I have "OneTwoThreeTwoOne", I want to know how I can match
only "TwoOne" at the end and not "TwoThreeTwoOne".

I've been experimenting with quantifiers, but something like "Two.*?$" will
grab everything from the first occurrence of "Two".  How can I make it grab
from only the last occurrence?

Thanks!

Hal
Stefan Ram - 07 Jan 2008 14:50 GMT
>I'm not sure what terms to look for here. I want to use a regex that will
>match as little as possible in a string from the *end* of the string.

 »As little as possible from the end of the string« would be "".

>For example, if I have "OneTwoThreeTwoOne", I want to know how
>I can match only "TwoOne" at the end and not "TwoThreeTwoOne".

 When I tell you that this can be done by »TwoOne$«, you will
 not be satisfied, I guess. But it is the actual answer to the
 preceding English sentence.

 It might help to try to express what you want in English,
 in a manner that does not need examples to be understood.

 »As little as possible from the end« does not seem to be that
 expression, because it contradicts the example given.
Hal Vaughan - 07 Jan 2008 17:40 GMT
>>I'm not sure what terms to look for here. I want to use a regex that will
>>match as little as possible in a string from the *end* of the string.
[quoted text clipped - 10 lines]
>   It might help to try to express what you want in English,
>   in a manner that does not need examples to be understood.

Okay.

I want to be able to specify a phrase with a regex and remove from the last
occurrence of that phrase to the end of the original string.

Examples:

Full String: "The date 2008-01-07 is one day before Elvis' birthday on
2008-01-08, which is tomorrow."
Match: "[0-9]{2,4}-[0-9]{1,2}-[0-9]{1,2}"
Desired Result: "The date 2008-01-07 is one day before Elvis' birthday on "

It matches the LAST date that fits the format and removes from the last
match on.  If I used "[0-9]{2,4}-[0-9]{1,2}-[0-9]{1,2}.*?" it'll match from
the first date on.

Full String: "One Two Three Two One"
Match: "Two"
Desired Result: "One Two Three "

I've tried using negative lookaheads, but, as best I can guess, in that last
example, if I use any kind of quantifier, then the first match I get
is "Two Three Two One" and it seems to not see it can match just "Two One".

I have used Match.find() and a loop to get the last position of the match,
then used Match.start() to get the position, then get a substring of the
original string, using that start position as where the substring ends, but
if I want to delete from just after the match, then I have to check to make
sure I'm not out of bounds and so on.  I would think there would be an easy
way to match the last occurrence of a phrase to the end of a string instead
of having to loop through it.

Thanks!

Hal
Robert Klemme - 07 Jan 2008 18:09 GMT
>>> I'm not sure what terms to look for here. I want to use a regex that will
>>> match as little as possible in a string from the *end* of the string.
[quoted text clipped - 40 lines]
> way to match the last occurrence of a phrase to the end of a string instead
> of having to loop through it.

In your case you can use this

http://java.sun.com/javase/6/docs/api/java/lang/String.html#lastIndexOf(java.lan
g.String
)

    robert
Hal Vaughan - 07 Jan 2008 18:17 GMT
>>>> I'm not sure what terms to look for here. I want to use a regex that
>>>> will match as little as possible in a string from the *end* of the
[quoted text clipped - 46 lines]
>
> In your case you can use this

http://java.sun.com/javase/6/docs/api/java/lang/String.html#lastIndexOf(java.lan
g.String
)

But that uses a String not a regex.  I tried it.

Thanks, though.

Hal
Stefan Ram - 07 Jan 2008 18:12 GMT
>I want to be able to specify a phrase with a regex and remove from the last
>occurrence of that phrase to the end of the original string.

 Thus, for the phrase »alpha« and the text
 »alpha beta alpha gamma alpha delta alpha epsilon«,
 the text to be removed is »alpha epsilon«.

public class Main
{
 public static void test( final java.lang.String text )
 {
   final java.util.regex.Matcher matcher =
   java.util.regex.Pattern.compile
   ( "^(.*)alpha.*$" ).
   matcher( text );

   while( matcher.find() )
   java.lang.System.out.println( matcher.group( 1 )); }

 public static void main( final java.lang.String[] args )
 { test
   ( "alpha beta alpha gamma alpha delta alpha epsilon alpha zeta" ); }}

alpha beta alpha gamma alpha delta alpha epsilon
Hal Vaughan - 07 Jan 2008 18:24 GMT
>>I want to be able to specify a phrase with a regex and remove from the
>>last occurrence of that phrase to the end of the original string.
[quoted text clipped - 20 lines]
>
> alpha beta alpha gamma alpha delta alpha epsilon

Then what you're doing, in essence, is using a greedy quantifier at the
start to gobble up as much as possible so it only finds the last occurrence
and then just using capture to get the text that's to be kept and using
that as the replacement.

Am I right in how this is working?  I see it works, I just want to be sure I
understand it clearly.

Thanks!

Hal
Stefan Ram - 07 Jan 2008 18:31 GMT
>Then what you're doing, in essence, is using a greedy
>quantifier at the start to gobble up as much as possible so it
>only finds the last occurrence and then just using capture to
>get the text that's to be kept and using that as the
>replacement.

 I believe so.
Hal Vaughan - 07 Jan 2008 18:34 GMT
>>Then what you're doing, in essence, is using a greedy
>>quantifier at the start to gobble up as much as possible so it
[quoted text clipped - 3 lines]
>
>   I believe so.

Okay.  I got it.

Thanks!

Hal
Stefan Ram - 07 Jan 2008 19:49 GMT
>Okay.  I got it.

 One still might ask, whether there is a way to just inspect
 the end of the string. I am not absolutely sure, whether the
 following code really does that, but I would try it this way:

public class Main
{
 public static java.lang.String pos( final java.lang.String text )
 {
   final java.util.regex.Matcher matcher =
   java.util.regex.Pattern.compile
   ( "(?<=(alpha.{0,2147483642}?)$)" ).
   matcher( text );

   return matcher.find() ? matcher.group( 1 ) : ""; }

 public static void main( final java.lang.String[] args )
 {
   final java.lang.String source = "alpha beta alpha gamma alpha delta";

   final java.lang.String stringToBeRemoved = pos( source );

   java.lang.System.out.println( stringToBeRemoved );

   final java.lang.String result = source.substring
   ( 0, source.length() - stringToBeRemoved.length() );

   java.lang.System.out.println( result ); }}

alpha delta
alpha beta alpha gamma
Ronny Schuetz - 07 Jan 2008 15:30 GMT
I guess you want to look for the difference of greedy vs. reluctant
(non-greedy) matches. However, see Stefans reply.

Ronny


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.