Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / Tools / June 2005

Tip: Looking for answers? Try searching our database.

eclipse advance regex help

Thread view: 
Chris - 02 Jun 2005 08:51 GMT
How do I search for a pattern only if the pattern is before a specific
character eg '%'. For example:-

    1. pattern % something
    2. something % pattern

I want to be able to match line 1 but not line 2.

Any ideas?
Robert Klemme - 02 Jun 2005 11:59 GMT
> How do I search for a pattern only if the pattern is before a specific
> character eg '%'. For example:-
[quoted text clipped - 5 lines]
>
> Any ideas?

Make it part of the pattern and / or use lookahead and lookbehind.
http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html#special

Kind regards

   robert
Chris - 03 Jun 2005 08:27 GMT
Thanks for that, lookbehind is wanting to know.

However, it could be that I don't fully understand how to use
lookbehind. In eclipse, I'm using the regex in eclipse search with case
insensitivity with the following expression:-

    (?<!%).*pattern

and it's matching the case I'm expecting not to match. What an I doing
wrong?
Robert Klemme - 03 Jun 2005 08:52 GMT
> Thanks for that, lookbehind is wanting to know.
>
[quoted text clipped - 6 lines]
> and it's matching the case I'm expecting not to match. What an I doing
> wrong?

Two things catch my eye:

- ".*" this is always dangerous, I'd prefer something better here ("\s*"
for example)

- Could be a multiline issue if the unwanted prefix and your patter sit
on different lines

Kind regards

   robert
Chris - 03 Jun 2005 09:16 GMT
> Two things catch my eye:
>
>  - ".*" this is always dangerous, I'd prefer something better here ("\s*"
> for example)

In my example, where is a space, but there could be any number of any
character between the '%' and the pattern.

>  - Could be a multiline issue if the unwanted prefix and your patter sit
> on different lines

In this case, they're on the same line.
Robert Klemme - 03 Jun 2005 09:26 GMT
>> Two things catch my eye:
>>
[quoted text clipped - 3 lines]
> In my example, where is a space, but there could be any number of any
> character between the '%' and the pattern.

Then I'd try ".*+" and I'd also put this part into the brackets - it's not
really a part of the pattern that you want to match.

>>  - Could be a multiline issue if the unwanted prefix and your patter
>> sit on different lines
>
> In this case, they're on the same line.

   robert
Chris - 03 Jun 2005 10:14 GMT
> Then I'd try ".*+" and I'd also put this part into the brackets - it's not
> really a part of the pattern that you want to match.

No luck. I tried with and without brackets. I also tried ".+", ".?" and
".*?". The results act like "(?<!%)" wasn't there.
Robert Klemme - 03 Jun 2005 11:36 GMT
>> Then I'd try ".*+" and I'd also put this part into the brackets -
>> it's not really a part of the pattern that you want to match.
>
> No luck. I tried with and without brackets. I also tried ".+", ".?"
> and ".*?". The results act like "(?<!%)" wasn't there.

Can you post some sample content along with the exact pattern that you try
to match?

   robert
1i5t5.googlegroups@3mai1.com - 04 Jun 2005 09:01 GMT
Create a file and add the following lines:-

    1. pattern % something
    2. something % pattern

Find within that file (ctl+F) using the regex "(?<!%)(pattern)".

The result is "pattern" is found in both lines.

The expected result is "pattern" is only found in line 1.
Robert Klemme - 05 Jun 2005 11:31 GMT
> Create a file and add the following lines:-
>
[quoted text clipped - 6 lines]
>
> The expected result is "pattern" is only found in line 1.

This doesn't work:

(?<!%.{0,100})pattern

But this does for the example

(?<!%\s{0,100})pattern

Note that

(?<!%.*)pattern

Leads to the error message, that the look behind pattern does not have an
obvious max length.

This could mean

1)  You found a bug in Eclipse

2) You found a bug in Java std lib

3) Our understanding of the RE engines lookbehind mechanism is incomplete

Kind regards

   robert
Alan Moore - 05 Jun 2005 21:45 GMT
>This doesn't work:
>
[quoted text clipped - 18 lines]
>
>3) Our understanding of the RE engines lookbehind mechanism is incomplete

It's #2.  Lookbehind means that the enclosed subexpression matches
starting at some position before the current match position and ending
AT the current match position.  The way it's implemented, the
subexpression is allowed to match BEYOND the current match position.
That's what's happening with the first regex above: the ".{0,100}"
matches all the way to the end of the line, then that position is
compared to the current match position.  They don't line up, so the
subexpression fails and the negative lookbehind incorrectly succeeds.

In fact, quantifiers don't really work at all.  I thought the regex
above would work if a reluctant quantifier were used:

 (?<!%.{0,100}?)pattern

...but it doesn't.  It just tries to match the dot zero times, fails
and gives up.  Whether the subexpression matches too much or too
little, it never goes back to try matching a different amount.

I'll go ahead and file the bug report if nobody has any objection.
Chris - 06 Jun 2005 03:04 GMT
I don't object as long as you point it's the negative lookbehind that
isn't working. The original regex is "(?<!%).*string" which is to say
find all lines that contain "string" that doesn't have a "%" anywhere
in the line before the "string".
Alan Moore - 06 Jun 2005 06:55 GMT
>I don't object as long as you point it's the negative lookbehind that
>isn't working. The original regex is "(?<!%).*string" which is to say
>find all lines that contain "string" that doesn't have a "%" anywhere
>in the line before the "string".

The only difference between positive and negative lookbehind is
whether a match is treated as success or failure.  They're both
supposed to exhaust all possibilities trying to find the match but, as
you've discovered, they don't.  Below is a test case I wrote up.  All
four regexes should match "foo1", "foo2" and "foo3", but only the
fourth one does, and it's a hack.

import java.util.regex.*;

public class Test
{
 public static void main(String[] args)
 {
   String str =
     "%foo1\n%bar foo2\n%bar  foo3\n%blahblah foo4\nfoo5";

   String[] rgxs = { "(?<=%.{0,5})foo\\d",
                     "(?<=%.{0,5}?)foo\\d",
                     "(?<=%.{0,5}\\b)foo\\d",
                     "foo\\d(?<=%.{0,5}foo\\d)" };

   for (int i=0; i<rgxs.length; i++)
   {
     Pattern p = Pattern.compile(rgxs[i]);
     Matcher m = p.matcher(str);
     System.out.println();
     System.out.println(p.pattern());
     while (m.find())
     {
       System.out.println(m.group());
     }
   }
 }
}
Robert Klemme - 06 Jun 2005 10:09 GMT
> I don't object as long as you point it's the negative lookbehind that
> isn't working. The original regex is "(?<!%).*string" which is to say
> find all lines that contain "string" that doesn't have a "%" anywhere
> in the line before the "string".

Btw, this one should also do the job if you just want to omit a single
char:

^[^%]*pattern

Kind regards

   robert
Chris - 08 Jun 2005 05:17 GMT
Robert,

Thanks, that worked.
Chris - 06 Jun 2005 03:06 GMT
A more generic case would to find some artifact that's *not* commented
out!
Alan Moore - 06 Jun 2005 01:01 GMT
>> Create a file and add the following lines:-
>>
[quoted text clipped - 21 lines]
>Leads to the error message, that the look behind pattern does not have an
>obvious max length.

Try this:  pattern(?<=%.{0,100}pattern)

Note that "pattern" has to have an obvious maximum length.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.