Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / April 2006

Tip: Looking for answers? Try searching our database.

Pattern and Regex

Thread view: 
newsnet customer - 10 Apr 2006 09:37 GMT
Hi,

Would like to use regular expression to find patterns in a string.
Consider a string seq "ABCDEFHIJKLMNOP";
Would like to find three patterns (ABC, NOP, HIJ) which ever comes first.
Expect the output "0" representing the index of ABC.
Long story short, couldn't get it to work and tried the two attempts below:
Much help appreciated on this regular expression, which is doing my head in.

Cheers
ST

Attempt One:
  String seq = "ABCDEFHIJKLMNOP";

  Pattern p1 = Pattern.compile("ABC" || "NOP" || "HIJ");
  Matcher m = p1.matcher(seq);

  while ( m.find())
  {
       System.out.println(m.start());
       //nothing happens
  }

Attempt Two:
  String seq = "ABCDEFHIJKLMNOP";

  Pattern p1 = Pattern.compile("ABC");
  Pattern p2 = Pattern.compile("NOP");
  Pattern p3 = Pattern.compile("HIJ");
 
  Matcher m1 = p1.matcher(seq);
  Matcher m2 = p2.matcher(seq);
  Matcher m3 = p3.matcher(seq);
   
  while ( m1.find() || m2.find() || m3.find() )
  {
        //won't work cos I don't know which matcher gave me the true
  }
Bart Cremers - 10 Apr 2006 10:07 GMT
You got pretty close with your first attempt. The problem is you're
mixing regex with java a java OR instead of sticking within the regex.

Pattern p1 = Pattern.compile("ABC|NOP|HIJ");

if you run this you'll see 3 matches. If you only want the first;
replace the "while" with a simple "if".

Regards,

Bart
newsnet customer - 10 Apr 2006 13:24 GMT
> You got pretty close with your first attempt. The problem is you're
> mixing regex with java a java OR instead of sticking within the regex.
[quoted text clipped - 6 lines]
> Regards
> Bart

Thanks.
Consider this harder problem, which im not sure regular expression can
solve.
Imagine the string below. ignore the white space - it shouldnt be there but
i
deliberately put it there so you and others can see what im talking about.
I want to find the first instance of either of the three patterns (FHI,
HIJ,NOP).
But the first instance must be within a block of threes. so that removes
FHI,
leaving HIJ and NOP. Since HIJ comes before NOP, HIJ becomes the output.

String seq = "ABC DEF HIJ KLM NOP";

Pattern p1 = Pattern.compile("FHI|"HIJ"|"NOP");
Matcher m = p1.matcher(seq);

if ( m.find())
{
   //do something
}

EXPECT:
HIJ

(1) I have tried putting * but that doesnt work
   Pattern p1 = Pattern.compile("***FHI|"***HIJ"|"***NOP");
   //compile error
(2) I have tried putting * but that doesnt work
   Pattern p1 = Pattern.compile("*FHI|"*HIJ"|"*NOP");
   //doesnt give me right output

I know im close like before.
help appeciated.

Cheers
ST
Bart Cremers - 10 Apr 2006 13:38 GMT
You could simply combine the regex operation with a simple modulo
operation on the start of the match. It works in your simple example
case, but might not work for more complex cases:

       String seq = "ABCDEFHIJKLMNOP";

       Pattern p1 = Pattern.compile("FHI|HIJ|NOP");

       Matcher m = p1.matcher(seq);

       int start = 0;
       while (m.find(start)) {
           System.out.printf("%3d - %s", m.start(),
seq.substring(m.start(), m.end()));
           if (m.start() % 3 == 0) {
               System.out.println(" -> OK");
               // maybe break out here
           } else {
               System.out.println(" -> ignore");
           }
           start = m.start() + 1;
       }

Bart
newsnet customer - 10 Apr 2006 14:40 GMT
> You could simply combine the regex operation with a simple modulo
> operation on the start of the match. It works in your simple example
[quoted text clipped - 20 lines]
>
> Bart

cheers Bart.
you have been really helpful.
If i can't get it to work without using the modulus.
that is, just using the regular expression then i will use your code.

ST
Jussi Piitulainen - 10 Apr 2006 14:49 GMT
> If i can't get it to work without using the modulus.
> that is, just using the regular expression [...]

The following prints the shortest prefix of triples in args[0] that
ends in one of FHI, HIJ and NOP.

   Pattern p = Pattern.compile("(...)*?(FHI|HIJ|NOP)");
   Matcher m = p.matcher(args[0]);
   if (m.find()) {
       System.out.println(m.group(0));
Gordon Beaton - 10 Apr 2006 13:49 GMT
> (1) I have tried putting * but that doesnt work
>     Pattern p1 = Pattern.compile("***FHI|"***HIJ"|"***NOP");
[quoted text clipped - 4 lines]
>
> I know im close like before.

First, the whole regex must be a single string, enclosed between one
pair of quotation marks. Try to remember this. Neither of your
examples are even compilable.

Second, quantifiers (such as * and ?) can't be used on their own, they
must be preceded by a pattern to modify. So .* will match 0 or more
characters, while \s* will match 0 or more whitespace characters, etc.
Instead of guessing, read the regex documentation and think about what
you're trying to do.

If you have optional whitespace among the stuff you really want to
match, try something like this (untested):

 "\\s*((F\\s*H\\s*I)|(H\\s*I\\s*J)|(N\\s*O\\s*P))\\s*"

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Jussi Piitulainen - 10 Apr 2006 10:21 GMT
>    String seq = "ABCDEFHIJKLMNOP";
>  
[quoted text clipped - 6 lines]
>         //nothing happens
>    }

Did you just ignore the error message from the Java compiler?
Don't do that.

As to the pattern, you want "ABC|NOP|HIJ".


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.