Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2006

Tip: Looking for answers? Try searching our database.

regex bug jre6???

Thread view: 
triVinci@gmail.com - 12 Dec 2006 05:51 GMT
Hello all and TIA for any and all insight...

My application reads in a regex pattern as a string from an external
source.
When the pattern contains an escaped backslash ("\\"), my existing code
works great under 1.4.2 and 1.5.

I'm simply calling...

   string.matches(regex);

In jre6, the same code, processing the same regex, throws a
PatternSyntaxException complaining that "\\" is an Illegal/unexpected
escape sequence.

Is this a bug in jre6?  Are there any suggestions on how to work-around
this issue?

-tv
hiwa - 12 Dec 2006 09:11 GMT
> Hello all and TIA for any and all insight...
>
[quoted text clipped - 15 lines]
>
>  -tv
1.6 works normally. Here's a SSCCE.
----------------------------------------------------------------------
/** content of regex.txt **
\\[newline]
***************************/
import java.io.*;
import java.util.regex.*;

public class RegX{

 public static void main(String[] args){
   String regex = null;
   String text = "abc\\efg\\xyz\\";

   try{
     BufferedReader br
       = new BufferedReader(new FileReader("regex.txt"));
     regex = br.readLine();
   }
   catch (Exception e){
     e.printStackTrace();
   }

   Pattern pat = Pattern.compile(regex);
   Matcher mat = pat.matcher(text);

   while (mat.find()){
     System.out.println(mat.group());
   }
 }
}
--------------------------------------------------------------------------------------
triVinci@gmail.com - 12 Dec 2006 13:47 GMT
hiwa,

Thanks for taking the time to write that up and respond. It helped me
shed a little more light on the issue.  It's not the "\\" that causes
the problem, but rather "\\Q".  I've modified RegX and regex.txt a bit
to highlight the problem.  Runtime output is from 1.4, 1.5, and 1.6
(with the Exception pasted in).

/** content of regex.txt **
^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$[newline]
***************************/

import java.io.*;
import java.util.regex.*;

public class RegX
{
   public static void main(String[] args)
   {
       System.out.print("\nJava Version " +
           System.getProperty("java.specification.version"));
       System.out.println("\n----------------");
       String regex = null;
       String text = "GEORGIA";

       try
       {
           BufferedReader br
               = new BufferedReader(new FileReader("regex.txt"));
           regex = br.readLine();
       }
       catch (Exception e)
       {
           e.printStackTrace();
       }

       Pattern pat = Pattern.compile(regex);
       Matcher mat = pat.matcher(text);

       System.out.println("\nLooking for \"" + text + "\" in \"" +
           regex + "\"");
       while (mat.find())
       {
           System.out.println("\t--> " + mat.group());
       }

       System.out.print("\n\"" + text + "\" matches \"" +
           regex + "\"...  ");
       System.out.println(text.matches(regex));
       System.out.println

("\n====================================================\n");
   }
}

OUTPUT...

Java Version 1.4
----------------

Looking for "GEORGIA" in
"^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$"
    --> GEORGIA

"GEORGIA" matches
"^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$"...  true

====================================================

Java Version 1.5
----------------

Looking for "GEORGIA" in
"^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$"
    --> GEORGIA

"GEORGIA" matches
"^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$"...  true

====================================================

Java Version 1.6
----------------

Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal/unsupported escape squence near index 31
^((VINCENT)|(GEORGIA)|(GIACOMO\\QUARENGHI)|(CLAUDE))$
                              ^
       at java.util.regex.Pattern.error(Unknown Source)
       at java.util.regex.Pattern.escape(Unknown Source)
       at java.util.regex.Pattern.atom(Unknown Source)
       at java.util.regex.Pattern.sequence(Unknown Source)
       at java.util.regex.Pattern.expr(Unknown Source)
       at java.util.regex.Pattern.group0(Unknown Source)
       at java.util.regex.Pattern.sequence(Unknown Source)
       at java.util.regex.Pattern.expr(Unknown Source)
       at java.util.regex.Pattern.group0(Unknown Source)
       at java.util.regex.Pattern.sequence(Unknown Source)
       at java.util.regex.Pattern.expr(Unknown Source)
       at java.util.regex.Pattern.compile(Unknown Source)
       at java.util.regex.Pattern.<init>(Unknown Source)
       at java.util.regex.Pattern.compile(Unknown Source)
       at RegX.main(RegX.java:25)
Oliver Wong - 12 Dec 2006 17:46 GMT
> Thanks for taking the time to write that up and respond. It helped me
> shed a little more light on the issue.  It's not the "\\" that causes
> the problem, but rather "\\Q".  I've modified RegX and regex.txt a bit
> to highlight the problem.  Runtime output is from 1.4, 1.5, and 1.6
> (with the Exception pasted in).

   I think you found a bug. Here's an SSCCE that more readily demonstrates
the problem:

<SSCCE>
public class RegExpTest {
public static void main(String args[]) {
 System.out.println("Java Version " +
System.getProperty("java.specification.version"));
 System.out.println("----------------");
 {
  // This works
  String regex = "G\\\\A";
  Pattern pat = Pattern.compile(regex);
 }
 {
  // This works
  String regex = "G\\\\B";
  Pattern pat = Pattern.compile(regex);
 }
 {
  // This fails
  String regex = "G\\\\Q";
  Pattern pat = Pattern.compile(regex);
 }
}
}
</SSCCE>

   It probably has to do with the fact that \Q and \E are used for "super
quoting" in regular expressions, and the parser looks for \Q before escaping
all the \\s first.

   So go ahead and file the bug report at Sun's.

   - Oliver
hiwa - 13 Dec 2006 02:22 GMT
> > Thanks for taking the time to write that up and respond. It helped me
> > shed a little more light on the issue.  It's not the "\\" that causes
[quoted text clipped - 37 lines]
>
>     - Oliver
Hmmm..
1.4 fails for:
String regex = "G\\\\Qabc\\E";
at 'E',
but 1.6 doesn't.

and,
1.6 fails for:
String regex = "G\\\\Q";
at resultant '\\Q',
but 1.4 doesn't.

1.4 succeeds with:
String regex = "G\\\\\\Qabc\\E";

Yes, 1.6 has introduced a ner bug.....
hiwa - 13 Dec 2006 02:24 GMT
> ner bug
new bug
triVinci@gmail.com - 13 Dec 2006 14:47 GMT
Thanks again!

A bug report was submitted on Nov24.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6497148

> > > Thanks for taking the time to write that up and respond. It helped me
> > > shed a little more light on the issue.  It's not the "\\" that causes
[quoted text clipped - 53 lines]
>
> Yes, 1.6 has introduced a ner bug.....
Andrew Thompson - 13 Dec 2006 15:02 GMT
...
> A bug report was submitted on Nov24.
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6497148

Sterling job!

With Sun accepting it as a bug so quickly,
(& identifying a possible source) it seems
my offer of a web-start based test is redundant.
We'll see how it goes.

Andrew T.
Andrew Thompson - 13 Dec 2006 04:03 GMT
....
>     I think you found a bug. Here's an SSCCE that more readily demonstrates
> the problem:
....
>     So go ahead and file the bug report at Sun's.

JWS would be good for selecting the JVM against
which to test a piece of code.

Would putting up such a test at my site, help the
progress of this bug, or simply be redundant?
(It seems pretty simple and clear-cut, suggesting 'no')

Opinions/thoughs welcome.

Andrew T.
hiwa - 13 Dec 2006 04:19 GMT
> ....
> >     I think you found a bug. Here's an SSCCE that more readily demonstrates
[quoted text clipped - 12 lines]
>
> Andrew T.

> Would putting up such a test at my site
Would surely help multiple JDK version test in general.

It's a little bit awkward and pain at a personal local site to do them.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.