Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2006

Tip: Looking for answers? Try searching our database.

escape meta for Pattern?

Thread view: 
Markus Dehmann - 14 Feb 2006 21:56 GMT
On
http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html

it says:
> The string literal "\(hello\)" is illegal and leads to a compile-time error; in order to match the string (hello) the string literal "\\(hello\\)" must be used.

Now, if I read my input strings from a file I have to convert my
strings in order to match them against a pattern.  How do I do that?
Is  there a predefined method to  do it? Like quotemeta in perl?

Thanks!
Markus
Oliver Wong - 14 Feb 2006 22:08 GMT
> On
> http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
[quoted text clipped - 7 lines]
> strings in order to match them against a pattern.  How do I do that?
> Is  there a predefined method to  do it? Like quotemeta in perl?

   There's several concepts you need to get straight here. One is "what is
the character-content of a String in memory?" which I will call "String A"
for short, and "What string do I have to type in my Java source code to get
String A into memory?" which I will call "String B".

   So if you type a String B like "\\(hello\\)" then String A will be
"\(hello\)".

   If you type a String B like "\t\\t\t", then String A will be something
like "        \t        ".

   Now, let me define a new string called String C, as follows: "What does
my file have to contain so that when I read in that string, String A gets
loaded into memory?"

   It turns out that String C and String A are exactly the same. If you
want "        \t        " to appear in memory, then your file should contain
"        \t        ". If you want "\(hello\)" to appear in memory, then your
file should contain "\(hello\)".

   - Oliver
jamesahart79@gmail.com - 15 Feb 2006 02:06 GMT
Yes, the previous poster is completely correct.

I think it should be made absolutely clear that it is the Java compiler
that turns '\\' into '\'.  Thus only string constants in code that will
be compiled, i.e. in source code, need to have the overabundance of
'\\'.  Everywhere else (external files, memory images, etc), what you
see is what you get.
Roedy Green - 15 Feb 2006 04:14 GMT
>I think it should be made absolutely clear that it is the Java compiler
>that turns '\\' into '\'.  Thus only string constants in code that will
>be compiled, i.e. in source code, need to have the overabundance of
>'\\'.  Everywhere else (external files, memory images, etc), what you
>see is what you get.

A SCID could hide this \ quoting goofiness by displaying and editing
strings in two colours, one for literal chars, and one for
representations of unprintable characters.  Unicode has special glyphs
for the control chars you could use.  Ditto for regex.  We have the
hardware. We act as if had only TTYs to code on.
It would make proofreading 100 times easier. 40% of the difficultly of
regexes comes from the double layer of quoting.

See
http://mindprod.com/projects/regexcomposer.html
http://mindprod.com/projects/regexproofreader.html
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

John C. Bollinger - 16 Feb 2006 02:14 GMT
> On
> http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
[quoted text clipped - 6 lines]
> strings in order to match them against a pattern.  How do I do that?
> Is  there a predefined method to  do it? Like quotemeta in perl?

As others have pointed out, pattern strings obtained by means other than
string literals are not bound by the constraints of string literals
(though they may have their own constraints).  Another thing to
consider, though, is how to use a string -- from whatever source -- as a
literal pattern, handling erstwhile metacharacters as normal characters.
 It isn't clear to me whether that's what you want, but if it is then
you should look into Pattern.quote().

Signature

John Bollinger
jobollin@indiana.edu



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.