Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2008

Tip: Looking for answers? Try searching our database.

Regex: Capturing and replacing question

Thread view: 
Hal Vaughan - 02 Jan 2008 21:54 GMT
I can't find an actual reference to this in the API (I'm using 1.4.2), so I
want to be sure what I'm doing is "legal."

I'm using regexes to find any place in a string where there is a small
letter followed by a capital one and put a space between them.  I'm
using "([a-z])([A-Z])" as the pattern to search for and using "$1 $2" as
the replacement string.  This is working well in all my tests, but since I
didn't find it documented where I'd feel safe, I thought I should check
(I've also learned, in Perl, just how tricky regexes can be).

Is it correct that $1 in a replacement string references the first captured
text sequence in the regex?  And so on with $2, $3....?

I've included my test case below (and I've tested more strings than what I
have in the code for now).  I just want to be sure there aren't side
effects or other issues I'm not aware of!

Thanks!

Hal
---------------------------
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Tester {
       
       public static void main(String[] args) {
               Tester tTest = new Tester();
               tTest.test("1AlphaBeta");
               tTest.test("AlphaBetaGammaDelta");
               for (int x = 0; x < args.length; x++) {
                       tTest.test(args[x]);
               }
       }
       
       public Tester() {}
       
       public void test(String sInput) {
               String sPattern = "([a-z])([A-Z])", sOutput = "", sReplace = "$1 $2";
               Pattern pRegex;
               Matcher mRegex;
               pRegex = Pattern.compile(sPattern);
               mRegex = pRegex.matcher(sInput);
               sOutput = mRegex.replaceAll(sReplace);
               System.out.println("Input: " + sInput + ", Output: " + sOutput);
               return;
       }

}
Roedy Green - 02 Jan 2008 22:30 GMT
>"([a-z])([A-Z])" a

You might want to experiment without the ().  I use 3 different regex
schemes in a day.  I forget which ones need the ().
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Hal Vaughan - 02 Jan 2008 22:57 GMT
>>"([a-z])([A-Z])" a
>
> You might want to experiment without the ().  I use 3 different regex
> schemes in a day.  I forget which ones need the ().

My understanding is I need them to create capture groups.  Without them, I
get an error at the line with the Matcher.replaceAll() command.

I just want to be sure the $1 and $2 specifically refer to captured
sequences.

Hal
Joshua Cranmer - 02 Jan 2008 23:19 GMT
> I'm using regexes to find any place in a string where there is a small
> letter followed by a capital one and put a space between them.  I'm
> using "([a-z])([A-Z])" as the pattern to search for and using "$1 $2" as
> the replacement string.  This is working well in all my tests, but since I
> didn't find it documented where I'd feel safe, I thought I should check
> (I've also learned, in Perl, just how tricky regexes can be).

Alternatively, this should work:

"(?<=[a-z])(?=[A-Z])" replaced with " ".

> Is it correct that $1 in a replacement string references the first captured
> text sequence in the regex?  And so on with $2, $3....?

The Matcher.appendReplacement says that $1 should be the output of
group(1), so that is correct.

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Lew - 03 Jan 2008 06:06 GMT
Hal Vaughan wrote:
>> I'm using regexes to find any place in a string where there is a small
>> letter followed by a capital one and put a space between them.  I'm
>> using "([a-z])([A-Z])" as the pattern to search for and using "$1 $2" as
>> the replacement string.  This is working well in all my tests, but
>> since I
>> didn't find it documented where I'd feel safe,

From <http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html>
>  Groups and capturing
>
[quoted text clipped - 7 lines]
>
> Group zero always stands for the entire expression.

Hal Vaughan wrote:
>> Is it correct that $1 in a replacement string references the first
>> captured
>> text sequence in the regex?  And so on with $2, $3....?

> The Matcher.appendReplacement says that $1 should be the output of
> group(1), so that is correct.

<http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplace
ment(java.lang.StringBuffer,%20java.lang.String
)>

The documentation is in the Javadocs for the relevant classes.  Javadocs are
often a great first place to look.  The Javadocs are a place where you will
"find it documented" and should be "where [you]'d feel safe" to trust it.

Signature

Lew

Hal Vaughan - 03 Jan 2008 16:07 GMT
> Hal Vaughan wrote:
>>> I'm using regexes to find any place in a string where there is a small
[quoted text clipped - 25 lines]
>> The Matcher.appendReplacement says that $1 should be the output of
>> group(1), so that is correct.

<http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplace
ment(java.lang.StringBuffer

%20java.lang.String)>

> The documentation is in the Javadocs for the relevant classes.  Javadocs
> are
> often a great first place to look.  The Javadocs are a place where you
> will "find it documented" and should be "where [you]'d feel safe" to trust
> it.

I goofed on that.  I read all the info for Pattern, since that had all the
regex expressions and patterns, then went over the documentation at the
start of Matcher, but I didn't expect to find something like this explained
in the description of a particular method.  I also had Googled quite a bit,
but had a high noise-to-signal ratio and kept getting sections that gave
the same general regex explanations.  I would think that something like
this should have been in the main part of Pattern or Matcher and not in a
method description.

Hal
Lew - 04 Jan 2008 00:09 GMT
Lew wrote:
> <http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplace
ment(java.lang.StringBuffer%20java.lang.String
)>

> I read all the info for Pattern, since that had all the
> regex expressions and patterns, then went over the documentation at the
> start of Matcher, but I didn't expect to find something like this explained
> in the description of a particular method.  ...   I would think that something like
> this should have been in the main part of Pattern or Matcher and not in a
> method description.

I couldn't agree more.  In fact, I was feeling somewhat incensed at Sun for
that when I finally turned up the reference.  I had remembered that it was in
the 'docs for Pattern or Matcher, but not that it was obscurely buried in a
method description.  That'll teach us to read every dripping word in every
corner of the Javadocs, for sure!

This is definitely a frak-up by Sun.

C'mon, Sun, polish up that Javadoc page!

Signature

Lew

Hal Vaughan - 04 Jan 2008 05:59 GMT
> Lew wrote:

<http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplace
ment(java.lang.StringBuffer%20java.lang.String
)>

>> I read all the info for Pattern, since that had all the
>> regex expressions and patterns, then went over the documentation at the
[quoted text clipped - 11 lines]
> method description.  That'll teach us to read every dripping word in every
> corner of the Javadocs, for sure!

I find, though, that no matter what I read when I get a question like this,
it doesn't matter.  I'll check the API docs, I'll check my books, then I'll
Google under the terms that I think would work (and often
include "tutorial" since that gives me good example pages), and it's always
on the one doc page I didn't read or I need to use one term in Google that
I didn't think of.  There are times I've had to post just to ask what the
proper term is for something so I know what to look up.

> This is definitely a frak-up by Sun.

Ah, a fellow Galactica fan! ;)

> C'mon, Sun, polish up that Javadoc page!

It definitely should be included in the main part, and in the Pattern page
as well. Even though it's not used as part of a pattern (maybe it could be,
it works that way in Perl), it's related closely enough it should be
included there.

Hal
Lew - 04 Jan 2008 06:33 GMT
Lew wrote:
>> This is definitely a frak-up by Sun.

> Ah, a fellow Galactica fan! ;)

Farscape.

Signature

Lew

Lew - 04 Jan 2008 06:38 GMT
> Lew wrote:
>>> This is definitely a frak-up by Sun.
>
>> Ah, a fellow Galactica fan! ;)
>
> Farscape.

Let me explain.  "Frak" is the Galactica term, "Frell" the Farscape term.  I
got the word from Galactica, but I was much more a Farscape fan.

Even on SCIFI.com they cop to the cognate nature:
<http://scifipedia.scifi.com/index.php/Farscape#Farspeak>
> Frell — A commonly used curse similar in use to
> "Frak" on Battlestar Galactica with the same general
> meaning as the "F" word on Earth.

Signature

Lew



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.