Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2007

Tip: Looking for answers? Try searching our database.

Regular expressions - marking up a URL

Thread view: 
rico.fabrini@gmail.com - 06 Nov 2007 16:33 GMT
Hi Everyone,

I've got the following code:
======================================================================
private static final Pattern rgxUrlsInHTML = Pattern.compile("(http:\\/
\\/([\\w.]+\\/?)\\S*)");
    public static String emailBodyToHtml(String body)
    {
        Matcher matcher = rgxUrlsInHTML.matcher(body);
        int start = 0;
        ArrayList<String> matches = new ArrayList<String>();
        while(matcher.find())
        {
            matches.add(matcher.group());
        }

        for(String match : matches)
        {
            int index = body.indexOf(match);
            if(index>-1)
                body = body.replaceFirst(match, "<a href=\"" + match+ "\">"+match
+"</a>");
        }
       }
======================================================================

1) I might have missed it, but I couldn't find a method that would
just return a collection of all matches.
   Accordingly, the API feels rather low-level. Probably sufficient,
but still low-level.

2) My actual query here:
                body = body.replaceFirst(match, "<a href=\"" + match+ "\">"+match
+"</a>");

seems to work as I would wish only for the first member match of the
matches collection.
Even though the if block is executed, the 2nd URL isn't substituted.

Is there something glaringly obvious that's eluding me here?
Also, comments for improvement from people familiar with the Regular
Expressions API are welcome.
Thanks.

Rico.
Joshua Cranmer - 06 Nov 2007 16:56 GMT
> Hi Everyone,
>
[quoted text clipped - 4 lines]
> Expressions API are welcome.
> Thanks.

What you want to do is a regex-replace:
body.replaceAll( < regex >, "<a href=\"$0\">$0</a>");

The 0-th group is the entire matched string; the dollar-signs represent
matching per
<http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplace
ment(java.lang.StringBuffer,%20java.lang.String
)>

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

rico.fabrini@gmail.com - 06 Nov 2007 18:10 GMT
> rico.fabr...@gmail.com wrote:
> > Hi Everyone,
[quoted text clipped - 16 lines]
> Beware of bugs in the above code; I have only proved it correct, not
> tried it. -- Donald E. Knuth

Thanks Joshua. That works and does what I was looking for.

I've got to admit that I've struggled somewhat to picture the idea of
"the entire matched string" though. So, I think of replaceAll() having
to scan the input sequence, and at every match $0 is a placeholder for
that particular match.

Without that scanning process in mind, I was baffled by the idea that
"entire matched string" meant some kind of concatenation of all the
matches.

Rico.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.