Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / August 2006

Tip: Looking for answers? Try searching our database.

Help with regular expression

Thread view: 
Grost - 21 Aug 2006 01:42 GMT
Hi all,

I'm writing an application to perform some HTML text manipulation from
templates and I have a regex formulation problem. For example, in the
template I have a line:

    <tr><td class="caption"><!--@caption--><!--<br />(@caption)--></td></tr>

where the parts I want to replace are HTML comments: <!-- ??? -->
There are two styles of comment I want to search/replace:
    1) <!--@caption-->
    2) <!--XXX(@caption)YYY-->, where XXX and YYY can represent other HTML

Case 1 is easy, and I just use: <!--\s*?@caption\s*?-->
Case 2 is the problem. I trying to use this for conditional insertion
of additional HTML, depending on whether @caption exists in the
application. If I have a value for @caption, then the following is
produced from the above example:

    <tr><td class="caption">foo<br />foo</td></tr>

This seems easy enough in principle, but every regex pattern I've tried
unsuprisingly matches the <!-- from the first comment. My initial try
which of course failed was: <!--(.*?)\(@caption\)(.*?)-->

What I need is a way of saying:
    Match "(@caption)" within an HTML comment, and capture the text on
either side of tag and within the comment, but make sure there are no
other comment-like tags within that text. I'm guessing I need something
along the lines of the lookaround operators, but I have little
experience with them. Any help anyone...?

(For clarity I removed the extra escaping required for Java inline strings.)

Stan
hiwa - 21 Aug 2006 03:25 GMT
Grost のメッセージ:

> Hi all,
>
[quoted text clipped - 31 lines]
>
> Stan
I think your description does not formalize the requirement well
enough.
Here's a rough stab in the dark. HTH.
------------------------------------------------------------------
public class Grost{

 public static void main(String[] args){
   String text =
"<tr><td class=\"caption\"><!--@caption--><!--<br
/>(@caption)--></td></tr>";
   String result = "<tr><td class=\"caption\">foo<br />foo</td></tr>";
   String regex1 = "<!--(<[^>]+>).*-->";
   String regex2 = "<!--.*-->";

   text = text.replaceAll(regex1, "foo$1foo");
   text = text.replaceAll(regex2, "");

   if (result.equals(text)){
     System.out.println("success");
   }
 }
}
Grost - 21 Aug 2006 05:48 GMT
> Grost のメッセージ:
>
[quoted text clipped - 56 lines]
>   }
> }

I figured that formalisation may be a problem, and that's quite likely
to be the aspect for which I need the most help. Essentially I want to
allow arbitrary text (inc.HTML) either side of a caption tag:
    <!--XXX(@caption)YYY-->
with the only restriction being that the text CANNOT be an HTML comment.
    XXX cannot contain <!--.*-->
    YYY cannot contain <!--.*-->

In regex terms, if I use my non-working version:
    <!--(.*?)\(@caption\)(.*?)-->
then neither $1 or $2 capuring groups in this match should contain any
HTML comments.

Any clearer?

Stan


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.