Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2007

Tip: Looking for answers? Try searching our database.

complex regex

Thread view: 
carlbernardi@gmail.com - 10 Oct 2007 02:56 GMT
HI,

I am new to java.util.regex package which I am using to detect each
time the javascript tag occurs in an html file and delete it. I tried
using the following code to find examples such as the ones below but
instead it finds the first occurrence of "<" and the last occurrence
of ">" which is not what I am looking for.

<script>
<script src="script.js">
</script>

       String mat = "<html><script><p><font></script>";
       String pat = "<*[\\x00-\\x7f]*jscript*[a-z0-9]*>";
       Pattern pattern = Pattern.compile(pat);
       Matcher matcher = pattern.matcher(mat);
       while(matcher.find()){
           System.out.println("Match: "+matcher.group()+" Start:"+
matcher.start()+" End:"+ matcher.end());
       }

output:
Match: <html><javascript><p><font><javascript> Start:0 End:39

i would be looking for an out put of:
Match: <javascript> Start:6 End:18
Match: <javascript> Start:27 End:18

Appreciate any input,

Carl
Gordon Beaton - 10 Oct 2007 07:22 GMT
> I am new to java.util.regex package which I am using to detect each
> time the javascript tag occurs in an html file and delete it. I
> tried using the following code to find examples such as the ones
> below but instead it finds the first occurrence of "<" and the last
> occurrence of ">" which is not what I am looking for.

Of course, because you are using greedy quantifiers, which will match
as much as possible. Use reluctant quantifiers instead, or at least a
more restrictive set of characters before and after "jscript".

/gordon

--


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.