Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2007

Tip: Looking for answers? Try searching our database.

complex regex

Thread view: 
carlbernardi@gmail.com - 10 Oct 2007 02:59 GMT
HI,

I am new to java.util.regex package which I am using to detect each
time the javascript tag occurs in an html file and delete it. I tried
using the following code to find examples such as the ones below but
instead it finds the first occurrence of "<" and the last occurrence
of ">" which is not what I am looking for.

<script>
<script src="script.js">
</script>

       String mat = "<html><script><p><font></script>";
       String pat = "<*[\\x00-\\x7f]*jscript*[a-z0-9]*>";
       Pattern pattern = Pattern.compile(pat);
       Matcher matcher = pattern.matcher(mat);
       while(matcher.find()){
           System.out.println("Match: "+matcher.group()+"
Start:"+matcher.start()+" End:"+ matcher.end());
       }

output:
Match: <html><script><p><font><script> Start:0 End:39

i would be looking for an out put of:
Match: <script> Start:6 End:18
Match: <script> Start:27 End:18

Appreciate any input,

Carl
carlbernardi@gmail.com - 10 Oct 2007 03:58 GMT
Funny, I think I found my answer. This way seamed to do the trick.  Is
it possible to do the same thing with just Matcher.replaceAll()?

       String mat = "(<html><script><p><font><script>";
       String pat = "<[^>]*>";
       StringBuffer sb = new StringBuffer(mat);
       StringBuffer sb2 = new StringBuffer(mat);
       Pattern pattern = Pattern.compile(pat);
       Matcher matcher = pattern.matcher(mat);
       int start,end = 0;
       int newStart = 0;
       while(matcher.find()){
           start = matcher.start();
           end = matcher.end();
           System.out.println("old string ---
"+sb.substring(matcher.start(),matcher.end()).toString());
           if(sb.substring(start,end).indexOf("script") > -1){
               System.out.println("new string --- "+sb2.delete(start-
newStart,end-newStart).toString());
               newStart = sb.length() - sb2.length();
           }
           System.out.println(start+" "+end+" "+newStart);
       }

On Oct 9, 9:59 pm, "carlberna...@gmail.com" <carlberna...@gmail.com>
wrote:
> HI,
>
[quoted text clipped - 27 lines]
>
> Carl


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.