Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / April 2006

Tip: Looking for answers? Try searching our database.

Validate opening and closing of html tags

Thread view: 
Pradeep - 17 Apr 2006 06:38 GMT
Hi,

Can anyone help me in solving this problem.
I have an example input:
sometext<b><i>some text</i></b>
the input may vary i.e. 1 tag is opened & not closed, some mismatches

To do:
1.check for few html tags like b,i,u
2.opening and closing of tags must be in proper order without
overlaping.

I have to write a java code to validate this.
Can anyone help me..

Thanks in Advance..

Regards,
Pradeep.
Mark Thomas - 17 Apr 2006 12:13 GMT
> Hi,
>
[quoted text clipped - 15 lines]
> Regards,
> Pradeep.

I'd use a finite state machine - googling that might get you started.

Mark
Martin Gregorie - 17 Apr 2006 12:42 GMT
>> Hi,
>>
[quoted text clipped - 17 lines]
>>
> I'd use a finite state machine - googling that might get you started.

Or a stack.

Signature

martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

Tim Smith - 18 Apr 2006 05:57 GMT
> > 2.opening and closing of tags must be in proper order without
> > overlaping.
...

> I'd use a finite state machine - googling that might get you started.

Wait a second...isn't checking for closing tags being in the right order
and for tags not overlapping equivalent to the problem of recognizing
palindromes?  And isn't that one of the classic examples of something
that you can't do with a finite state machine?

Signature

--Tim Smith

Oliver Wong - 18 Apr 2006 14:27 GMT
>> > 2.opening and closing of tags must be in proper order without
>> > overlaping.
[quoted text clipped - 6 lines]
> palindromes?  And isn't that one of the classic examples of something
> that you can't do with a finite state machine?

   You're right. It can't be done with a finite state machine. You'd need
an infinite state machine (or a stack machine, or something equally
powerful, etc.)

   - Oliver
Venkatesh - 17 Apr 2006 13:42 GMT
U can just make use of stack and java pattern matching package
(java.util.regex) ....

Here is the code to find tags in given html string:

   private static final String HTML_TAG_PATTERN = "<[^>]*>";
   private static final Pattern searchPattern =
Pattern.compile(HTML_TAG_PATTERN);

   private Matcher m = null;
   private String m_htmlStr = null;

   private boolean m_initDone = false;

   public void init(String htmlStr){

       m_htmlStr = htmlStr;
       m = searchPattern.matcher(m_htmlStr);

       m_initDone = true;

   }

   private String getNextTag() throws Exception {

       if (!m_initDone) {
           throw new Exception("Not yet initialized ....");
       }

       String tagToReturn = null;
       if (m.find()) {
           tagToReturn = m_htmlStr.substring(m.start(), m.end());
       }
       return tagToReturn;

   }

So, make use of a stack and push all the start tags and selectively pop
them up whenever u find an end tag and compare to find if the start and
end tags match.

Hope this helps

-Venkatesh
Greg R. Broderick - 17 Apr 2006 15:07 GMT
[posted and mailed]

> To do:
> 1.check for few html tags like b,i,u
[quoted text clipped - 3 lines]
> I have to write a java code to validate this.
> Can anyone help me..

Use a stack data structure.

Scan through the text looking for HTML tags.

When you encounter a start tag, push it on the stack.

When you encounter an end tag, pop the top element from the stack and
compare it to the end tag.

Cheers
GRB

Signature

---------------------------------------------------------------------
Greg R. Broderick                 [rot13] terto@oynpxubyvb.qlaqaf.bet

A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------

Oliver Wong - 17 Apr 2006 18:04 GMT
> Hi,
>
[quoted text clipped - 10 lines]
> I have to write a java code to validate this.
> Can anyone help me..

   You might be interested in HTML Tidy:
http://www.w3.org/People/Raggett/tidy/

   - Oliver
Martin Gregorie - 17 Apr 2006 18:36 GMT
>> Hi,
>>
[quoted text clipped - 13 lines]
>    You might be interested in HTML Tidy:
> http://www.w3.org/People/Raggett/tidy/

Agreed. If you're writing HTML you should not be without it. However, I
think you'll find the latest versions are here:

http://tidy.sourceforge.net/

The new C version is worth having and there's a Java version too.

Signature

martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.