Hiya,
I need an easy way to validate URLs, ie, to make sure that they're
syntactically correct according to rfc2396, not if they exist etc.
The Apache class "org.apache.commons.validator.UrlValidator" looks
extremely promising, but I can't get it to work. Even the example in
the Javadoc fails. For instance:
UrlValidator validate = new UrlValidator();
System.out.println(validate.isValid("ftp"));
This should print "true" according to the Javadoc, doesn't though.
Indeed, everything returns "false": http://example.com, for example.
What am I doing wrong?
Here's a link to the javadoc: http://linkfrog.net/validate.
Thanks in advance,
Walter Gildersleeve
Freiburg, Germany
Oliver Wong - 20 Oct 2005 22:19 GMT
> Hiya,
>
[quoted text clipped - 11 lines]
> Indeed, everything returns "false": http://example.com, for example.
> What am I doing wrong?
If your run contradicts the javadocs, I'd consider this a bug and file a
bug report (if one hasn't been filed already).
- Oliver
Roedy Green - 21 Oct 2005 00:11 GMT
>I need an easy way to validate URLs, ie, to make sure that they're
>syntactically correct according to rfc2396, not if they exist etc.
I had a similar problem with validating email addresses.. The
standards let a lot of stuff that in practice most likely will not
deliver.
So I did it with series of regexes and just pruned the algorithms to
get the right balance of turfing good addresses and keeping bad ones.
My final check was to contact the mailserver and ask and keep a
history of probes.
You might do the analogous thing. See
http://mindprod.com/jgloss/domainnames.html
for lists of actual domain suffixes.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
NullBock - 21 Oct 2005 12:27 GMT
Just a note: the UrlValidator wasn't properly handling url-paths, and
choked if a URL didn't have a path. Thus:
http://example.com/path
was valid, but these weren't:
http://example.com
http://example.com/
This will be changed in the next release.
Walter
Oliver: thanks, I submitted the JD problem as a bug, and it's been
added to the next release.