Java Forum / General / November 2005
Jar: protocol question
Rhino - 29 Oct 2005 16:20 GMT I've been using the Jar: protocol a bit in the last few days and I'd like to know if this is a valid use of that protocol:
jar:file:!/Images/foo.gif
Basically, I'm trying to describe the location of a GIF that a program should be able to find in one of the various jars that are on the classpath used by the program.
Since 'this.getClass().getResource()' will search EVERY jar in the classpath for the desired file, it shouldn't be necessary to specify the jar name. Therefore, it seems to me that this should be valid notation for indicating that the jar name isn't necessary in this case: the bang ('!') in the name following the 'file:' suggests to me that the default jar(s), namely all of the jars found in the classpath, will be searched for an Images directory and a file named foo.gif within that directory.
Does that seem reasonable? If not, can anyone suggest a better notation to use for my situation?
I can't find any discussion of this "special case" in the articles I've seen about the Jar: protocol.
 Signature Rhino --- rhino1 AT sympatico DOT ca "There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies." - C.A.R. Hoare
Ben_ - 29 Oct 2005 17:25 GMT If you need to load the resource from the classpath, then this.getClass().getResource() will do the trick. What you need to know to load it is "Images/foo.gif" and the method returns the location of the resource (wherever it is found first).
If you know in advance there is a risk of collision (the resource exists multiple times on the classpath and you don't want to load it from the first location found), then you better change the location to make it unique...
Or, I don't see the point of your question... :-)
Rhino - 29 Oct 2005 18:11 GMT > If you need to load the resource from the classpath, then > this.getClass().getResource() will do the trick. What you need to know to [quoted text clipped - 6 lines] > > Or, I don't see the point of your question... :-) Yeah, sorry, I probably didn't ask the clearest question in the world....
I understand how this.getClass().getResource() works. My question is really one of notation more than anything. I think ;-)
One of the articles I saw on the jar: protocol - I can't find it again now, unfortunately! - contained examples like this: - jar:file:d:\\myJars\\test.jar!/images/foo.gif (refers to a specific file, foo.gif, in the path 'images' within a jar called 'test.jar' in the directory d:\myJars on the local filesystem. - jar:http://xyz.com/photos/pictures.jar!/pix/baz.jpg (refers to a specific file, baz.jpg, in the page 'pix', within a jar called 'pictures.jar' in the directory 'photos' on the website at http://xyz.com.
I'm looking for a notation that I can use in my programs to refer to files that are within jars that are visible to the program by virtue of being on the program's classpath. In that case, the name of the jar isn't necessary since all jars in the classpath will be scanned for the file. That's why I was thinking that 'jar:file:!/images/foo.png' would be good since it imitates the first example and implies that the file is in the file system but that we don't care what jar contains the file.
When I parse a String like that last example, I merely have to start at the position immediately following the bang and use the rest of the String to build my URL. Then, the file is found or not found as the case may be.
Does this seem like a reasonable notation to use or is there actually an "official" way of saying the same thing? The documentation I've seen has no information on how to denote a file in a jar when you don't care what jar file contains it.
Rhino
Ben_ - 29 Oct 2005 19:10 GMT Let's say you want to store an identifier for the resource in a config file.
Then why not simply store "Images/foo.gif" and pass that String directly to this.getClass().getResource() ?
It will lookup the classpath and load the resource from the file system or from a jar (or from a URL, if it's a URLClassLoader).
So, I still not fully understand why you really insist on the fact that it has to be found in a jar file ? It's exactly the point of ClassLoaders to make you ignorant of where the resource is, provided it is on the classpath.
Rhino - 29 Oct 2005 19:59 GMT > Let's say you want to store an identifier for the resource in a config file. > [quoted text clipped - 7 lines] > has to be found in a jar file ? It's exactly the point of ClassLoaders to > make you ignorant of where the resource is, provided it is on the classpath. If I only stored "Images/foo.gif" in config file, someone looking at the code would have to *INFER* that it was in a jar and was going to be obtained via this.getClass().getResource(). But what if I wanted to be able to specify a file that wasn't necessarily in the classpath? The file could be anywhere, even online or elsewhere in the file system.
"Images/foo.gif" is not going to be found by the program if it is *not* in a jar on the classpath. In fact, I wouldn't be able to use this.getClass().getResource() to find it; if it was in a standalone file in the file system, I'd do this:
File myFile = new File(filename); URL fileURL = myFile.toURL();
It seems to me that if you clearly and unambiguously notate the origins of a file, including the protocol name and the jar name (where applicable!) as well as the path(s) and the specific file name, anyone maintaining the code is going to be very clear on exactly where that file is. The only wrinkle is that I'm not aware of a standard way of saying that a file is in a jar on the classpath and we don't need to know the jar name. That's why I'm proposing the notation that I mentioned.
However, for all I know, there is already an existing convention to get the same idea across which differs from mine. If so, I'd like to use the already established notation, whatever it is, rather than muddying the waters by using a different notation.
In a sense, I suppose this is basically about making the code both self-documenting and flexible. Rather than relying on someone to write and maintain comments in your imaginary config file saying that the file is found in a jar on the classpath, I'd like to be able to point to any file in the file system or online (or in a jar that is in the filesystem or online or in the classpath) and be confident that program would use the appropriate methods to find the file based on the way the file location was notated.
Am I making sense yet? :-)
I'm glad we're having this dialog, it is helping me clarify in my own mind what I'm trying to accomplish and why it seems useful.
Rhino
Roedy Green - 30 Oct 2005 06:10 GMT On Sat, 29 Oct 2005 14:59:38 -0400, "Rhino" <no.offline.contact.please@nospam.com> wrote, quoted or indirectly quoted someone who said :
>If I only stored "Images/foo.gif" in config file, someone looking at the >code would have to *INFER* that it was in a jar and was going to be obtained >via this.getClass().getResource(). Experienced Java programmers all know what a resource is, and are used to adjusting file locations external to the program to have them still considered resources in different contexts, e.g. debugging, running locally, running on a server, running as an Applet.
If you start monkeying with that, you are going to cause 100 times as much confusion as you imagine you are avoiding.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Rhino - 30 Oct 2005 18:25 GMT > On Sat, 29 Oct 2005 14:59:38 -0400, "Rhino" > <no.offline.contact.please@nospam.com> wrote, quoted or indirectly [quoted text clipped - 11 lines] > If you start monkeying with that, you are going to cause 100 times as > much confusion as you imagine you are avoiding. Causing confusion is the last thing I want. I'm looking to find a way of describing resources so that it is easy for my code to find those resources and easy for maintainers of code to understand what I am doing. I'm just struggling to find a good way to indicate that a given file is in a jar in the filesystem so I'm looking for suggestions on how to notate the file name to indicate that.
Rhino
Roedy Green - 31 Oct 2005 03:48 GMT On Sun, 30 Oct 2005 12:25:55 -0500, "Rhino" <no.offline.contact.please@nospam.com> wrote, quoted or indirectly quoted someone who said :
>I'm just >struggling to find a good way to indicate that a given file is in a jar in >the filesystem so I'm looking for suggestions on how to notate the file name >to indicate that. using getResource or getResourceAsStream is a good clue. Perhaps you might want to collect such strings using the term RESOURCE in their names.
The big problem I have found is remembering to include resources in jars. You get no hint of trouble till you actually load the resource.
Cramfull offers a solution to that.
See http://mindprod.com/jgloss/cramfull.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 30 Oct 2005 06:08 GMT On Sat, 29 Oct 2005 13:11:04 -0400, "Rhino" <no.offline.contact.please@nospam.com> wrote, quoted or indirectly quoted someone who said :
>I'm looking for a notation that I can use in my programs to refer to files >that are within jars that are visible to the program by virtue of being on [quoted text clipped - 3 lines] >imitates the first example and implies that the file is in the file system >but that we don't care what jar contains the file. getResource does searching and gives you the FIRST resource match on the classpath.
It produces the !syntax which is treated thereafter like any other URL.
You want a method something like this off the top of my head:
/** @param s either resource or URL to resource code either as "myimage.jpg" or as "jar:file:///C|/bar/baz.jar!/com/foo/myimage.jpg" @eturn URL to resource */ getURLForResource( Class c, String s ) { if ( s.startsWith( "jar:" ) || s.startsWith( "file:" ) || s.startsWith( "http:" ) ) return new URL( s ); else return c.getResource( s ); }
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Rhino - 30 Oct 2005 18:23 GMT > On Sat, 29 Oct 2005 13:11:04 -0400, "Rhino" > <no.offline.contact.please@nospam.com> wrote, quoted or indirectly [quoted text clipped - 30 lines] > else return c.getResource( s ); > } You might be on to something there! Hmm, let me think that over a bit....
Rhino
Thomas Fritsch - 30 Oct 2005 03:25 GMT > I've been using the Jar: protocol a bit in the last few days and I'd like > to know if this is a valid use of that protocol: > > jar:file:!/Images/foo.gif AFAIK the JAR-URL-specification requires the part between "jar:" and "!" to be a valid URL, i.e. the URL of the jar file. In your case this part is "file:" which surely is invalid. Hence, in my opinion it would be a mis-use of the "jar:" protocol.
> Basically, I'm trying to describe the location of a GIF that a program > should be able to find in one of the various jars that are on the [quoted text clipped - 14 lines] > Does that seem reasonable? If not, can anyone suggest a better notation to > use for my situation? Yes, sure. It is a reasonable thing to want. But I would suggest not to use the "jar:" protocol, and instead prefer to invent a new protocol (the name "classpath:" comes to mind). Example URLs might then be: classpath:/Images/foo.gif classpath:/javax/swing/plaf/metal/icons/Error.gif
You can push this approach even one step further. If you would write a small Handler for this new protocol, then you could use your new URLs exactly like any other URL. For example: URL url = new URL("classpath:/Images/foo.gif"); InputStream stream = url.openStream(); More on protocol handlers can be found at http://java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#constructor_detail
> I can't find any discussion of this "special case" in the articles I've > seen > about the Jar: protocol.
 Signature "TFritsch$t-online:de".replace(':','.').replace('$','@')
Rhino - 30 Oct 2005 18:22 GMT > > I've been using the Jar: protocol a bit in the last few days and I'd like > > to know if this is a valid use of that protocol: [quoted text clipped - 36 lines] > InputStream stream = url.openStream(); > More on protocol handlers can be found at http://java.sun.com/j2se/1.4.2/docs/api/java/net/URL.html#constructor_detail
I think I like your proposal! It seemed reasonable to me to continue to use the 'jar:file:', since it seems to be acceptable to use 'jar:file:c:\\myJars\\big.jar!/Images/foo.gif' to designate a specific entry within a jar on the filesystem. After all, a jar on the classpath is also on the filesystem, by definition. But using a new protocol like your proposed 'classpath:' is clearer than trying to use 'file:' within 'jar:'.
The downside is that no 'classpath:' protocol is known to anyone but you and me, at least as far as i know. That means it may raise more questions than it solves if anyone else sees my code. Of course, I could approach whatever body creates RFCs and suggest the creation of a 'classpath:' protocol but that strikes me as a process that would drag on for years. Still, that is probably the best solution from a design point of view....
> > I can't find any discussion of this "special case" in the articles I've > > seen > > about the Jar: protocol. Rhino
Chris Smith - 31 Oct 2005 23:44 GMT > I think I like your proposal! It seemed reasonable to me to continue to use > the 'jar:file:', since it seems to be acceptable to use [quoted text clipped - 6 lines] > me, at least as far as i know. That means it may raise more questions than > it solves if anyone else sees my code. Believe me, seeing a malformed "jar:" URL would raise at LEAST as many questions as seeing a new protocol handler. Besides, your new "URL" could not be used properly with the java.net.URL class. So your choice is between something that's unique, or something that's fundamentally broken. I'll take unique any day.
> Of course, I could approach whatever > body creates RFCs and suggest the creation of a 'classpath:' protocol but > that strikes me as a process that would drag on for years. It's also quite unlikely to succeed. You need to realize that the Java concept of a URL is very different from the W3C's concept of a URL. Things like the "jar:" scheme are Java URLs. Though they comply with the syntax of the W3C's URLs, the W3C doesn't standardize any scheme called "jar".
So if you did suggest this, use the Bug Parade at java.sun.com. The best way to proceed would be to implement a ProtocolHandler as described above, and then propose it as an RFE in Bug Parade, and attach your existing code. And yes, it would definitely drag on for years; but at least you've got your own code to use in the interim.
 Signature www.designacourse.com The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Roedy Green - 01 Nov 2005 07:18 GMT >So if you did suggest this, use the Bug Parade at java.sun.com. The >best way to proceed would be to implement a ProtocolHandler as described >above, Would anyone care to compose a paragraph outlining what you have to do to add a new protocol handler? I would like to add this to the java glossary.
What Interface(s) do you have to implement? What do you do to get it registered on the list so that your code will get called when someone does.
URL url = new URL( "weird://www.billabong.com:80/songs/lyrics.txt" ); URLConnection urlc = url.openConnection();
I know this is possible because a team I worked on added a number of custom protocol handlers to deal with various stock market ticker streams.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Smith - 01 Nov 2005 16:31 GMT > Would anyone care to compose a paragraph outlining what you have to do > to add a new protocol handler? I would like to add this to the java > glossary. Better yet, there's a nice description at:
http://java.sun.com/developer/onlineTraining/protocolhandlers/
 Signature www.designacourse.com The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Roedy Green - 02 Nov 2005 01:09 GMT >Better yet, there's a nice description at: > > http://java.sun.com/developer/onlineTraining/protocolhandlers/ Many Thanks. That felt like scratching an itch in the center of my back.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 02 Nov 2005 02:11 GMT >Better yet, there's a nice description at: > > http://java.sun.com/developer/onlineTraining/protocolhandlers/ In a nutshell, here is how it works:
In URLs, you see officially supported protocols like http: https: file: ftp: jdbc: and rmi:. These describe the rules by which data are extracted over such links. It is possible to define your own protocols, e.g. to extract stock ticker information or to handle encryption or compression. To do that you implement a custom java.net. URLStreamHandler class and a java.net. URLConnection.
You then name your new URLStreamHandler class com.mydomain.protocol. xxxx.Handler where xxxx is the name of your new protocol.
Then you must hook them into the official list of supported protocols so that new URL will recognise your new protocol rather than throwing a MalformedURLException. You do this by adding your implementing package name prefix e.g. com.mydomain.protocol to the java.protocol.handler.pkgs system property, e.g.
// Registering a new protocol handler com.mydomain.protocol.xxxx.Handler
System.setSystemProperty( "java.protocol.handler.pkgs", "com.mydomain.protocol" );
You don't hook your protocol name itself in anywhere. Java finds it via the package/class naming convention.
Normally there is no such java.protocol.handler.pkgs property because there are no custom protocols. If you have more than one package prefix, use | to separate the names, not commas.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Fritsch - 02 Nov 2005 19:44 GMT > You then name your new URLStreamHandler class com.mydomain.protocol. > xxxx.Handler where xxxx is the name of your new protocol. You could also name your new URLStreamHandler class "sun.net.www.protocol.xxxx.Handler" . Then there would be no need to fiddle with the "java.protocol.handler.pkgs" system property. (But of course this is dirty cheating, and as such is to be frowned upon. ;-) )
> Then you must hook them into the official list of supported protocols > so that new URL will recognise your new protocol rather than throwing [quoted text clipped - 7 lines] > System.setSystemProperty( "java.protocol.handler.pkgs", > "com.mydomain.protocol" );
 Signature "TFritsch$t-online:de".replace(':','.').replace('$','@')
Roedy Green - 03 Nov 2005 03:37 GMT On Wed, 2 Nov 2005 19:45:41 +0100, "Thomas Fritsch" <i.dont.like.spam@invalid.com> wrote, quoted or indirectly quoted someone who said :
>You could also name your new URLStreamHandler class >"sun.net.www.protocol.xxxx.Handler" . >Then there would be no need to fiddle with the "java.protocol.handler.pkgs" >system property. >(But of course this is dirty cheating, and as such is to be frowned upon. >;-) ) That would be like teaching forgery as way to solve your financial ills. I suspect if you distributed such code, and Sun found out about it, they could sue you.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Fritsch - 01 Nov 2005 01:05 GMT > I think I like your proposal! It seemed reasonable to me to continue to > use [quoted text clipped - 14 lines] > that strikes me as a process that would drag on for years. Still, that is > probably the best solution from a design point of view.... I agree, going through the official RFC process would be lengthy. But I think that this would not be needed for your goal. From your original post I understand your goal as: Use specialized URLs internal to your application, but *not* for inter-operating with *other* applications.
The "classpath" protocol is not a real protocol in the same sense like the networking protocols ("http", "ftp", ...) are. Actually it is little more than a parsing rule how to interpret URL-strings ("classpath:resourceName"). Hence, the term /protocol/ might be very misleading here. BTW: As far as I know, even Sun's "jar" protocol has never been officially registered by an RFC, probably for the same reason as above.
You might be surprised how simple the "classpath" protocol handler can be implemented (~12 lines of code). See http://gate.ac.uk/gate/doc/java2html/gate/util/protocols/classpath/Handler.java.html for an inspiration. Making Java aware of the new protocol handler is simple, too: Add class "your.package.classpath.Handler" to your app, and start your app with "-Djava.protocol.handler.pkgs=your.package". Java will then automagically find the Handler when needed.
 Signature "TFritsch$t-online:de".replace(':','.').replace('$','@')
Rhino - 01 Nov 2005 18:44 GMT > > I think I like your proposal! It seemed reasonable to me to continue to > > use [quoted text clipped - 19 lines] > understand your goal as: Use specialized URLs internal to your application, > but *not* for inter-operating with *other* applications. Yes, I think this is the heart of the issue. I really only _need_ a convention/notation/protocol for use within my own programs; however, I'm _hoping_ to find something that other people would understand fairly intuitively if they tried to maintain my code. The ideal would be to come up with a standardized way of saying "the file is in some jar of the classpath but we don't need it's name" that the whole industry would understand and accept.
Even better, an overall convention on describing the position of ANY piece of information would be wonderful. It would be really neat to see a concise, descriptive way of noting the location of a piece of data, even if was in a database or on a network share or a floppy disk in someone's house. Then, just hand that location to a blackbox method that will return the data in the file, assuming the "data source" is actually available (e.g. the floppy disk is actually in someone's drive) and there are no security roadblocks (e.g. there are no file permissions or access issues for that particular piece of data).
> The "classpath" protocol is not a real protocol in the same sense like the > networking protocols ("http", "ftp", ...) are. Actually it is little more > than a parsing rule how to interpret URL-strings ("classpath:resourceName"). > Hence, the term /protocol/ might be very misleading here. > BTW: As far as I know, even Sun's "jar" protocol has never been officially > registered by an RFC, probably for the same reason as above. Yes, I agree, "protocol" may not be the right word for this; but if there is an existing word that is more suitable, I haven't thought of what it is. Maybe this needs a new noun - maybe a "Fritsch" or a "Rhino" - but let's put that aside for now :-)
> You might be surprised how simple the "classpath" protocol handler can be > implemented (~12 lines of code). See http://gate.ac.uk/gate/doc/java2html/gate/util/protocols/classpath/Handler.java.html
> for an inspiration. > Making Java aware of the new protocol handler is simple, too: > Add class "your.package.classpath.Handler" to your app, and start your app > with "-Djava.protocol.handler.pkgs=your.package". > Java will then automagically find the Handler when needed. Damn! I'm starting to like this idea a lot! This approach would make it pretty easy for others to adopt this new "protocol" too. Eventually, it could become part of the API itself so that it wouldn't have to be loaded separately.
I'm going to mull this over a bit; I may just decide to take this further. But I need to think about what my own application needs first before I get too caught up in this :-)
Thanks for your very interesting suggestions!
Rhino
Roedy Green - 02 Nov 2005 02:18 GMT On Tue, 1 Nov 2005 12:44:57 -0500, "Rhino" <no.offline.contact.please@nospam.com> wrote, quoted or indirectly quoted someone who said :
> The ideal would be to come up >with a standardized way of saying "the file is in some jar of the classpath >but we don't need it's name" that the whole industry would understand and >accept. You specifically want to exclude things on the classpath NOT in jars though?
You could do that with a wrapper around the getResource method that looked at the offered URL and rejected ones without a ! somewhere in them.
You would invent a custom protocol mainly if you wanted to smuggle this into other people's code that only understood URLs.
You could use a self explanatory protocol name
in-jar-on-classpath:xxx.txt
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 30 Oct 2005 06:02 GMT On Sat, 29 Oct 2005 11:20:01 -0400, "Rhino" <no.offline.contact.please@nospam.com> wrote, quoted or indirectly quoted someone who said :
>Since 'this.getClass().getResource()' will search EVERY jar in the classpath >for the desired file, it shouldn't be necessary to specify the jar name. [quoted text clipped - 3 lines] >the jars found in the classpath, will be searched for an Images directory >and a file named foo.gif within that directory. getResource does a search and gives you a direct url to the Jar where it found the resource.
If you want to speed that search, use the class-path and index feature of jar.exe to build a multi-jar index for direct lookup.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|