Java Forum / General / January 2006
Any Checkstyle users?
slippymississippi@yahoo.com - 31 Jan 2006 02:14 GMT I'm curious, I was looking at using Checkstyle on my projects. For some reason, it complains when I use hashmaps. What? Why wouldn't I want to use hashmaps in my code? Heaven forbid I use one of the most useful data structures in all of Java.
mikm - 31 Jan 2006 03:46 GMT What problem does it report? I just installed the Checkstyle plugin for Eclipse and it reported no problem when I created an new HashMap aside from the "magic number" complaint in the constructor.
The "magic number" problem means that you should try to use constants instead of using some number whenever possible.
mikm - 31 Jan 2006 03:55 GMT Dah. I should clarify myself. For example, imagine you have a Hotel class and want to keep track of your guests
import java.util.HashMap; public class Hotel { private HashMap guests; //... public Hotel() { guests = new HashMap(50); } //... }
In this example "50" is ambiguous. Why 50?
A slightly better way would be
public class Hotel { private HashMap guests; private static final int MAX_GUESTS = 50; //... public Hotel() { guests = new HashMap(MAX_GUESTS); } //... }
Now, anybody looking at your code knows that the maximum number of guests is 50 and that the HashMap "guests" can hold that much data.
Tony Morris - 31 Jan 2006 04:18 GMT > Dah. I should clarify myself. For example, imagine you have a Hotel > class and want to keep track of your guests > > import java.util.HashMap; > public class Hotel > { private HashMap guests; You should be using a Map reference. All aside from the false assumption that HashMap is indeed useful.
 Signature Tony Morris http://tmorris.net/
Roedy Green - 31 Jan 2006 05:03 GMT >You should be using a Map reference. >All aside from the false assumption that HashMap is indeed useful. Does that buy you anything other than it prevents you write locking yourself into ArrayList by using non-Map methods?
The penalty is you are using slower interface references rather that direct class ones which require a much more convoluted scheme to find the appropriate method on each call. Another penalty is you increase the complexity of the declare and I figure make it less readable, at least to newbies.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Tony Morris - 31 Jan 2006 05:14 GMT > >You should be using a Map reference. > >All aside from the false assumption that HashMap is indeed useful. [quoted text clipped - 8 lines] > Another penalty is you increase the complexity of the declare and I > figure make it less readable, at least to newbies. Separating contract from implementation is something that Java has failed miserably at, while pretending to do otherwise. The advantages to a correct separation are widely misunderstood and not even addressed in some of the most prominent literature. The argument regarding the "performance hit" for lookup of virtual calls is fall of nothing more than air - I'm sure (at least, I hope) you'll agree. In fact, it has nothing at all to do with interfaces, and everything to do with the level of depth in "virtuality" of the lookup, which class types can also "suffer" from.
"Newbies" very rarely suffer from the apparant complexity because they have no deeply held notions about how things "should" look, in contrast to the many contrived ideas by "non-newbies". At least, this is my observation according to experimentation - I can explain a somewhat trivial concept to a vulnerable mind, such as my first year students, and I will be asked all the right questions - to the junk-filled mind, I typically hear religious refutation and 'reiteration of the guff', which is strongly guarded by ego.
 Signature Tony Morris http://tmorris.net/
Roedy Green - 31 Jan 2006 11:56 GMT > I'm sure (at least, I hope) you'll agree. In >fact, it has nothing at all to do with interfaces, and everything to do with >the level of depth in "virtuality" of the lookup, which class types can also >"suffer" from. Not at all. Calls to methods via interfaces are much trickier than calls to virtual instance methods.. For classes, the method lives at a fixed offset in a vtbl for the calls. Bjarne Stroustrup, the father of C++, invented the vtbl. I am impressed all to heck since I tried for years to invent it on my own for my own language, coming up with nothing nearly as elegant. All you need is the method number and the vtbl for the class this object ACTUALLY is and away you go , not that much worse that a call to a static method.
But for a call to a method via an interface reference the method is at a different offset in every implementing method. Various goofy schemes are used from linearly scanning for it on every call, caching the last hit or two, miniature HashMaps where you look up the interface to get the offset of the method to implement each call...
I understand that much research has gone into efficiently implementing calls to methods via interface references and the penalties are not nearly so severe as in past.
People use interface references in preference to class references, to the point making up dummy interfaces that have only one implementor and always will. I think you should only use interface references when there is a likely benefit.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Daniel Dyer - 31 Jan 2006 11:04 GMT >> You should be using a Map reference. >> All aside from the false assumption that HashMap is indeed useful. > > Does that buy you anything other than it prevents you write locking > yourself into ArrayList by using non-Map methods? Not sure what you mean by "write locking yourself into ArrayList". I would have thought the main advantages are that you can switch implementations (for example change from HashMap to TreeMap) without having to change any other code. The corrollary of that is that it prevents your logic from becoming polluted by details of the current implementation.
I'm interested in Tony's point about HashMap not being useful. What do you mean by this?
> The penalty is you are using slower interface references rather that > direct class ones which require a much more convoluted scheme to find > the appropriate method on each call. You would have to have some convincing evidence of a significant performance hit before I would take that into consideration. This is another one of those micro-optimisations that is best left to the compiler or VM.
Dan.
 Signature Daniel Dyer http://www.dandyer.co.uk
Roedy Green - 31 Jan 2006 11:58 GMT On Tue, 31 Jan 2006 11:04:17 -0000, "Daniel Dyer" <dan@dannospamformepleasedyer.co.uk> wrote, quoted or indirectly quoted someone who said :
>"write locking yourself into ArrayList" You make it difficult to change to some other List implementation. If you use an explicit ArrayList reference, you can inadvertently use methods defined only for ArrayList but not for other Lists.
Using an List reference, forces you write vanilla code where you can change the collection implementation later easily.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 31 Jan 2006 11:42 GMT > { private HashMap guests; > private static final int MAX_GUESTS = 50; [quoted text clipped - 4 lines] > Now, anybody looking at your code knows that the maximum number of > guests is 50 and that the HashMap "guests" can hold that much data. Except that that isn't true. The name should be something like private static final int INITIAL_ESTIMATE_OF_PROBABLE_MAX_GUESTS = 50; which is unwieldy in the extreme, but calling it MAX_GUESTS is actively wrong and so must be avoided.
The real problem here is not the use of a "magic number", but the use of a checking tool which will blindly apply the same so-called rule to every situation, regardless of whether it is applicable. If Checkstyle doesn't provide a way to annotate your code to say "I have reviewed this 'problem' code and it is fine as it is", then I'd suggest dropping Checkstyle.
-- chris
Rick Giles - 31 Jan 2006 12:40 GMT >> { private HashMap guests; >> private static final int MAX_GUESTS = 50; [quoted text clipped - 15 lines] > provide a way to annotate your code to say "I have reviewed this 'problem' code > and it is fine as it is", then I'd suggest dropping Checkstyle. Yes, it does; see SuppressionCommentFilter at http://checkstyle.sourceforge.net/config.html
/RG
> -- chris Chris Uppal - 31 Jan 2006 13:15 GMT [me:]
> > The real problem here is not the use of a "magic number", but the use > > of a checking tool which will blindly apply the same so-called rule to [quoted text clipped - 5 lines] > Yes, it does; see SuppressionCommentFilter at > http://checkstyle.sourceforge.net/config.html Good. (Although the feature seems rather awkward to use for my taste.)
BTW, it appears (from more recent posts in the thread) that the issue wasn't the use of a magic number at all, but of a variable of type HashMap.
-- chris
slippymississippi@yahoo.com - 31 Jan 2006 14:03 GMT > BTW, it appears (from more recent posts in the thread) that the issue wasn't > the use of a magic number at all, but of a variable of type HashMap. Not just HashMap:
"java.util.GregorianCalendar, java.util.Hashtable, java.util.HashSet, java.util.HashMap, java.util.ArrayList, java.util.LinkedList, java.util.LinkedHashMap, java.util.LinkedHashSet, java.util.TreeSet, java.util.TreeMap, java.util.Vector"
I suppose the check is pushing me to pass/return custom data objects that implement Java-specific data structures behind the scenes? I can't see how you would write meaningful code without eventually using one of the listed data structures, but I can see vaguely how it would be beneficial to wrapper the implementation of these data structures.
Daniel Dyer - 31 Jan 2006 14:22 GMT >> BTW, it appears (from more recent posts in the thread) that the issue >> wasn't [quoted text clipped - 12 lines] > one of the listed data structures, but I can see vaguely how it would > be beneficial to wrapper the implementation of these data structures. It seems to be advocating that you return java.util.Calendar, java.util.Map, java.util.Set, java.util.List, etc. In other words, return the interface type rather than the concrete implementation. No need to write any wrappers.
Dan.
 Signature Daniel Dyer http://www.dandyer.co.uk
Chris Uppal - 31 Jan 2006 15:26 GMT > Not just HashMap: > [quoted text clipped - 8 lines] > one of the listed data structures, but I can see vaguely how it would > be beneficial to wrapper the implementation of these data structures. I think you may be reading too much into what Checkstyle is telling you.
Forgetting about Checkstyle for the moment, there are at least three different aspects to "good" code that you touch on here. I think it would help to separate them out from each other.
One is that when you use, say, a HashMap in some private bit of code, how do you declare the variable that holds the reference. Do you write: Map map = new HashMap(); or: HashMap map = new HashMap(); The most popular school of thought is that the first form is to be preferred, but there are dissenting opinions too. My own opinion is that it doesn't make much difference in this case, since -- by hypothesis -- we are talking about private code. If the scope of the variable is restricted, then so is the "damage" that can possibly be caused by declaring it too narrowly. In short it doesn't lead to hard-to-manage coupling or brittle code since nothing much can depend on it.
The second is to ask the same question about externally visible fields and APIs. For instance the return type of a public or protected method. In /this/ case I think the standard answer is completely correct, and that -- unless you have some special requirement -- the declared type should be as general as possible (ideally an interface).
The last issue is the use of "bare" collections in APIs at all. Is it "good" to export an API which takes a List<MyObject>, or which returns a Map(String, MyOtherObject>) ? My answer is that its not intrinsically wrong, and is quite often exactly the right thing to do, but that you should stop and think whenever you find yourself doing it. If you find yourself doing it much, or if you find yourself expressing complicated structures as collections (in the API), then you have quite probably gone too far, and have something of a code mess, where the responsibility for managing and interpreting a particular raw structure is scattered all over the code-base, rather than nicely centralised. For instance, I wouldn't blink at: List<MyObject>someMethod() I would feel mildly uneasy about Map<String, MyObject>someMethod() and if I came across List<List<MyObject>> then my alarm bells would be going off like sky-rockets.
Coming back to Checkstyle. I don't know whether it distinguishes between my first and second issues, but I think it is trying to warn you about one or the other. I don't see an easy way that a tool like Checkstyle could provide much help with the third issue since the cut-off between valid uses of raw collections, and culpable failure to create a true object, is not easy to recognise without a true understanding of the code. I suppose some sort of heuristic could be defined, but I'd not expect it to be much use.
-- chris
Roedy Green - 31 Jan 2006 14:17 GMT On Tue, 31 Jan 2006 11:42:01 -0000, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>Except that that isn't true. The name should be something like > private static final int INITIAL_ESTIMATE_OF_PROBABLE_MAX_GUESTS = 50; >which is unwieldy in the extreme, but calling it MAX_GUESTS is actively wrong >and so must be avoided. How let us say the capacity factor is .75 and you set the size at 100. How big is the array inside 100 or 133? Not knowing that could give you poor performance with the beasts.
IIRC, I have it documented but I don't recall the answer. See http://mindprod.com/jgloss/hashmap.html http://mindprod.com/jgloss/hashtable.html
These things need a fair bit of slop to work. Putting 100 guests in 100 slots is like asking a ballerina to perform in a telephone booth.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 31 Jan 2006 16:04 GMT [me:]
> > Except that that isn't true. The name should be something like > > private static final int [quoted text clipped - 5 lines] > 100. How big is the array inside 100 or 133? Not knowing that could > give you poor performance with the beasts. Good question! I had assumed that the capacity was the number of actual entries expected -- regardless of load factor. But it seems that the code intereprets it as the size of the array to allocate, with the load factor determining how many of the occupied slots are allowed before it grows itself.
In point of fact, the implementation rounds the capacity up to a power of two.
Anyway, none of that's clear from the documentation -- which is unforgivable. I don't care for the choice they've made for the interpretation of "capacity" (in fact I think it's stupid), but leaving the matter open to guesswork is <pauses, and chooses a less loaded word> regrettable.
That being the case, my recommended identifier name is also wrong. I leave it to the reader to find a managable identifier that expresses the concept Initial estimate of the probable maximum number of expected guests, adjusted to take account of the (undocumented) relationship between the "load factor" and "capacity" of a java.util.HashMap.
Personally, I find that just 50 is as clear as anything ;-)
-- chris
Roedy Green - 31 Jan 2006 19:56 GMT On 31 Jan 2006 16:04:07 GMT, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>In point of fact, the implementation rounds the capacity up to a power >of two. See http://mindprod.com/jgloss/hashmap.html http://mindprod.com/jgloss/hashtable.html
Look for the hammerhead "gotcha" shark.
That rounding gives HashMaps a bit of extra slop, perhaps giving HashMap an undeserved reputation for being much faster that Hashtable.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
slippymississippi@yahoo.com - 31 Jan 2006 12:59 GMT > What problem does it report? I tried turning on all of the optional checks, and it returned the IllegalType violation. The description is:
"Checks that particular class are never used as types in variable declarations, return values or parameters. Includes a pattern check that by default disallows abstract classes. Rationale: Helps reduce coupling on concrete classes."
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|