Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2006

Tip: Looking for answers? Try searching our database.

Any Checkstyle users?

Thread view: 
slippymississippi@yahoo.com - 31 Jan 2006 02:14 GMT
I'm curious, I was looking at using Checkstyle on my projects.  For
some reason, it complains when I use hashmaps.  What?  Why wouldn't I
want to use hashmaps in my code?  Heaven forbid I use one of the most
useful data structures in all of Java.
mikm - 31 Jan 2006 03:46 GMT
What problem does it report?  I just installed the Checkstyle plugin
for Eclipse and it reported no problem when I created an new HashMap
aside from the "magic number" complaint in the constructor.

The "magic number" problem means that you should try to use constants
instead of using some number whenever possible.
mikm - 31 Jan 2006 03:55 GMT
Dah.  I should clarify myself.  For example, imagine you have a Hotel
class and want to keep track of your guests

import java.util.HashMap;
public class Hotel
{   private HashMap guests;
   //...
  public Hotel()
  {   guests = new HashMap(50);
  }
  //...
}

In this example "50" is ambiguous.  Why 50?

A slightly better way would be

public class Hotel
{   private HashMap guests;
   private static final int MAX_GUESTS = 50;
   //...
  public Hotel()
  {   guests = new HashMap(MAX_GUESTS);
  }
  //...
}

Now, anybody looking at your code knows that the maximum number of
guests is 50 and that the HashMap "guests" can hold that much data.
Tony Morris - 31 Jan 2006 04:18 GMT
> Dah.  I should clarify myself.  For example, imagine you have a Hotel
> class and want to keep track of your guests
>
> import java.util.HashMap;
> public class Hotel
> {   private HashMap guests;

You should be using a Map reference.
All aside from the false assumption that HashMap is indeed useful.

Signature

Tony Morris
http://tmorris.net/

Roedy Green - 31 Jan 2006 05:03 GMT
>You should be using a Map reference.
>All aside from the false assumption that HashMap is indeed useful.

Does that buy you anything other than it prevents you write locking
yourself into ArrayList by using non-Map methods?

The penalty is you are using slower interface references rather that
direct class ones which require a much more convoluted scheme to find
the appropriate method on each call.

Another penalty is you increase the complexity of the declare and I
figure make it less readable, at least to newbies.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Tony Morris - 31 Jan 2006 05:14 GMT
> >You should be using a Map reference.
> >All aside from the false assumption that HashMap is indeed useful.
[quoted text clipped - 8 lines]
> Another penalty is you increase the complexity of the declare and I
> figure make it less readable, at least to newbies.

Separating contract from implementation is something that Java has failed
miserably at, while pretending to do otherwise.
The advantages to a correct separation are widely misunderstood and not even
addressed in some of the most prominent literature.
The argument regarding the "performance hit" for lookup of virtual calls is
fall of nothing more than air - I'm sure (at least, I hope) you'll agree. In
fact, it has nothing at all to do with interfaces, and everything to do with
the level of depth in "virtuality" of the lookup, which class types can also
"suffer" from.

"Newbies" very rarely suffer from the apparant complexity because they have
no deeply held notions about how things "should" look, in contrast to the
many contrived ideas by "non-newbies". At least, this is my observation
according to experimentation - I can explain a somewhat trivial concept to a
vulnerable mind, such as my first year students, and I will be asked all the
right questions - to the junk-filled mind, I typically hear religious
refutation and 'reiteration of the guff', which is strongly guarded by ego.

Signature

Tony Morris
http://tmorris.net/

Roedy Green - 31 Jan 2006 11:56 GMT
> I'm sure (at least, I hope) you'll agree. In
>fact, it has nothing at all to do with interfaces, and everything to do with
>the level of depth in "virtuality" of the lookup, which class types can also
>"suffer" from.

Not at all. Calls to methods via  interfaces are much trickier than
calls to virtual instance methods..  For classes, the method lives at
a fixed offset in a vtbl for the calls. Bjarne Stroustrup, the father
of C++, invented the vtbl.  I am impressed all to heck since I tried
for years to invent it on my own for my own language,  coming up with
nothing nearly as elegant. All you need is the method number and the
vtbl for the class this object ACTUALLY is and away you go , not that
much worse that a call to a static method.

But for a call to a method via an interface reference the method is at
a different offset in every implementing method.  Various goofy
schemes are used from linearly scanning for it on every call, caching
the last hit or two, miniature HashMaps where you look up the
interface to get the offset of the method to implement each call...

I understand that much research has gone into efficiently implementing
calls to methods via interface references and the penalties are not
nearly so severe as in past.

People use interface references in preference to class references, to
the point making up dummy interfaces that have only one implementor
and always will.  I think you should only use interface references
when there is a likely benefit.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Daniel Dyer - 31 Jan 2006 11:04 GMT
>> You should be using a Map reference.
>> All aside from the false assumption that HashMap is indeed useful.
>
> Does that buy you anything other than it prevents you write locking
> yourself into ArrayList by using non-Map methods?

Not sure what you mean by "write locking yourself into ArrayList".  I  
would have thought the main advantages are that you can switch  
implementations (for example change from HashMap to TreeMap) without  
having to change any other code.  The corrollary of that is that it  
prevents your logic from becoming polluted by details of the current  
implementation.

I'm interested in Tony's point about HashMap not being useful.  What do  
you mean by this?

> The penalty is you are using slower interface references rather that
> direct class ones which require a much more convoluted scheme to find
> the appropriate method on each call.

You would have to have some convincing evidence of a significant  
performance hit before I would take that into consideration.  This is  
another one of those micro-optimisations that is best left to the compiler  
or VM.

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Roedy Green - 31 Jan 2006 11:58 GMT
On Tue, 31 Jan 2006 11:04:17 -0000, "Daniel Dyer"
<dan@dannospamformepleasedyer.co.uk> wrote, quoted or indirectly
quoted someone who said :

>"write locking yourself into ArrayList"

You make it difficult to change to some other List implementation. If
you use an explicit ArrayList reference, you can inadvertently use
methods defined only for ArrayList but not for other Lists.

Using an List reference, forces you write vanilla code where you can
change the collection implementation later easily.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Uppal - 31 Jan 2006 11:42 GMT
> {   private HashMap guests;
>     private static final int MAX_GUESTS = 50;
[quoted text clipped - 4 lines]
> Now, anybody looking at your code knows that the maximum number of
> guests is 50 and that the HashMap "guests" can hold that much data.

Except that that isn't true.  The name should be something like
       private static final int INITIAL_ESTIMATE_OF_PROBABLE_MAX_GUESTS = 50;
which is unwieldy in the extreme, but calling it MAX_GUESTS is actively wrong
and so must be avoided.

The real problem here is not the use of a "magic number", but the use of a
checking tool which will blindly apply the same so-called rule to every
situation, regardless of whether it is applicable.  If Checkstyle doesn't
provide a way to annotate your code to say "I have reviewed this 'problem' code
and it is fine as it is", then I'd suggest dropping Checkstyle.

   -- chris
Rick Giles - 31 Jan 2006 12:40 GMT
>> {   private HashMap guests;
>>     private static final int MAX_GUESTS = 50;
[quoted text clipped - 15 lines]
> provide a way to annotate your code to say "I have reviewed this 'problem' code
> and it is fine as it is", then I'd suggest dropping Checkstyle.

Yes, it does; see SuppressionCommentFilter at
http://checkstyle.sourceforge.net/config.html

/RG

>     -- chris
Chris Uppal - 31 Jan 2006 13:15 GMT
[me:]
> > The real problem here is not the use of a "magic number", but the use
> > of a checking tool which will blindly apply the same so-called rule to
[quoted text clipped - 5 lines]
> Yes, it does; see SuppressionCommentFilter at
> http://checkstyle.sourceforge.net/config.html

Good.  (Although the feature seems rather awkward to use for my taste.)

BTW, it appears (from more recent posts in the thread) that the issue wasn't
the use of a magic number at all, but of a variable of type HashMap.

   -- chris
slippymississippi@yahoo.com - 31 Jan 2006 14:03 GMT
> BTW, it appears (from more recent posts in the thread) that the issue wasn't
> the use of a magic number at all, but of a variable of type HashMap.

Not just HashMap:

"java.util.GregorianCalendar, java.util.Hashtable, java.util.HashSet,
java.util.HashMap, java.util.ArrayList, java.util.LinkedList,
java.util.LinkedHashMap, java.util.LinkedHashSet, java.util.TreeSet,
java.util.TreeMap, java.util.Vector"

I suppose the check is pushing me to pass/return custom data objects
that implement Java-specific data structures behind the scenes?  I
can't see how  you would write meaningful code without eventually using
one of the listed data structures, but I can see vaguely how it would
be beneficial to wrapper the implementation of these data structures.
Daniel Dyer - 31 Jan 2006 14:22 GMT
>> BTW, it appears (from more recent posts in the thread) that the issue  
>> wasn't
[quoted text clipped - 12 lines]
> one of the listed data structures, but I can see vaguely how it would
> be beneficial to wrapper the implementation of these data structures.

It seems to be advocating that you return java.util.Calendar,  
java.util.Map, java.util.Set, java.util.List, etc.  In other words, return  
the interface type rather than the concrete implementation.  No need to  
write any wrappers.

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Chris Uppal - 31 Jan 2006 15:26 GMT
> Not just HashMap:
>
[quoted text clipped - 8 lines]
> one of the listed data structures, but I can see vaguely how it would
> be beneficial to wrapper the implementation of these data structures.

I think you may be reading too much into what Checkstyle is telling you.

Forgetting about Checkstyle for the moment, there are at least three different
aspects to "good" code that you touch on here.  I think it would help to
separate them out from each other.

One is that when you use, say, a HashMap in some private bit of code, how do
you declare the variable that holds the reference.  Do you write:
       Map map = new HashMap();
or:
       HashMap map = new HashMap();
The most popular school of thought is that the first form is to be preferred,
but there are dissenting opinions too.  My own opinion is that it doesn't make
much difference in this case, since -- by hypothesis -- we are talking about
private code.  If the scope of the variable is restricted, then so is the
"damage" that can possibly be caused by declaring it too narrowly.  In short it
doesn't lead to hard-to-manage coupling or brittle code since nothing much can
depend on it.

The second is to ask the same question about externally visible fields and
APIs.  For instance the return type of a public or protected method.  In /this/
case I think the standard answer is completely correct, and that -- unless you
have some special requirement -- the declared type should be as general as
possible (ideally an interface).

The last issue is the use of "bare" collections in APIs at all.  Is it "good"
to export an API which takes a List<MyObject>, or which returns a Map(String,
MyOtherObject>) ?  My answer is that its not intrinsically wrong, and is quite
often exactly the right thing to do, but that you should stop and think
whenever you find yourself doing it.  If you find yourself doing it much, or if
you find yourself expressing complicated structures as collections (in the
API), then you have quite probably gone too far, and have something of a code
mess, where the responsibility for managing and interpreting a particular raw
structure is scattered all over the code-base, rather than nicely centralised.
For instance, I wouldn't blink at:
   List<MyObject>someMethod()
I would feel mildly uneasy about
   Map<String, MyObject>someMethod()
and if I came across
   List<List<MyObject>>
then my alarm bells would be going off like sky-rockets.

Coming back to Checkstyle.  I don't know whether it distinguishes between my
first and second issues, but I think it is trying to warn you about one or the
other.  I don't see an easy way that a tool like Checkstyle could provide much
help with the third issue since the cut-off between valid uses of raw
collections, and culpable failure to create a true object, is not easy to
recognise without a true understanding of the code.  I suppose some sort of
heuristic could be defined, but I'd not expect it to be much use.

   -- chris
Roedy Green - 31 Jan 2006 14:17 GMT
On Tue, 31 Jan 2006 11:42:01 -0000, "Chris Uppal"
<chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly
quoted someone who said :

>Except that that isn't true.  The name should be something like
>        private static final int INITIAL_ESTIMATE_OF_PROBABLE_MAX_GUESTS = 50;
>which is unwieldy in the extreme, but calling it MAX_GUESTS is actively wrong
>and so must be avoided.

How let us say the capacity factor is .75  and you   set the size at
100.  How big is the array inside 100 or 133?  Not knowing that could
give you poor performance with the beasts.

IIRC, I have it documented but I don't recall the answer. See
http://mindprod.com/jgloss/hashmap.html
http://mindprod.com/jgloss/hashtable.html

These things need a fair bit of slop to work.  Putting 100 guests in
100 slots is like asking a ballerina to perform in a telephone booth.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Uppal - 31 Jan 2006 16:04 GMT
[me:]
> > Except that that isn't true.  The name should be something like
> >        private static final int
[quoted text clipped - 5 lines]
> 100.  How big is the array inside 100 or 133?  Not knowing that could
> give you poor performance with the beasts.

Good question!  I had assumed that the capacity was the number of
actual entries expected -- regardless of load factor.  But it seems
that the code intereprets it as the size of the array to allocate, with
the load factor determining how many of the occupied  slots are allowed
before it grows itself.

In point of fact, the implementation rounds the capacity up to a power
of two.

Anyway, none of that's clear from the documentation -- which is
unforgivable.  I don't care for the choice they've made for the
interpretation of "capacity" (in fact I think it's stupid), but leaving
the matter open to guesswork is <pauses, and chooses a less loaded
word> regrettable.

That being the case, my recommended identifier name is also wrong.  I
leave it to the reader to find a managable identifier that expresses
the concept
    Initial estimate of the probable maximum number of
    expected guests, adjusted to take account of the (undocumented)
    relationship between the "load factor" and "capacity" of a
    java.util.HashMap.

Personally, I find that just
    50
is as clear as anything ;-)

   -- chris
Roedy Green - 31 Jan 2006 19:56 GMT
On 31 Jan 2006 16:04:07 GMT, "Chris Uppal"
<chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly
quoted someone who said :

>In point of fact, the implementation rounds the capacity up to a power
>of two.

See http://mindprod.com/jgloss/hashmap.html
http://mindprod.com/jgloss/hashtable.html

Look for the hammerhead "gotcha" shark.

That rounding gives HashMaps a bit of extra slop, perhaps giving
HashMap an undeserved reputation for being much faster that Hashtable.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

slippymississippi@yahoo.com - 31 Jan 2006 12:59 GMT
> What problem does it report?

I tried turning on all of the optional checks, and it returned the
IllegalType violation.  The description is:

"Checks that particular class are never used as types in variable
declarations, return values or parameters. Includes a pattern check
that by default disallows abstract classes. Rationale: Helps reduce
coupling on concrete classes."


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.