Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2007

Tip: Looking for answers? Try searching our database.

why people use "Map m= new HashMap()" or  "List l = new ArrayList()"?

Thread view: 
www - 19 Oct 2007 20:10 GMT
Hi,

I saw in many places that people use:

Map m= new HashMap();

or

List l = new ArrayList();

I *rarely* see people do:

HashMap m= new HashMap();

or

ArrayList l = new ArrayList();

Is any specific reason for such a practice?

I noticed Map does not have clone(), but HashMap does.
Richard Reynolds - 19 Oct 2007 20:39 GMT
> Hi,
>
[quoted text clipped - 17 lines]
>
> I noticed Map does not have clone(), but HashMap does.

They're using the most "general" type that they can to give them the
greatest flexibility in their design. It's a trait of OO programming in
general. Try reading some basic OO design tutorials, in particulay you might
want to pay attention to generalisation, specialisation and, in Java for
instance, the use of abstract classes and interfaces.
Mark Space - 19 Oct 2007 20:42 GMT
> I saw in many places that people use:
>
> Map m= new HashMap();

> I *rarely* see people do:
>
> HashMap m= new HashMap();

1. Convention.  As you say, many people do it this way.

2. Flexibility.  It may be easier to change the implementation of the
former than the latter.  The latter form would encourage other
programmers to code to the specific HashMap() interface, rather than the
more general Map() interface.  With Map(), you can just plug in a new
type of map like

Map m = new LinkedHashMap();

But if you've coded to a specific type of Hash Map this change may be
not so easy to make.

3. Probably some other reasons I can't think of right now...  always it
depends on your application.
Lew - 19 Oct 2007 21:03 GMT
>> I saw in many places that people use:
>>
[quoted text clipped - 19 lines]
> 3. Probably some other reasons I can't think of right now...  always it
> depends on your application.

It is a best practice to prefer the most general type possible for the
compile-time type of a variable.  This provides the most bug-free and
maintainable code.

The word is "prefer", not "demand", and "possible for the ... type", which
might be a specific implementation but usually isn't.

Consider the fairly common error of using Vector for a List when you don't
need its special features.  If you declared

List <String> stuff = new Vector <String> ();

then it's much easier to change to

List <String> stuff = new ArrayList <String> ();

and later, if you see that you need specific performance characteristics,

List <String> stuff = new TreeList <String> ();

or even

List <String> stuff
   = Collections.synchronizedList( new ArrayList <String> () );

No other code will depend on non-List methods such as those of Vector, so you
are much safer in making the changes.

Read Joshua Bloch's excellent book /Effective Java/ for details on this and
many other best practices.

Signature

Lew

Knute Johnson - 20 Oct 2007 03:34 GMT
>>> I saw in many places that people use:
>>>
[quoted text clipped - 50 lines]
> Read Joshua Bloch's excellent book /Effective Java/ for details on this
> and many other best practices.

It doesn't make a lot of sense to me.  Generics were added to closely
control the type of parameters passed to a class that we then want to
make less type specific?  It makes no sense to turn a LinkedList for
example into a plain List as you lose all of what makes it a LinkedList.
 Of course you could assign it to a Deque.

knute...
Lew - 20 Oct 2007 03:40 GMT
> It doesn't make a lot of sense to me.  Generics were added to closely
> control the type of parameters passed to a class that we then want to
> make less type specific?  It makes no sense to turn a LinkedList for
> example into a plain List as you lose all of what makes it a LinkedList.
>  Of course you could assign it to a Deque.

Well, duhy, if you need the specific type you use the specific type.  No one
is saying to be stupid about it.

Signature

Lew

Knute Johnson - 20 Oct 2007 09:37 GMT
>> It doesn't make a lot of sense to me.  Generics were added to closely
>> control the type of parameters passed to a class that we then want to
[quoted text clipped - 4 lines]
> Well, duhy, if you need the specific type you use the specific type.  No
> one is saying to be stupid about it.

The only two classes extended from List that you would ever want to do
this with are ArrayList and Vector.  Since everyone hates Vector,
explain to me in what actual case this would be of any benefit and not
just confusing.

knute...
Patricia Shanahan - 20 Oct 2007 11:41 GMT
>>> It doesn't make a lot of sense to me.  Generics were added to closely
>>> control the type of parameters passed to a class that we then want to
[quoted text clipped - 11 lines]
>
> knute...

You might choose LinkedList because you need e.g. a Queue. On the other
hand, it is also reasonable to choose LinkedList because you need a List
and preliminary estimates suggest that non-tail insertions and removals
will dominate over indexed access.

I would use List, except for the actual constructor call, in the second
case.

Patricia
Lew - 20 Oct 2007 14:10 GMT
>>>> It doesn't make a lot of sense to me.  Generics were added to
>>>> closely control the type of parameters passed to a class that we
[quoted text clipped - 19 lines]
> I would use List, except for the actual constructor call, in the second
> case.

I simply point to the body of literature, such as Joshua Bloch's /Effective
Java/, that make the case for this practice.  They provide much better
arguments than I will here in favor of it.  That's where I learned it -
reading the experts.  Instead of challenging me to justify the practice, write
to them.

Signature

Lew

Eric Sosman - 22 Oct 2007 17:35 GMT
Knute Johnson wrote On 10/20/07 04:37,:

>>>It doesn't make a lot of sense to me.  Generics were added to closely
>>>control the type of parameters passed to a class that we then want to
[quoted text clipped - 7 lines]
> The only two classes extended from List that you would ever want to do
> this with are ArrayList and Vector.

   The documentation describes ten core classes that
implement List, of which eight are concrete classes.

> Since everyone hates Vector,
> explain to me in what actual case this would be of any benefit and not
> just confusing.

   Do you think your program is in its final form when
you have finished debugging it?  Nobody, not even you,
will ever want to modify it?  Nobody, not even you, will
ever want to change his mind about an implementation
choice?  "Hmm: Profiling tells me that a lot of time is
spent expanding and re-expanding and re-re-expanding this
ArrayList.  We don't seem to need random access; maybe
LinkedList would be better."  This sort of thing never
happens to you?

   If not, all I can say is that you are more gifted at
predicting the future than I am.  Given my temporally
foreshortened foresight, I prefer to leave myself as many
chances to change my mind as I possibly can.  YMMV.

   ... but when Java 9 brings us SuperList ("faster than
a speeding bullet"), I bet I'll have an easier time adapting
my code to use it than you will with yours.

Signature

Eric.Sosman@sun.com

Roedy Green - 21 Oct 2007 07:34 GMT
>It is a best practice to prefer the most general type possible for the
>compile-time type of a variable.  This provides the most bug-free and
>maintainable code.

how does that help reduce bugs?  I would expect the opposite, to a
mild degree since it is less clear what classes you are actually
using.

You could for example use a LinkedList inadvertently and have it not
as obvious.

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Lew - 21 Oct 2007 08:13 GMT
>> It is a best practice to prefer the most general type possible for the
>> compile-time type of a variable.  This provides the most bug-free and
[quoted text clipped - 6 lines]
> You could for example use a LinkedList inadvertently and have it not
> as obvious.

Actually, for reasons that you stated,
> The main reason is to avoid the temptation to use methods of HashMap
> that are not part of Map.  That way you can change to some other sort
> of Map very easily.  If you allowed yourself to use methods peculiar
> to HashMap, you would have to rewrite the code to avoid them when you
> wanted to change the Map implementation.

and others that Zig stated.

Joshua Bloch covers it quite well in /Effective Java/.

Signature

Lew

Jim Korman - 20 Oct 2007 03:01 GMT
>Hi,
>
[quoted text clipped - 17 lines]
>
>I noticed Map does not have clone(), but HashMap does.

The rule I use is if the collection I'm using is local to a method,
or totally private to a class, then I'll use the specific collection
type. If the collection is visible anywhere outside the class or
especially subclasses, I use the collection interfaces.

Jim
Lew - 20 Oct 2007 03:29 GMT
> The rule I use is if the collection I'm using is local to a method,
> or totally private to a class, then I'll use the specific collection
> type. If the collection is visible anywhere outside the class or
> especially subclasses, I use the collection interfaces.

I use "declare the most general type" even for private items.  Of course,
"most general" might mean a specific implementation if the algorithm demands it.

Signature

Lew

Wayne - 20 Oct 2007 03:41 GMT
> Hi,
>
[quoted text clipped - 7 lines]
> ArrayList l = new ArrayList();
> Is any specific reason for such a practice?

You bet.  When the map is visible outside of the
current method (or class), you only want external
code to know that you have a Map.  They should not
care about the *implementation* of that Map, which
the implementer can then change without potentially
harming users of that code.

This is a standard practice called "data hiding"
or other more impressive terms.

Note:  when you (someday) learn about "generics",
don't fall into the trap of:
  List l = new ArrayList<String>();
Instead use:
  List<String> l = new ArrayList<String>();

Verbally you would say that "l is a list of Strings".

-Wayne
Chris ( Val ) - 20 Oct 2007 07:51 GMT
> > Hi,
>
[quoted text clipped - 11 lines]
> current method (or class), you only want external
> code to know that you have a Map.

>From what I have read, I don't think that is quite
the reson behind it.

> They should not
> care about the *implementation* of that Map, which
> the implementer can then change without potentially
> harming users of that code.

Yes, however; that is typical of an interface hiding
its implementation, via a layer of abstraction.

> This is a standard practice called "data hiding"
> or other more impressive terms.

But "data hiding" is an artifact of encapsulation,
generally achieved through the implementation of
private accessors - This answers a different question
to what the OP really asked.

>From what I understand, the OP wants to know why
one construct "Map m= new HashMap();" is prefered
over another "HashMap m= new HashMap();".

[snip]

In my studies so far (and I have not seen anyone mention
this yet), but the following constructs:

 "Map m= new HashMap();" or...
 "public void someFunc( Map m ); // Accepts HashMap, etc.

...appear to be what is known as: "Programming To An Interface".

It is said to be preferred and encouraged because it offers the
most flexible solution of all.

Cheers,

Chris
Roedy Green - 20 Oct 2007 08:19 GMT
>Map m= new HashMap();

The main reason is to avoid the temptation to use methods of HashMap
that are not part of Map.  That way you can change to some other sort
of Map very easily.  If you allowed yourself to use methods peculiar
to HashMap, you would have to rewrite the code to avoid them when you
wanted to change the Map implementation.

From a raw speed point of view, the Map idiom is not a good idea since
method calls via an interface reference are slower than calls via a
class reference.

You might say, but I will NEVER want to change the Map implementation.
None of the others are even remotely close to what I need.  Consider
what happened when Sun introduced StringBuilder to replace
StringBuffer. StringBuilder did not exist at the time many people
wrote their code.  Similarly HashMap largely replaced HashTable, and
ArrayList replaced Vector.  What you are doing is making it easier to
plug in some new improved implementation in future that you have never
heard of now.

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Daniel Dyer - 20 Oct 2007 11:43 GMT
> From a raw speed point of view, the Map idiom is not a good idea since
> method calls via an interface reference are slower than calls via a
> class reference.

I doubt you'd ever notice the difference.  Let the JIT take care of  
optimising this rather than doing it explicitly.

Dan.

Signature

Daniel Dyer
http//www.uncommons.org

Lew - 20 Oct 2007 14:15 GMT
>> From a raw speed point of view, the Map idiom is not a good idea since
>> method calls via an interface reference are slower than calls via a
>> class reference.
>
> I doubt you'd ever notice the difference.  Let the JIT take care of
> optimising this rather than doing it explicitly.

I was at a meeting for software developers the other night.  The guest speaker
spoke of his decades' work in software development, pointing out that *every*
time he profiled a bit of code the bottleneck was *never* where he predicted.

He used the absolutes.  Every time he looked, the bottleneck was never where
he thought.  Only by actual measurement did he find the slowdown.  He says,
every time.

Based on the rest of his presentation, he was knowledgeable and proficient in
software development.  I find his claim credible.

Signature

Lew

Michael Jung - 21 Oct 2007 10:51 GMT
> > On Sat, 20 Oct 2007 08:19:50 +0100, Roedy Green
> >> From a raw speed point of view, the Map idiom is not a good idea since
[quoted text clipped - 10 lines]
> Based on the rest of his presentation, he was knowledgeable and proficient in
> software development.  I find his claim credible.

The people around my place who are responsible for tuning performance never
predict (except the obvious), they measure and then analyse.  The thing is
that premature optimization does more harm than good. Almost every time.

Michael
Roedy Green - 21 Oct 2007 07:49 GMT
>I doubt you'd ever notice the difference.  Let the JIT take care of  
>optimising this rather than doing it explicitly.

What evidence do you have that it is possible to optimise away the
extra overhead of interface references?  When you think about what has
to happen under the hood it would be quite a feat.

See the Goldfish book:

The Java Virtual Machine
ISBN10: 1-56592-194-1
Joshua Engel
http://www.amazon.com/gp/product/1565921941?ie=UTF8&tag=canadianmindprod&linkCod
e=as2&camp=1789&creative=9325&creativeASIN=1565921941


I have heard that the JVMs have been gradually improving the
implementation of interface references, however if I were given some
code to optimise for speed, one thing I would try would be to convert
interface to class references where possible at least for leaf calls
where the CPU spends most of its time.

There is no virtue in deliberately choosing a slower solution unless
there is some compensating benefit.  

There is a sentiment you hear from people who have read Knuth but who
are not of his generation, reminiscent of the conspicuous consumption
of movie stars that there is some virtue in deliberately wasting
resources.

What Knuth was talking about was making code unreadable with peephole
optimisations better done by the compiler.  I don't think that many
Java programmers are old enough to have ever even seen the kind of
ghastly code he was talking about.  He did not intend for you to close
your eyes to speed considerations or deliberately choose the slowest
implementation on your first cut.  You might as well write fast code
first time out if it is readable and easy to write.

Knuth would unlikely bawl you out for choosing the best algorithm
first time out.  That's where you get your massive speed improvements
for the least amount of work.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Lew - 21 Oct 2007 08:22 GMT
>> I doubt you'd ever notice the difference.  Let the JIT take care of  
>> optimising this rather than doing it explicitly.
>
> What evidence do you have that it is possible to optimise away the
> extra overhead of interface references?  When you think about what has
> to happen under the hood it would be quite a feat.

What evidence do you have that it won't?

The problem with optimization discussions is that, by all experienced
accounts, only actual measurement of a specific application's performance will
effect an accurate assessment of what to optimize for a given program on a
given JVM.

I have seen measured jumps of 50% in performance just by switching to a
"-server" JVM configuration for the exact same applications.  Some JVMs do
escape analysis and massive inlining /at runtime/ of heavily-run segments of
code.  Published benchmarks show a block of code speeding up as the optimizer
"settles in" to the usage pattern.

You cite Knuth as if choice of an interface variable over a class variable
were always an algorithmic decision, whereas in many cases the choice of, say,
a List implementation is irrelevant to the particular algorithm.  Knuth warned
against the effects of premature optimization, which by all accounts is
defined as prior to measurement of the actual performance.

Naturally if the overhead of interface calls over class calls is an important
factor, then it makes sense to narrow the type of the relevant variables.

Signature

Lew

Robert Klemme - 20 Oct 2007 14:35 GMT
> From a raw speed point of view, the Map idiom is not a good idea since
> method calls via an interface reference are slower than calls via a
> class reference.

Um, where do you take that from?  I cannot remember having heard that
claim before and off the top of my head I could not imagine a cause for
this.  Can you elaborate or show some evidence?  Thank you!

Kind regards

    robert
Zig - 20 Oct 2007 19:38 GMT
>> From a raw speed point of view, the Map idiom is not a good idea since
>> method calls via an interface reference are slower than calls via a
[quoted text clipped - 3 lines]
> claim before and off the top of my head I could not imagine a cause for  
> this.  Can you elaborate or show some evidence?  Thank you!

When you call a method on an interface type, the compiler will end up  
emitting an

invokeinterface

call. Whereas, calling the same method on a non-final class method will  
result in a

invokevirtual

In an invokevirtual, the runtime can simply jump strait to the class  
method table and lookup the correct address for the method, and jump into  
it. In an invokeinterface, the runtime has to determine which class method  
corresponds to the interface method, thus invokeinterface is slower than  
invokevirtual.

That said, it's a detail. You should assume that 99% of your cpu time is  
spent inside the method itself, not by the runtime looking for which  
method to invoke, unless your method is just { }.

Also, if you were to write your code as:

final Map m=new HashMap()

I assume that a smart compiler will infer the correct type, recognize  
there is no need for virtual method invocation, and use direct method  
invocation, thus providing better performance than either approach.

But, the real only place to care is in big number crunching algorithms,  
where you expect to invoke methods trillions of times per calculation.

HTH,

-Zig
Lew - 20 Oct 2007 21:34 GMT
> But, the real only place to care is in big number crunching algorithms,
> where you expect to invoke methods trillions of times per calculation.

At that point, measurement (with -server on the JVM) will tell you if the
optimizer is smart enough to deal with this for you.

As usual, "loosest appropriate type" hinges on the definition of "appropriate".

Signature

Lew

Zig - 20 Oct 2007 21:35 GMT
> Hi,
>
> I saw in many places that people use:
>
> Map m= new HashMap();

The best reason I can offer here is that this should tell a maintence  
programmer:

* The HashMap implementation is not strictly required
* Defer to documentation for implicit assumptions required of the  
implementation

Eg, if you were to instead say

import java.util.concurrent.*;

ConcurrentMap m=new ConcurrentHashMap()

This tells the user that the choice of a Hash'ed map is not strictly  
required, but the map must be safe for access by other threads. It also  
implies that callers should be safe to iterate through m.entrySet(), but  
those methods should beware that the iterator could return entries that  
did not exist when the iterator was created, or might not return entries  
that did exist when the iterator was created.

Likewise, using

SortedMap m=new TreeMap()

tells the reader that your algorithm implicitly requires that entries be  
sorted, and may fail spuriously if the implementation changes.

In the end,

Map m=new HashMap()

This says that, in the future, this algorithm will still work correctly if  
the keys are converted to Enums and the map is replaced with the faster  
EnumMap implementation as

Map m=new EnumMap(MyEnum.class);

However, the same programmer should really read the documentation and  
scratch their head before attempting to refit this code with  
Collections.synchronizedMap, Collections.unmodifiableMap, an  
implementation of ConcurrentMap, etc.

HTH,

-Zig
www - 22 Oct 2007 15:34 GMT
Thank you all for all your responses.

However, I still didn't get the answer. My old code is like:

public class MyClass {
    private Map<String, String> _map;

    public MyClass() {
        _map = new HashMap<String, String>();
    }

    public Map getMap() {
        return _map.clone();   //Oops! wrong! no clone() method for Map
    }
}

Now, my new code is like:

public class MyClass {
    private HashMap<String, String> _map;

    public MyClass() {
        _map = new HashMap<String, String>();
    }

    public HashMap getMap() {
        return ((HashMap)_map.clone());   //clone() is available for HashMap
    }
}

Do you see anything wrong with my new code? Should I keep the old code
and re-write getMap() like:

public Map getMap() {
    Map<String, String> tempMap = new HashMap<String, String>();
        //then copy everyting from _map into tempMap

    return tempMap;
}

Thank you very much.
Patricia Shanahan - 22 Oct 2007 16:20 GMT
...
> public Map getMap() {
>     Map<String, String> tempMap = new HashMap<String, String>();
>         //then copy everyting from _map into tempMap
>
>     return tempMap;
> }

public Map<String,String> getMap(){
  return new HashMap<String,String>(_map);
}

Alternatively, if callers just need a view of _map without being able to
modify it, and should see changes to _map:

public Map<String,String> getMap(){
  return Collections.unmodifiableMap(_map);
}

Patricia
www - 22 Oct 2007 17:01 GMT
> ...
>> public Map getMap() {
[quoted text clipped - 16 lines]
>
> Patricia

Thank you very much. That is great!
Lew - 22 Oct 2007 16:59 GMT
> However, I still didn't get the answer. My old code is like:

You did get the answer to the question that you asked, many times over.

> public class MyClass {
>     private Map<String, String> _map;

By convention, variable names should not include underscores unless they
represent compile-time constants.

>     public MyClass() {
>         _map = new HashMap<String, String>();
[quoted text clipped - 15 lines]
>
>     public HashMap getMap() {

You can still return a type Map, and you must not omit the generic declaration.

>         return ((HashMap)_map.clone());   //clone() is available for
> HashMap

But you are applying the cast to the result of clone(), not the variable
'_map' here.

Also, in Java casting and generics don't mix well.

>     }
> }
>
> Do you see anything wrong with my new code?

Yes.

How about this (which I've compiled, but not run)?

 public class MapCloner
 {
   private Map <String, String> stuff = new HashMap <String, String> ();

   /** Get the <code>stuff</code> map.
    * @return <code>Map &lt; String, String &gt;</code> stuff.
    */
   public Map <String, String> getStuff()
   {
     return new HashMap <String, String> ( stuff );
   }

   // TODO methods to put values into the Map, etc., go here

   /** Main method.
    * @param args <code>String []</code> program arguments.
    */
   public static void main( String [] args)
   {
     // TODO code application logic here
   }
 }

Like clone(), the HashMap(Map<? extends K,? extends V> m) constructor does a
shallow copy.

Signature

Lew

Adam Maass - 23 Oct 2007 03:55 GMT
> Thank you all for all your responses.
>
[quoted text clipped - 32 lines]
> Map<String, String> tempMap = new HashMap<String, String>();
>         //then copy everyting from _map into tempMap

There is a copy constructor on HashMap, so the copy is very straightforward
to make:

Map<String, String> tempMap = new HashMap<String, String>(_map);

Additionally, this may be an instance of needing to return a Map such that
the caller can't modify the internal state of the this object. For this,
there is:

Collections.unmodifiableMap(_map)

-- Adam Maass


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.