Hi,
I have a central cache that is heavily accessed. So far, I'm using a
hashmap protected by a read-write lock. With many threads, I seem to see
contention even on this lock (maybe due to Linux scheduling).
Since the cache is not too big, and rarely modified, I've changed the code
to use volatile:
volatile HashMap cache;
get(key) {
result = cache.get(key);
if(result == null) {
result = compute();
asynchronously(put(key, result));
}
return result;
}
put(key, value) {
newCache = new HashMap(cache);
newCache.put(key, value);
cache = newCache;
}
put is garantueed not to be called concurrently.
I see a massive improvement in scalability and throughput.
My question: is this pattern correct ?
Thanks
Matthias

Signature
Matthias Ernst
Software Engineer
CoreMedia - Smart Content Technology
Erwin Molendyk - 05 Aug 2003 11:55 GMT
> I have a central cache that is heavily accessed. So far, I'm using a
> hashmap protected by a read-write lock. With many threads, I seem to see
[quoted text clipped - 19 lines]
>
> My question: is this pattern correct ?
I'm think the volatile keyword will make the reference to the hashmap object
volatile, and not the hashmap object itself. This would mean unsynchronized
updates might not be seen on other CPU's. You might want to have a look at
Multi-Read-Exclusive-Write synchronization, which seems to be exactly what
you need.
Hope this helps.
Cheers,
Erwin
Matthias Ernst - 05 Aug 2003 14:46 GMT
> My question: is this pattern correct ?
I found there is a related discussion wrt double-checked locking here:
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
(Scroll down to the end of the page: "Fixing Double-Checked Locking using
Volatile"). As it seems, my solution is not valid under the current memory
model but will be.
Matthias

Signature
Matthias Ernst
Software Engineer
CoreMedia - Smart Content Technology
Doug Pardee - 05 Aug 2003 16:48 GMT
> I have a central cache that is heavily accessed. So far, I'm using a
> hashmap protected by a read-write lock. With many threads, I seem to see
[quoted text clipped - 8 lines]
>
> My question: is this pattern correct ?
No.
Basically what Erwin already said: the "volatile" only assures that
the reference variable "cache" is properly shared. It does not assure
that the HashMap object is properly shared, nor does it assure that
the objects contained IN the HashMap are properly shared, nor any
objects that they reference, nor any objects that those objects
reference, nor...
Chances are, if you're running this on a single-CPU machine, it
probably works. The problems mainly crop up with multi-CPU systems,
because each CPU has its own partial cache of memory contents.
When you're sharing anything over 32 bits between, you *have* to use
"synchronized". There is no safe way around it, because "synchronized"
is the only way to assure that all data is shared between threads and
the CPU caches are updated.
Since the pseudo-code that you gave implies that each update results
in a new and immutable HashMap, you don't really need the mutex
locking effect of "synchronized", so you could synchronize on a
thread-specific object and get the memory-barriers while avoiding the
contention. If you do this, be sure to put some comments in there so
that people understand why you did such an oddball thing, and that it
is critical that the HashMap (and the objects that it contains) be
immutable once its reference is stored into "cache".
Matthias Ernst - 05 Aug 2003 17:05 GMT
> you could synchronize on a
> thread-specific object and get the memory-barriers while avoiding the
> contention.
Now that's cool. Thanks for the tip.
So far, I've helped myself with two methods 'enter' and 'leave' that will
store the current cache into a thread-local. That works fine, and a call to
both methods is easy to ensure using a servlet filter.
Thx again
Matthias

Signature
Matthias Ernst
Software Engineer
CoreMedia - Smart Content Technology
Roedy Green - 05 Aug 2003 20:04 GMT
>I have a central cache that is heavily accessed. So far, I'm using a
>hashmap protected by a read-write lock. With many threads, I seem to see
>contention even on this lock (maybe due to Linux scheduling).
This may be what you have already done. You have two trees you
maintain. However one is designated the active tree for lookups and
the other the inactive tree for updates. After an update you swap
trees and update again. You only have to atomically update a single
volatile pointer to the active tree. Therefore there need be no lock
on that.
It is not quite as wasteful as it sounds since the lookup hash is
duplicated, not the objects themselves.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
Matthias Ernst - 07 Aug 2003 09:25 GMT
> You only have to atomically update a single
> volatile pointer to the active tree. Therefore there need be no lock
> on that.
Well, that was my original question, and as it seems, a simple update of a
volatile object reference does *not* do the job. The required memory
barrier that hinders other threads to 'read ahead' the pointed-to memory
only happens on a 'synchronized'.
Matthias

Signature
Matthias Ernst
Software Engineer
CoreMedia - Smart Content Technology
Roedy Green - 07 Aug 2003 20:40 GMT
>Well, that was my original question, and as it seems, a simple update of a
>volatile object reference does *not* do the job. The required memory
>barrier that hinders other threads to 'read ahead' the pointed-to memory
>only happens on a 'synchronized'.
Hmm. while if that is true, you only have to synchronise long enough
to look at the pointer, not to do your entire tree sniffing business.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.