> Casting is a very common operation in Java. It does nothing, just
> check that you are not cheating, but it is time consuming. It would
> be a fruitful place to optimise.
For classes (single inheritance), not interfaces (multiple inheritance),
the following trick provides constant time class cast/instanceof:
Organize the class type hierarchy into a tree with Object at the root,
classes extending object children of the root, etc. Order the children
of each node somehow (alphabetically, load order, whatever) so that you
can speak of an ordering. Walk the tree, from left to right,
maintaining a counter as you first enter or last exit each node,
and tag each node with its enter and exit numbers. For example, if
your world is:
Object
A extends Object
X extends A
O extends X
B extends Object
Y extends B
C extends Object
Z extends C
then
class enter exit
Object 1 16
A 2 7
B 8 11
C 12 15
X 3 6
Y 9 10
Z 13 14
O 4 5
P is a superclass-eq of Q if and only if enter(P) <= enter(Q) <= exit(P)
There's some monkeybusiness (an additional indirection to get the
enter/exit numbers) to deal with dynamic class loading, but otherwise,
constant time, and constant space overhead.
Of course, if you know that the class being cast to is final, or is
effectively final, then you can simply compare for equality.
The technique above was described by Paul Dietz (STOC '80 or '81,
I can never get the reference straight), and is also used in Modula-3
implementations.
Interfaces are much more interesting.
David Chase
xarax - 02 Aug 2003 17:59 GMT
> > Casting is a very common operation in Java. It does nothing, just
> > check that you are not cheating, but it is time consuming. It would
> > be a fruitful place to optimise.
/snip/
> There's some monkeybusiness (an additional indirection to get the
> enter/exit numbers) to deal with dynamic class loading, but otherwise,
> constant time, and constant space overhead.
/snip/
I don't think you mean "constant time", but "linear time". Constant
time means that the look-up takes EXACTLY the same amount of
processor time regardless of how deeply inherited the class. Since
your algorithm starts with a tree walk, then the time cannot be
constant, because trees vary in size.
It is possible, however, to arrange a tree of class/interface
names in such a way that the time to find a matching node is
directly proportional to the length of the name being sought
(as opposed to being dependent on the size of the tree). It's
a variant of a patent that I invented for data compression. It
will also work with multiple inheritance of interfaces in
linear time proportional to the length of the name being sought.
I doubt that any JVM uses such an implementation, though, but
who knows?
Roedy Green - 02 Aug 2003 20:48 GMT
>Of course, if you know that the class being cast to is final, or is
>effectively final, then you can simply compare for equality.
that would collapse to the case where the entry and exit numbers are
equal.
I wonder if any JVM actually does it this way, perhaps counting by
tens to leave room for insertions before another tree walk is needed.
The big problem is with new classes showing up late, especially if you
have JITed. You would have to rejit the code if you put the constants
inline.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
David Chase - 03 Aug 2003 01:48 GMT
>>Of course, if you know that the class being cast to is final, or is
>>effectively final, then you can simply compare for equality.
[quoted text clipped - 8 lines]
> have JITed. You would have to rejit the code if you put the constants
> inline.
That's one way. Another approach involves an additional indirection
through an array of entry/exit numbers -- each class gets a unique
small number when it is loaded, and that number is used to index into
the array. Class cast loads the address of that array, ONCE, at the
start of each class cast operation, and uses it to complete the
operation. When a renumbering is required, a new array is generated,
and stored into place (there's some memory model foolishness that must
be dealt with, but the plan is to put all the expense onto class loading
since that is rare compared to cast). The old array is reclaimed by
garbage collection.
David
Roedy Green - 03 Aug 2003 03:08 GMT
>Another approach involves an additional indirection
>through an array of entry/exit numbers
Yet another example of the adage, there's no computer science problem
than can't be solved by sufficient layers of indirection.
It looks like the Jalepeno people use "my" technique checking a class
number at a given depth. They show the assembler code they use to do
the cast. Reminds me of writing the BBL/Abundance interpreter where I
would code things many ways and count cycles, or ever turn a brute
force try all possibilities optimiser on it to find any other
sequences that gave the same net result. Nice to know somebody is
minding the store.
I suppose some day, if Java is successful, hardware will adapt to make
things like instanceof and bounds checks fast.
I'd imagine the bounds checking could go on in parallel with the
fetch.
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
Chris Smith - 04 Aug 2003 15:57 GMT
> For classes (single inheritance), not interfaces (multiple inheritance),
> the following trick provides constant time class cast/instanceof:
[quoted text clipped - 4 lines]
> enter/exit numbers) to deal with dynamic class loading, but otherwise,
> constant time, and constant space overhead.
Another tehnique (which I first encountered in Appel's Modern Compiler
Implementation in C, but may have been published earlier) also provides
constant time/space instanceof and checkcast -- for classes not
interfaces -- with two differences: it is limited (to a variable amount)
in the depth of the class hierarchy that it can deal with, but it much
more easily accomodates dynamic classloading.
In this technique, you reserve space for a certain number of pointers in
the class descriptor. You then start from the base of the object
hierarchy, and store references to all the superclasses down to the
current class, or the maximum depth for which you've reserved space. So
the class descriptor might have (just for example) five slots, which
would be pointers to superclass as follows (and I'm omitting Object,
which is always the same and doesn't require a slot):
1. java.awt.Component
2. java.awt.Container
3. java.awt.Window
4. javax.swing.JWindow
5. (null)
If the JIT compiler encounters an instanceof or checkcast for
java.awt.Window, it *knows* that java.awt.Window is at level 3 -- it
always will be -- and it can just index directly into level 3 of the
table to check if this object is an instance of that class. If the
depth of the class is deeper than this chart, that's also know by the
JIT compiler, and it can generate a tree-walking kind of instanceof
check.
This does require some substantial amount of space to store class
structure in the class descriptors. Five is probably too little, but I
doubt you'd often exceed eight or nine, and this is per-class of course,
not per-object. David's index technique really only requires that the
JIT compiler maintain two numbers per class for the instanceof bounds,
and one number in the descriptor to be compared, so that's a bit of an
advantage.
> Interfaces are much more interesting.
Indeed.

Signature
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation