Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2005

Tip: Looking for answers? Try searching our database.

Java VM Address Space

Thread view: 
a - 18 Dec 2005 21:20 GMT
Hello Everybody.
I heard that Security is not covered completely in Java resources available
from Sun Microsystems'.
I'm trying to understand then, several things related to internals of Java
VM.
Since the codes for VM are like ordinary machine codes, do they lay in the
same Address Space for all loaded classes?
Is that address space a chunk of segmented memory with size of 4Gb? Or that
code lies in address space of Virtual Machine in form of instantinated
objects of VM itself?
Actually how many kinds of bindings(linking) exist?
Late binding, compile-time binding, and dynamic are they all presented in
Java?

Jack.
Roedy Green - 19 Dec 2005 00:46 GMT
>Since the codes for VM are like ordinary machine codes, do they lay in the
>same Address Space for all loaded classes?
>Is that address space a chunk of segmented memory with size of 4Gb? Or that
>code lies in address space of Virtual Machine in form of instantinated
>objects of VM itself?

The virtual machine is quite unlike an Intel machine.  Addressing
within a method is relative to start of the method.  Addressing within
an object is my field name.  Addressing on the stack is by stack slot
relative to the current invocation, not letting you peek at the
caller's stack. References are black boxes designed to fetch an
object.  You can't do arithmetic on them.

There is great flexibility in how the JVM actually works. It can even
use 64 bit references if it wants.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Jack - 24 Dec 2005 08:54 GMT
> The virtual machine is quite unlike an Intel machine.  Addressing
> within a method is relative to start of the method.  Addressing within
[quoted text clipped - 8 lines]
> Canadian Mind Products, Roedy Green.
> http://mindprod.com Java custom programming, consulting and coaching.

While looking into hex view of class file i found the limitations of 16bit
(pools entries) and 32bit of length.

Therefore classes should be distinguished by the version number. Finally the
class is a not compiled code at all.
The class seems to be, and it is, the same source file, but without human
readable crap.

From your answer a can form the answers for my questions. Please correct me
if im wrong.

JavaVM under Windows obviously uses the address space of itself and keeps
classes in form of instantinated objects (which are not related directly to
Java objects).
All three kinds of binding are used in Java technology.
Roedy Green - 24 Dec 2005 11:55 GMT
> Finally the
>class is a not compiled code at all.
>The class seems to be, and it is, the same source file, but without human
>readable crap.

It is not Pentium native machine code. It is machine code for the Java
Virtual machine. It is the native machine code for PicoJava chips. See
http://mindprod.com/jgloss/picojava.html

It is not source with the comments removed, even though much of it is
intelligible to hex viewer. See http://mindprod.com/jasm.html

Try using Javah -c to disassemble some class files. You will see the
byte code resembles FORTH, a postfix, stack-based language.

see http://mindprod.com/jgloss/disassembler.html
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Roedy Green - 24 Dec 2005 11:58 GMT
>While looking into hex view of class file i found the limitations of 16bit
>(pools entries) and 32bit of length

There are a number of limits in the class file format which are not in
the running JVM itself.  For example inside the JVM you can have 64
bit references. The class file is the same. The size of the compiled
native code is not limited to 64K for a method. Inside the JVM, you
can have all the strings that will fit in the address space.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Smith - 24 Dec 2005 16:01 GMT
> Finally the class is a not compiled code at all.

By everyone else's definition of the word "compile" Java bytecode is
definitely compiled.  Perhaps you meant to say that it is not native, in
which case you'd be right.  A second stage of the compiler (the JIT)
generally runs as the program is running to translate that bytecode into
machine code for the processor on which the code will run.

> The class seems to be, and it is, the same source file, but without human
> readable crap.

The class file contains a set of VM-level methods and fields and
constants, which mostly resemble the set written in the Java language.  
However, the *contents* of the methods (the bytecode) bears only some
resemblances to Java.  It does not radically depart from the memory
model, but it throws out decisions about choice of control flow
constructs, block scope, order of operation rules, variable names, etc.

> JavaVM under Windows obviously uses the address space of itself and keeps
> classes in form of instantinated objects (which are not related directly to
> Java objects).

I'm not at all sure what you mean by "instantiated objects", which
you've said a couple of times.  I get the feeling from context that you
don't mean the same thing as most other people would when they say the
same thing.

All major Java virtual machines for conventional CPUs with MMUs reside
in a single address space, and load all of their code into that address
space.  I am not aware of any JVM that makes use of shared memory or
other IPC between separate processes (except insofar as pre-NPTL
versions of Linux tend to treat threads as multiple processes that share
everything).  There will exist in that address space one or more copies
of the bytecode: one as the bytecode itself, and a natively compiled
(JITed) versions; and potentially other JITed versions if the JIT
compiler determines that it's worth recompiling for special cases to
improve efficiency.

> All three kinds of binding are used in Java technology.

I don't know what distinction you're making between the three kinds of
binding.  In fact, you seem to have made a habit of assuming that
everyone shares your understanding of terminology.  In reality,
terminology differs from context to context; and textbook authors,
technical presenters, etc. often make it up on the fly to express the
distinctions that happen to be useful for them, right now.

This might help.  There are the following different bytecodes used in
the JVM to invoke methods:

   invokevirtual
   invokestatic
   invokespecial
   invokeinterface

(and the special cases: invokevirtual_quick, invokenonvirtual_quick,
invokesuper_quick, invokestatic_quick, invokeinterface_quick,
invokevirtualobject_quick, invokevirtual_quick_w)

Ignoring the special cases, the four invoke opcodes could be said to
correspond loosely to kinds of method binding, but I don't think that
they correspond directly to the terms you've used.  The most dynamic is
invokeinterface.  Two of the four -- virtual and interface -- are
polymorphic, whereas the other two -- special and static -- can be
completely resolved earlier in the class loading and JIT process.  The
only difference between special and static is whether to pass an
implicit this pointer, which is irrelevant to a discussion of method
binding.

Note that the term "compile-time" is problematic when discussing the
JVM, since there are two compilers involved -- the source compiler, and
the JIT compiler.  No methods are ever completely bound by the source
compiler; it always embed the names of methods into the class file and
leaves the runtime class loader to resolve them.  The JIT compiler will
typically completely resolve an invokespecial or invokestatic all the
way to a direct memory address for a jump or jsr instruction.  It will
resolve invokevirtual to a simple indirect jump (as in, jump to
[obj_base + offset]).  Unless it has some specialized global knowledge
of the application, the JIT will generate table lookup code for an
invokeinterface instruction.

All of this, of course, is typical implementation.  The JVM is free to
choose any alternate implementation, so long as the observed behavior is
the same.

Now, I'm confused as to why this would have any impact on security...
but there it is, anyway.

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Jack - 25 Dec 2005 16:06 GMT
So Thank You for help, now it clarifies many things for me.

Now i have to learn what is FORTH and Postfix.

Of course i understand that Java bytecode are not the physical system's
processor's "byte codes".
So if language constructions are translated to Java ByteCodes at compile
time, the addresses (of fields and methods) can not be determined until
class is loaded into VM.
Thus they have to be found by either string names or indexes.
Neither addresses can be found after class loading.
The VM will have to look for real address of field of method (even it is not
executed by real CPU).
This i suppose to name the call-time binding.
Such thing can be done only by database of addresses that consists of names
and addresses. The database should be indexed for faster search. Thus the
bigger the program the slower it will run. But it is not allways true.
But after checking the access rights, the substitution for real addresses
should be totally safe.
But there is still no freedom of arbitrary numbers which potentially leads
to securyti risks.

Thank You a lot. I'll go learn bit more.
Roedy Green - 25 Dec 2005 22:54 GMT
>Now i have to learn what is FORTH and Postfix.

In forth every word is a verb. You evaluate strictly left to right, in
a stack based machine.

So for example

2  3  +  .

is a forth program that prints 5.

It works like this

2 is a verb that pushes 2 to the stack

3 is a verb that pushes 3 to the stack Your stack now looks like :
 2 3 ( with 2 deeper in the stack )

+ is a verb that adds the top two stack elements, discards them and
pushes the sum to the stack. your stack now looks like:
5

. is a verb that displays the top of stack as an int and discard it.
Your stack in now empty.

Postfix is the natural order of calculation.  You have to calculate
operands before you can do an operation on them.  It is used in
PostScript, HP calculators, and the Java JVM.

Another example

a = b + ( c - d ) / e

becomes in postfix

b c d - e / + a !

where ! is the store operator

if ( a < b )
   {
   c = d;
   f = g;
   }
else
  {
  c = e;
  }

becomes in FORTH

 a b < IF  d   g f !  ElSE e THEN c !

It looks very strange at first, but if you work with it for a while it
become easier than Java notation because there are no precedence
rules.  Everything proceeds strictly left to right.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Jack - 26 Dec 2005 22:30 GMT
Wow! Cool!
So what protection is used to prevent the stack overflow?

seems to be like: a b < IF  d c !  g f !  ElSE e THEN c !

> Postfix is the natural order of calculation.  You have to calculate
> operands before you can do an operation on them.  It is used in
[quoted text clipped - 15 lines]
>
>   a b < IF  d   g f !  ElSE e THEN c !
Roedy Green - 27 Dec 2005 03:33 GMT
>So what protection is used to prevent the stack overflow?
>
>seems to be like: a b < IF  d c !  g f !  ElSE e THEN c !

FORTH protection comes from writing very small routines and debugging
each one exhaustively before you move on to the next. This technique
is much more powerful than you would imagine. In Forth there are no
safety nets.  You can do an explicit stack check with ?STACK. I
designed BBL with a bit of slop in the stacks so small amounts of
overflow/underflow would do no damage during debugging.

The most common bug in Forth is to leave something on the stack you
did not intend or consume something you did not intend.  When you get
your stack balanced to the method spec, nearly always your code is
correct.

Even though FORTH and Java are very similar underneath the hood, FORTH
has a quite different philosophy.  You are permitted to tinker with
the inner workings of everything. They are so simple, you can
understand every last instruction in the entire system even more so if
you write your own FORTH engine from scratch. You could do a simple
one in about a month.  My 32-bit one was more complex. I had to
simulate a 32-bit virtual machine on a 16-bit 8086.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Jack - 28 Dec 2005 01:32 GMT
> FORTH protection comes from writing very small routines and debugging
> each one exhaustively before you move on to the next. This technique
[quoted text clipped - 15 lines]
> one in about a month.  My 32-bit one was more complex. I had to
> simulate a 32-bit virtual machine on a 16-bit 8086.

I Know how to resolve the problem of stack over/underflow FOREVER.

FOREVER.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.