Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2007

Tip: Looking for answers? Try searching our database.

Which scope for variables being used in a loop?

Thread view: 
bugnthecode - 03 Feb 2007 02:42 GMT
Hi everyone,

I've been working on an application for work and I'm using a library
provided by someone else. I'm now running into a problem where I'm
running out of memory under certain circumstances and have begun
tracing the problem. Anyway, this made me think about the way I've
coded some of my for loops. Take for instance the following snippet:

public void printCustomers(List<Customer> customers) {
String customerName;
String customerPhone;
String customerLocation;
for(Customer customer : customers) {
 customerName = customer.getName();
 customerPhone = customer.getPhone();
 customerLocation = customer.getLocation();
 System.out.println(customerName+customerPhone+customerLocation);
}
}

Now my original thinking was that I would declare the strings outside
of the loop so that I'm not re-creating a reference each time through
the loop; I would just keep re-assigning it. Then I started to think
that maybe this wasn't doing as I expected, and the realized that
those 3 strings stay in scope until the end of the method meaning the
locations would be unavailable to the gc until the method was finished
processing. In the above snippet that really doesn't matter, but what
if I had much more code below the loop.

So my question is which way would be better on memory or performance.
Which is better coding style if performance and memory usage are
negligible either way?

Thanks in advance for your help.
Will
Manish Pandit - 03 Feb 2007 03:00 GMT
> Hi everyone,
>
[quoted text clipped - 32 lines]
> Thanks in advance for your help.
> Will

In this case, the variables are allocated on the stack, as they are
"local". I do not think this could lead to out of memory (unless the
collection is gigantic). Personally I never like the idea of declaring
variables within a loop, and have not seen a lot of instances where it
is done.

-cheers,
Manish
Karl Uppiano - 03 Feb 2007 04:37 GMT
[snip]

> In this case, the variables are allocated on the stack, as they are
> "local". I do not think this could lead to out of memory (unless the
> collection is gigantic). Personally I never like the idea of declaring
> variables within a loop, and have not seen a lot of instances where it
> is done.

I think it happens a lot, for example if someone calls another method from
within the loop.

> -cheers,
> Manish
Lew - 04 Feb 2007 00:43 GMT
"Manish Pandit" wrote ...
>> In this case, the variables are allocated on the stack, as they are
>> "local". I do not think this could lead to out of memory (unless the
>> collection is gigantic). Personally I never like the idea of declaring
>> variables within a loop, and have not seen a lot of instances where it
>> is done.

There are plenty of instances where variables are declared inside a loop, and
there are good, solid engineering reasons to do so.

Declaring the variable inside a loop, or any other block, limits its scope to
that block. If it is not needed outside the block, then its scope matches its
use.

The variable remains in the JVM until the end of the stack frame, even after
it goes out of scope; if it isn't nulled then it won't be gced until the
method ends. It is still inaccessible to code outside its block.

Limiting variable scope is a good principle of defensive programming. If a
variable doesn't linger after its use, nor is declared until needed, it has
less chance to make mischief. (Joshua Bloch touches on this in /Effective Java/.)

Use of the for ( T thing : things ) idiom is an example of scope limitation.
The variable "thing" is only in scope for the loop.

- Lew
Esmond Pitt - 04 Feb 2007 23:08 GMT
> In this case, the variables are allocated on the stack, as they are
> "local".

In this case the *references* are allocated on the stack. The objects to
which they refer are allocated on the heap as always.

> I do not think this could lead to out of memory (unless the
> collection is gigantic).

The effect as written is that the *last* 3 objects allocated can't be
GC'd until the scope they are declared in exist. All the objects
previously allocated inside the loop can be GC'd as soon as the next set
of assignments takes place. So unless one or more of the 3 customer
objects is gigantic this is probably not a memory hog.

> Personally I never like the idea of declaring
> variables within a loop, and have not seen a lot of instances where it
> is done.

I'm the opposite: I favour declaring all variables in the innermost
scope possible, and I haven't seen a lot of instances where it *isn't* done.

> -cheers,
> Manish
Stefan Ram - 04 Feb 2007 23:41 GMT
>>In this case, the variables are allocated on the stack, as
>>they are "local".
>In this case the *references* are allocated on the stack.

 References are values. »Reference« is just short for
 »reference value« of which the JLS says »(...) reference
 values (...) are pointers« (JLS3, 4.3.1). They can not be
 allocated, because they are not storage locations.

 Variables are storage locations (JLS3, 4.12), so they
 might be allocated.

 Reference variables are those entities, which then contain
 reference values.
NT - 08 Feb 2007 10:12 GMT
Escape Analysis in JSE6 might rule it out, i.e. allocate them on
stack, for small objects.

> In this case the *references* are allocated on the stack. The objects to
> which they refer are allocated on the heap as always.

-Nirav Thaker
http://niravthaker.blogspot.com
Chris Uppal - 08 Feb 2007 13:37 GMT
> Escape Analysis in JSE6 might rule it out, i.e. allocate them on
> stack, for small objects.

Do you know if that feature is actually part of JDK 1.6.0 ?  I remember hearing
that it was mooted for inclusion in this cycle, but haven't seen anything to
say whether it actually made it in to the final cut.  (It's something I've been
looking forward to playing with)

   -- chris
Esmond Pitt - 09 Feb 2007 00:07 GMT
> Do you know if that feature is actually part of JDK 1.6.0 ?

Apparently so. See 'Escape Analysis in Mustang' on
http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html
Chris Uppal - 09 Feb 2007 17:25 GMT
[me:]
> > Do you know if that feature is actually part of JDK 1.6.0 ?
>
> Apparently so. See 'Escape Analysis in Mustang' on
> http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html

Thanks for the link (not a bad article, if somewhat overstated IMO), but
unfortunately it's dated 2005-09-27, so it can only be talking about what was
then /hoped/ would make 1.6 final.  I'm looking for some sort of corroboration
that it /did/ make 1.6 final (or, alternatively, that it didn't).

   -- chris
NT - 12 Feb 2007 19:07 GMT
On Feb 9, 10:25 pm, "Chris Uppal" <chris.up...@metagnostic.REMOVE-
THIS.org> wrote:

> [me:]
>
[quoted text clipped - 4 lines]
>
>     -- chris

You may want to try these switches with mustang XX:
+PrintEscapeAnalysis, -XX:+DoEscapeAnalysis, -XX:+PrintOptoAssembly.

Nirav Thaker
http://niravthaker.blogspot.com
Chris Uppal - 13 Feb 2007 16:23 GMT
[me:]
> > Thanks for the link (not a bad article, if somewhat overstated IMO), but
> > unfortunately it's dated 2005-09-27, so it can only be talking about
[quoted text clipped - 6 lines]
> You may want to try these switches with mustang XX:
> +PrintEscapeAnalysis, -XX:+DoEscapeAnalysis, -XX:+PrintOptoAssembly.

Great!   Thank you very much.

(BTW, only the -XX:+DoEscapeAnalysis is enabled in production builds -- at
least the server JVM rejects the other options when I try them.  Not a big
deal...)

The first benchmark I was hoping it would have some significant impact on is
from John Harrop's comparison of various languages at:
   http://www.ffconsultancy.com/free/ray_tracer/languages.html
in which I believe Java suffers because the Java expression of the ray-tracing
algorithms creates huge numbers of transient vector objects.  Sadly, it seems
that -XX:+DoEscapeAnalysis and -XX:-DoEscapeAnalysis have no measurable effect
on the execution speed of that code.  Oh well...

Now I shall have to try to invent some benchmark where it /does/ make a
difference...

   -- chris
robert maas, see http://tinyurl.com/uh3t - 09 Feb 2007 20:14 GMT
> From: Esmond Pitt <esmond.p...@nospam.bigpond.com>
> The effect as written is that the *last* 3 objects allocated
> can't be GC'd until the scope they are declared in exist.

Yes, I agree. In a cases where one of those last 3 objects might be
really big, assuming the programmer can anticipate that as a
likelihood, it would seem prudent to explictly null out the
relevant local pointers just before reaching the bottom of the
inside of the loop. That wouldn't cause any practical difference
for any but the last 3, because each group of 3 pointers (except
the last group) immediately gets overwritten by the next group of 3
when the loop repeats, but it'd protect the last three from being
held too long. I'm assuming what somebody else said, that in fact
the stack frame is created upon entry to the method, to hold the
longest chain of local variables in the whole block, so even though
the variables declared in the inner loop-block go out of scope they
are still unavailable to be garbage collected until the whole
method returns.

And I agree that the variables should be declared within the inner
loop-block, rather than out in the main body of the method, for
several reasons (defense against accidently re-using a value later
in the same method, and allowing local allocations in one loop to
overlay local allocations in another loop. Hence you can't null-out
the three pointers just once after the loop exits, because the
variables are out of scope already, still held by the GC yet
untouchable by the programmer! (Hey, maybe loops should allow a
"finally" clause which performs actions after the very last time
through the loop but before local variables go out of scope?? Nah,
too ugly. How about a "finally" or "volatile" clause on the local
declaration itself, such that the pointer-value of the variable is
forcibly nulled-out at the moment just before the variable goes out
of scope the last time through the loop? Nah, impractical. How
about a "volatile" declaration for the entire *loop* (not the block
inside the loop), such that all stack values allocated by any block
within that loop are nulled out upon exit from the loop? I think
that would be logically consistent, clean, and doable.)

> I favour declaring all variables in the innermost scope possible,

I agree completely.

But note such a programming style precludes this loop-search pattern:
for (int ix=0; ix<=max && obj.elementAt(ix)!=target; ix++) {}
if (ix>max) then return(<didn'tfindvalue>);
else return(<foundvalue>);
because the post-loop test can't work because ix is already out of scope.

But I always preferred this style anyway:
for (int ix=0; ; ix++) {
 if (ix>max) return(<didn'tfindvalue>);
 if (obj.elementAt(ix)==target) return(<foundvalue(ix)>);
}

And who would really prefer this half-of-each pattern?
for (int ix=0; ix<=max; ix++) {
 if (obj.elementAt(ix)==target) return(<foundvalue(ix)>);
}
return(<didn'tfindvalue>);
What I don't like about that is that one return is after the loop,
whereby maintainers might not notice the *other* return from inside
the loop.

And I never liked this pattern either:
bool success=false; int savedix;
for (int ix=0; ix<=max; ix++) {
 if (obj.elementAt(ix)==target) {
   success=true; savedix=ix; break;
 }
}
if success then return(<foundvalue(savedix)>)
else return(didn'tfindvalue>);

Although I did find myself forced to invent that last horrible
pattern very long ago in another language that is overly (IMO)
restrictive about where you can do what. No, it wasn't pascal.
zhengxianfu@gmail.com - 03 Feb 2007 13:48 GMT
>I'm now running into a problem where I'm running out of memory under certain circumstances and have >begun tracing the problem.

Did you know under what circumstances ,are u sure the List-typed
customer will big enough to run out of your memory ,if so how about
the place which create this customer list.Maybe somewhere else.

By the way ,if the you can add/override toString() method to class
Customer ,that can avoid re-creation and assign value to local string
object.
Chris Uppal - 04 Feb 2007 15:12 GMT
> Now my original thinking was that I would declare the strings outside
> of the loop so that I'm not re-creating a reference each time through
[quoted text clipped - 8 lines]
> Which is better coding style if performance and memory usage are
> negligible either way?

I don't think that there's any doubt that the best coding style is to declare
variables in the most restricted scope possible.  Not only does it make the
code more readable, but it reduces the chances of errors by mistakenly re-using
a value which had been set in an earlier passage of code where it was used for
something else.  A counter-point to that view is that if your methods are so
long that it makes any difference where you declare variables, then your
methods are too long anyway.  There's some truth in that, but it is difficult
to write sensible Java code where methods /are/ properly short (say 2 or 3
lines, not counting the brackets).

From an efficiency POV, it makes no difference at all.  None.  Zero.  Java
bytecode has no concept of a local variable (or stack slot, actually) at a
narrower scope than a method.  So the bytecode generated for:

public void printCustomers(List<Customer> customers)
{
String customerName;
String customerPhone;
String customerLocation;
for(Customer customer : customers)
{
 customerName = customer.getName();
 customerPhone = customer.getPhone();
 customerLocation = customer.getLocation();
 System.out.println(customerName+customerPhone+customerLocation);
}

and:

public void printCustomers2(List<Customer> customers)
{
for(Customer customer : customers)
{
 String customerName = customer.getName();
 String customerPhone = customer.getPhone();
 String customerLocation = customer.getLocation();
 System.out.println(customerName+customerPhone+customerLocation);
}

are essentially identical (as you can verify with javap, if you feel so
inclined).  I tried that with the javac from jdk 1.6.0, and the only difference
(for some reason) was that it assigned the variables to stack slots in a
different order.

Note that javac and/or the JIT is at liberty to generate code to null-out stack
slots once they are no longer live, but in fact (as far as I can tell) the
current implementations do not do so (there's to be some #ifdef-ed-out code in
the 1.6 JVM source, which /seems/ as if it might be an experimental
implementation of that idea, but...).   However, since there's no difference in
the bytecode, the JIT will, or won't, do that no matter which way you phrase
your code.

   -- chris
Patricia Shanahan - 04 Feb 2007 18:23 GMT
>> Now my original thinking was that I would declare the strings outside
>> of the loop so that I'm not re-creating a reference each time through
[quoted text clipped - 18 lines]
> to write sensible Java code where methods /are/ properly short (say 2 or 3
> lines, not counting the brackets).

I find the most restricted scope rule helpful for keeping down method
length.

It tends to lead to largely self-contained blocks, with all working
variables that belong to the block declared in it, and no unnecessary
sharing of variables between blocks. Those are the easiest blocks to
convert to separate methods during a refactoring pass.

Patricia
firstsql@ix.netcom.com - 06 Feb 2007 06:50 GMT
On Feb 4, 7:12 am, "Chris Uppal" <chris.up...@metagnostic.REMOVE-
THIS.org> wrote:

> I don't think that there's any doubt that the best coding style is to declare variables in the most restricted scope possible.  Not only does it make the
> code more readable, but it reduces the chances of errors by mistakenly re-using
[quoted text clipped - 8 lines]
> bytecode has no concept of a local variable (or stack slot, actually) at a
> narrower scope than a method.  So the bytecode generated for:

Actually, there is some difference. Sun's javac will reuse slots
between parallel blocks within a method. So, there can be a coupla
efficiency advantages (albeit minor) to most restricted scope:

1) The stack frame for the method will be smaller, perhaps important
for recursive uses, and

2) Reference slots for earlier blocks may be overwritten, allowing
early garbage collection of the referenced objects.

--
Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com)
==============================================================
* The Ultimate DBMS is here!
* FirstSQL/J Object/Relational DBMS  (http://www.firstsql.com)
Chris Uppal - 06 Feb 2007 16:51 GMT
[me:]
> > From an efficiency POV, it makes no difference at all.  None.  Zero.
> > Java
[quoted text clipped - 4 lines]
> Actually, there is some difference. Sun's javac will reuse slots
> between parallel blocks within a method. So [...]

That's true.  I should have mentioned it myself.  Thanks.

   -- chris
dagarwal82@gmail.com - 07 Feb 2007 06:27 GMT
How about declaring "String customerName;  String customerPhone;
String customerLocation;" outside the loop with String Buffer. I
mean :-

StringBuffer customerName
StringBuffer customerPhone
StringBuffer customerLocation

now even if you have 1tera billion records in the list , there will be
a single instance of these variables.
Lew - 07 Feb 2007 13:59 GMT
> How about declaring "String customerName;  String customerPhone;
> String customerLocation;" outside the loop with String Buffer. I
[quoted text clipped - 6 lines]
> now even if you have 1tera billion records in the list , there will be
> a single instance of these variables.

Actually, there won't. Variables don't have instances.

There is an engineering principle related to re-use of instances. Compare

public Foo do( Foo foo )
{
   Foo fooToo = foo;
   fooToo.setProperty( getAValue() );
   return fooToo;
}

to

public Foo do( Foo foo )
{
   Foo fooToo = foo.clone();
   fooToo.setProperty( getAValue() );
   return fooToo;
}

The latter produces two instances of Foo, the former uses only the one. The
number of variables is the same.

There are reasons to declare variables outside a loop, either because you need
a wider scope or syntactic reasons.

parallel loop control variables
 for ( int i=0, Iterator iter = coll.iterator(); iter.hasNext(); ++i )
not allowed - declare one before the loop.

- Lew


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.