Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / November 2005

Tip: Looking for answers? Try searching our database.

Do java support 'String' methods properly ?

Thread view: 
Red Orchid - 08 Nov 2005 09:01 GMT
I think that if the methods of 'String' are not
efficient, these exert a very bad influence on
the performance of an application.

Java do not support any String methods with ignoreCase
except 'String.equalsIgnoreCase' and 'String.regionMatches'.

Threrefore, I think that java leads a programmer to
write the codes that are inefficient.

In proof of my thinking, I will show examples bellow.

<example_#1>
String src = ..;
String key = ..;

.. = src.toLowerCase().indexOf(key.toLowerCase());

#1 is very inefficient.  because #1 converts the entire
of 'src' to lowerCase though there is no need to do so.

That is,
if src = "abcdefghij" and key = "bc", there is need
to convert only "abc" of src to lowerCase.

The codes like the above are found very common in java
application sources.

<example_#2>
'String.regionMatches' do not provide valuable aid for
solving the inefficient of #1.

Because ..
the folowing code is a part of 'String.regionMatches'
Assuming 'src.regionMatches(true, i, key, j, ...)'.

if (ignoreCase) {
 ...
 char u1 = Character.toUpperCase(c1);
 char u2 = Character.toUpperCase(c2);
 if (u1 == u2) {
    continue;
 }
 ..

Whenver 'regionMatches' is called,  'key' is converted
to upperCase.  it is inefficient.

<example_#3>
A programmer can write his method like the follows.

String k = key.toLowerCase();
for (int i = 0; i < src.length(); i++) {
   for (int j = 0; j < k.length(); j++) {
       char c = Character.toLowerCase(src.charAt(i + j));
       if (c != k.charAt(j)) {

...

#3 calls 'String.charAt'.  That is, #3 has the overhead
of function call (method call).    if his method exists
inside 'String',  he can access the field 'value[]' of
'String' and then his method will not have the overhead.

</example_*>

I think that java do not support 'String' methods enough
or properly.

What is your opinion ?

Thanks.
Ingo R. Homann - 08 Nov 2005 09:28 GMT
Hi,

> [String-operations are slow in Java]

Well, in "normal" applications, String-operations are *not* the bottleneck.

But *if* String-operations are a bottleneck, then even your idea of
linear searching a String is a very bad idea.

Google for Knuth-Morris-Pratt and Boyer-Moore-Horspool for *efficient*
String searching algorithms!

Ciao,
Ingo
Roedy Green - 08 Nov 2005 09:40 GMT
On Tue, 08 Nov 2005 10:28:12 +0100, "Ingo R. Homann"
<ihomann_spam@web.de> wrote, quoted or indirectly quoted someone who
said :

>Google for Knuth-Morris-Pratt and Boyer-Moore-Horspool for *efficient*
>String searching algorithms!

See http://mindprod.com/jgloss/products1.html#BOYER

Boyer-Moore does not make that much difference. I think they would pay
off if you were simultaneously searching for many different strings.
My implementation does NOT handle that.

A string search can be done very efficiently in a native method which
is what Jet does.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Googmeister - 08 Nov 2005 10:59 GMT
> Hi,
>
> > [String-operations are slow in Java]

> Well, in "normal" applications, String-operations are *not* the bottleneck.

Yes, I agree here. Strings are very convenient to use in Java (as
compared to, say, C where they're associated with buffer overflow
errors). The tradeoff is reduced efficiency and increased memory
usage. If your bottleneck really is dealing with strings, you can
always go back to using an array of bytes or chars, and you're even
free to use the convention of terminating them with '\0'.

> But *if* String-operations are a bottleneck, then even your idea of
> linear searching a String is a very bad idea.

This is how Java implements indexOf() as well. It is efficient (linear)
for "typical" inputs, but has a quadratic worst case.

> Google for Knuth-Morris-Pratt and Boyer-Moore-Horspool for *efficient*
> String searching algorithms!

Anyone know why Java's indexOf doesn't use Boyer-Moore
(sublinear on typical inputs) for long strings? I suspect it's a
combination of requiring extra space proportional to the alphabet
size and difficulties in dealing with the subtleties of Unicode
(e.g., code points). But it would be nice to hear from the experts.
Ingo R. Homann - 08 Nov 2005 12:27 GMT
Hi,

> This is how Java implements indexOf() as well. It is efficient (linear)
> for "typical" inputs, but has a quadratic worst case.

Depends on what is 'typical' for you. :-)

(BTW, linear search is not really quadratic, but M*N.)

>>Google for Knuth-Morris-Pratt and Boyer-Moore-Horspool for *efficient*
>>String searching algorithms!
[quoted text clipped - 4 lines]
> size and difficulties in dealing with the subtleties of Unicode
> (e.g., code points). But it would be nice to hear from the experts.

IIRC, all algorithms (linear, BMH, KMP) have different advantages,
depending on the sizes of the text and the pattern, the size of the
alphabet, the distribution of the characters and if you are searching a
pattern in different texts (do not forget that you need to generate a
"suffix-list" for the pattern)!

If the both the text and the pattern are very short (perhaps only 10 to
20 chars?), it is obvious that linear search ist the best. If the
alphabet is very short (eg only A and B), the pattern might be AAAAA, so
that KMP (*) has no advantage over linear search as well.

Fact is: You simply cannot decide which algorithm is 'best' if you do
not know *anything* about the application (which of course sun can't).

And if you want to implement a large text-database or a gene-sequencer,
it is obvious that you have to think *a lot* about searching-algorithms,
so that *no* implementation (that has not been optimized especially for
the exact application (**)) is perfect.

Ciao,
Ingo

(*) It has been nearly 10 years since I implemented these algorithms,
and I remember there were some slight differences between BM and BMH
wich I do not remember right now. So, forgive me, if I confuse the
algorithms...

(**) In these cases, it might often be useful if you do not have the
whole text in RAM, but e.g. to have a linked list of "chunks" of texts,
with one chunk perhaps of the size of some MB. The rest remains on disk.
 No default-implementation can deal with that!
Googmeister - 08 Nov 2005 13:49 GMT
> Hi,
>
> > This is how Java implements indexOf() as well. It is efficient (linear)
> > for "typical" inputs, but has a quadratic worst case.
>
> Depends on what is 'typical' for you. :-)

English text.

> (BTW, linear search is not really quadratic, but M*N.)

Yes, in the worst case, this is quadratic in the input size.
The worst case occurs when M = N/2.

> >>Google for Knuth-Morris-Pratt and Boyer-Moore-Horspool for *efficient*
> >>String searching algorithms!
[quoted text clipped - 10 lines]
> pattern in different texts (do not forget that you need to generate a
> "suffix-list" for the pattern)!

That's why I specifically said long strings. Arrays.sort uses insertion
sort for small inputs and mergesort/quicksort for large inputs. It
wouldn't
be inconsistent to do something smart (e.g., Boyer-Moore) for long
strings since it can lead to dramatic performance improvements.

> Fact is: You simply cannot decide which algorithm is 'best' if you do
> not know *anything* about the application (which of course sun can't).

Sun provides algorithms like mergesort and red-black trees. We
certainly expect reasonable efficient algorithms here, even though
they're not always the best for any particular application. My question
is why not use Boyer-Moore for long strings, where it is superior to
brute force, and in most common cases vastly superior?
Roedy Green - 08 Nov 2005 14:19 GMT
> why not use Boyer-Moore for long strings, where it is superior to
>brute force, and in most common cases vastly superior?

I wrote one and experimented.   The catch is you usually find the
target string pretty early on, so Boyer Moore never gets a chance to
overcome its startup overhead.

See http://mindprod.com/products1.html#BOYER
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Ingo R. Homann - 08 Nov 2005 14:24 GMT
Hi,

> That's why I specifically said long strings. Arrays.sort uses insertion
> sort for small inputs and mergesort/quicksort for large inputs. It
[quoted text clipped - 7 lines]
> is why not use Boyer-Moore for long strings, where it is superior to
> brute force, and in most common cases vastly superior?

That would indeed be possible. But - as I said in my last posting:

(1) I think that if you want to deal with really large texts, you will
have to  implement an own (chunked) String-class, highly optimized to
your specific application, so that you cannot use String.indexOf at all.

(2) String.indexOf is not a bottleneck in most typical applications.

So, it may not be worth implementing.

Or, perhaps sun hasn't heard of BMH at all. ;-)

Ciao,
Ingo
Chris Uppal - 08 Nov 2005 15:29 GMT
> That's why I specifically said long strings. Arrays.sort uses insertion
> sort for small inputs and mergesort/quicksort for large inputs. It
> wouldn't
> be inconsistent to do something smart (e.g., Boyer-Moore) for long
> strings since it can lead to dramatic performance improvements.

I think it would be at least as useful if it were provided for binary data too
(or even instead).

I'm not sure that your point about sorting carries over all that well.  Sorting
is something that many apps do a lot of, and often the collection to be sorted
is big enough that insert-sort is not suitable.  For a comparable situation to
apply to string searching (text or binary) searches of long strings would have
to be reasonably common -- and I'm not convinced that they are[*].  Especially
when, in practise, BM and naive searching both tend to operate in linear
time -- albeit with different constants.  And then when you consider that the
long strings you are searching must have come /from/ somewhere, with attendent
I/O costs, it is probably quite hard to find an application for which string
matching is a genuine bottleneck[**].

(
[*] Although that might be as much a result of the lack a fast string search as
a cause of it.

[**] I'm not saying there aren't any, just that I wouldn't expect there to be
enough to push Sun into giving string matching high priority.
)

   -- chris
Googmeister - 08 Nov 2005 18:07 GMT
> > That's why I specifically said long strings. Arrays.sort uses insertion
> > sort for small inputs and mergesort/quicksort for large inputs. It
[quoted text clipped - 10 lines]
> apply to string searching (text or binary) searches of long strings would have
> to be reasonably common -- and I'm not convinced that they are[*].

Yes, I do agree that fast sorting is typically more useful than
fast string search.

> Especially
> when, in practise, BM and naive searching both tend to operate in linear
> time -- albeit with different constants.

No, BM runs in *sublinear* time on typical inputs -- that is O(N / M),
where N is the length of the text and M is the length of the pattern.
This is what makes it so useful (assuming you have the text already
in memory). It also comes with a linear time worst case guarantee
that protects against pathological inputs.
Chris Uppal - 08 Nov 2005 19:55 GMT
> > Especially
> > when, in practise, BM and naive searching both tend to operate in linear
> > time -- albeit with different constants.
>
> No, BM runs in *sublinear* time on typical inputs -- that is O(N / M),

Oh I quite agree with the theoretical analysis.  The position I was coming from
was twofold -- still is actually.

One point is that -- depending on circumstances -- "jumping" over bits of the
sequence to match may not buy you genuine sub-linear execution.  For instance,
if you are streaming stuff off-disk, then unless the jumps are bigger than a
disk page, you will still have to read the data into memory in the OS's buffers
and/or the application's.  Which is O(n).   Or if the data is in main RAM, a
similar argument applies about fetching data into caches.  If you end up
limited by disk I/O capacity, or by bandwidth to RAM, then reducing the number
of actual character (or byte) comparisons isn't cutting out the bottleneck
unless it /also/ reduces the amount of data that has to be moved over each
threshold.

The other point is that I would expect the strings you were searching /for/ to
be fixed for many applications (not always, of course).  In such circumstances,
the expected execution time is O(n), simply because the O(1/m) bit is (for that
application) a constant.  The app can't take advantage of the effect of bigger
M because there are no longer strings that it wants to look for :-(

Hence my comment.  I should have been more precise, I admit.

> This is what makes it so useful (assuming you have the text already
> in memory). It also comes with a linear time worst case guarantee
> that protects against pathological inputs.

Agreed, it's a very nice family of algorithms.

   -- chris
Roedy Green - 09 Nov 2005 03:04 GMT
On Tue, 8 Nov 2005 19:55:57 -0000, "Chris Uppal"
<chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly
quoted someone who said :

>One point is that -- depending on circumstances -- "jumping" over bits of the
>sequence to match may not buy you genuine sub-linear execution.  For instance,
[quoted text clipped - 6 lines]
>unless it /also/ reduces the amount of data that has to be moved over each
>threshold.

The other thing to understand is that mathematicians simplify. To them
all that counts in a sort are the number of compares or number of
swaps. They completely ignore the other overhead. Ditto for Boyer
Moore.  They are all excited about reducing the number of compares,
but that is just a fraction of what goes on. Hopping taking extra
time.  So even though you save in number of comparisons you lose it
setup for the next comparison.  

For the third time, I invite you to experiment with an actual
implementation and see for yourself that the speed up in ordinary use
is not that dramatic. See http://mindprod.com/products1.html#BOYER
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Thomas Hawtin - 08 Nov 2005 09:50 GMT
> I think that if the methods of 'String' are not
> efficient, these exert a very bad influence on
[quoted text clipped - 16 lines]
> #1 is very inefficient.  because #1 converts the entire
> of 'src' to lowerCase though there is no need to do so.

Inefficient, possibly. Very inefficient, no.

Remember that (short lived) memory allocation is very cheap. Also
String.toLowerCase will not create a new String if the source String is
already lower case.

In general if you split code into passes then each pass becomes compact
and fast. If you try to do everything at once, then it rapidly becomes a
mess. It also leads to code duplication.

> I think that java do not support 'String' methods enough
> or properly.

It's a complicated area. And I'll probably make some mistakes...

There are plenty of inconsistencies. equalsIgnoreCase works on a
character basis. String.toUpperCase may create a String with a different
number of chars.

Take the German eszet (<http://en.wikipedia.org/wiki/Eszet>).
"Stra\u00dfe".toUpperCase() should give "STRASSE". Both literal strings
should be equal ignoring case, but equalsIgnoreCase is broken.

Another thing that is usually done incorrectly (including within Java)
is that toUpperCase and toLowerCase almost always should have a
specified Locale.  For instance "i".toUpperCase() does not always give
"I" (Turkish will give "\u0130"). It seems that lots of programs exhibit
bugs when run in Turkish.

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Robert M. Gary - 08 Nov 2005 17:01 GMT
If you profile your application, I would be surprised if you found
String manipulation your bottleneck. If you are doing any I/O at all, I
would look at making that faster.

BTW: If you are creating a string to pass to other methods, consider
using StringBuffer instead. Its a first class object and therefore
passed by reference. The String class is passed by value and results in
a complete copy everytime you call a method.

-Robert
Daniel Dyer - 08 Nov 2005 17:15 GMT
> BTW: If you are creating a string to pass to other methods, consider
> using StringBuffer instead. Its a first class object and therefore
> passed by reference. The String class is passed by value and results in
> a complete copy everytime you call a method.

This is simply not true.  There is no difference between the way Strings  
and StringBuffers are passed and no concept of "first class object" versus  
some other class of object.  Everything in Java is passed by value,  
whether it's a primitive or an object reference (note that it's the  
reference that is passed, not the object itself).  Strings are not copied  
as they are passed around and the fact that they are immutable makes them  
suitable for sharing.

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Robert M. Gary - 08 Nov 2005 23:38 GMT
Actually, that's not true. Strings are kinda like real objects and
kinda like primatives. For instance you can say
String foo = "hello" just like you can say int i =5. You cannot say
StringBuffer foo = "hello" (because you must create the object first).

Here is the test though...
public class stringtest
{

   stringtest()
   {
       String foo = "hello";
       StringBuffer bar = new StringBuffer("hello");
       testString (foo);
       testBuffer(bar);
       System.out.println("String is "+foo);
       System.out.println("Buffer is "+bar);
   }

   private void testBuffer(StringBuffer bar)
   {
       bar.append("bye");
   }

   private void testString(String foo)
   {
       foo.toUpperCase();
   }

   public static void main(String[] args)
   {
       new stringtest();
   }
}

The out put looks like
String is hello
Buffer is hellobye

Notice the string is left untouched (because it was passed by value)
but the StringBuffer was changed (because it was passed by reference).

-Robert
Thomas Hawtin - 09 Nov 2005 01:07 GMT
> Actually, that's not true. Strings are kinda like real objects and
> kinda like primatives. For instance you can say
> String foo = "hello" just like you can say int i =5. You cannot say
> StringBuffer foo = "hello" (because you must create the object first).

You have created the "hello" String. It happens to be written as a
literal and interned. A bit of syntax for literals doesn't make Strings
in anyway primitive.

> Notice the string is left untouched (because it was passed by value)
> but the StringBuffer was changed (because it was passed by reference).

String was left unchanged because you didn't mutate it. StringBuffer
changed because you mutated it.

If you were to call length on the StringBuffer instead of append, then
it wouldn't have changed. Similarly, if you had used reflection of the
String, then it may have changed. Synchronisation adds an interesting case.

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Roedy Green - 09 Nov 2005 03:15 GMT
On Wed, 09 Nov 2005 01:07:40 +0000, Thomas Hawtin
<usenet@tackline.plus.com> wrote, quoted or indirectly quoted someone
who said :

>if you had used reflection of the
>String, then it may have changed.

Strings are immutable.  That is iron clad. The caller's String will
not change neither will the caller's reference to the String.

The only way I know of to change a String is to use JNI and once
inside cheat, taking advantage of C's total lack of restraints.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Thomas Hawtin - 09 Nov 2005 08:40 GMT
> Strings are immutable.  That is iron clad. The caller's String will
> not change neither will the caller's reference to the String.
>
> The only way I know of to change a String is to use JNI and once
> inside cheat, taking advantage of C's total lack of restraints.

setAccessible(true) takes the remaining seat belts off reflection.

Tom Hawtin
Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Roedy Green - 09 Nov 2005 08:53 GMT
On Wed, 09 Nov 2005 08:40:19 +0000, Thomas Hawtin
<usenet@tackline.plus.com> wrote, quoted or indirectly quoted someone
who said :

>setAccessible(true) takes the remaining seat belts off reflection.

That lets you view a private field. Does it also let you write a
final?
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Daniel Dyer - 09 Nov 2005 09:29 GMT
> On Wed, 09 Nov 2005 08:40:19 +0000, Thomas Hawtin
> <usenet@tackline.plus.com> wrote, quoted or indirectly quoted someone
[quoted text clipped - 4 lines]
> That lets you view a private field. Does it also let you write a
> final?

You can modify the contents of the String's character array, this is  
especially fun when dealing with literals, since they are pooled.  You can  
write code (without using JNI) that makes this line

    System.out.println("Hello World");

print out "Goodbye".

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Benji - 09 Nov 2005 09:48 GMT
> You can modify the contents of the String's character array, this is  
> especially fun when dealing with literals, since they are pooled.  You can  
> write code (without using JNI) that makes this line

>     System.out.println("Hello World");

> print out "Goodbye".

import java.lang.reflect.Field;
public class Main
{
   public static void main(String[] argv) throws Exception
   {
       String s1 = "This is only a test.";
       String s2  = "This is only a test.";
       destructiveMutate(s2, "Well I'll be darned");
       System.out.println(s1);
   }
   
   public static void destructiveMutate(String s, String sNew) throws Exception
   {
       Field value = String.class.getDeclaredField("value");
       Field offset = String.class.getDeclaredField("offset");
       value.setAccessible(true);
       offset.setAccessible(true);
       
       assert sNew.length() > s.length() : "too long";
       
       int off = offset.getInt(s);
       char[] arr = (char[])value.get(s);
       for(int i=0;i<sNew.length();i++)
       {
           arr[off+i] = sNew.charAt(i);
       }
   }
}

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

Benji - 09 Nov 2005 09:53 GMT
> You can modify the contents of the String's character array, this is  
> especially fun when dealing with literals, since they are pooled.  You can  
> write code (without using JNI) that makes this line

>     System.out.println("Hello World");

> print out "Goodbye".

import java.lang.reflect.Field;
public class Main
{
   public static void main(String[] argv) throws Exception
   {
       String s1 = "This is only a test.";
       String s2  = "This is only a test.";
       destructiveMutate(s2, "Well I'll be darned");
       System.out.println(s1);
   }
   
   public static void destructiveMutate(String s, String sNew) throws Exception
   {
       Field value = String.class.getDeclaredField("value");
       Field offset = String.class.getDeclaredField("offset");
       value.setAccessible(true);
       offset.setAccessible(true);
       
       assert sNew.length() <= s.length() : "too long";
       
       int off = offset.getInt(s);
       char[] arr = (char[])value.get(s);
       for(int i=0;i<sNew.length();i++)
       {
           arr[off+i] = sNew.charAt(i);
       }
   }
}

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

Roedy Green - 09 Nov 2005 11:15 GMT
On Wed, 09 Nov 2005 09:29:18 -0000, "Daniel Dyer"
<dan@dannospamformepleasedyer.co.uk> wrote, quoted or indirectly
quoted someone who said :

>You can modify the contents of the String's character array, this is  
>especially fun when dealing with literals, since they are pooled.  You can  
>write code (without using JNI) that makes this line
>
>    System.out.println("Hello World");

My God. It that seems like building a bank vault then leaving the
combination on a slip of paper under the mat.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Daniel Dyer - 09 Nov 2005 11:33 GMT
> On Wed, 09 Nov 2005 09:29:18 -0000, "Daniel Dyer"
> <dan@dannospamformepleasedyer.co.uk> wrote, quoted or indirectly
[quoted text clipped - 9 lines]
> My God. It that seems like building a bank vault then leaving the
> combination on a slip of paper under the mat.

But think of the possibilities for driving your co-workers nuts.  Hide  
some of this literal-altering code somewhere obscure and watch them debug  
for hours and finally start crying like a baby because even their simple  
System.out.println calls don't do what they expect.

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Chris Uppal - 09 Nov 2005 12:08 GMT
> But think of the possibilities for driving your co-workers nuts.  Hide
> some of this literal-altering code somewhere obscure and watch them debug
> for hours and finally start crying like a baby because even their simple
> System.out.println calls don't do what they expect.

;-)

Also add some complicated and obscure JNI code (is there any other kind ?)
which does /not/ play games with String contents.  Watch them scratching their
heads over C code, trying to puzzle out the JNI error...

   -- chris
Benji - 09 Nov 2005 16:00 GMT
>>You can modify the contents of the String's character array, this is  
>>especially fun when dealing with literals, since they are pooled.  You can  
>>write code (without using JNI) that makes this line
>>
>>    System.out.println("Hello World");

> My God. It that seems like building a bank vault then leaving the
> combination on a slip of paper under the mat.

Looks like another entry in the unmaintainable code database, Roedy.

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

Ingo R. Homann - 09 Nov 2005 10:58 GMT
Hi Roedy,

>>setAccessible(true) takes the remaining seat belts off reflection.
>
> That lets you view a private field. Does it also let you write a
> final?

That's not necessary (see below), since you do not need to change the
array itself, but only the elements of the array.

Ciao,
Ingo

import java.lang.reflect.Field;

public class MutableString
{

  public static void main(String[] args)
  {
    String s = "Hello";
    modify(s);
    System.out.println(s);
  }

  private static void modify(String s)
  {
    try
    {
      Field f = String.class.getDeclaredField("value");
      f.setAccessible(true);
      char[] cs = (char[]) f.get(s);
      for (int i = 0; i < cs.length; i++)
      {
        cs[i] = 'X';
      }
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }
  }

}
Benji - 09 Nov 2005 09:27 GMT
> setAccessible(true) takes the remaining seat belts off reflection.

That doesn't have anything to do with strings being immutable.  Immutable
types (String, Integer) depend on immutability for optimizations.  Even if
calling setAccessible(true) could let you modify a String, it would really
screw up your program.  For example...if I have
String a = "foo";
String b = "foo";

And I modify a, I will also end up modifying b because of the way that the
compiler stores Strings.  (since they're immutable, they might as well be
the same object)

Another optimization that a compiler can make (that I'm not sure if it does
or not), if you have

String a = "sub";
String b = "I live on a submarine";

a can be stored as a substring of b.  I believe current JVMs do this.

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

Thomas Hawtin - 09 Nov 2005 10:14 GMT
>>setAccessible(true) takes the remaining seat belts off reflection.
>
> That doesn't have anything to do with strings being immutable.  Immutable

No, it shows Strings are not passed by value. As in "Notice the string
is left untouched (because it was passed by value) ...". (The reference
is passed by value.)

> types (String, Integer) depend on immutability for optimizations.  Even if
> calling setAccessible(true) could let you modify a String, it would really
> screw up your program.  For example...if I have
> String a = "foo";
> String b = "foo";

You don't need to change final fields to modify String. You can get the
value array. You can fiddle with the hashCode cache. In old versions,
value is not final. synchronize also modifies the object.

> And I modify a, I will also end up modifying b because of the way that the
> compiler stores Strings.  (since they're immutable, they might as well be
> the same object)

Yup.

> Another optimization that a compiler can make (that I'm not sure if it does
> or not), if you have
[quoted text clipped - 3 lines]
>
> a can be stored as a substring of b.  I believe current JVMs do this.

Not on the version I use.

class InternSubstring {
    public static void main(String[] args) throws Exception {
        String a = "$sub$";
        String b = "I live on a $sub$marine";
        show(a);
        show(b);
    }
    private static void show(String str) throws Exception {
        System.err.println(
             field("value", str)+", "+
             field("offset", str)+", "+
             field("count", str)
        );
    }
    private static String field(
        String name, Object object
    ) throws Exception {
        java.lang.reflect.Field field =
             String.class.getDeclaredField(name);
        field.setAccessible(true);
        return name+" <"+field.get(object)+">";
    }
}

Signature

Unemployed English Java programmer
http://jroller.com/page/tackline/

Benji - 09 Nov 2005 15:58 GMT
> <snip incorrect stuff that I cancelled>

lol.  sorry, nobody had replied when I realized that I was wrong, and I
cancelled this.  It was only up for like a minute.  You're quick.  =P

Signature

Of making better designs there is no end,
 and much refactoring wearies the body.

Roedy Green - 09 Nov 2005 03:13 GMT
> Strings are kinda like real objects and
>kinda like primatives.

Strings  may have some native methods and they are final, but they are
ordinary objects.

They are special only in that you can create a String object with a
literal in the language and the + operator will work on them.

The + operator is just sugar to save you writing out the equivalent
StringBuffer code.

The literal feature is deeper. The byte code has a way of specially
recording String literals in a pool in the byte code.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Daniel Dyer - 09 Nov 2005 09:38 GMT
> Actually, that's not true. Strings are kinda like real objects and
> kinda like primatives. For instance you can say
> String foo = "hello" just like you can say int i =5. You cannot say
> StringBuffer foo = "hello" (because you must create the object first).

OK, I'll give you that, the String class has some special language support  
(literals and the + operator), but they are still just objects like any  
others.

> Here is the test though...
> public class stringtest
[quoted text clipped - 32 lines]
> Notice the string is left untouched (because it was passed by value)
> but the StringBuffer was changed (because it was passed by reference).

The toUpperCase method returns a new String (you are discarding it by not  
assigning it to anything), it does not change the original String.  If  
StringBuffers were passed by reference Pascal-style, the following code  
would print out "Goodbye", but it prints hello because the reference is  
passed by value (http://javadude.com/articles/passbyvalue.htm).

public class ReferenceTest
{
    private StringBuffer myBuffer = new StringBuffer("Hello");

    public static void main(String[] args)
    {
        changeStringBuffer(myBuffer);
        System.out.println(myBuffer.toString());
    }

    private static void changeStringBuffer(StringBuffer buffer)
    {
        buffer = new StringBuffer("Goodbye");
    }
}

Dan.

Signature

Daniel Dyer
http://www.dandyer.co.uk

Roedy Green - 09 Nov 2005 03:06 GMT
>The String class is passed by value and results in
>a complete copy everytime you call a method.

a String  is passed by value of the REFERENCE not the entire String.
Only 32 bits get pushed to the stack.  Strings are immutable.
References share the same copy.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.