Java Forum / General / November 2005
Efficient String Manipulation ?
javafreak - 12 Nov 2005 08:32 GMT Guys
I have been thinking on this problem Say I have a String input ="DSDFSDFSFSFDAAAAABBBBBBBBRRRRRRRR" and i have a character array of the characters to be removed say {A,C}
So I need to delete the occurences of A and C in this string
One way is to traverse the String input for each character using charAt() and for each character , run a for loop to check whether that character exists in the "remove" Array
for eg:
1st char D, Check whether it exists in A and C array
if both the string and remove array this soln O(n2) would be bad, can somehow the soln be made more efficient of complexity n
Thanks javafreak
shangiaa - 12 Nov 2005 09:20 GMT > Guys > [quoted text clipped - 18 lines] > Thanks > javafreak shangiaa - 12 Nov 2005 09:35 GMT Hi,
try to use the replace() methd: public class ReplaceChar { public static void main(String str[]) { String strInp = "DSDFSDFSFSFDAAAAABBBBBBBBRRRRRRARR"; char[] array = {'A','B','D'}; for(int i=0;i<array.length;i++) { strInp = strInp.replace(array[i],'\0'); } System.out.println(str); } }
regards, Shangitaa.
Roedy Green - 12 Nov 2005 13:45 GMT > public static void main(String str[]) > { [quoted text clipped - 6 lines] > System.out.println(str); > } Whoa Nelly!
1. You have THREE inputs, str, strInp and array. His problem describes two, the string to be modified and a list of chars to be removed.
2. Octal liberals should have all three digits to avoid ambiguities, e.g. '\000'. See http://mindprod.com/jgloss/literals.html
3. Converting a char to (char)0 is not the same as removing it.
4. the answer is in strInp not str. You printed out the wrong value.
Your method suggests a brute-force solution though; after you are done, then in one more pass, remove the 0s with: replace( "\000", "" );. Even then this method has the side effect of creating many temporary String objects.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 12 Nov 2005 14:08 GMT On Sat, 12 Nov 2005 13:45:33 GMT, Roedy Green <my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or indirectly quoted someone who said :
>2. Octal liberals should have all three digits to avoid ambiguities, >e.g. '\000'. See http://mindprod.com/jgloss/literals.html Octal literals. Sorry, spell checker.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 12 Nov 2005 10:12 GMT >I have been thinking on this problem >Say I have a String input ="DSDFSDFSFSFDAAAAABBBBBBBBRRRRRRRR" >and i have a character array of the characters to be removed say {A,C} here is code to remove a single char. Just extend it with a BitSet to track which chars to strip to handle multiples.
public class Stripper { /** * Strips out the given char from a String. * This should be quicker than other ways of getting the same effect. * See also String.replace and String.replaceAll. * @param charToStrip the char e.g. ' ', '/', that you no longer want in your String . * @param s the String, may be empty but not null. * @return new String with all instances of charToStrip removed, or the original * String if there were no instances of charToStrip. */ public static String stripper( char charToStrip, String s ) { int place = s.indexOf( charToStrip ); if ( place < 0 ) { // were no instances, so we can return the original String. return s; }
int length = s.length();
// we know result will be at least one shorter. StringBuilder sb = new StringBuilder( length-1 );
// append first part of s, before first instance of unwanted char sb.append( s.substring( 0, place ) );
// copy over tail end of string, ignoring unwanted chars. for ( int i=place+1; i<length; i++ ) { char c = s.charAt( i );
if ( c != charToStrip ) { sb.append( c ); } } // end for
return sb.toString(); }
/** * test harness * * @param args not used */ public static void main ( String[] args ) { System.out.println ( stripper( ' ', "hello world" ) ); // helloworld System.out.println ( stripper( ' ', "h e l l o w o r l d " ) ); // helloworld System.out.println ( stripper( 'q', "fine as is" ) ); // fine as is } } // end class
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Chris Uppal - 12 Nov 2005 11:10 GMT > if both the string and remove array this soln O(n2) would be bad, can > somehow the soln be made more efficient of complexity n Sure, if you are willing to spend the space.
Build a boolean array which is indexed by character values. Set the values in that array to true for the characters you want to remove. Create a StringByulder (or StringBuffer) of the same size as the input String. Iterate over the chars in the input string, for each one check the value of the array at the corresponding index, if that is false then append the char to the StringBuilder. At the end convert the StringBuilder into a String.
Possible refinements include:
After the run put the relevant slots in the array back to false, that way you can save it and re-use it later. (Saves initialising 2**16 value).
Use an int[] array instead of a boolean[] array, and do bit-twiddling to index bit-positions rather than boolean array slots. (Saves lots of space).
If the characters you are interested in come from a restricted range, then you can reduce the size of the array accordingly. (Saves space)
You could use a HashMap(<Character>, <Boolean>) instead of the array. (Saves space, but in a different way. May, or may not, save time.)
-- chris
Roedy Green - 12 Nov 2005 13:47 GMT On Sat, 12 Nov 2005 11:11:00 -0000, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>Build a boolean array which is indexed by character values. or a java.util.BitSet which uses only one bit instead of 32 (or 8?) bits per slot.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 12 Nov 2005 13:48 GMT On Sat, 12 Nov 2005 11:11:00 -0000, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>You could use a HashMap(<Character>, <Boolean>) instead of the array. (Saves >space, but in a different way. May, or may not, save time.) The lookup in a HashMap would be much slower than a BitSet.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Schodt - 12 Nov 2005 15:32 GMT > Say I have a String input ="DSDFSDFSFSFDAAAAABBBBBBBBRRRRRRRR" > and i have a character array of the characters to be removed say {A,C} I'd use something simple like
public final class ReplaceChar { public static final void main(String[] arg) { String strInp = "DSDFSDFSFSFDAAAAABBBBBBBBRRRRRRARR"; char[] array = {'A','B','D'}; String regex = "["+new String(array)+"]"; System.out.println(strInp.replaceAll(regex,"")); } }
Roedy Green - 13 Nov 2005 03:07 GMT On Sat, 12 Nov 2005 15:32:58 +0000, Thomas Schodt <spamtrap@xenoc.demon.co.uk.invalid> wrote, quoted or indirectly quoted someone who said :
> char[] array = {'A','B','D'}; > String regex = "["+new String(array)+"]"; > you might as well collapse that to
String regex = "[ABD+]";
That is nice and short, does not leave a large number of temporary objects, and is easy to configure. Is suspect the BitSet method would be faster especially if the array were long, but there is no reason the regex method should be a dog. This solution I think is the best submitted.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Edwin Martin - 13 Nov 2005 23:57 GMT >> char[] array = {'A','B','D'}; >> String regex = "["+new String(array)+"]"; [quoted text clipped - 3 lines] > > String regex = "[ABD+]"; You meant "[ABD]+" ?
Edwin Martin
 Signature http://www.bitstorm.org/edwin/en/
Roedy Green - 14 Nov 2005 01:35 GMT On Mon, 14 Nov 2005 00:57:50 +0100, Edwin Martin <e.j.martin@chello.nl> wrote, quoted or indirectly quoted someone who said :
>>> char[] array = {'A','B','D'}; >>> String regex = "["+new String(array)+"]"; [quoted text clipped - 7 lines] > >Edwin Martin if i were simply trying to copy the original, there shoud be no + at all. Otherwise, you are correct.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
javafreak - 14 Nov 2005 02:55 GMT Guys thanks for the replies, just checking all the replies now . HashMap soln seems to be a good one, will go and try tht
Cheers
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|