Java Forum / General / October 2005
Solve this beauty
Sharp Tool - 27 Oct 2005 14:23 GMT Code:
1. String line = "123 456"; 2. String firstPart = null; 3. String[] splits = line.split("\\s",2); //breaks line into two parts at white space 4. firstPart.id = splits[0];
Problem: I would like to merge line 3. and 4. into 1 statement to save space. Is this possible?
Sharp Tool
dnasmars - 27 Oct 2005 14:29 GMT what is firstPart.id ? From where does it come ?
:)
> Code: > [quoted text clipped - 9 lines] > > Sharp Tool Sharp Tool - 28 Oct 2005 10:13 GMT > >Sharp Tool wrote: > > Code: [quoted text clipped - 10 lines] > > > > Sharp Tool
>dnasmars wrote: > what is firstPart.id ? > From where does it come ? > :) Sorry it should be: 1. String line = "123 456"; 2. String firstPart = null; 3. String[] splits = line.split("\\s",2); 4. firstPart = splits[0];
To answer the other questions: I am calling line.split() many times (>70000) although not in succession. And I would like the code to run as fast as possible.
Sharp Tool
John - 28 Oct 2005 10:32 GMT >>>Sharp Tool wrote: >>>Code: [quoted text clipped - 30 lines] > > Sharp Tool Oh I see. I thought you were just messing around rather than looking to speed it up.
If you are trying to make this more efficient, use NOBODY's code which is meant to be quicker.
It might be possible to speed this up further, but it would help to know where the "line" comes from. If you are reading individual lines from a file that is >70000 lines long, you could look into using a BufferedReader and an InputStreamReader.
Post a larger section of code please.
John
Timbo - 27 Oct 2005 14:37 GMT > Code: > [quoted text clipped - 7 lines] > I would like to merge line 3. and 4. into 1 statement to save space. > Is this possible? This should work:
firstPart.id = line.split("\\s",2)[0];
but you will lose the variable declaration of 'splits'.
Thomas Fritsch - 27 Oct 2005 14:57 GMT > 1. String line = "123 456"; > 2. String firstPart = null; [quoted text clipped - 4 lines] > Problem: > I would like to merge line 3. and 4. into 1 statement to save space. and to give up readability (?)
> Is this possible? firstPart.id = line.split("\\s",2)[0];
 Signature "Thomas:Fritsch$ops:de".replace(':','.').replace('$','@')
Malte - 27 Oct 2005 15:14 GMT > Code: > [quoted text clipped - 9 lines] > > Sharp Tool OK, I'll bite:
First, what kind of 'space' you you want to save?
Second, I am not sure of the firstPart.id thing. firstPart is a String. I do not recall the String class or its parent (Object) having a public field called id. If you want firstPart to be an instance of the class FirstPart, I guess you might want to instantiate it before use.
Even so, it would seem (untested) that
String line = "123 456"; String firstPart = line.split("\\s", 2)[0];
should work.
Ovbiously, this isn't very robust code, nor does it save a lot of 'space'.
John - 27 Oct 2005 16:33 GMT >> Code: >> [quoted text clipped - 27 lines] > > Ovbiously, this isn't very robust code, nor does it save a lot of 'space'. We can even get rid of another line if we wanted.
<sscce filename="Keith.java"> public class Keith { public String id;
public static void main(String[] args) { Keith firstPart = new Keith(); //this is the important line. firstPart.id = "123 456".split("\\s", 2)[0]; System.out.println(firstPart.id); } } </sscce>
john
steve - 27 Oct 2005 22:01 GMT >>> Code: >>> [quoted text clipped - 44 lines] > > john why don't we just get rid of the whole post, and save a lot more "space". Do any of you , realize how much space this wastes on a global level? or how much electricity it wastes?
Obviously they have not learned about "compilers" & byte code yet.
Now if we can just compress the "1"'s into Zeros , that would save space.
Steve
John - 28 Oct 2005 10:14 GMT >>>>Code: >>>> [quoted text clipped - 54 lines] > > Steve Hmm. Perhaps at some point in 3 years of Computer Science and another 3 of programming I found out how electricity, the Internet, compilers, byte code and the JVM work.
Malte made the necessary comments, so there was no need to add to them.
I just did it for fun (it's a strange concept I know).
Perhaps we should ban crosswords and Sudoku due to a waste of pencils and paper?
John
Oliver Wong - 28 Oct 2005 16:02 GMT > Now if we can just compress the "1"'s into Zeros , that would save space. As I recall, there was a compression algorithm that, given any bitstream, would get rid of all the zeros, because they're worthless anyway. This all by itself would typically yield a 50% compression ratio. But then, the designers realized that all you're left with is a long string of 1s. So why encode the series of 1s when you can just encode how many 1s there are in the stream? E.g., what's shorter to write? "11111111111111111111111" or "23"? Now here's where the real genius comes in. They then take that 23, and encode it in binary, thus giving you a bitstream of 1s and 0s, and then they repeat the process of eliminating the zeros and so on. Eventually, they can store every message in 1 bit, which is always set to "1", so they never actually need to send that bit in the first place, since the destination can infer its value.
- Oliver
Roedy Green - 28 Oct 2005 20:26 GMT > which is always set to "1", so they never >actually need to send that bit in the first place, since the destination can >infer its value. Yes, but that only works for streams with where the information content approaches zero, e.g. XML schemas.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
HalcyonWild - 28 Oct 2005 00:45 GMT > 2. String firstPart = null; > 3. String[] splits = line.split("\\s",2); //breaks line into two parts at > white space > 4. firstPart.id = splits[0]; Forget merging to save a line, what is firstPart.id. I mean firstPart is a String. I just checked api, both 1.4 and 1.5 , to see if String(or Object) has a public field called id. Is your program even compiling.
NOBODY - 28 Oct 2005 00:56 GMT > 1. String line = "123 456"; > 2. String firstPart = null; > 3. String[] splits = line.split("\\s",2); //breaks line into two parts > at white space > 4. firstPart.id = splits[0]; First of all, avoid String.split() for such simple task. It uses regex underneath! Second, String doesn't have a public field 'id'. Now, given that you do not deal with non-safe input, I will propose
firstpart = (line).substring(0, line.indexOf(' '));
Benji - 28 Oct 2005 01:07 GMT > First of all, avoid String.split() for such simple task. It uses regex > underneath! This only matters if it is a task that might be used very rapidly in succession. He would be the one to make that call. Also, the regex is a very lightweight object in this instance, and doesn't really matter. What matters more is readability, and making sure that bugs are found easily. The below code does not really fit that description.
> firstpart = (line).substring(0, line.indexOf(' '));
 Signature Of making better designs there is no end, and much refactoring wearies the body.
NOBODY - 28 Oct 2005 01:31 GMT >> First of all, avoid String.split() for such simple task. It uses >> regex underneath! > > This only matters if it is a task that might be used very rapidly in > succession. He would be the one to make that call. So, I guess you wash your car with a toothbrush, since, you know, you do it only every month or so, and since you are familiar with a toothbrush since you use it everyday, it is simpler to understand.
> Also, the regex is a very lightweight object in this instance A millimeter is small, but a micron is small too. About a 1000 times smaller...
> What matters more is readability, and making sure that bugs > are found easily. You cannot make that call either. He may need speed. And the point is, Writing good code always pays off. You never know how a piece of code will scale... That's why we need P3 to run w2k (ms fluffy-I-don't-care code), but we were playing 3d doom on a 386.
-- Don't let the computer do what you wouldn't do yourself.
Benji - 28 Oct 2005 02:11 GMT > So, I guess you wash your car with a toothbrush, since, you know, you do it > only every month or so, and since you are familiar with a toothbrush since > you use it everyday, it is simpler to understand. amusing, but not very true to the point. =)
I'm going to take a wild guess and say that given the code he's trying to figure out, his application is neither high-performance, nor critical.
> A millimeter is small, but a micron is small too. About a 1000 times > smaller... the regex runs about 25 times slower. don't be dramatic. =)
> You cannot make that call either. He may need speed. And the point is, I didn't say I could make the call. I said it was his. You made a valid point about needing to realize that using split is more time intensive, but saying that you shouldn't use it at all is just as bad of a design decision. If he's not running it repeatedly, or if it's a precondition for an operation that takes much longer, it will not matter at all.
> Writing good code always pays off. You never know how a piece of code will > scale... That's why we need P3 to run w2k (ms fluffy-I-don't-care code), > but we were playing 3d doom on a 386. that's true, but your code was not better. what if indexOf returns -1? You will throw an exception, whereas his will not. better and faster are not equivalent. faster code is only better if what it gives up to be fast does not matter as much as the speed.
I'm not trying to bash you, only trying to temper what you said.
 Signature Of making better designs there is no end, and much refactoring wearies the body.
Luc The Perverse - 28 Oct 2005 02:33 GMT >> First of all, avoid String.split() for such simple task. It uses regex >> underneath! [quoted text clipped - 6 lines] > >> firstpart = (line).substring(0, line.indexOf(' ')); The guy wanted the shortest code possible.
It is a ridiculous request, but still we humored him :)
-LTP
:) Chris Uppal - 28 Oct 2005 12:33 GMT > > First of all, avoid String.split() for such simple task. It uses regex > > underneath! > > This only matters if it is a task that might be used very rapidly in > succession. He would be the one to make that call. Also, the regex > is a very lightweight object in this instance, Typically regexps are /not/ lightweight objects, they have a lot of state and take a long time to create. (Although this particular one is small, as you say). Also the code as given will create a new regexp for each call, thus involving the complicated and expensive analysis needed to translate the regexp into a DFA.
> and doesn't really matter. > What matters more is readability, and making sure that bugs are found > easily. Agreed, and that's another reason to avoid regexps. Any day's trawl through this newsgroup will show that they cause many problems.
It is very rare for regexps to the right tool for the job. They are not powerful enough in themselves for most real-world parsing requirements (though they might be useful as /part/ of a parser). They are difficult to construct. Harder to read after the fact. Harder still to modify. I would say use regexps only with extreme reluctance, they should never be the first tool you think of because they are only rarely the most appropriate tool for the job. (Exceptions including: as one part of a parser, when external configuration is supplied (so the regexp comes from the user not the programmer), and the few cases where a data-format has been specifically designed to by picked apart using regexps.)
OTOH, I can't see the overhead of String.split() being too significant in this case. The OP has said that it's only going to be executed about 70K times in the life of the program.
-- chris
Roedy Green - 28 Oct 2005 07:56 GMT On Thu, 27 Oct 2005 13:23:59 GMT, "Sharp Tool" <sharp.tool@bigpond.net.au> wrote, quoted or indirectly quoted someone who said :
>3. String[] splits = line.split("\\s",2); //breaks line into two parts at >white space [quoted text clipped - 3 lines] >I would like to merge line 3. and 4. into 1 statement to save space. >Is this possible? That sort of code drives newbies crazy, but it is pretty easy to do.
Just replace "splits" in line 4 with the expression in line 3. I added () for clarity. It works just like grade 7 algebra.
firstPart.id = (line.split("\\s",2))[0];
If you do this in a loop, you are better to use the form where you precompile your Pattern.
see http://mindprod.com/jgloss/regex.html#STRING
Note that this code will blow up with an index exception if the line is not of proper format.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|