Java Forum / General / August 2006
How to display a "double" in all its precision???
CS Imam - 07 Aug 2006 03:59 GMT Hello,
Here is a code fragment that is very simple... but I can't get it to work!
public static void main(String[] args) { for (int i = 1; i <= 30 ; i++) { double x = Math.pow(2, i); x = 1 + 1 / x; System.out.printf("For i = %d: %.40f%n", i, x); System.out.println( Long.toBinaryString(Double.doubleToLongBits(x)) ); System.out.println(); } }
All this code is supposed to do is print out the fractions 1+1/2, 1+1/4, 1+1/8, etc. When one prints out the raw bits (see doubleToLongBits), the code is clearly working.
But on the regular printf("For i...etc"), at i=17 and above, the numbers get frozen at 16 digits displayed after the decimal point (the precision). But it's not really the precision, because the bits ARE changing correctly. What gives???
Help!
- not a stunningly gorgeous woman who would marry you if you solve this problem
EJP - 07 Aug 2006 04:21 GMT > But on the regular printf("For i...etc"), at i=17 and above, the > numbers get frozen at 16 digits displayed after the decimal point (the > precision). But it's not really the precision, because the bits ARE > changing correctly. What gives??? There *is no more* precision. A double has 53 bits of binary precision which is about 16 decimal digits. The 40 decimal digits you're expecting would take 133 bits.
Maybe you're getting confused between 40 bits of binary precision and 40 decimal digits?
CS Imam - 07 Aug 2006 04:28 GMT Thank you for your reply, but here is the output to clarify what I am seeing. As you will see, the binary representation shows that the double is more than capable of representing the numbers in question (1/2^i where i goes from 1 to 30). As you said, the double gives 52 bits of precision. However, when displaying the number in decimal, Java appears to be unable to display it correctly - why does it properly display in base 2, but not in base 10?
I hope I was clearer this time - please pardon me if not! But thanks for your help...
For i = 1: 1.5000000000000000000000000000000000000000 11111111111000000000000000000000000000000000000000000000000000
For i = 2: 1.2500000000000000000000000000000000000000 11111111110100000000000000000000000000000000000000000000000000
For i = 3: 1.1250000000000000000000000000000000000000 11111111110010000000000000000000000000000000000000000000000000
For i = 4: 1.0625000000000000000000000000000000000000 11111111110001000000000000000000000000000000000000000000000000
For i = 5: 1.0312500000000000000000000000000000000000 11111111110000100000000000000000000000000000000000000000000000
For i = 6: 1.0156250000000000000000000000000000000000 11111111110000010000000000000000000000000000000000000000000000
For i = 7: 1.0078125000000000000000000000000000000000 11111111110000001000000000000000000000000000000000000000000000
For i = 8: 1.0039062500000000000000000000000000000000 11111111110000000100000000000000000000000000000000000000000000
For i = 9: 1.0019531250000000000000000000000000000000 11111111110000000010000000000000000000000000000000000000000000
For i = 10: 1.0009765625000000000000000000000000000000 11111111110000000001000000000000000000000000000000000000000000
For i = 11: 1.0004882812500000000000000000000000000000 11111111110000000000100000000000000000000000000000000000000000
For i = 12: 1.0002441406250000000000000000000000000000 11111111110000000000010000000000000000000000000000000000000000
For i = 13: 1.0001220703125000000000000000000000000000 11111111110000000000001000000000000000000000000000000000000000
For i = 14: 1.0000610351562500000000000000000000000000 11111111110000000000000100000000000000000000000000000000000000
For i = 15: 1.0000305175781250000000000000000000000000 11111111110000000000000010000000000000000000000000000000000000
For i = 16: 1.0000152587890625000000000000000000000000 11111111110000000000000001000000000000000000000000000000000000
For i = 17: 1.0000076293945312000000000000000000000000 11111111110000000000000000100000000000000000000000000000000000
For i = 18: 1.0000038146972656000000000000000000000000 11111111110000000000000000010000000000000000000000000000000000
For i = 19: 1.0000019073486328000000000000000000000000 11111111110000000000000000001000000000000000000000000000000000
For i = 20: 1.0000009536743164000000000000000000000000 11111111110000000000000000000100000000000000000000000000000000
For i = 21: 1.0000004768371582000000000000000000000000 11111111110000000000000000000010000000000000000000000000000000
For i = 22: 1.0000002384185790000000000000000000000000 11111111110000000000000000000001000000000000000000000000000000
For i = 23: 1.0000001192092896000000000000000000000000 11111111110000000000000000000000100000000000000000000000000000
For i = 24: 1.0000000596046448000000000000000000000000 11111111110000000000000000000000010000000000000000000000000000
For i = 25: 1.0000000298023224000000000000000000000000 11111111110000000000000000000000001000000000000000000000000000
For i = 26: 1.0000000149011612000000000000000000000000 11111111110000000000000000000000000100000000000000000000000000
For i = 27: 1.0000000074505806000000000000000000000000 11111111110000000000000000000000000010000000000000000000000000
For i = 28: 1.0000000037252903000000000000000000000000 11111111110000000000000000000000000001000000000000000000000000
For i = 29: 1.0000000018626451000000000000000000000000 11111111110000000000000000000000000000100000000000000000000000
For i = 30: 1.0000000009313226000000000000000000000000 11111111110000000000000000000000000000010000000000000000000000
-----------------
> > But on the regular printf("For i...etc"), at i=17 and above, the > > numbers get frozen at 16 digits displayed after the decimal point (the [quoted text clipped - 7 lines] > Maybe you're getting confused between 40 bits of binary precision and 40 > decimal digits? EJP - 07 Aug 2006 05:41 GMT > Thank you for your reply, but here is the output to clarify what I am > seeing. As you will see, the binary representation shows that the [quoted text clipped - 103 lines] >>>precision). But it's not really the precision, because the bits ARE >>>changing correctly. What gives??? No, 16 decimal places really *is* the precision. The *binary* bits change beyond 16 because the radix is binary not decimal. The 16th decimal digit expresses a lot more precision than the 16th binary digit. You need to understant that. In fact I don't understand what you are expecting to see. The last decimal number printed is 1.0000000009313226 and the last binary number printed converts precisely back to that. Nothing is being lost. You can't get 40 decimal digits out of 53 bits, you can only get 16.
Patricia Shanahan - 07 Aug 2006 06:06 GMT >> Thank you for your reply, but here is the output to clarify what I am >> seeing. As you will see, the binary representation shows that the [quoted text clipped - 112 lines] > Nothing is being lost. You can't get 40 decimal digits out of 53 bits, > you can only get 16. For any number that can be expressed as a terminating binary fraction, including any number that is representable in Java double, there is a unique decimal fraction that is EXACTLY equal to it, not just a close-enough approximation. I believe that is the answer the OP is looking for.
(In another message, I suggested getting it via BigDecimal).
Patricia
CS Imam - 07 Aug 2006 08:16 GMT Thanks again for replying, you and Patricia both. The BigDecimal DID work, and that is great to solve the problem at hand.
But I'm really interested in understanding my... misunderstanding I suppose.
I am not looking for 40 places of decimal precision; I only used "%.40d" as an overkill to see "all the numbers". I *am* aware that doubles are supposed to give only 15 digits of decimal precison approximately. However, what I find puzzling is that in binary, we are supposed to get 52 (not 53 as far as I know) bits of precision. So here is my misunderstanding: I see the bits changing in binary. And yet when they are converted into decimal through the "prints" (and Patricia pointed out that the problem is really in "toString"), the decimal equivalent is NOT precise. And this does not make sense to me. If the underlying raw number in binary IS precise, then the converted decimal number should be precise as well, right?
When you wrote:
> The last decimal number printed is 1.0000000009313226 > and the last binary number printed converts precisely back to that. > Nothing is being lost. You can't get 40 decimal digits out of 53 bits, > you can only get 16. Actually as far as I know, something IS being lost. The last binary number, if you convert it to decimal, should be:
1.000000000931322574615478515625
In binary, the underlying bits are as follows:
11111111110000000000000000000000000000010000000000000000000000
So again, if the underlying bits are precisely expressing some number within the 52 bits of accuracy, why does converting it to a decimal representation fail?
I really apologize if I am not seeing something that you are explaining!
thanks, and sorry again.
> > Thank you for your reply, but here is the output to clarify what I am > > seeing. As you will see, the binary representation shows that the [quoted text clipped - 112 lines] > Nothing is being lost. You can't get 40 decimal digits out of 53 bits, > you can only get 16. Chris Uppal - 07 Aug 2006 10:14 GMT > However, what I find puzzling is that in binary, we are > supposed to get 52 (not 53 as far as I know) bits of precision. So here [quoted text clipped - 4 lines] > underlying raw number in binary IS precise, then the converted decimal > number should be precise as well, right? I think what you may be missing is that there is a /range/ of precise decimal numbers which would all have the same representation as a double. So, although any given double converts exactly into precisely one arbitrary-precision decimal number, that number is not the only one which the double value may be "trying" to represent.
The string representation has to /choose/ one value from the infinite set of arbitrary-precision decimal numbers which the double value might be intended to represent. One option would be to chose the unique element which was exactly equal to the double, but that's not the only possible design. In fact (and defensibly, IMO, although it would be nice to have a choice) the element which is chosen is the one with the fewest digits -- otherwise, for instance, 0.1D would print out as: 0.1000000000000000055511151231257827021181583404541015625 which is certainly precise, but is probably not what most programmers (or users) would wish to see.
-- chris
hiwa - 07 Aug 2006 11:01 GMT This is the funniest post of this summer. The OP reminds me of some type of elderly people who stubbonly believe in an impossible. Thanks for a big laugh.
Seriously: Learn and study IEEE 754 bit-array-representation of 64 bit FPN.
Patricia Shanahan - 07 Aug 2006 15:02 GMT > This is the funniest post of this summer. > The OP reminds me of some type of elderly people who stubbonly believe > in an impossible. Section 3.2.2 Double of ANSI/IEEE Std 754-1985 begins "A 64-bit double format number X is divided as shown in in Fig. 2. The value v of X is inferred from its constituent fields thus:" followed by a series of cases.
The relevant case is "(3) If 0 < e < 2047, then ..." followed by a formula for the value of a finite, normalized, non-zero double number. Using "^" for exponentiation, and "*" for multiplication, and "." for the centered dot, it is equivalent to:
(-1)^s * 2^(e-1023 *(1 "." f) where s is the sign bit, e is the exponent, and f is the fraction.
I interpreted the base article as requesting a print out, in decimal, of that value. My BigDecimal suggestion, which does exactly that, worked for the OP, so that seems to be the correct interpretation.
Why do you consider this to be impossible? Or do you disagree with my interpretation?
> Thanks for a big laugh. Although I find IEEE 754 floating point arithmetic both interesting and useful, I'm completely missing its humor. I know explanations can sometimes kill a joke, but perhaps you could explain what is so funny?
> Seriously: > Learn and study IEEE 754 bit-array-representation of 64 bit FPN. I've read the standard from cover to cover a couple of times, and reread individual sections far more often. I've looked at many, many doubles as bit patterns. To me, the OP's request seemed quite reasonable. Indeed, it is something I've needed for myself when working on understanding some of the subtleties of rounding, so I already had a solution. What am I missing?
Patricia
CS Imam - 07 Aug 2006 18:13 GMT Patricia,
Just wanted to thank you for your information. You turned out to be right on the money. Those posters who are insisting that there is no loss of precision, and that it is impossible to do better, and that this is a funny post... you need to go back and read.
Based on what Patricia wrote, I did some searching on the net, and found that this is indeed a difficult problem: what to print in decimal format given a binary number. And it also turns out that it was addressed in a classic paper by Guy Steele and Jon White. Here is a link to the paper:
"How to Print Floating Point Numbers Accurately"
http://portal.acm.org/ft_gateway.cfm?id=989431&type=pdf&coll=portal&dl=ACM&CFID= 15151515&CFTOKEN=6184618
You may read section 2 "Properties of Radix Conversion" to understand the issues involved. If you don't have time, then read Patricia's original answer to me. That was the succinct and correct answer, not repeatedly insisting that nothing was being lost. Indeed, LOTS is being lost... but on purpose it turns out.
- my hat is off to Ms. Shanahan
> > This is the funniest post of this summer. > > The OP reminds me of some type of elderly people who stubbonly believe [quoted text clipped - 36 lines] > > Patricia The_Sage - 09 Aug 2006 03:36 GMT >Reply to article by: "CS Imam" <csimam@gmail.com> >Date written: 7 Aug 2006 10:13:30 -0700 >MsgID:<1154970810.637741.315150@n13g2000cwa.googlegroups.com>
>Just wanted to thank you for your information. You turned out to be >right on the money. Those posters who are insisting that there is no >loss of precision, and that it is impossible to do better, and that >this is a funny post... you need to go back and read. Isn't it obvious that your "lack of precision" is due to rounding off and quantizing errors -- as would be expected?
Not all floating point decimal numbers can be exactly represented in digit-restricted binary, therefore some truncation or rouding will occur during conversion. This reduces precision.
Remember the math lecture that if all your calculations are done with, say 10-digits, your final answer will not be accurate to 10-digits due to rounding errors? If you want 10-digit accuracy, you need more than 10-digits to work with. The more operations you perform on a digit-restricted number, the less accurate it becomes due to rounding off. This reduces precision and this is the reason for the existence of "guard digits", ie -- 10-digit calculators use 13-digit calculations internally in order to maintain 10-digit accuracy.
Most application programmers do not list the internal precision of their math routines, -- assuming that they are even aware of such math issues. Usually, if you are really serious about your math, you will test the math routines for accuracy by performing repetitive loops.
See http://support.microsoft.com/default.aspx?scid=kb;EN-US;q42980 See http://docs.sun.com/source/806-3568/ncg_goldberg.html
The Sage
============================================================= http://members.cox.net/the.sage/index.htm
"All those painted screens erected by man to shut out reality -- history, religion, duty, social position -- all were illusions, mere opium fantasies" John Fowles, The French Lieutenant's Woman =============================================================
Patricia Shanahan - 09 Aug 2006 06:48 GMT >> Reply to article by: "CS Imam" <csimam@gmail.com> >> Date written: 7 Aug 2006 10:13:30 -0700 [quoted text clipped - 7 lines] > Isn't it obvious that your "lack of precision" is due to rounding off and > quantizing errors -- as would be expected? No, in this case it isn't at all obvious. Take another look at the program in the base message of the thread.
The value being printed, x, is calculated as 1+1/Math.pow(2,i) where i ranges from one to 30.
For i in the range one through thirty, each of Math.pow(2,i), 1/Math.pow(2,i) and 1+1/Math.pow(2,i) has a mathematical result that is exactly representable as a double, and that is required to be the result according to the Math.pow documentation and the JLS descriptions of divide and add.
The OP knew that the calculations were exact, and that the final double held the expected result.
The issue was entirely one of output formatting, and using BigDecimal it is possible to get the decimal representation of the exact result.
Patricia
Chris Smith - 09 Aug 2006 20:23 GMT > That was the succinct and correct answer, not > repeatedly insisting that nothing was being lost. Indeed, LOTS is being > lost... but on purpose it turns out. I'll point out that while you and Patricia are right, the other responses you got aren't as dumb as you seem to think. Specifically, no information at all was lost in that display under the following two assumptions:
(a) you understand a floating point value as representing a range of possible mathematical values, as Chris Uppal pointed out; AND
(b) you know the original precision of the binary floating point number.
Under those assumptions, which are quite reasonable for most uses, you got back a correct answer with no loss of information versus the original. However, if you don't assume (a), then the answer is incorrect; and if you don't assume (b), then information was lost.
Hope that clarifies,
 Signature Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation
Patricia Shanahan - 10 Aug 2006 01:01 GMT >> That was the succinct and correct answer, not >> repeatedly insisting that nothing was being lost. Indeed, LOTS is being [quoted text clipped - 7 lines] > (a) you understand a floating point value as representing a range of > possible mathematical values, as Chris Uppal pointed out; AND There are three problems with regarding a floating point number as representing a range of possible mathematical values rather than as corresponding to a unique real:
1. It conflicts with both the JLS and ANSI/IEEE Std 754-1985. Each gives a formula for calculating the real number value of a floating point number, based on the values of its bit fields. The formulas differ, but give the same results.
2. It would make describing floating point operations much harder. Every statement of the form "In the remaining cases, where neither an infinity, nor a zero, nor NaN is involved, and the operands have the same sign or have different magnitudes, the exact mathematical sum is computed." would need to be replaced by a more complicated discussion in terms of the ranges of the two floating point numbers.
3. There are different rounding ranges for different purposes. An add is allowed at most half a ulp of rounding error, and must round the half way between numbers towards even. Math.sin is allowed one ulp of rounding error. Which range does a double x represent? Only the add results that would round to it? Or does x's range include sine(y) if Math.sin(y)==x?
I find it simpler to go with the specs, and think of each floating point number as having a unique value, surrounded a range of real numbers that would be rounded to it under the arithmetic rounding rules, and broader ranges that could be rounded to it under some of the more relaxed function evaluation rules.
> (b) you know the original precision of the binary floating point number. > [quoted text clipped - 4 lines] > > Hope that clarifies, Certainly I find the normal Java Double.toString result very practical for most, but not all, purposes. Printing the shortest decimal number that Double.valueOf(String) would round to the double is a reasonable default.
Patricia
Chris Smith - 10 Aug 2006 22:23 GMT Patricia,
I believe that "conflicts" is too strong a word for the relationship between a mental model of floating point numbers as ranges, and the JLS and IEEE specs. A floating point value can have both a range of numbers that it best represents, and also an exact mathematical value. It is more useful to use the exact mathematical value for some purposes, and the range for others.
I do suspect, though, that there is too much emphasis here on the exact mathematical value of a floating point number. For most purposes, this exact value is somewhat arbitrary from the perspective of the programmer; it may or may not be precisely specified by the operations (the "within one ulp" operations cause it to become unspecified), and even when it is specified, it is still often not particularly relevant to the intended operation. For most purposes, the most meaningful thing that can be said about the exact value of the floating point number is that it approximates the correct answer to some degree of accuracy that depends on context. The same can be said of any other number that rounds to that floating point value, and there's not necessarily any good reason to choose one over another except that it happens to be representable.
The ranges of values that are best represented by a given float are not accuracy ranges and have nothing to do with the degree of accuracy of the approximation, so the error in certain calculations is not relevant. An operation can lack accuracy all it wants, and since floating point numbers have no concept of accuracy, this would have to be tracked elsewhere, in separate variables. All it means is that there's generally no reason to believe that 0.100000001490116119384765625 is really a better answer than 0.1 to that question. They are both within the range of numbers that would be represented by a given float.
 Signature Chris Smith - Lead Software Developer / Technical Trainer MindIQ Corporation
jmcgill - 08 Aug 2006 19:14 GMT > This is the funniest post of this summer. > The OP reminds me of some type of elderly people who stubbonly believe [quoted text clipped - 3 lines] > Seriously: > Learn and study IEEE 754 bit-array-representation of 64 bit FPN. If you're starting from scratch, start by truly understanding unsigned and signed char, then unsigned and signed two's complement, then single precision floating point, and then, with a full comprehension of that, the full spec won't be that hard to understand.
Are there CS programs out there that don't include a computer organization class where this stuff gets drilled into your brain?
blmblm@myrealbox.com - 09 Aug 2006 08:16 GMT >> This is the funniest post of this summer. >> The OP reminds me of some type of elderly people who stubbonly believe [quoted text clipped - 11 lines] >Are there CS programs out there that don't include a computer >organization class where this stuff gets drilled into your brain? It's probably presented somewhere in most CS programs, but drilled into the students' brains -- hm, I'm going to guess that not so many of them do that. The ACM's most recent set of curriculum guidelines (http://acm.org/education/curric_vols/cc2001.pdf) call for spending about a week's worth of lecture time on bit-level representations of various kinds of data, including integers and floating point. You can only get across so much in a week.
And if you consider the general population of people trying to write code, and not just those who are products of a formal CS program somewhere .... If most programmers understood how floating point works, would there be so many questions along the lines of "how come when I divide 1.0 by 10 I don't get exactly one tenth?" ?
Not a good state of affairs, I agree.
 Signature B. L. Massingill ObDisclaimer: I don't speak for my employers; they return the favor.
Chris Uppal - 09 Aug 2006 10:41 GMT > Are there CS programs out there that don't include a computer > organization class where this stuff gets drilled into your brain? I would imagine there are lots.
And I think that's defensible: a course could validly limit it's coverage of floating-point to "don't use floating point (unless you know what you are doing)". With an optional course component which covered not only floating-point representation issues, but also issues of numerical stability and the like. Few programmers would need the optional component, I would think -- it would be of interest primarily to scientists and masochists.
-- chris
Patricia Shanahan - 09 Aug 2006 16:03 GMT >> Are there CS programs out there that don't include a computer >> organization class where this stuff gets drilled into your brain? [quoted text clipped - 7 lines] > and the like. Few programmers would need the optional component, I would > think -- it would be of interest primarily to scientists and masochists. I think some of the confusion in this thread may be a result of this strategy. Programmers seem know floating point rounding error exists, without being able to recognize exact calculations, or maybe even without realizing that some floating point calculations do have exact results.
Patricia
Patricia Shanahan - 07 Aug 2006 15:09 GMT ...
> I am not looking for 40 places of decimal precision; I only used > "%.40d" as an overkill to see "all the numbers". I *am* aware that > doubles are supposed to give only 15 digits of decimal precison > approximately. However, what I find puzzling is that in binary, we are > supposed to get 52 (not 53 as far as I know) bits of precision. Chris has already responded to your main point.
I just want to clarify the reason for 53 bits of precision, rather than 52.
It is a consequence of floating point normalization. Even in non-binary systems, there are advantages to avoiding, wherever possible, leading zero digits in the mantissa. A "normalized" floating point format is one in which the most significant digit of the mantissa is never zero.
For binary, there is an additional advantage. We know the leading digit of the mantissa of a normalized float is a binary digit, and is not zero. There is no point spending a bit in a dense format on a binary digit that must be one, so it does not appear. Normalization buys us an extra bit of precision.
Suppose you have a normalized double with 52 bit fraction f. The full mantissa is 1.f, a 1 before the binary point, followed by the 52 bit fraction after the binary point.
Patricia
jmcgill - 07 Aug 2006 17:50 GMT > There *is no more* precision. A double has 53 bits of binary precision > which is about 16 decimal digits. I've looked for a formula before to get this kind of constraint. That is, to answer the question, "how many decimal digits of precision is N bits of binary precision for a given floating point model?"
Maybe this is a question for a discrete math forum.
Patricia Shanahan - 07 Aug 2006 18:56 GMT >> There *is no more* precision. A double has 53 bits of binary precision >> which is about 16 decimal digits. [quoted text clipped - 4 lines] > > Maybe this is a question for a discrete math forum. The superficial, rough answer is that N bits have 2^N possible values (using "^" for exponentiation). M decimal digits have 10^M possible values.
If 10^M = 2^N then M = log10(2^N) = N*log10(2)
So the rough answer is to multiply N by log10(2), about 0.301.
53*log10(2) is about 15.954, almost 16.
Once you get into actual arithmetic, everything gets more complicated.
Patricia
jmcgill - 07 Aug 2006 20:24 GMT > Once you get into actual arithmetic, everything gets more complicated. Is it at least correct to claim that 53 bits of binary precision guarantees no less than 15 decimal digits? Even this, no doubt, depends on the exponent.
Patricia Shanahan - 07 Aug 2006 20:35 GMT >> Once you get into actual arithmetic, everything gets more complicated. > > Is it at least correct to claim that 53 bits of binary precision > guarantees no less than 15 decimal digits? Even this, no doubt, > depends on the exponent. I'm not sure I see how the exponent affects it that much, as long as you are thinking of significant digits. Of course, if you are thinking of digits after the decimal point, even a 20 digit decimal floating point system does not guarantee 15 decimal digits.
Patricia
jmcgill - 07 Aug 2006 20:44 GMT >>> Once you get into actual arithmetic, everything gets more complicated. >> [quoted text clipped - 4 lines] > I'm not sure I see how the exponent affects it that much, as long as you > are thinking of significant digits. I suspected that larger exponents lead to more granularity in the ranges or something like that.
Also, when I posted to the thread I somehow thought I was posting on a C group, not java. I realize java programmers do not generally deal with the bitwise evaluation of data.
I'm wondering all this because in other disciplines, "error bounds" is always such an early, and often repeated, focus. Yet I had never seen nor been asked for error bounds in IEEE numeric representations.
Patricia Shanahan - 07 Aug 2006 21:16 GMT >>>> Once you get into actual arithmetic, everything gets more complicated. >>> [quoted text clipped - 7 lines] > I suspected that larger exponents lead to more granularity in the ranges > or something like that. I think you need to distinguish more between absolute and relative effects.
For example, the absolute difference x-y between two consecutive representable numbers, x and y, is a strictly increasing function of the exponent. The relative difference (x-y)/x varies within a given value of the exponent, but does not increase with exponent.
> Also, when I posted to the thread I somehow thought I was posting on a C > group, not java. I realize java programmers do not generally deal with > the bitwise evaluation of data. I'm neither a Java programmer nor a C programmer. I'm a programmer who happens be using Java right now. I may ignore some details when working in high level languages, but that does not mean I understand them any less than when I'm working in assembly language.
> I'm wondering all this because in other disciplines, "error bounds" is > always such an early, and often repeated, focus. Yet I had never seen > nor been asked for error bounds in IEEE numeric representations. There is a required accuracy for all the basic operations, specified in the IEEE 754 standard, although some implementations confuse matters by keeping intermediate results with more accuracy. That tends to reduce the need for discussion.
A good library specification should discuss error bounds for those functions whose implementation is allowed some slack. See, for example, the Java API documentation for sin:
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Math.html#sin(double)
"The computed result must be within 1 ulp of the exact result."
("ulp" is short for Unit Least Place, a difference of one in the least significant bit of the fraction. See the top of the referenced page for a more detailed explanation.)
Beyond the basics, the subject rapidly gets very complex, see "numerical analysis" in any good technical bookstore or library. Floating point application accuracy is a difficult, but intensely studied, subject.
Patricia
Patricia Shanahan - 07 Aug 2006 05:22 GMT > Hello, > [quoted text clipped - 22 lines] > precision). But it's not really the precision, because the bits ARE > changing correctly. What gives??? Double.toString, used implicitly in conversion of x to a String, produces the shortest string that, when converted back to double, will produce the original number.
BigDecimal is the easiest way I know to get all the digits:
System.out.println(new BigDecimal(x));
Patricia
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|