Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / June 2006

Tip: Looking for answers? Try searching our database.

Arithmetic with longs

Thread view: 
Twisted - 13 Jun 2006 05:03 GMT
If two unsigned longs are summed, is there an easy and efficient way to
determine if there was an overflow? (An overflow would mean the actual
sum is 2^64 greater than what the result seemed to be.)
Twisted - 13 Jun 2006 05:09 GMT
> If two unsigned longs are summed, is there an easy and efficient way to
> determine if there was an overflow? (An overflow would mean the actual
> sum is 2^64 greater than what the result seemed to be.)

Hrm, Java doesn't seem to even have an unsigned long type. Looks like
I'll have to use positive signed longs up to 2^63-1 and check if the
sum is mysteriously negative...
Twisted - 13 Jun 2006 05:22 GMT
> > If two unsigned longs are summed, is there an easy and efficient way to
> > determine if there was an overflow? (An overflow would mean the actual
[quoted text clipped - 3 lines]
> I'll have to use positive signed longs up to 2^63-1 and check if the
> sum is mysteriously negative...

May be easier to use BigInteger for this ... but would it be faster?
How efficient is the typical Java deployment's implementation of
BigInteger anyway? About as fast as native C code working with arrays
of hardware-word-size integers as "digits" with "carries" based on
low-level trapping of overflow? Somewhat slower? Enormously slower?

(While I'm asking, may as well ask how well BigDecimal compares
speed-wise with C implementations of extended-precision floating- or
fixed-point arithmetic. That BigDecimal apparently stores numbers
decimal digit by decimal digit rather than in chunks of 32 or 64 bits
doesn't leave me hopeful.)
EJP - 13 Jun 2006 07:23 GMT
> (While I'm asking, may as well ask how well BigDecimal compares
> speed-wise with C implementations of extended-precision floating- or
> fixed-point arithmetic. That BigDecimal apparently stores numbers
> decimal digit by decimal digit rather than in chunks of 32 or 64 bits
> doesn't leave me hopeful.)

Of course it does, that's its purpose! If you want a binary radix use
BigInteger.
Patricia Shanahan - 13 Jun 2006 07:23 GMT
...
> (While I'm asking, may as well ask how well BigDecimal compares
> speed-wise with C implementations of extended-precision floating- or
> fixed-point arithmetic. That BigDecimal apparently stores numbers
> decimal digit by decimal digit rather than in chunks of 32 or 64 bits
> doesn't leave me hopeful.)

How did you reach that conclusion? My reading of the source code is that
BigDecimal is implemented as the combination of a BigInteger and an int
scale, and BigInteger uses an int[].

Patricia
Twisted - 13 Jun 2006 13:39 GMT
> How did you reach that conclusion? My reading of the source code is that
> BigDecimal is implemented as the combination of a BigInteger and an int
> scale, and BigInteger uses an int[].

Don't have the source code here.

Hrm. So both might perform decently, though I assume they use O(n^2)
multiplication that isn't as good as some other algorithms (FFT,
matrices) at really large sizes.
Patricia Shanahan - 13 Jun 2006 13:49 GMT
>> How did you reach that conclusion? My reading of the source code is that
>> BigDecimal is implemented as the combination of a BigInteger and an int
>> scale, and BigInteger uses an int[].
>
> Don't have the source code here.

Do you have a JDK installed? If so, unzip the file "src.zip" in its top
directory.

Patricia
Jeffrey Schwab - 13 Jun 2006 14:15 GMT
>> How did you reach that conclusion? My reading of the source code is that
>> BigDecimal is implemented as the combination of a BigInteger and an int
[quoted text clipped - 5 lines]
> multiplication that isn't as good as some other algorithms (FFT,
> matrices) at really large sizes.

For some definition of "good."
Twisted - 13 Jun 2006 16:01 GMT
> >> How did you reach that conclusion? My reading of the source code is that
> >> BigDecimal is implemented as the combination of a BigInteger and an int
[quoted text clipped - 7 lines]
>
> For some definition of "good."

Er?

_Introducing_Algorithms_ (ISBN 0-262-03141-8); also www.mersenne.org
(they use FFT for multiplying bignums); there are O(n log n) bignum
multiplies, but they have a large constant factor so are less efficient
for bignums below a certain length in bits.
Jeffrey Schwab - 13 Jun 2006 16:06 GMT
>>>> How did you reach that conclusion? My reading of the source code is that
>>>> BigDecimal is implemented as the combination of a BigInteger and an int
[quoted text clipped - 12 lines]
> multiplies, but they have a large constant factor so are less efficient
> for bignums below a certain length in bits.

Don't those techniques lose precision in the low-order digits?  I
believe they're fine for most applications, but not quite a direct
replacement for the brute force O(n^2) method.
Mark Thornton - 13 Jun 2006 19:48 GMT
>> _Introducing_Algorithms_ (ISBN 0-262-03141-8); also www.mersenne.org
>> (they use FFT for multiplying bignums); there are O(n log n) bignum
[quoted text clipped - 4 lines]
> believe they're fine for most applications, but not quite a direct
> replacement for the brute force O(n^2) method.

No they are exact.
Jeffrey Schwab - 13 Jun 2006 23:08 GMT
>>> _Introducing_Algorithms_ (ISBN 0-262-03141-8); also www.mersenne.org
>>> (they use FFT for multiplying bignums); there are O(n log n) bignum
[quoted text clipped - 6 lines]
>
> No they are exact.

Of course they are.  Thank you for correcting me, and please excuse the
brain-fart.

Twisted mentioned FFTs, which do the job nicely, and "matrices."  I'm
not sure what technique uses matrices to do fast multiplication of
scalars, except maybe as a technique for parallelization...
Twisted - 14 Jun 2006 19:35 GMT
> Twisted mentioned FFTs, which do the job nicely, and "matrices."  I'm
> not sure what technique uses matrices to do fast multiplication of
> scalars, except maybe as a technique for parallelization...

It's in the book whose isbn I posted. I couldn't find it again with a
quick vgrep through the index, though.

A Web search turns up
http://mathworld.wolfram.com/KaratsubaMultiplication.html which appears
to be the same thing, and has order n log n log log n.

http://citeseer.ist.psu.edu/451468.html has a paper in pdf, ps, and
other formats discussing several bigint multiply algorithms with better
than n^2 performance.

All can presumably be adapted to multiplying bignum reals to a given
precision (the easiest way being to use a bigint under the hood, and
fixed-point arithmetic, with double the precision used in the internal
bigints). In fact, if a BigDecimal in Java is just a movable-point
wrapper around a BigInteger, an analogous wrapper around a MyBigInteger
that uses FFT (or whatever) would do the same job as a BigDecimal, but
faster for very, very big mantissas.

Re: BigInteger vs. int comparison -- was this using BigIntegers that
could fit in a normal int, or bigger? And was it with hotspot and
-server, which seems to give the fastest math performance relative to
native code?
Oliver Wong - 14 Jun 2006 20:11 GMT
> Re: BigInteger vs. int comparison -- was this using BigIntegers that
> could fit in a normal int, or bigger? And was it with hotspot and
> -server, which seems to give the fastest math performance relative to
> native code?

   If this is addressed to me, it'd probably less confusing to the message
hierarchy if you posted it as a seperately, and as a reply to my message,
instead of a reply to Jeffrey Schwabs message. If it isn't addressed to me,
ignore the rest of this post.

   The benchmark provided the exact same input to BigInteger and int, so it
would be BigIntegers that could fit in normal int. I don't remember what
settings I had the JVM running under, and this might have been under 1.4. I
was originally planning on writing several benchmarks (not just the Prime
Sieve test), getting some initial results, and then writing my own custom
BigInteger class to see if I could optimize for speed a bit, and publish the
results. The intent was to see if I could make a blanket recommendation of
"always use BigInteger in preference to int so that you never have to worry
about errors overflow again". I sort of got too busy in the rest of my life
though, so I never finished the research.

   That being said, in a lot of business applications, the bottleneck is
probably not integer arithmatic, so it may make a lot of sense to use
BigInteger "by default", and to fall back on int (or some other "data
structure") in the special cases where performance is critical.

   - Oliver
Twisted - 15 Jun 2006 21:01 GMT
>     That being said, in a lot of business applications, the bottleneck is
> probably not integer arithmatic, so it may make a lot of sense to use
> BigInteger "by default", and to fall back on int (or some other "data
> structure") in the special cases where performance is critical.

The bottleneck could be method calls, and it might actually be worse if
typical deployments make every BigInteger add or similar go down
through JNI than if they let the JIT worry about speed or the user put
whole, much bigger algorithms in native code instead of each add and
multiply separately.

I'm frankly dubious if, with modern JITs, there's any use for JNI
whatsoever anymore, except for the few core library calls that need to
invoke system-dependent OS APIs at a low level, mainly to get drawing
surfaces and window handles for JFrames and so forth.

There is another problem with blanket use of BigIntegers though: the
things are inherently clunkier to use. Since most of the Java community
foams violently at the mouth as soon as anyone mentions the *other* OO
(op*r*t*r ov*rlo*ding, if I must risk using what might be considered by
some here an obscenity :)) this isn't likely to change anytime soon.

Then again, there is a possible compromise. Give the same special
status to BigInteger (and BigDecimal) as given to String ...

Also, how about letting us intern the things?
Oliver Wong - 15 Jun 2006 21:51 GMT
> There is another problem with blanket use of BigIntegers though: the
> things are inherently clunkier to use. Since most of the Java community
[quoted text clipped - 4 lines]
> Then again, there is a possible compromise. Give the same special
> status to BigInteger (and BigDecimal) as given to String ...

   Yeah, the "final stages" of my research, had I had the time, would be to
devise a new language based on Java which would not have any primitive types
at all, and which would use BigInteger for integer-literals appearing in the
code (I hadn't yet decided how to handle decimal numbers -- e.g. should I
use floating point, or fixed point?).

   So in my language, the code:

<code>
int i = 1000000000000000000000000000000000000;
</code>

would translate to something like this in Java:

<code>
BigInteger i = new BigInteger("1000000000000000000000000000000000000");
</code>

   This was assuming I did not run into any show-stopping problems in my
blanket-recommendation during the earlier research phases.

   - Oliver
Twisted - 16 Jun 2006 00:56 GMT
So, does anyone have any benchmark results of int vs. small BigInteger
with a) -server or that expensive java-to-native-code-compiler that
proved to be no faster and b) a throwaway run to let the JIT do its job
followed by a run that's actually timed, in each case? (Floating point
performance of java doubles vs. native doubles came up in this
newsgroup a month or two back and extensive data got posted.)
P.Hill - 16 Jun 2006 05:41 GMT
> The intent was to see if I could
> make a blanket recommendation of "always use BigInteger in preference to
> int so that you never have to worry about errors overflow again". I sort
> of got too busy in the rest of my life though, so I never finished the
> research.

I'll keep your idea to "always use BigInteger in preference to int" in
mind, but since working in biotech, banking, communication, graphics,
business, ... etc for over 20 years I have only a couple of times needed
more than a long when I required fixed-point arithmetic.

I also might suggest the more common solution for most applications:
Java long -- 9,223,372,036,854,775,807 (2^(63-1)) is a whole lot of
accurate counting and covers nearly anything outside of working with
all stars in the universe or other such not very "always" number. So
instead of just guffawing and falling out of my seat, I have to ask what
particular application area where you working in where you commonly
needed to count more than a long worth of fixed point such that you
wanted to even think of the idea of "aways using BigInteger"?

-Paul
Chris Uppal - 16 Jun 2006 12:43 GMT
> So
> instead of just guffawing and falling out of my seat, I have to ask what
> particular application area where you working in where you commonly
> needed to count more than a long worth of fixed point such that you
> wanted to even think of the idea of "aways using BigInteger"?

I'd be more worried about overflow in intermediate expressions than any
inability to represent the values which naturally crop up in <some domain>.
For a rather silly example, comparing the volumes to two containers would
overflow a long if they had sides or the order of 2.5e6 units.  For a slightly
less silly example, the only way I know to caclulate the distance from a point
to a line segment uses intermediate values to the fourth power of the inputs.

But more importantly, I don't want to have to /think/ about overflow at all --
it's a distraction I don't need.  I wouldn't advocate defaulting to using Java
BigInteger/BigDecimal for numeric quantities myself, but that's mostly because
of the incredible[*] awkwardness of expressing any kind of calculation using
them.

   -- chris

([*] "incredible" is something of an overstatement -- I don't find it
impossible to believe, just difficult ;-)
Oliver Wong - 16 Jun 2006 15:14 GMT
>> So
>> instead of just guffawing and falling out of my seat, I have to ask what
>> particular application area where you working in where you commonly
>> needed to count more than a long worth of fixed point such that you
>> wanted to even think of the idea of "aways using BigInteger"?

[...]

> But more importantly, I don't want to have to /think/ about overflow at
> all --
[quoted text clipped - 5 lines]
> using
> them.

   Yeah, that's my main argument for "always use BigInteger" too: You no
longer have to worry about overflow at all.

   Here's the one application I worked on where I got to try it out.

   There's this dance game called In The Groove
(http://www.inthegroove.com/), which is similar to the more widely known
Dance Dance Revolution. It has a few new features, including a USB slot on
the arcade machine itself which allows you to create an account and track
various statistics on your gameplay, including an estimate of how many
calories you've burned playing the game, how many times you played, what
your scores were for each song, etc.

   This data is stored as XML files on the USB key, and is digitally signed
by the machine. I think the developers of the game are open source
advocates. They've released the schema for the XML files and a public key to
check the signature to allow third parties (like myself) to play around with
those files.

   Now I assume that the code for the machine is implemented in C/C++, so
*probably* the machine itself doesn't support writing an XML file that says
the player played more than 2*10^19 games (which would probably actually
take 10^14 years to accomplish), but the schema file itself doesn't put any
restriction on the count. So rather than worrying about what internal limits
the machine uses (which might change from revision to revision; the machines
are currently at Version 2 Revision 8), I just store this count as a
BigInteger.

   As I predicted, the bottleneck for my application (which is basically
just a specialized viewer for the XML file, rendering charts using jGraph
showing average calories burned per day, for example) was not integer
arithmatic, but rather file-IO. So by using BigInteger, I'd never have to
worry about any of the values from the XML ever causing an overflow error in
my application, and I didn't suffer significant performance loss.

   - Oliver
Twisted - 16 Jun 2006 19:35 GMT
> ...would take 10^14 years...

It may be worth noting that the current best guess as to the age of the
entire universe is around 1.37x10^10 years. :)
P.Hill - 17 Jun 2006 18:19 GMT
>> But more importantly, I don't want to have to /think/ about overflow
>> at all -- it's a distraction I don't need.  I wouldn't advocate
[quoted text clipped - 7 lines]
>    Yeah, that's my main argument for "always use BigInteger" too: You no
> longer have to worry about overflow at all.

So in order to draw "charts and graphs" on either your printer or screen
you are worried that somewhere for some polygon calculation or some
perpetual sum there might be an overflow or round-off error.  All at
the expense of less readable thus maintainability.

Chris's example at least suggests the possibility of overflow while your
real-world values to 2D computer graphics hardly suggests a need for
more than a Long.

> I didn't suffer significant performance loss.

But is performance even a concern when drawing graphs and charts?

To me, the use of BigInteger in this example values with known sizes to
accumulations in databases and graphics falls into the "You Aren't Gonna
Need It" category.

http://xp.c2.com/YouArentGonnaNeedIt.html

Hopefully your programming served as a good learning exercise for
BigInteger.

-Paul
Oliver Wong - 19 Jun 2006 15:24 GMT
>>> But more importantly, I don't want to have to /think/ about overflow at
>>> all -- it's a distraction I don't need.  I wouldn't advocate defaulting
[quoted text clipped - 16 lines]
> real-world values to 2D computer graphics hardly suggests a need for
> more than a Long.

   The overflow part comes from the arbitrary integer input. For example,
your score on a song is equal to some integer. That's the mathematical
concept of an integer, not Java's 32-bit or 64-bit concept. As I mentioned
in my previous post, yes, the game engine is programmed in C/C++, and so far
all of the scores in the XML file that the machine generated were within the
32-bit range. But when the company published the XSD schema describing the
file format, they did not in any way limit the range of the integer values.
So I could either design my program to only accept a subset of all legal XML
inputs, or I could design it to accept all legal XML inputs. I went with the
latter.

> > I didn't suffer significant performance loss.
>
> But is performance even a concern when drawing graphs and charts?

   The fact that performance is NOT a concern is an argument FOR using
BigInteger.

> To me, the use of BigInteger in this example values with known sizes to
> accumulations in databases and graphics falls into the "You Aren't Gonna
> Need It" category.
>
> http://xp.c2.com/YouArentGonnaNeedIt.html

   But then how would I have had real-world data for my "Always use
BigInteger" experiment/research/whatever?

   - Oliver
Twisted - 14 Jun 2006 19:21 GMT
> Don't those techniques lose precision in the low-order digits?  I
> believe they're fine for most applications, but not quite a direct
> replacement for the brute force O(n^2) method.

According to the documentation regarding mprime's n log n FFT method,
it does "lose precision in the low order digits" but the error
magnitude can be kept below 1/2, and when the ultimate purpose is
integer arithmetic, you can then snap to the nearest integer at the end
of the calculation.

The matrix based method is, so far as I know, exact.
Oliver Wong - 13 Jun 2006 17:33 GMT
>> > If two unsigned longs are summed, is there an easy and efficient way to
>> > determine if there was an overflow? (An overflow would mean the actual
[quoted text clipped - 9 lines]
> of hardware-word-size integers as "digits" with "carries" based on
> low-level trapping of overflow? Somewhat slower? Enormously slower?

   I didn't do benchmarks against C, but I compared Java's BigInteger to
Java's primitive int, and in my results, BigInteger is approximately 7 times
slower in a Eratosthenes Prime Number Sieve benchmark.

   - Oliver
Tim B - 13 Jun 2006 05:28 GMT
> > If two unsigned longs are summed, is there an easy and efficient way to
> > determine if there was an overflow? (An overflow would mean the actual
[quoted text clipped - 3 lines]
> I'll have to use positive signed longs up to 2^63-1 and check if the
> sum is mysteriously negative...

Perhaps BigInteger would be of help here.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.