Java Forum / General / January 2008
Extending Schrage Multiplication
rossum - 08 Dec 2007 00:08 GMT The Schrage Method and its Limitations
Schrage's algorithm ("A More Portable Fortran Random Number Generator." ACM Trans. Math. Software 5, 132-138, 1979.) allows a multiplication with modulus calculation without overflow, given certain limitations on the numbers used:
/** * Calculates (a * b) % m without overflow * given certain conditions */ int schrage1(int a, int b, int m) { // Check limitations if (b >= m - 1) { throw new RuntimeException( "Schrage: b >= m-1."); } if (a == 0) { return 0; } int quot = m / a; int rem = m % a; if (rem >= quot) { throw new RuntimeException( "Schrage: rem >= quot."); }
// Schrage's algorithm int result = a * (b % quot) - rem * (b / quot); if (result < 0) { result += m; }
return result; }
The check for a == 0 is required because of the division m / a; we need to avoid a divide by zero.
Schrage's algorithm works well, but only if the two given conditions are met: b < m-1 and rem < quot. If either of these two is not met then the algorithm will fail.
Since I wanted a way to multiply and mod any numbers without overflowing I looked for a way to extend Schrage's algorithm to deal with any positive inputs. This meant finding a way to remove the two limitations on the parameters.
Removing the First, b < m-1, Limitation
If we have b >= m-1 then we can try to get round it by reducing the value of b. If b is even then a * b = (a * b/2) + (a * b/2), if b is odd then a * b = (a * b/2) + (a * b/2) + a [here b/2 is an integer division with fractional parts ignored]. We can use this to recurse with a reduced value of b and remove the first limitation:
/** * Calculates (a * b) % m without overflow * given certain conditions */ int schrage2(int a, int b, int m) { if (a == 0) { return 0; } if (b >= m - 1) { int result = schrage2(a, b/2, m); result = addMod(result, result, m); if (b % 2 == 1) { result = addMod(result, a, m); } return result; }
// Check second limitation int quot = m / a; int rem = m % a; if (rem >= quot) { throw new RuntimeException( "Schrage: rem >= quot."); }
// Schrage's algorithm int result = a * (b % quot) - rem * (b / quot); if (result < 0) { result += m; }
return result; }
I have moved the a == 0 check to avoid unneccessary recursion. By halving b for each recursive call we ensure that eventually b < m which meets the first limitation of the basic Schrage algorithm. From that we can build up the final result as the recursion unwinds.
The addMod() function is a non-overflowing modular addition:
/** * Calculates (a + b) mod m without overflow. */ int addMod(int a, int b, int m) { int result; if (b <= Integer.MAX_VALUE - a) { result = (a + b) % m; } else { result = addMod(a, b/2, m); result = addMod(result, b/2, m); if (b % 2 == 1) { result = addMod(result, 1, m); } } return result; }
This uses a similar method of recursing with a half size operand to avoid overflow.
Removing the Second, rem < quot, Limitation
Since quot = m / a, by reducing a we increase the value of quot. We also reduce the maximum possible value of rem, since rem = m % a. Hence by reducing a we can bring quot and rem closer to allowed values. If a is even then a * b = (a/2 * b) + (a/2 * b), if a is odd then a * b = (a/2 * b) + (a/2 * b) + b [here a/2 is an integer division with fractional parts ignored]. We can use this to recurse with a reduced value of a and remove the second limitation:
/** * Calculates (a * b) % m without overflow */ int schrage3(int a, int b, int m) { if (a == 0) { return 0; } if (b >= m - 1) { int result = schrage3(a, b/2, m); result = addMod(result, result, m); if (b % 2 == 1) { result = addMod(result, a, m); } return result; } int quot = m / a; int rem = m % a; if (rem >= quot) { int result = schrage3(a/2, b, m); result = addMod(result, result, m); if (a % 2 == 1) { result = addMod(result, b, m); } return result; }
// Schrage's algorithm int result = a * (b % quot) - rem * (b / quot); if (result < 0) { result += m; }
return result; }
I have tested this revised algorithm and it works with random positive inputs, a >= 0, b >= 0, m >= 1. It does not work with negative inputs - for my application I only need to deal with positive numbers. It would not be difficult to add a shell round the algorithm to handle signed inputs.
I have presented this example with integer inputs, which allows testing against the same calculations in longs. For my real application I have it using longs in order to avoid going into Java BigIntegers and slowing things down.
My apologies to those of you to whom I am teaching ovisuction. I hope that some others among you might find it useful.
rossum
Tim Smith - 08 Dec 2007 02:06 GMT > /** > * Calculates (a * b) % m without overflow [quoted text clipped - 4 lines] > if (b >= m - 1) { throw new RuntimeException( > "Schrage: b >= m-1."); } ...
> Since I wanted a way to multiply and mod any numbers without > overflowing I looked for a way to extend Schrage's algorithm to deal [quoted text clipped - 8 lines] > division with fractional parts ignored]. We can use this to recurse > with a reduced value of b and remove the first limitation: That seems overly complicated. Note that
a*b = a*(b%m) mod m
So, you can start with
a = a%m; b = b%m;
to get both your inputs into [0,m-1]. If you really need b < m-1, you can then special case the b=m-1 case. Note that in that case b%m = -1, so your answer is simply (m-a)%m.
I wrote that as (m-a)%m there, rather than (-a)%m, to take into account the fact that most programming languages don't do something sensible when asked for x%y and x < 0, y > 0. By doing (m-a)%m, I make sure the answer is the one that would make sense to a mathematician, rather than the answer that for some inexplicable reason seems to make sense to most programming language designers and most CPU designers. (I haven't checked what Java does here, but I'd sure expect it to go along with the crowd, rather than surprise a lot of programmers by doing the mathematically more sensible thing!).
rossum - 08 Dec 2007 13:47 GMT >> /** >> * Calculates (a * b) % m without overflow [quoted text clipped - 21 lines] > > a*b = a*(b%m) mod m That is an excellent suggestion, and I have incorporated it.
>So, you can start with > > a = a%m; > b = b%m; If I was writing a shell round the basic method then I would probably do that. I am reluctant to add an initial divide to a recursive algorithm where it will only be potentially useful for the top level of recursion.
>to get both your inputs into [0,m-1]. If you really need b < m-1, I do, Schrage can give incorrect results if b == m-1.
>you can then special case the b=m-1 case. Note that in that case b%m = -1, >so your answer is simply (m-a)%m. Again an excellent suggesition which I have incorporated.
>I wrote that as (m-a)%m there, rather than (-a)%m, to take into account >the fact that most programming languages don't do something sensible >when asked for x%y and x < 0, y > 0. By doing (m-a)%m, I make sure the >answer is the one that would make sense to a mathematician, rather than >the answer that for some inexplicable reason seems to make sense to most >programming language designers and most CPU designers. Agreed, that was one of my motives in avoiding negative inputs in this discussion.
I ran some timing tests on my original algorithm (Schrage), an improved version (Fast Schrage) with your suggestions incorporated and a basic non-overflowing mulMod function similar to the addMod I gave earlier:
Over 3000000 random tests, there were 0 mismatched answers. Schrage took 9.485 seconds. Fast Schrage took 9.266 seconds. mulMod took 15.734 seconds.
I suspect that the Schrage method is faster because it can operate on larger numbers and so needs less depth of recursion to reach a point where it can calculate the answer without overflowing.
Thankyou for your suggestions,
rossum
// --- Code ---
static int schrageFast(int a, int b, int m) { if (a < 2) { return (a * b) % m; }
int result; // Check for b >= m-1 if (b > m - 1) { result = schrageFast(a, b % m, m); } else if (b == m - 1) { result = (m - a) % m; } else { // Check for rem >= quot int quot = m / a; int rem = m % a; if (rem >= quot) { result = schrageFast(a/2, b, m); result = addMod(result, result, m); if (a % 2 == 1) { result = addMod(result, b, m); } } else { // Schrage method result = a * (b % quot) - rem * (b / quot); if (result < 0) { result += m; } } // end if } // end if return result; } // end schrageFast()
// --- End Code ---
Lew - 08 Dec 2007 17:58 GMT Tim Smith
>> I wrote that as (m-a)%m there, rather than (-a)%m, to take into account >> the fact that most programming languages don't do something sensible >> when asked for x%y and x < 0, y > 0. By doing (m-a)%m, I make sure the >> answer is the one that would make sense to a mathematician, rather than >> the answer that for some inexplicable reason seems to make sense to most >> programming language designers and most CPU designers. All programming languages do something sensible with that case.
The problem is you have to choose between remainder and modulo. If you define % as modulo, you lose the property that (x%y) + x/y = x when x and y differ in sign. Defining % as remainder preserves that property.
There is absolutely nothing inexplicable about it. You're just trying to interpret % to mean something different from what it does. You ought to try taking it for what it is, instead of for what it isn't.
 Signature Lew
Lew - 08 Dec 2007 18:02 GMT > Tim Smith >>> I wrote that as (m-a)%m there, rather than (-a)%m, to take into account [quoted text clipped - 13 lines] > to interpret % to mean something different from what it does. You ought > to try taking it for what it is, instead of for what it isn't. Typo.
I meant (x%y) + y*(x/y) = x
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 18:17 GMT * Lew:
>> Tim Smith >>>> I wrote that as (m-a)%m there, rather than (-a)%m, to take into account [quoted text clipped - 20 lines] > I meant > (x%y) + y*(x/y) = x The above article was cross-posted to [comp.lang.java.programmer] and [comp.programming].
I'm not sure that the article's claimed difference between "modulo" and "remainder" is language independent, i.e. that these terms can be used in general to indicate one or the other way of rounding, or for that matter that the property (x%y) + y*(x/y) = x is necessarily lost with rounding-towards-zero (it's only lost when "/" doesn't match "%").
Could the author please provide a reference for his terminology.
Cheers,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 18:27 GMT > * Lew: >>> Tim Smith [quoted text clipped - 35 lines] > > Could the author please provide a reference for his terminology. The terminology is self-evident.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 19:15 GMT * Lew:
>> * Lew: >>>> Tim Smith [quoted text clipped - 38 lines] > > The terminology is self-evident. I'm sorry, that's bullshit.
Cheers, & hth.,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 19:28 GMT > * Lew: >>> * Lew: [quoted text clipped - 41 lines] > > I'm sorry, that's bullshit. Actually, your statement is bullshit.
The remainder is the value left when you divide x/y. (/ of course is defined as for Java.) In this case, I'm referring to the remainder which makes the relation x == y*(x/y) + (x%y) hold true. (/ is defined as in Java, of course.) <http://en.wikipedia.org/wiki/Remainder>
That was clearly evident from my post. Just because you couldn't see that doesn't make it bullshit.
Modulo is the standard meaning from mathematics. How is this in any way not standard or well understood? <http://en.wikipedia.org/wiki/Modulo>
I never said the difference was language-independent. That's your requirement. I was, of course, speaking of Java.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 19:48 GMT * Lew:
>> * Lew: >>>> * Lew: [quoted text clipped - 62 lines] > I never said the difference was language-independent. That's your > requirement. I was, of course, speaking of Java. I'm sorry, again, but the "of course"'s are bullshit.
Note, as already pointed out to you, that this is crossposted to [comp.programming], and that's the group I'm replying in.
Note also that different programming languages differ in their definitions of rem and mod operators, and associated terminology -- it's OK to pick particular meanings, but going the other way, to think meanings are implied by the terminology, is a fallacy.
Cheers, & hth.,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 20:50 GMT > Note, as already pointed out to you, that this is crossposted to > [comp.programming], and that's the group I'm replying in. Note: The OP made it quite clear that he was talking about Java. Of course I can see this thread is cross-posted - you hardly need to point out what the headers clearly show.
Go on, be a troll.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 21:09 GMT * Lew:
>> Note, as already pointed out to you, that this is crossposted to >> [comp.programming], and that's the group I'm replying in. > > Note: The OP made it quite clear that he was talking about Java. And you did the opposite.
You wrote "All programming languages do something sensible with that case."
Instead of your bullshit claim that your terminology is "self-evident" in that context, you could just have admitted that your earlier comments about terminology did not apply to "all programming languages".
> Of > course I can see this thread is cross-posted - you hardly need to point > out what the headers clearly show. Since you didn't understand that at first -- it had to be pointed out a second time -- I don't think it was unnecessary.
Cheers, & hth.,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 21:13 GMT > * Lew: >>> Note, as already pointed out to you, that this is crossposted to [quoted text clipped - 5 lines] > > You wrote "All programming languages do something sensible with that case." But I didn't write that they all do the same thing, now did I?
> Instead of your bullshit claim that your terminology is "self-evident" > in that context, you could just have admitted that your earlier comments > about terminology did not apply to "all programming languages". I never said that they did apply to all programming languages.
> Since you didn't understand that at first -- it had to be pointed out > a second time -- I don't think it was unnecessary. You have no evidence that I didn't understand it, troll.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 21:26 GMT * Lew:
>> * Lew: >>>> Note, as already pointed out to you, that this is crossposted to [quoted text clipped - 19 lines] > > You have no evidence that I didn't understand it, troll. Uhm, your quoting is highly misleading, leaving out the referent for the last paragraph, and making it refer to something it did not refer to.
Cheers, & hth.,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 21:47 GMT > * Lew: >>> * Lew: [quoted text clipped - 24 lines] > Uhm, your quoting is highly misleading, leaving out the referent for the > last paragraph, and making it refer to something it did not refer to. Oh, I am so sorry. I was referring to your bullshit claim that I didn't understand that the message was cross-posted by the OP, troll.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 22:05 GMT * Lew:
>> * Lew: >>>> * Lew: [quoted text clipped - 27 lines] > Oh, I am so sorry. I was referring to your bullshit claim that I didn't > understand that the message was cross-posted by the OP, troll. Good that you're sorry.
There should be no need for you to resort to personal name calling.
I've characterized your claims as bullshit, in response to an initial rudeness and because they tecnically are, but I really don't know enough about you as a person -- other than your tendency here to get personal, with repeated name-calling, which might or might not be characteristic -- to label you as anything, so, I choose not to.
Cheers, & hth.,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Lew - 08 Dec 2007 22:15 GMT > Good that you're sorry. Oooh, score!
> There should be no need for you to resort to personal name calling. Nor should there be any need for you to jump in calling what I say "bullshit" and claiming without evidence that I don't understand something.
> I've characterized your claims as bullshit, in response to an initial > rudeness and because they tecnically are, but I really don't know enough The "initial rudeness" to which you started your use of the word "bullshit" was this post:
> The terminology is self-evident. That was it. The whole thing. You call that rude, eh?
How is "bullshit" a techical term? Especially when you misconstrued my remarks in order to label them so.
> about you as a person -- other than your tendency here to get > personal, with repeated name-calling, which might or might not be > characteristic -- to label you as anything, so, I choose not to. I call you "troll" because you jump right in with using words like "bullshit", you get /ad hominem/ with claims that I don't understand something, you twist the meaning of what I say in order to attack it. All this is trollish behavior.
> Cheers, & hth., Oh, yeah, you're a big help.
Plonk.
 Signature Lew
Alf P. Steinbach - 08 Dec 2007 22:26 GMT * Lew:
>> Good that you're sorry. > [quoted text clipped - 13 lines] > > That was it. The whole thing. You call that rude, eh? Yes, it is very rude to refuse to answer a simple question about what you mean, claiming it is self-evident.
> Plonk. I'm sorry that Lew won't be able to read the above.
Cheers,
- Alf
 Signature A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Tim Smith - 09 Dec 2007 02:57 GMT >> The problem is you have to choose between remainder and modulo. If you >> define % as modulo, you lose the property that (x%y) + x/y = x when x [quoted text clipped - 8 lines] > I meant > (x%y) + y*(x/y) = x Actually, the problem is with division. If x < 0, y > 0, should x/y be floor(x/y) or ceil(x/y)? If it is floor(x/y), then everything works out fine. x - y * floor(x/y) will be in [0,y) for y > 0, and everyone would be happy.
What most CPUs and programming languages do is, for y > 0, take x/y as floor(x/y) if x > 0, and ceil(x/y) if x < 0, which I think you'll find is the option least preferred by mathematicians.
Lew - 09 Dec 2007 03:48 GMT > Actually, the problem is with division. If x < 0, y > 0, should x/y be > floor(x/y) or ceil(x/y)? If it is floor(x/y), then everything works out [quoted text clipped - 4 lines] > floor(x/y) if x > 0, and ceil(x/y) if x < 0, which I think you'll find > is the option least preferred by mathematicians. True. I was, of course, referring to division as implemented by "most CPUs and programming languages", particularly Java, as that was the language of interest for the OP.
There is a reason why computer engineers prefer the way they do it, which reason escapes my memory at the moment. Something about the consistency of
|x/y| being the same regardless of operand sign.
 Signature Lew
Tim Smith - 09 Dec 2007 18:19 GMT > There is a reason why computer engineers prefer the way they do it, which > reason escapes my memory at the moment. Something about the consistency of >|x/y| being the same regardless of operand sign. That's what I've always assumed. They want (-x)/y = x/(-y) = -(x/y), probably because it lets them design dividors that can ignore sign, and then they can set the sign of the answer at the end.
It's just that mathematically, that's not a particularly interesting property to have for integer division, so I'd prefer that integer division be done in a way that preserves x=y*(x/y) + x%y and makes x%y come out in [0,y). A lot of things in mathematics count on that relationship, and it gets annoying to have to take extra steps in most programming languages to deal with that. I'd much rather have that just work, and if a case ever arose where I needed (-x)/y = -(x/y), I'd rather that be the case that requires extra code.
Colin Barker - 06 Jan 2008 16:38 GMT >>> /** >>> * Calculates (a * b) % m without overflow [quoted text clipped - 96 lines] > > // --- End Code --- Richard F.L.R.Snashall - 08 Dec 2007 19:28 GMT But, if b == m - 1, then b == -1 mod m, so the answer is just -a (or its equivalent, m - a).
Colin Barker - 06 Jan 2008 16:43 GMT > /** > * Calculates (a + b) mod m without overflow. [quoted text clipped - 10 lines] > return result; > } Assuming 0 <= a,b < m wouldn't the following be more efficient?
int addMod(int a, int b, int m) { return (a < m - b) ? (a + b) : (a - (m - b)); }
Sorry about the null reply I made a couple of minutes ago. -- Colin
rossum - 06 Jan 2008 22:32 GMT >Assuming 0 <= a,b < m wouldn't the following be more efficient? > > int addMod(int a, int b, int m) { > return (a < m - b) ? (a + b) : (a - (m - b)); > } Yes it would, it roughly halved the timings of my tests:
Old values: Schrage took 9.485 seconds. Fast Schrage took 9.266 seconds.
New values (with your addMod() method) Schrage took 5.907 seconds. Fast Schrage took 5.813 seconds.
I also tested against my old version of addMod() and there were no differences for positive inputs.
Thank you,
rossum
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|