For instance, you want to calculate a remainder of some very huge bignum: x=bignum % divisor. But bignum is very huge, you want to parallelize the process.
This is kind of problem in style of Project Euler. Here I split the (random) bignum by 3 parts and calculate 3 remainders. Can you write the missing function, that will reconstruct the 'whole' remainder?
#!/usr/bin/env python3
import random
CHUNK_SIZE=10000
for _ in range(30000):
hi=random.randint(0, CHUNK_SIZE-1)
mi=random.randint(0, CHUNK_SIZE-1)
lo=random.randint(0, CHUNK_SIZE-1)
dividend=hi*(CHUNK_SIZE**2) + mi*CHUNK_SIZE + lo
divisor=random.randint(1,CHUNK_SIZE-1)
# calculate 3 remainders in parallel:
rem_hi=hi % divisor
rem_mi=mi % divisor
rem_lo=lo % divisor
remainder_my=reconstruct_remainder(rem_hi, rem_mi, rem_lo, divisor)
# self-test:
remainder_real=dividend % divisor
assert remainder_real==remainder_my
Stop reading here. Try to work on this problem for a while. To continue reading, scroll down for the solution.
And there it is, with comments:
#!/usr/bin/env python3
def reconstruct_remainder(rem_hi, rem_mi, rem_lo, divisor):
# all these 4 methods work
# simplest version:
# two remainders are just shifted left
# in other words, 3 remainders 'stretch'
#x=(rem_hi*((CHUNK_SIZE**2)))
#x=x+(rem_mi*CHUNK_SIZE)
#x=x+rem_lo
#x=x % divisor
#return x
# with early modulo operation, so that all values will stay under divisor
# this is faster
#x=((rem_hi*((CHUNK_SIZE**2) % divisor)) % divisor)
#x=x+((rem_mi*(CHUNK_SIZE % divisor) % divisor))
#x=x+rem_lo
#x=x % divisor
#return x
# but if chunk size known beforehand, we can precompute these constants:
precomputed_mult_hi=(CHUNK_SIZE**2) % divisor
precomputed_mult_mi=CHUNK_SIZE % divisor
#x=rem_hi*precomputed_mult_hi
#x=x+(rem_mi*precomputed_mult_mi)
#x=x+rem_lo
#x=x % divisor
#return x
# probably the most performant method
# with early modulo operation
# all values always stays under divisor
x=(rem_hi*precomputed_mult_hi) % divisor
x=x+((rem_mi*precomputed_mult_mi) % divisor)
x=x+rem_lo
x=x % divisor
return x
It can run on any number of cores/threads.
Also, another fact from modulo arithmetics that helped me to solve that problem:
#!/usr/bin/env python3
import random
CHUNK_SIZE=10000
for _ in range(10000):
hi=random.randint(0, CHUNK_SIZE-1)
lo=random.randint(0, CHUNK_SIZE-1)
dividend=hi*CHUNK_SIZE + lo
divisor=random.randint(1, 50)
remainder=dividend % divisor
rem_hi=hi % divisor
rem_lo=lo % divisor
# note that. here we combine remainder from the high part with low part
# and we can divide it by divisor
x=(rem_hi*CHUNK_SIZE + lo) % divisor
assert x==remainder
x=(((rem_hi*(CHUNK_SIZE % divisor)) % divisor)+rem_lo) % divisor
assert x==remainder
Actually, I have no need in parallelizing such an operation. But I did this for better understanding of modulo arithmetics. Hopefully, it can be of help for someone else.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.