## [Math] Modulo inverse and the solution to the reverse engineering challenge #87

The following reverse engineering challenge was published by me in summer 2017, as the hardest among those challenges. What does it do? Here is it rewritten to pure C:
uint32_t modinv32 (uint32_t x)
{
uint32_t rslt=1;
for (int i=0; i<31; i++)
{
rslt = (rslt * x);
x = (x * x);
}
return rslt;
}


At least two of my readers found solution correctly. This function finds modulo inverse (used when replacing division by multiplication) and was copypasted by me from the GCC source code. Let's see, how it works.

By the method I once used in 'SAT/SMT by Example' book, let's construct a final expression that this function generates:

#!/usr/bin/env python3

class Expr:
def __init__(self,s):
self.s=s

def __str__(self):
return self.s

def convert_to_Expr_if_int(self, n):
if isinstance(n, int):
return Expr(str(n))
if isinstance(n, Expr):
return n
raise AssertionError # unsupported type

def __mul__(self, other):
return Expr("(" + self.s + "*" + self.convert_to_Expr_if_int(other).s + ")")

x=Expr("rslt")
rslt=1

BITS=4

for i in range(BITS-1):
rslt=x*rslt
x=x*x

print (rslt)
print (str(rslt).count("rslt"))

% ./modinv.py
(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*1)))
7


Hmm, this expression seems to be very regular and redundant. Let's simplify it in Wolfram Mathematica:

In[1]:= (((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*1)))
Out[1]= rslt^7


So this is simply $$rslt^7$$.

Now for 8 bits:

(((((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))))*(((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))))*((((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))))*(((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*1)))))))
127


Wolfram Mathematica:

In[9]:= (((((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))))*(((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))))*((((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))))*(((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt))))*((((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*rslt)))*(((rslt*rslt)*(rslt*rslt))*((rslt*rslt)*(rslt*1)))))))

Out[9]= rslt^127


This is $$rslt^{127}$$. For BITS=16, this would be $$rslt^{32767}$$.

By induction we see that this functions simply computes $$x^{2^{BITS-1}-1}$$.

#include <stdint.h>
#include <stdio.h>

static uint64_t inverse (uint64_t x)
{
uint64_t rslt = 1;
int i;

//for (i = 0; i < 64 - 1; i++)
for (i = 0; i < 16 - 1; i++)
{
rslt = (rslt * x);
x = (x * x);
}

return rslt & 0xffff;
}

void main()
{
printf ("0x%lx\n", inverse(3));
};


That prints:

0xaaab


This is a magic number that can be used by replacing division by 3 on 16-bit registers. Instead, multiply by 0xaaab.

Let's check in Wolfram Mathematica:

In[11]:= BaseForm[PowerMod[3,-1,2^16],16]
Out[11]//BaseForm= Subscript[aaab, 16]

In[18]:= BaseForm[PowerMod[3,(2^(16-1))-1,2^16],16]
Out[18]//BaseForm= Subscript[aaab, 16]


This is explained here in Wikipedia. And as of Euler's totient function...

In[19]:= EulerPhi[2^16]
Out[19]= 32768


(Each odd number under $$2^x$$ is coprime to $$2^x$$. Even numbers are all coprimes to $$2^x$$. And there are $$\frac{2^x}{2}$$ odd numbers.)

So basically, for computing modulo inverse of x modulo $$2^{16}$$, find $$x^{2^{15}}-1 \pmod{m}$$. (Usually, EGCD algorithm is used instead, because it's just faster.)

This pure C code does modulo arithmetic as well, it's just hidden. It uses 16/32/64-bit CPU registers and all results of all operations are just truncated down to $$2^{64}$$. (This wouldn't work in Python, it will just switch to bignum.)