ElGamal encryption system, cracking it using CADO-NFS

ElGamal encryption system shares a lot with Diffie–Hellman key exchange, so please read my previous blog post about it first,

Toy example

All variable names are the same as in the Wikipedia article.

Generate g/p parameters as for Diffie-Hellman scheme. (g is generator, p is a prime.)

#!/usr/bin/env python3
import random, math

# https://stackoverflow.com/questions/567222/simple-prime-number-generator-in-python
def is_prime(num):
    """Returns True if the number is prime else False."""
    if num == 0 or num == 1:
        return False
    for x in range(2, num):
        if num % x == 0:
            return False
    else:
        return True

# https://stackoverflow.com/questions/40190849/efficient-finding-primitive-roots-modulo-n-using-python
def primRoots(modulo):
    coprime_set = {num for num in range(1, modulo) if math.gcd(num, modulo) == 1}
    return [g for g in range(1, modulo) if coprime_set == {pow(g, powers, modulo)
            for powers in range(1, modulo)}]

primes=[i for i in range(500, 2000) if is_prime(i)]
# random prime:
p=random.choice(primes)
g=primRoots(p)[0] # pick the first one
print ("p, g =", (p, g))

Got them:

p, g = (1123, 2)

These parameters are public, again, as in Diffie-Hellman scheme.

Now the 'receiver' generates public/private keys:

#!/usr/bin/env python3
import random, math

p=1123
g=2

x=random.randint(1, p-1)
h=pow(g, x, p)

print ("secret key = ", x)
print ("full public key g, p, h = ", (g, p, h))
secret key =  108
full public key g, p, h =  (2, 1123, 368)

Secret key is kept, public one is broadcasted widely.

To encrypt a message to a 'receiver', you need only g/p parameters and public key. No need to generate any key on sender's side.

Encrypt a message: 123.

#!/usr/bin/env python3
import random

# g/p parameters and public key:
g, p, h = 2, 1123, 368

m=123
assert m>=0 and m<=p

# AKA ephemeral key
# random for each message:
y=random.randint(1, p-2)

s=pow(h, y, p) # AKA shared secret

C1=pow(g, y, p) # here $y$ is transmitted in 'obfuscated' way
C2=divmod(m*s, p)[1]

# encrypted message:
print (C1, C2)

Decrypt it:

#!/usr/bin/env python3
import sys

# private key:
x=108
# g/p parameters and public key:
g, p, h = 2, 1123, 368

C1=int(sys.argv[1])
C2=int(sys.argv[2])

# recover $s$ from $C1$ using $x$
s=pow(C1,x,p)

# divide $C1$ by $s$
decrypted=divmod(C2*pow(s,-1,p),p)[1]

print (decrypted)

Run:

 % ./encrypt.py
962 679

 % ./decrypt.py 962 679
123

An ephemeral key is passed in 'obfuscated' form as C1. A 'receiver' would need a secret key to reconstruct an ephemeral key. Note also that ElGamal scheme has two features: ciphertext's length is twice as long and it requires a good PRNG (ephemeral key must be different each time).

In short, ElGamal encryption is just multiplication:

\[ message \cdot shared\_secret \pmod p \]

Decryption is just division:

\[ \frac{message}{shared\_secret} \pmod p \]

But to divide modulo p, we use multiplication by inverse modulo p (described in my Mathematical recipes):

\[ message \cdot shared\_secret^{-1} \pmod p \]

(Finding modulo inverse in Python is: pow(input, -1, p).)

256-bit modulo, cracking with CADO-NFS

The problem is to 'deobfuscate' C1. And this is discrete logarithm problem.

Let's extend my example. Not toyish anymore. I generated 256-bit p using SageMath, because I want safe prime (to make g=2 always). (The same reason safe primes used in Diffie-Hellman scheme.)

g, p =  (2, 106782337079608867773935245212202223922658909307427458412216465947190648686267)

Generate public/secret keys:

#!/usr/bin/env python3
import random, math

p=106782337079608867773935245212202223922658909307427458412216465947190648686267
g=2

# secret key:
x=random.randint(1, p-1)
# public key:
h=pow(g, x, p)

print ("secret key = ", x)
print ("full public key g, p, h = ", (g, p, h))

Done:

secret key =  42552630690748541237192753211567470672347161483452209090775419238444389011519
full public key g, p, h =  (2, 106782337079608867773935245212202223922658909307427458412216465947190648686267, 31666560919228943900143403193461257805624153173272548606369950392837599874669)

Encryption code:

#!/usr/bin/env python3
import random, operator

def str_to_num(s):
    x=map(ord, list(s))
    y=list(map(lambda x: 256**x, range(len(s))))
    return sum(map(operator.mul, x, y))

# g/p parameters and public key:
g, p, h =  2, 106782337079608867773935245212202223922658909307427458412216465947190648686267, 31666560919228943900143403193461257805624153173272548606369950392837599874669

m=str_to_num("Hello, world! And everyone else!") # 32 characters
assert m>=0 and m<=p

# AKA ephemeral key
# random for each message:
y=random.randint(1, p-2)

print ("y=", y)
s=pow(h, y, p) # AKA shared secret

C1=pow(g, y, p) # here $y$ is transmitted in 'obfuscated' way
C2=divmod(m*s, p)[1]

# encrypted message:
print ("C1, C2=", C1, C2)

Decryption code:

#!/usr/bin/env python3
import sys, math

def num_to_str (t):
    return "".join(map(lambda i: chr((t >> i*8)&0xff), range(math.ceil(math.log(t,256)))))

# private key:
x= 42552630690748541237192753211567470672347161483452209090775419238444389011519
# g/p parameters and public key:
g, p, h =  2, 106782337079608867773935245212202223922658909307427458412216465947190648686267, 31666560919228943900143403193461257805624153173272548606369950392837599874669

C1=int(sys.argv[1])
C2=int(sys.argv[2])

# recover $s$ from $C1$ using $x$
s=pow(C1,x,p)

# divide $C1$ by $s$
decrypted=divmod(C2*pow(s,-1,p),p)[1]

print (num_to_str(decrypted))

Run it:

 % ./encrypt.py
y= 5144495077841497976153694286122959414334781235188500618304821025636874868256
C1, C2= 6987472764437489136246373887954785566501893663768545393064534657896039052304 17399590184257652984342389957591864072251175067586468409794768502613032358607

 % ./decrypt.py 6987472764437489136246373887954785566501893663768545393064534657896039052304 17399590184257652984342389957591864072251175067586468409794768502613032358607
Hello, world! And everyone else!

Now we run CADO-NFS to solve the \( g^y=target \pmod p \) equation. We pass \( ell=\frac{p-1}{2} \), C2 as target and p.

% python3 ./cado-nfs.py -dlp -ell 53391168539804433886967622606101111961329454653713729206108232973595324343133 target=6987472764437489136246373887954785566501893663768545393064534657896039052304 106782337079608867773935245212202223922658909307427458412216465947190648686267

~4 minutes on 6-core AMD Ryzen 5 3600:

Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization / Discrete logarithm: Total cpu/elapsed time for entire Discrete logarithm: 1496.16/218.861
Info:root: CADO_DEBUG is on, data kept in /tmp/cado.7lqwix6m
Info:root:   p = 106782337079608867773935245212202223922658909307427458412216465947190648686267
Info:root:   ell = 53391168539804433886967622606101111961329454653713729206108232973595324343133
Info:root: Checking that log(2) and log(3) are consistent...
Info:root:   unscaled_log2 = 30428972128915818279147905145377664110452663445550892998657716656893355159153 mod ell
Info:root:   unscaled_log3 = 8060334805109884269725591663308071132159292151224498885152880760206664691371 mod ell
Info:root: Checking that log(2) and log(3) are consistent... passed!
Info:root: Also check log(target) vs log(2) ...
Info:root:   target = 6987472764437489136246373887954785566501893663768545393064534657896039052304
Info:root:   unscaled_log(target) = 10376137087718435699981896182249833062555684801728895261346224059295604816178 mod ell
Info:root: Also check log(target) vs log(2) ... passed!
Info:root: Using g=2 as a generator
Info:root:   log2 = 1 mod ell
Info:root:   log3 = 21373637609788537016765289093511200061069952176936395306973887475772315899633 mod ell
Info:root: target = 6987472764437489136246373887954785566501893663768545393064534657896039052304
Info:root: log(target) = 5144495077841497976153694286122959414334781235188500618304821025636874868256 mod ell
5144495077841497976153694286122959414334781235188500618304821025636874868256
Info:root: If you want to compute a new target, run ./cado-nfs.py /tmp/cado.7lqwix6m/p80.parameters_snapshot.0 target=

Ah, you see, CADO-NFS reported \( y \) that was used during encryption (ephemeral key). Everything else can be reconstructed using it.

Of course, larger modulos must be used in practical applications. At least 1024 bits, etc.

1024-bit version

This would be unbreakable to CADO-NFS (yet?)

Generate a 1024-bit safe prime using SageMath again:

g, p =  2, 137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767

Generate public/secret keys:

#!/usr/bin/env python3
import random, math

p=137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767
g=2

# secret key:
x=random.randint(1, p-1)
# public key:
h=pow(g, x, p)

print ("secret key = ", x)
print ("full public key g, p, h = ", (g, p, h))
secret key =  125145567444781188291563236814448642991074009386501584522151284465576384840849460470722982180439173635742072131455048006910693927012453056391845492404236237424570006598525046184094062453272170219120390449177131205786467667439971687661069639140348041315057998555150872782256901068188155140833329567148544932439
full public key g, p, h =  (2, 137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767, 6249556833668681644083438550299497673106834839601060066276497310296385700845899734502627671244366202807168979799285196733422125874828630253195783505028941308011866959264471973821739180982893861881463421391972516267745232824952669360476216145572401140492786897546674079901107972405845351020015289305994148373)

And here I'm introducting a mistake. A bug. Two messages encrypted using the same ephemeral key.

#!/usr/bin/env python3
import random, operator, sys

def str_to_num(s):
    x=map(ord, list(s))
    y=list(map(lambda x: 256**x, range(len(s))))
    return sum(map(operator.mul, x, y))

# g/p parameters and public key:
g, p, h =  2, 137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767, 6249556833668681644083438550299497673106834839601060066276497310296385700845899734502627671244366202807168979799285196733422125874828630253195783505028941308011866959264471973821739180982893861881463421391972516267745232824952669360476216145572401140492786897546674079901107972405845351020015289305994148373

str1="Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."
str2="Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."
m1=str_to_num(str1)
m2=str_to_num(str2)
print ("m1=", m1)
print ("m2=", m2)
assert m1>=0 and m1<=p
assert m2>=0 and m2<=p

# AKA ephemeral key
# random for each message:
y=random.randint(1, p-2)

print ("y=", y)
s=pow(h, y, p) # AKA shared secret

C1=pow(g, y, p) # here $y$ is transmitted in 'obfuscated' way

C2=divmod(m1*s, p)[1]
# encrypted message #1:
print ("C1, C2=", C1, C2)

C2=divmod(m2*s, p)[1]
# encrypted message #2:
print ("C1, C2=", C1, C2)

How it works:

 % ./encrypt.py
m1= 29621909512940638494760964484140759984375625095067376237231284556034193927678593304693703604928742721424602503578550816330552508304029834564314939053627929063784140576852239375974072338377421082828128630065358298657550096313231682094938593100469751578082134400792524191660546879328042651332276044
m2= 87189695534124845841346330292339032268475696505825424426456330437317672577229927821912298409021431706304432575665337914113315350311192548566057684273954696388197763565201964255555319526629349718675786069561800114946423232652077255874042911215069727943586901
y= 134940895478659564763488825551678767452914904526016862981255266965373851022274611148744119297930900591711351266982417934691387640859446906471803793985267686610315267629402542318073956598371342980684122698010291540240572268992326509131182595912050971445954864494167940753125490006989751384348845805435689003644
C1, C2= 38868461258163385350952064492545978079474927358454737597039167965009500483783768589761067678611767984772751216292653023592851441121732044569477993995525509876324601315031672444371765522230733285226469927516803221370700178171737644987194715516926938219663309000601832451676068716429123819891113295284356360229 122084354265190275723695632568222305314579207473081027964362476628423056328282126056359514435352933557667128817597906218481982354732495318494696334727346175816480935530125124185238134256011126407251367580887541578293634433565790951482035240073100294523042279728773292435829170281005907316101220827916733046016
C1, C2= 38868461258163385350952064492545978079474927358454737597039167965009500483783768589761067678611767984772751216292653023592851441121732044569477993995525509876324601315031672444371765522230733285226469927516803221370700178171737644987194715516926938219663309000601832451676068716429123819891113295284356360229 67436295487534111095400425890140032426290810812117420336125568521033562691792885601278968958947708799644218777892115465612814249202184833182137563312093709192261331993748924146537777996565563567193764592610326801112313239259621703548762311180291556279832860627010357404438372742826085454887814456186341933469

It can be decrypted without a hitch:

#!/usr/bin/env python3
import sys, math

def num_to_str (t):
    return "".join(map(lambda i: chr((t >> i*8)&0xff), range(math.ceil(math.log(t,256)))))

# private key:
x= 125145567444781188291563236814448642991074009386501584522151284465576384840849460470722982180439173635742072131455048006910693927012453056391845492404236237424570006598525046184094062453272170219120390449177131205786467667439971687661069639140348041315057998555150872782256901068188155140833329567148544932439
# g/p parameters and public key:
g, p, h =  2, 137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767, 6249556833668681644083438550299497673106834839601060066276497310296385700845899734502627671244366202807168979799285196733422125874828630253195783505028941308011866959264471973821739180982893861881463421391972516267745232824952669360476216145572401140492786897546674079901107972405845351020015289305994148373

C1=int(sys.argv[1])
C2=int(sys.argv[2])

# recover $s$ from $C1$ using $x$
s=pow(C1,x,p)

# divide $C1$ by $s$
decrypted=divmod(C2*pow(s,-1,p),p)[1]

print ("decrypted=", decrypted)
print (num_to_str(decrypted))
 % ./decrypt.py 38868461258163385350952064492545978079474927358454737597039167965009500483783768589761067678611767984772751216292653023592851441121732044569477993995525509876324601315031672444371765522230733285226469927516803221370700178171737644987194715516926938219663309000601832451676068716429123819891113295284356360229 122084354265190275723695632568222305314579207473081027964362476628423056328282126056359514435352933557667128817597906218481982354732495318494696334727346175816480935530125124185238134256011126407251367580887541578293634433565790951482035240073100294523042279728773292435829170281005907316101220827916733046016
decrypted= 29621909512940638494760964484140759984375625095067376237231284556034193927678593304693703604928742721424602503578550816330552508304029834564314939053627929063784140576852239375974072338377421082828128630065358298657550096313231682094938593100469751578082134400792524191660546879328042651332276044
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

 % ./decrypt.py 38868461258163385350952064492545978079474927358454737597039167965009500483783768589761067678611767984772751216292653023592851441121732044569477993995525509876324601315031672444371765522230733285226469927516803221370700178171737644987194715516926938219663309000601832451676068716429123819891113295284356360229 67436295487534111095400425890140032426290810812117420336125568521033562691792885601278968958947708799644218777892115465612814249202184833182137563312093709192261331993748924146537777996565563567193764592610326801112313239259621703548762311180291556279832860627010357404438372742826085454887814456186341933469
decrypted= 87189695534124845841346330292339032268475696505825424426456330437317672577229927821912298409021431706304432575665337914113315350311192548566057684273954696388197763565201964255555319526629349718675786069561800114946423232652077255874042911215069727943586901
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

As you remember, ciphertext is a plaintext multiplied by 'shared secret'. By knowing plaing text and ciphertext, we can simply get it. Here, using only first plaintext/ciphertext pair, we can decrypt second ciphertext:

#!/usr/bin/env python3
import math

p=137317384306991478612368304329360800376234346559644221669650048769789586302529142946225337013730395114826953750636339982051782120897412799937783869292398296942792027911662290058773308135876835152795957816699643984545269896449282627220047872998032473062103551723430882982992050101825994714784360845181100251767

C2_v1=122084354265190275723695632568222305314579207473081027964362476628423056328282126056359514435352933557667128817597906218481982354732495318494696334727346175816480935530125124185238134256011126407251367580887541578293634433565790951482035240073100294523042279728773292435829170281005907316101220827916733046016

decrypted_v1=29621909512940638494760964484140759984375625095067376237231284556034193927678593304693703604928742721424602503578550816330552508304029834564314939053627929063784140576852239375974072338377421082828128630065358298657550096313231682094938593100469751578082134400792524191660546879328042651332276044

s=C2_v1*pow(decrypted_v1,-1,p)

C2_v2=67436295487534111095400425890140032426290810812117420336125568521033562691792885601278968958947708799644218777892115465612814249202184833182137563312093709192261331993748924146537777996565563567193764592610326801112313239259621703548762311180291556279832860627010357404438372742826085454887814456186341933469

decrypted_v2=divmod(C2_v2*pow(s,-1,p),p)[1]

def num_to_str (t):
    return "".join(map(lambda i: chr((t >> i*8)&0xff), range(math.ceil(math.log(t,256)))))

print (num_to_str(decrypted_v2))

Swapping multiplication and division

Remember I wrote that encryption is multiplication and decryption is division? Given that fact, these operations can be swapped during encryption and decryption, without loss of security. Don't know, who will do this, but I write this to give a better understanding of modulo inverses -- they are inverse operations which can be swapped.

 % diff -u encrypt_old.py encrypt_new.py
...
-C2=divmod(m*s, p)[1]
+C2=divmod(m*pow(s, -1, p), p)[1] # changed
...

 % diff -u decrypt_old.py decrypt_new.py
...
-decrypted=divmod(C2*pow(s,-1,p),p)[1]
+decrypted=divmod(C2*s,p)[1] # changed
...

Decryption without modulo inverse

This is a popular version. It's a merely optimization and harder to grasp. This is why I left it for the end of this blog post.

...
decrypted=divmod(C1**(p-1-x)*C2, p)[1]

#decrypted=divmod(pow(C1, p-1-x, p)*C2, p)[1] # another version - only Python syntax is slightly different
...

It relies on Fermat little theorem and can be simplified neatly. Here are all steps that leads to simplification, I did in Wolfram Mathematica (SageMath can't simplify pow() functions yet). (First steps are manual.)

In[24]:= pub=PowerMod[g,secret,p];

In[25]:= C1=PowerMod[g,k,p];

In[21]:= C2=Mod[m*pub^k,p]
Out[21]= Mod[m PowerMod[g,secret,p]^k,p]

In[26]:= DecryptedMessage=Mod[C1^(p-1-secret)*C2,p]
Out[26]= Mod[Mod[m PowerMod[g,secret,p]^k,p] PowerMod[g,k,p]^(-1+p-secret),p]

In[28]:= DecryptedMessage=Mod[PowerMod[g,k,p]^(p-1-secret)*Mod[m*pub^k,p],p]
Out[28]= Mod[Mod[m PowerMod[g,secret,p]^k,p] PowerMod[g,k,p]^(-1+p-secret),p]

In[30]:= DecryptedMessage=Mod[PowerMod[g,k,p]^(p-1-secret)*m*pub^k,p]
Out[30]= Mod[m PowerMod[g,k,p]^(-1+p-secret) PowerMod[g,secret,p]^k,p]

In[32]:= DecryptedMessage=Mod[g^(k*(p-1-secret))*m*pub^k,p]
Out[32]= Mod[g^(k (-1+p-secret)) m PowerMod[g,secret,p]^k,p]

In[34]:= DecryptedMessage=Mod[PowerMod[g,k,p]^(p-1-secret)*m*PowerMod[g,secret,p]^k,p]
Out[34]= Mod[m PowerMod[g,k,p]^(-1+p-secret) PowerMod[g,secret,p]^k,p]

In[36]:= DecryptedMessage=Mod[PowerMod[g,k,p]^(p-1-secret)*m*g^(secret*k),p]
Out[36]= Mod[g^(k secret) m PowerMod[g,k,p]^(-1+p-secret),p]

In[38]:= DecryptedMessage=Mod[g^(k*(p-1-secret))*m*g^(secret*k),p]
Out[38]= Mod[g^(k (-1+p-secret)+k secret) m,p]

In[39]:= Simplify[DecryptedMessage]
Out[39]= Mod[g^(k (-1+p)) m,p]

In[44]:= DecryptedMessage=Mod[g^(k-1+p)*m,p]
Out[44]= Mod[g^(-1+k+p) m,p]

... replace g^k to dummy ...

In[46]:= DecryptedMessage=Mod[dummy^(p-1)*m,p]
Out[46]= Mod[dummy^(-1+p) m,p]

In[48]:= Simplify[DecryptedMessage]
Out[48]= Mod[dummy^(-1+p) m,p]

It relies on Fermat little theorem in a sence that the last step simplified simply to 'dummy'. You may read about it here: Ctrl-F: "Fermat little theorem".

The files

Python code, Jupyter, SageMath notebooks, etc...


At reddit.

Next part: ElGamal signature scheme.

(the post first published at 20220427.)


List of my other blog posts.

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.