[RevEng][Math] Data compression and entropy

This is yet another blog post about entropy. This time we will measure a weight of effective information in some data. Or payload (not in exploit sense).

# generate RSA key
openssl genrsa -out keypair.pem 4096
# extract public key out of it
openssl rsa -in keypair.pem -pubout -out pubkey.pub
# dump secret key:
openssl rsa -noout -text -in keypair.pem > secret.txt
# dump public key:
openssl rsa -noout -text -inform PEM -pubin -in pubkey.pub > pub.txt

Start with public key. It has 4096 bits of high entropy data:

RSA Public-Key: (4096 bit)
Modulus:
    00:c9:e5:db:c9:f7:ae:d0:f6:6f:44:1a:1c:54:15:
    2d:50:69:93:7e:90:3f:c4:2b:e4:7d:33:1a:78:a9:

...

    e9:b1:f8:fa:fb:90:f0:55:d7:4c:46:12:04:9d:e5:
    b3:c2:7d
Exponent: 65537 (0x10001)

Public key in PEM text format:

-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAyeXbyfeu0PZvRBocVBUt
UGmTfpA/xCvkfTMaeKnU55jReRnP9+x7OghZkYkAQT8yHL6dnSZhjs5iSCPQp31q
m/uVLx2oo/vWDI/B2ZRlqXg0xPaDw2HVIhsnhELhF9zqoIaI8j9dflCUqaixnmGo
qEemvFSTWV72aJ4tflHEm+tqaJhstQAqyp7RKOqEJgLc4GeD72hfFGYsubAVslwN
NzYe8vJlpL2CxMcckBt18udIHuh13bTGBAcrrJUQfgQe2WwYPvhs+hFIVSWO3CzN
1XJJOqpZl5WIuDeDJmC3fCxNAUdswhk1XCD0cSzgS9SkZqZFEKlqOcEGU+y3fXD/
cYvB9WqVgRe5/s6xlnXf/YLWBsKgg66JHBpM7dts7kqwZ/mqOw7cx3aVtGJYUpoc
YwtkL4bIZ+XCCp8rSM9G0j1KhDHdVZG06uWP0sY7yugzbBHiJocuRhzLfbjGpQFZ
wOHsEv9XQR7PViX0JbTMaUuc889KhtncAQbs2Z4Euy6HToiqDuggqpBwm9gv22ae
xwO4b96vRiyQv9Fh7rxMEf4wlwh6/AGRZ7UqpeoNjIEXfizfnmMGt7LqZnsCzLjW
BKnqeH89mBK/YFmv+NtcVu9oFZtYEMNmi7L4jj9Uq8VxmdB1diT5l9I4lhjz7Sbp
sfj6+5DwVddMRhIEneWzwn0CAwEAAQ==
-----END PUBLIC KEY-----

Compress the PEM text file with xz -z -9. The final size is 740 bytes. Should be - 4096/8 = 512 bytes. OK, xz is not the ideal compressor. And there is much overhead - base64 encoding, ASN.1 tokens, "BEGIN PUBLIC KEY" header/footer, etc. An ideal compressor in the ideal world should produce data output as close to 512 bytes as possible.

Now compress the text dump: xz -z -9 pub.txt - 948 bytes. Worse, but OK.

What is in the secret key?

RSA Private-Key: (4096 bit, 2 primes)
modulus:
    00:c9:e5:db:c9:f7:ae:d0:f6:6f:44:1a:1c:54:15:
...
    b3:c2:7d
publicExponent: 65537 (0x10001)
privateExponent:
    00:b1:e5:79:5e:62:81:84:da:3f:7c:10:4d:b9:c0:
...
    51:2f:39
prime1:
    00:ff:67:d5:e8:b4:85:7e:63:13:b0:e9:0f:ae:05:
...
    f3:4f
prime2:
    00:ca:5e:24:f5:c5:cc:52:f7:17:d1:09:b5:fd:fe:
...
    aa:73
exponent1:
    51:57:38:81:0c:3d:17:ab:66:32:09:87:bc:dc:54:
...
    71
exponent2:
    00:c1:9d:1d:23:7f:e1:23:27:81:43:e0:54:9c:f4:
...
    1d:fd
coefficient:
    00:b0:42:15:15:3e:f1:e3:65:44:18:bc:c8:6e:ce:
...
    c6:2a

With my comments:

RSA Private-Key: 4096 bits
modulus: 4096 bit
publicExponent: 16 bits
privateExponent: 4096 bits
prime1: 2048 bits
prime2: 2048 bits
exponent1: 2048 bits
exponent2: 2048 bits
coefficient: 2048 bits

Sum: 14352 bits or 1794 bytes

PEM secret key file compressed with xz - 2676 bytes. Not ideal, but you got the idea.

Compress the dump (which consists mainly of hexadecimal digits and colon characters): xz -z -9 secret.txt - 3172 bytes.

# Get my TLS certificate, signed by Let's Encrypt:
openssl s_client -connect yurichev.com:443 < /dev/null > tmp
# Dump it:
openssl x509 -in tmp -text > tmp2
# Compress with xz
xz -z -9 tmp2
# Get file size:
stat tmp2.xz

... 3352 bytes or 26816 bits.

More information about entropy is in my book.

(the post first published at 20220719.)


List of my other blog posts.

Subscribe to my news feed

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.