This is my notes about Ethereum keystore. I just created a key (or address) with the password '1234'. Feel free to use it, but there is no ETH on this address.
% ./geth account new ... Your new account is locked with a password. Please give a password. Do not forget this password. Password: Repeat password: Your new key was generated Public address of the key: 0xEC1dC9454d4727294CFB22c341F5560F5293b974 Path of the secret key file: /home/i/.ethereum/keystore/UTC--2022-11-30T18-28-00.028017872Z--ec1dc9454d4727294cfb22c341f5560f5293b974 ...
What is in this file?
{"address":"ec1dc9454d4727294cfb22c341f5560f5293b974","crypto":{"cipher":"aes-128-ctr","ciphertext":"33391ffea38bffa499221c34f15ebd8802544c1e357c545044235aa1e2e5d18c","cipherparams":{"iv":"796d401810173d1b2ada5ced48018e14"},"kdf":"scrypt","kdfparams":{"dklen":32,"n":262144,"p":1,"r":8,"salt":"2b4615e610fe3eada89c884fbb34f4894fdd348345dcac43ada4062041ca5176"},"mac":"fe231144df6d3568ee5caecd4c62ac9d0169af99c55e43937aa626ca12ff7eb4"},"id":"ff9f7c5d-d65c-4aa2-9374-d56d4f37a682","version":3}
Tidy version:
% cat UTC--2022-11-30T18-28-00.028017872Z--ec1dc9454d4727294cfb22c341f5560f5293b974 | jq { "address": "ec1dc9454d4727294cfb22c341f5560f5293b974", "crypto": { "cipher": "aes-128-ctr", "ciphertext": "33391ffea38bffa499221c34f15ebd8802544c1e357c545044235aa1e2e5d18c", "cipherparams": { "iv": "796d401810173d1b2ada5ced48018e14" }, "kdf": "scrypt", "kdfparams": { "dklen": 32, "n": 262144, "p": 1, "r": 8, "salt": "2b4615e610fe3eada89c884fbb34f4894fdd348345dcac43ada4062041ca5176" }, "mac": "fe231144df6d3568ee5caecd4c62ac9d0169af99c55e43937aa626ca12ff7eb4" }, "id": "ff9f7c5d-d65c-4aa2-9374-d56d4f37a682", "version": 3 }
Let's see what each field means.
You probably heard about GPU/FPGA/ASIC mining rigs. SHA1/SHA2 algorithm works faster to these platforms, because it can be parallelized easily.
Here comes to play so called 'memory-hard functions' (MHFs). They use RAM extensively.
All these devices - CPU, GPU, FPGA and ASIC can use DDR RAM without problems. But! RAM accesses cannot be parallelized. RAM becomes a bottleneck. No matter how faster your GPU/FPGA/ASIC is, your DDR RAM has the same performance as if attached to mainstream CPU.
This levels all CPU/GPU/FPGA/ASIC owners/users. Basically, this is protection against brute-force.
Well-known examples are scrypt and Argon2.
Let's measure scrypt performance:
#!/usr/bin/env python3 import hashlib def key(password, data): key = hashlib.scrypt( bytes(password, 'utf-8'), salt=bytes("12345", "utf-8"), n=262144, r=8, p=1, maxmem=2000000000, dklen=32 ) return key for _ in range(60): key("password", "random data")
My venerable Intel(R) Xeon(R) CPU E31220 @ 3.10GHz spend one second to call scrypt. Of course, this is Python. Pure C version would be much faster. But you got the idea.
BTW, this is why you feel lag when you 'unlock' your account in Geth.
My code is reworked version of the code by David Egan.
#!/usr/bin/env python3 import hashlib import sys, json from getpass import getpass # pip install pycryptodome from Crypto.Cipher import AES from Crypto.Util import Counter from Crypto.Hash import keccak def password_to_key(password, scrypt_params): key = hashlib.scrypt( bytes(password, 'utf-8'), salt=bytes.fromhex(scrypt_params["salt"]), n=scrypt_params["n"], r=scrypt_params["r"], p=scrypt_params["p"], maxmem=2000000000, dklen=scrypt_params["dklen"] ) return key def verify_key(key, ciphertext, mac): validate = key[16:] + bytes.fromhex(ciphertext) k=keccak.new(digest_bits=256) k.update(validate) return mac == k.hexdigest() def read_json(filename): with open(filename) as f_in: return(json.load(f_in)) def main(filename): json = read_json(filename) data = json["crypto"] password = getpass() #password="1234" k=password_to_key(password, data["kdfparams"]) print ("key=", k.hex()) if (verify_key(k, data["ciphertext"], data["mac"])): print("Password verified.") iv_int = int(data["cipherparams"]["iv"], 16) ctr = Counter.new(AES.block_size * 8, initial_value=iv_int) dec_suite = AES.new(k[:16], AES.MODE_CTR, counter=ctr) decrypted_private_key = dec_suite.decrypt(bytes.fromhex(data["ciphertext"])) print("Private key:", decrypted_private_key.hex()) else: print("Password NOT verified.") filename = sys.argv[1] main(filename)
Basically, the password is 'crunched' via scrypt. Resulting hash is used as a key for AES-128 in CTR mode.
You then decrypt 'ciphertext' using AES-128 and IV mentioned in JSON file. Decrypted plaintext is a private key for your Ethereum address.
But how you can be sure that the password is correct? You can decrypt 'ciphertext' with any key. How to be sure?
Key (or hash) obtained using scrypt plus 'ciphertext' is hashed using SHA-3 (or Keccak). Resulting hash is called MAC. MAC is like checksum. Here it is stored in JSON file, as a checksum. Since, 'ciphertext' is hashed, this is 'encrypt-than-MAC' scheme: you can check MAC before decrypting 'ciphertext'. Other 'checksum' algorithms may be much more vulnerable and some information about encrypted private key may be leaked.
Let's run it with our JSON key:
% ./keystore_decrypt.py UTC--2022-11-30T18-28-00.028017872Z--ec1dc9454d4727294cfb22c341f5560f5293b974 Password: key= 6cd27944e3a46ec32e853f46f9ce5a7bc0681bc1a7268718324641d0d56b58a7 Password verified. Private key: 9a1fee826569e2b9c53ca973d031a90f50d70b78689d58a3c2b76bf7da32cded
This is the private key, meant to be kept in secret.
Note: you can't use password as a key for AES, for many reasons. Instead, key derivation functions (KDF) are used. Scrypt is also a KDF.
Let's see how to extract public key and address from private key.
The 'secp256k1' EC curve is used in Ethereum. Here we get public key point (X/Y) and then convert it to a hex string. It must not be 'compressed'. ( More on 'compressed' keys. )
A part of that hex string (not the string as whole) is then hashed with SHA3 or Keccak again, to get Ethereum address.
#!/usr/bin/env python3 import sys pri_key=sys.argv[1] # from https://docs.ethers.io/v5/api/utils/address/ #pri_key="b976778317b23a1385ec2d483eda6904d9319135b89f1d8eee9f6d2593e2665d" # pip install fastecdsa import fastecdsa.keys import fastecdsa.curve import fastecdsa.encoding.sec1 curve = fastecdsa.curve.secp256k1 private_key_raw = int(pri_key, base=16) pubkey = fastecdsa.keys.get_public_key(private_key_raw, curve) print ("pub key as EC point:") print(pubkey) #t = fastecdsa.encoding.sec1.SEC1Encoder().encode_public_key(pubkey, compressed=True) t = fastecdsa.encoding.sec1.SEC1Encoder().encode_public_key(pubkey, compressed=False) print ("not compressed encoded pub key:") print("0x" + t.hex()) # pip install pysha3 import sha3 z=sha3.keccak_256() z.update(t[1:]) print ("sha3 of pub key:") print (z.hexdigest()[24:])
% ./prikey_to_address.py 9a1fee826569e2b9c53ca973d031a90f50d70b78689d58a3c2b76bf7da32cded pub key as EC point: X: 0xa28333c0d6a8362f839676a7abe5577c520067eace17613caacd82458f340e7c Y: 0x6ce1fb092b4605b647c52546ab8ad2d0fac89c84f66d5f34aa3d4cf653b5ad14 (On curve <secp256k1>) not compressed encoded pub key: 0x04a28333c0d6a8362f839676a7abe5577c520067eace17613caacd82458f340e7c6ce1fb092b4605b647c52546ab8ad2d0fac89c84f66d5f34aa3d4cf653b5ad14 sha3 of pub key: ec1dc9454d4727294cfb22c341f5560f5293b974
ec1dc9454d4727294cfb22c341f5560f5293b974 is the same address we see in JSON file.
So basically, the only one thing is in your keystore -- EC private key, protected with password.
Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.