This guide demonstrates how RSA (Rivest-Shamir-Adleman, from the name of the inventors) encryption (used in cryptographic software such as PGP) works using small prime numbers. While not secure for real use, it makes the math behind RSA easy to follow.
Read more about RSA: https://en.wikipedia.org/wiki/RSA_cryptosystem
How RSA Works (step-by-step)
RSA relies on number theory and modular arithmetic. A public key is used for encryption, and a private key is used for decryption.
This is similar to using an open padlock (the public key) to lock a box containing a secret message. Anyone can close the lock and send the box, but only the person with the matching key (the private key) can open it and read the message.
-
Choose two (small) prime numbers, with a product larger than 256 (to be able to encrypt bytes):
-
Compute the modulus :
-
Compute Euler’s Totient function :
-
Choose a public exponent :
must follow the rules:
We can choose:
-
Compute the private exponent :
must follow the rule:
Try values of until this holds true.
For and , works:
This Result in The Following RSA Keys:
Public key :
Private key :
Component | Role | Used for | Shared? |
---|---|---|---|
Public exponent | Encryption | Yes | |
Modulus | Both encryption and decryption | Yes | |
Private exponent | Decryption | No | |
Prime numbers | Internal math only | No |
When Used for Encryption:
let be a number (65) or a character (A) to encrypt.
So, 65 is encrypted into 143.
When Used for Decryption:
We have now recovered the original number (65).
Python Script for Demonstration
The script belows is using the primes above for demonstration. You can input a message and get an encrypted version of it, as well as a decrypted version of it.
Warning
Do not use this implementation for secure encryption
def gcd(a, b):
while b:
a, b = b, a % b
return a
def mod_inverse(e, phi):
for d in range(1, phi):
if (d * e) % phi == 1:
return d
return None
def rsa_keygen(p, q, e):
n = p * q
phi = (p - 1) * (q - 1)
if gcd(e, phi) != 1:
raise ValueError("e must be coprime with phi(n)")
d = mod_inverse(e, phi)
return {'public': (e, n), 'private': (d, n)}
def encrypt(public_key, plaintext_numbers):
e, n = public_key
return [pow(num, e, n) for num in plaintext_numbers]
def decrypt(private_key, ciphertext):
d, n = private_key
return [pow(num, d, n) for num in ciphertext]
def text_to_ascii(text):
return [ord(c) for c in text]
def ascii_to_text(ascii_list):
return ''.join(chr(num) for num in ascii_list)
# ==== DEMO VALUES ====
# Small primes for demo (p * q must be larger than 256)
p = 17
q = 23
e = 3 # must be coprime of phi(p * q)
keys = rsa_keygen(p, q, e)
public_key = keys['public']
private_key = keys['private']
print("Public Key:", public_key)
print("Private Key:", private_key)
# Text to encrypt
message = input("Enter message to encrypt: ")
ascii_values = text_to_ascii(message)
print("Original ASCII:", ascii_values)
# Encrypt
encrypted = encrypt(public_key, ascii_values)
print("Encrypted:", encrypted)
# Decrypt
decrypted_ascii = decrypt(private_key, encrypted)
print("Decrypted ASCII:", decrypted_ascii)
# Convert back to text
decrypted_text = ascii_to_text(decrypted_ascii)
print("Decrypted Text:", decrypted_text)
Notes About The Script
- Use small primes only for learning.
- In real world RSA, and are hundreds of digits long.
- RSA in itself is deterministic, which isn’t suitable for cryptographically safe encryption.
Extended Python Program
Below is a more extensive program that include mor efficient algorithms, support for larger primes and results in nondeterministic output.
import argparse
import random
import time
import sys
import base64
def style(text, color='green'):
colors = {'red': '31', 'green': '32', 'yellow': '33', 'blue': '36'}
code = colors.get(color, '0')
return f"\033[{code}m{text}\033[0m"
def base64_encode(c):
length = max((ci.bit_length() + 7) // 8 for ci in c)
header = length.to_bytes(4, 'big')
stream = b''.join(ci.to_bytes(length, 'big') for ci in c)
return base64.b64encode(header + stream).decode()
def base64_decode(b):
raw = base64.b64decode(b.strip())
length = int.from_bytes(raw[:4], byteorder='big')
stream = raw[4:]
return [int.from_bytes(stream[i:i + length], 'big') for i in range(0, len(stream), length)]
def prime_check(n, k=5):
if n < 2:
return False
if n in (2, 3):
return True
if n % 2 == 0:
return False
r, d = 0, n - 1
while d % 2 == 0:
d //= 2
r += 1
for _ in range(k):
a = random.randrange(2, n - 2)
if (x := pow(a, d, n)) == 1 or x == n - 1:
continue
for _ in range(r - 1):
if (x := pow(x, 2, n)) == n - 1:
break
else:
return False
return True
def prime_generate(bit_size):
while True:
p = random.getrandbits(bit_size)
if p % 2 == 0:
p += 1
while not prime_check(p):
p += 2
return p
def rsa_gcd(a, b):
while b:
a, b = b, a % b
return a
def rsa_extended_gcd(phi, e):
_r, r = phi, e
_s, s = 1, 0
_t, t = 0, 1
while r > 0:
q = _r // r
_r, r = r, _r - q * r
_s, s = s, _s - q * s
_t, t = t, _t - q * t
if _r == 1:
return _t + phi
def rsa_coprime_exponent(phi):
for e in range(3, phi, 2):
if rsa_gcd(phi, e) == 1:
return e
def rsa_generate_keys(p, q):
n = p * q
phi = (p - 1) * (q - 1)
e = rsa_coprime_exponent(phi)
d = rsa_extended_gcd(phi, e)
return e, d, n
def rsa_encrypt(message, e, n):
size = (n.bit_length() - 1) // 8
chunks = [message[i:i + size] for i in range(0, len(message), size)]
return [pow(int.from_bytes(chunk, 'big'), e, n) for chunk in chunks]
def rsa_decrypt(cipher_chunks, d, n):
size = (n.bit_length() - 1) // 8
result = bytearray()
for i, c in enumerate(cipher_chunks):
m = pow(c, d, n)
chunk = m.to_bytes(size, 'big')
if i == len(cipher_chunks) - 1: # remove padding of last chunk
chunk = chunk.lstrip(b'\x00')
result.extend(chunk)
return bytes(result)
def apply_xor_mask_to_string(message, nondeterministic=True):
masked = []
for x in message.encode("utf-8", errors="replace"):
masked.append(x ^ random.randint(0, 255) if nondeterministic else 0)
return masked
def recover_message_from_xor_mask(masked, nondeterministic=True):
message = bytearray()
for x in masked:
message.append(x ^ random.randint(0, 255) if nondeterministic else 0)
return message.decode("utf-8", errors="replace")
def print_keys(p, q, e, d, n):
print(style("\n[ PRIMES USED ]", "yellow"))
print(f"p = {p}")
print(f"q = {q}")
print(style("\n[ KEYS GENERATED ]", "green"))
print(f"Public Key (e, n): ({e}, {n})")
print(f"Private Key (d, n): ({d}, {n})")
def print_rng(seed, step):
print(style("\n[ RNG ]", "green"))
print(f"Seed: {seed}")
print(f"Step: {step}")
def print_encryption_steps(message, masked, prefix, encrypted_chunks):
print(style("\n[ ENCRYPTION STEPS BREAKDOWN ]", "red"))
print(style("\n[ 1. MESSAGE INPUT ]", "red"))
print(f"{message}")
print(style("\n[ 2. UTF-8 BYTE ENCODING ]", "red"))
print(f"Each character from input gets encoded as bytes")
print(f"{list(message.encode())}")
print(style("\n[ 3. XOR MASKING ]", "red"))
print(f"Each byte gets obfuscated using a random byte based of RNG seed")
print(f"{masked}")
print(style("\n[ 4. PREFIXING RNG SEED & RANDOM NOISE ]", "red"))
print(f"A prefix gets created containing the seed & random noise")
print(f"{prefix}")
print(style("\n[ 5. CONCATENATE PREFIX + XOR MASKED DATA ]", "red"))
print(f"The prefix and the XOR masked data gets concatenated")
print(f"{list(prefix + masked)}")
print(style("\n[ 6. CHUNKING OF DATA & RSA ENCRYPTION ]", "red"))
print(f"Chunking into ≤ N-byte segments (based on RSA modulus)")
print(f"RSA Encrypt chunks")
print(f"{encrypted_chunks}")
def print_decryption_steps(ciphertext, masked, prefix, recovered):
print(style("\n[ DECRYPTION STEPS BREAKDOWN ]", "red"))
print(style("\n[ 1. CIPHERTEXT INPUT ]", "red"))
print(f"{ciphertext}")
print(style("\n[ 2. RSA DECRYPTION & DECHUNKING OF DATA ]", "red"))
print(f"RSA Decrypt into chunks")
print(f"Dechunking from ≤ N-byte segments (based on RSA modulus)")
print(f"{list(prefix + masked)}")
print(style("\n[ 3. RNG SEED EXTRACTION ]", "red"))
print(f"Prefix and masked data gets separated, and the noise is ignored")
print(style("\n[ 4. XOR UNMASKING ]", "red"))
print(f"Each byte gets de-obfuscated using a random byte based of RNG seed")
print(f"{list(recovered.encode())}")
print(style("\n[ 5. UTF-8 BYTE DECODING ]", "red"))
print(f"Each byte from input gets decoded as UTF-8")
print(f"{recovered}")
def main():
parser = argparse.ArgumentParser(description="RSA encryption tool (educational only)")
parser.add_argument("-e", "--encrypt", action="store_true", help="Encrypt mode")
parser.add_argument("-d", "--decrypt", action="store_true", help="Decrypt mode")
parser.add_argument("-g", "--generate", action="store_true", help="Generate random primes")
parser.add_argument("-p", type=int, help="Prime p")
parser.add_argument("-q", type=int, help="Prime q")
parser.add_argument("-t", "--deterministic", action="store_true", help="Disable XOR masking")
parser.add_argument("-n", action="store_true", help="Enable XOR masking (default)")
parser.add_argument("-b", "--base64", action="store_true", help="Output ciphertext as base64")
parser.add_argument("--bits", type=int, default=16, help="Bit size for generated primes (default: 16)")
args = parser.parse_args()
encrypt = not args.decrypt
nondeterministic = not args.deterministic
# generate primes
if args.generate:
p = prime_generate(args.bits)
q = prime_generate(args.bits)
while p == q:
q = prime_generate(args.bits)
else:
p = args.p or 61
q = args.q or 53
# check primes
if not prime_check(p) or not prime_check(q):
print(style("Error: Both -p and -q must be prime numbers.", "red"), file=sys.stderr)
sys.exit(1)
elif p == q:
print(style("Error: Primes p and q must be distinct.", "red"), file=sys.stderr)
sys.exit(1)
elif p * q < 256:
print(style("Error: The product of p and q must be greater than 256.", "red"), file=sys.stderr)
sys.exit(1)
# generate keys
e, d, n = rsa_generate_keys(p, q)
print_keys(p, q, e, d, n)
if encrypt:
# generate RNG
seed = random.randint(0, 2**32 - 1)
random.seed(seed)
step = random.randint(0, 9)
print_rng(seed, step)
message = input(style("\n[ MESSAGE INPUT ]\n> ", "blue"))
# xor-mask data using randomized bytes
masked = apply_xor_mask_to_string(message, nondeterministic)
# prefix seed and noise as raw bytes
prefix = list(seed.to_bytes(4, byteorder='big'))
for _ in range(step):
prefix.append(random.randint(0, 255))
# encrypt the prefix and masked data
c = rsa_encrypt(bytes(prefix + masked), e, n)
print_encryption_steps(message, masked, prefix, c)
if args.base64:
print(style("\n[ ENCRYPTED OUTPUT (BASE64) ]", "yellow"))
print(base64_encode(c))
else:
print(style("\n[ ENCRYPTED OUTPUT (INTS) ]", "yellow"))
print(" ".join(map(str, c)))
else:
if args.base64:
line = input(style("\n[ CIPHERTEXT INPUT (BASE64) ]\n> ", "blue"))
c = base64_decode(line)
else:
line = input(style("\n[ CIPTHERTEXT INPUT (INTS) ]\n> ", "blue"))
c = list(map(int, line.strip().split()))
encrypted = rsa_decrypt(c, d, n)
# retrieve RNG
seed = int.from_bytes(encrypted[:4], byteorder='big')
random.seed(seed)
step = random.randint(0, 9)
print_rng(seed, step)
# split prefix and masked data
prefix = list(encrypted[:(4 + step)])
masked = list(encrypted[(4 + step):])
recovered = recover_message_from_xor_mask(masked, nondeterministic)
print_decryption_steps(c, masked, prefix, recovered)
print(style("\n[ DECRYPTED MESSAGE ]", "yellow"))
print(recovered)
if __name__ == "__main__":
main()
Description of The Program
A standard RSA implementation yields a strictly deterministic transformation1:
This means the same input always results in the same ciphertext — rendering it potentially vulnerable to chosen plaintext or replay attacks where repeated patterns can be observed.
This program improves upon that by introducing a layer of non-determinism prior to encryption:
- Seeded XOR masking applies a pseudorandom mask to the message, controlled by a seed and noise step. This program uses Python’s
random.seed(seed)
to control the pseudorandom values used in XOR masking. When the same seed is reused, the same sequence of random bytes is generated — making it possible to reproduce the mask and correctly reverse the XOR during decryption. This is not cryptographically secure, but useful for educational purposes. This means even with the same message and key, different ciphertexts may result if seeded differently.
Let be the message bytes, , then:
Encryption then proceeds with:
This results in:
Message Formatting Structure
Field | Size | Description |
---|---|---|
Seed | 4 bytes | RNG seed for deterministic pseudorandomness |
Noise | 0-9 bytes | Added noise depending on step from RNG |
Encrypted message | Variable | + First XOR masked plaintext + Then RSA-encrypted: |
Benefits
- Prevents deterministic ciphertexts for identical plaintexts.
- Enables repeatable encryption/decryption when the same seed is used.
- Makes pattern-based analysis harder for adversaries.
Limitations
While this system demonstrates improved conceptual security, it is not ready for production-grade cryptographic applications. Below are key issues and recommendations:
- Prime generation isn’t cryptographically secure:
Uses Python’s
random.getrandbits()
and a basic Miller-Rabin implementation, which is not resistant to side-channel attacks and vulnerable to advanced factorization methods. However, this could be easily mitigated using thesecrets
module and a robust Miller-Rabin with rounds. - XOR Masking ≠ Cryptographic Padding: XOR masking introduces obfuscation and nondeterminism, not encryption. Masking relies on a predictable seed and gives no semantic security if an attacker knows masking pattern. However, this could be easily mitigated using OAEP or AES for nondeterministic features.
- No Side-Channel or Timing Attack Protections: Modular exponentiation and key operations are not constant-time, as timing-resistance is not offered by Python. However, libraries like OPENSSL and Libsodium could provide this.
- No Integrity or Authenticity Check: No verification methods proving the message was unaltered or came from the right source is implemented. For message authenticity, one could generate a digital signature using the private key or applying HMAC after encryption for message integrity.
Deterministic RSA vs RSA + XOR Masking vs RSA + OAEP
Feature | Deterministic RSA | RSA + XOR Masking2 | RSA + OAEP |
---|---|---|---|
Deterministic | Yes | No | No |
Semantic security | None | Weak (depends on XOR strength) | Strong |
Uses secure randomness | None | Reproducible RNG (seeded random ) | Cryptographically secure RNG |
Padding scheme | None | Custom XOR + prefix | OAEP (Optimal Asymmetric Encryption) |
Resistant to chosen plaintext attacks | No | Somewhat (due to obfuscation, however weak to due to random ) | Yes |
Decryptability | Always | Yes, due to same embeded seed | Yes |
Production-ready? | No | No | Yes (if implemented properly) |
style(text, color='green')
Adds ANSI color codes for terminal output styling. It uses escape sequences to display text in colored formats (e.g., green, red). It does not affect logic or encryption but serves to aid in user experience.
base64_encode(c)
and base64_decode(b)
These functions encode and decode integer lists to/from base64 strings. They are not reimplementations of the base64 encoding and decoding, but extends upon them.
Encoding:
- Let be the list of ciphertext integers.
- Find the maximum byte length:
- Store as a 4-byte prefix in header.
- Serialize each as a fixed-length byte array of length .
- Concatenate header and serialized data and apply base64 encoding.
Decoding:
- Decode base64 stream to bytes.
- Read first 4 bytes to extract .
- Deserialize the reminder as integers of length .
Use case: Provides a portable string representation of RSA-encrypted integer chunks.
Time complexity: , where is the number of integers.
prime_check(n, k=5)
- Miller-Rabin primality test
This is a probabilistic primality test that determines whether a number is likely prime.
- Given an odd integer , express: , where is a positive integer and is an odd positive integer.
- For (the greater the , the safer the test) iterations:
- Randomly pick
- Compute
- If , continue.
- Otherwise, repeatedly square up to times, checking if
- If not, is composite.
Time complexity: using fast modular exponentiation.
prime_generate(bit_size)
Generates a prime with bithlength() = bit_size.
- Randomly samples an odd number
- Applies Miller-Rabin test until success
Expected time complexity: in practice, due to distribution of primes.
rsa_gcd(a, b)
Computes greatest common divisor using Euclidean algorithm:
Time complexity:
rsa_extended_gcd(phi, e)
Implementation of the Extended Euclidean Algorithm.
- Finds integers and such that:
- Returns (i.e., the modular inverse ).
Use case: Finding the private exponent .
Time complexity:
rsa_coprime_exponent(phi)
Linearly scans odd integers starting from 3 until:
Use case: Finding the public exponent .
Time complexity: Worst case , typically small due to early hits.
rsa_generate_keys(p, q)
Computes RSA key pair:
- $n = pq4
- Finds , such that
- Compute
- Returns tuple
rsa_encrypt(message, e, n)
and rsa_decrypt(cipher_chunks, d, n)
Chunk size calculation:
- bytes
Encryption:
- Split message into byte chunks
- Encrypt:
Decryption:
- Decrypt:
- Reconstruct original bytes
pow(a, b, c):
Python’s built-in pow(a, b, mod)
performs modular exponentiation, efficiently computing even for large exponents. Internally, it uses algorithms like square-and-multiply, making it much faster and memory-efficient than computing a ** b
first.
Time complexity: per encryption/decryption where is the number of chunks.
apply_xor_mask_to_string(message, nondeterministic=True)
Masks each byte with random byte , using a defined RNG seed.
If nondeterministic = False
then
Time complexity:
recover_message_from_xor_mask(masked, nondeterministic=True)
Inverts XOR masking using same RNG sequence. Since XOR is involutive: .
Time complexity:
print_keys(...), print_rng(...), print_encryption_steps(...), print_decryption_steps(...)
These provide annotated visualizations of key generation, RNG behavior, and the encryption/decryption pipeline. They are purely informative.
main()
Argument Parsing
Uses argparse
to allow flexible input configuration for primes, bit size, encryption/decryption mode, masking, and usage of base64 input/output.
Encryption Flow
- Key generation: Generate primes
- RNG generation: Seed + noise step stored for later use.
- Message input:
- Message preparation:
- Encode
- Mask with using RNG seed:
- Prefix with RNG seed + noise =
- Final data
- Mask with using RNG seed:
- Encode
- Chunking + encryption:
- Apply RSA encryption on byte string:
- Apply RSA encryption on byte string:
Decryption Flow
- RSA decryption of ciphertext chunks:
- Seed extraction from prefix , reseed RNG.
- XOR unmasking to restore message:
- Decode
Example Usage
Generate (weak) 8 bit primes, encrypt “test” and output the ciphertext as base64.
$ python rsa.py --bits 8 -g -e -b
[ PRIMES USED ]
p = 163
q = 59
[ KEYS GENERATED ]
Public Key (e, n): (5, 9617)
Private Key (d, n): (7517, 9617)
[ RNG ]
Seed: 547092045
Step: 1
[ MESSAGE INPUT ]
> Test
[ ENCRYPTION STEPS BREAKDOWN ]
[ 1. MESSAGE INPUT ]
Test
[ 2. UTF-8 BYTE ENCODING ]
Each character from input gets encoded as bytes
[84, 101, 115, 116]
[ 3. XOR MASKING ]
Each byte gets obfuscated using a random byte based of RNG seed
[240, 165, 72, 96]
[ 4. PREFIXING RNG SEED & RANDOM NOISE ]
A prefix gets created containing the seed & random noise
[32, 155, 246, 77, 113]
[ 5. CONCATENATE PREFIX + XOR MASKED DATA ]
The prefix and the XOR masked data gets concatenated
[32, 155, 246, 77, 113, 240, 165, 72, 96]
[ 6. CHUNKING OF DATA & RSA ENCRYPTION ]
Chunking into ≤ N-byte segments (based on RSA modulus)
RSA Encrypt chunks
[719, 1788, 3181, 2571, 7023, 6157, 2803, 6083, 1611]
[ ENCRYPTED OUTPUT (BASE64) ]
AAAAAgLPBvwMbQoLG28YDQrzF8MGSw==
Use the primes (163, 59), decrypt “AAAAAgLPBvwMbQoLG28YDQrzF8MGSw==” as ciphertext input in base64.
$ python rsa.py -p 163 -q 59 -d -b
[ PRIMES USED ]
p = 163
q = 59
[ KEYS GENERATED ]
Public Key (e, n): (5, 9617)
Private Key (d, n): (7517, 9617)
[ CIPHERTEXT INPUT (BASE64) ]
> AAAAAgLPBvwMbQoLG28YDQrzF8MGSw==
[ RNG ]
Seed: 547092045
Step: 1
[ DECRYPTION STEPS BREAKDOWN ]
[ 1. CIPHERTEXT INPUT ]
[719, 1788, 3181, 2571, 7023, 6157, 2803, 6083, 1611]
[ 2. RSA DECRYPTION & DECHUNKING OF DATA ]
RSA Decrypt into chunks
Dechunking from ≤ N-byte segments (based on RSA modulus)
[32, 155, 246, 77, 113, 240, 165, 72, 96]
[ 3. RNG SEED EXTRACTION ]
Prefix and masked data gets separated, and the noise is ignored
[ 4. XOR UNMASKING ]
Each byte gets de-obfuscated using a random byte based of RNG seed
[84, 101, 115, 116]
[ 5. UTF-8 BYTE DECODING ]
Each byte from input gets decoded as UTF-8
Test
[ DECRYPTED MESSAGE ]
Test