--- jupytext: text_representation: extension: .md format_name: myst format_version: 0.13 jupytext_version: 1.10.3 kernelspec: display_name: Python 3 (ipykernel) language: python name: python3 --- +++ {"tags": []} # Public-Key Encryption (Optional) +++ **CS1302 Introduction to Computer Programming** ___ +++ This notebook will introduce the idea of public-key (asymmetric key) cryptography called the [RSA algorithm](https://en.wikipedia.org/wiki/RSA_(cryptosystem)). +++ {"tags": []} ## Motivation +++ **What is public key encryption?** +++ Suppose there is one receiver trying to receive private messsages from multiple senders. +++ ````{prf:definition} Asymmetric key cipher Every sender uses the same public key $k_e$ to encrypt a plaintext $x$ to a ciphertext $y$ as $$ y = E(x, k_e). $$ (encrypt) The receiver with the private key $k_d$ decrypts the ciphertext back to the plaintext as $$ x = D(y, k_d). $$ (decrypt) $k_e, k_d, E, D$ is chosen to ensure - Decryptability: The receiver can recover the plaintext. - Secrecy: Anyone with only the public key but not the private key cannot recover the plaintext. ```` +++ **Exercise** (Optional) What is the benefit of asymmetric key cipher over symmetric key cipher? +++ {"nbgrader": {"grade": false, "grade_id": "benefits", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": true}, "tags": ["hide-input"]} - Easy to share keys: Encryption key can be shared to all senders publicly. - Easy to store keys: Only the receiver needs to know the private key. +++ **Is public key encryption possible?** +++ Unfortunately, public key encryption is an invertible function given a public key: +++ ````{prf:proposition} The plaintext can be recovered from the ciphertext using the public key, even without the private key. ```` +++ **Exercise** (Optional) Prove the above proposition. +++ {"nbgrader": {"grade": true, "grade_id": "invertible", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false}, "tags": ["hide-input"]} ````{prf:proof} It suffices to show that the encryption $E(\cdot, k_e)$ is invertible for any public key $k_e$, i.e., two plaintext $x_1$ and $x_2$ are the same if and only if they encrypts to the same ciphertexts using the same key $k_e$. If two plaintexts are the same, then their ciphertexts must be the same by {eq}`encrypt`. It remains to prove the converse. If the ciphertexts are the same, i.e., $$E(x_1, k_e)=E(x_2, k_e)=y,$$ then, by {eq}`decrypt`, $$x_1 = D(y, k_d) = x_2,$$ i.e., the plaintexts are the same. The converse is trivial. ```` +++ **How to make public key encryption secure?** +++ We can make it computationally infeasible to invert $E(\cdot, k_e)$ unless with the private key $k_d$ is available. Such a function is called the [trapdoor function](https://en.wikipedia.org/wiki/Trapdoor_function#:~:text=A%20trapdoor%20function%20is%20a,are%20widely%20used%20in%20cryptography.). An example of a trapdoor function is: +++ ````{prf:proposition} Integer factorization Computing the product of of two prime numbers $p$ and $q$, i.e., $$ (p,q)\mapsto n, $$ is a trapdoor function because integer factorization (computing $p$ and $q$ from $n$) is [co-NP](https://en.wikipedia.org/wiki/Integer_factorization) (difficult). ```` +++ As the size of $p$ and $q$ increases, the time required to factor $n$ increases dramatically as illustrated [here](https://www.khanacademy.org/computing/computer-science/cryptography/modern-crypt/pi/time-complexity-exploration). +++ {"tags": []} ## RSA Algorithm +++ **How to encrypt/decrypt?** +++ The encryption and decryption use modulo exponentiation instead of addition: $$ \begin{align} E(x, k_e) &:= x^e \bmod n && \text{where }k_e:=(e,n)\\ D(c, k_d) &:= c^d \bmod n && \text{where }k_d:=(d,n), \end{align} $$ and $e$, $d$, and $n$ are positive integers. +++ Computing exponentiation can be fast using [repeated squaring](https://en.wikipedia.org/wiki/Exponentiation_by_squaring). The built-in function `pow` already has an efficient implementation: ```{code-cell} ipython3 %%timeit x, e, n = 3, 2 ** 1000000, 4 pow(x, e, n) ``` **Exercise** (Optional) Implement you own `modulo_power` using a recusion. ```{code-cell} ipython3 --- nbgrader: grade: false grade_id: modulo_power locked: false schema_version: 3 solution: true task: false tags: [hide-input] --- %%timeit def modulo_power(x, e, n): ### BEGIN SOLUTION if e == 0: return 1 % n if e == 1: return x % n return ( x * modulo_power(x ** 2, e // 2, n) if e % 2 else modulo_power(x ** 2, e // 2, n) ) ### END SOLUTION x, e, n = 3, 2 ** 1000000, 4 pow(x, e, n) ``` **How to ensure decryptability?** +++ For $x = D(E(x, k_e), k_d) = x^{ed} \bmod n$, we need $0\leq x < n$ and $$ x^{ed} \equiv x \mod n. $$ (decryptability) +++ **Exercise** (Optional) Derive the above condition using $(a^c \bmod b) = (a\bmod b)^c \bmod b$. +++ {"nbgrader": {"grade": false, "grade_id": "decryptability", "locked": true, "points": 0, "schema_version": 3, "solution": false, "task": true}, "tags": ["hide-input"]} $$ \begin{align} x &= D(E(x, k_e), k_d) \\ &= (x^{e} \bmod n)^{d} \bmod n \\ &= x^{ed} \bmod n. \end{align} $$ +++ RSA makes use of the following result to choose $(e, d, n)$: +++ ````{prf:theorem} Fermat's little Theorem If $p$ is prime, then $$ x^{p-1}\equiv 1 \mod p $$ (fermat) for any integer $x$. ```` +++ There are elegant and elementary [combinatorial proofs](https://en.wikipedia.org/wiki/Proofs_of_Fermat%27s_little_theorem#Combinatorial_proofs). +++ Since {eq}`fermat` implies $x^p = x \bmod p$, can we choose - $n=p$ and - $ed=p$ to satisfies {eq}`decryptability`? +++ No because the private key can then be easily computed from the public key: $d = n/e$. +++ Alternatively, by raising {eq}`fermat` to the power of any integer $m$, $$ x^{m(p-1)} \equiv 1 \mod p. $$ (fermat-m) +++ Can we have $n=p$ and $ed \equiv 1 \bmod p-1$? +++ No because $d$ is the modular multiplicative inverse of $e$, which is [easy to compute](https://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Computation), e.g., using `pow` with an exponent of `-1`. In particular, for prime modulus here, the inverse is $d=e^{p-2}\bmod p$: ```{code-cell} ipython3 e, n = 3, 7 d = pow(e, -1, n) d, e * d % n == 1 and d == e ** (n - 2) % n ``` **How to make it difficult to compute $d$?** +++ RSA makes use of the hardness of factoring a product of large primes to create the desired trapdoor function. +++ In particular, with $m(p-1)$ in {eq}`fermat-m` being the least common multiple $\operatorname{lcm}(p-1,q-1)$ for another prime number $q$, we have $$ \begin{align} x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod p && \text{and}\\ x^{\operatorname{lcm}(p-1, q-1)} &\equiv 1 \mod q && \text{by symmetry.} \end{align} $$ +++ This implies $x^{\operatorname{lcm}(p-1, q-1)} - 1$ is divisible by both $p$ and $q$, and so $$ x^{\operatorname{lcm}(p-1, q-1)} \equiv 1 \mod p q. $$ Raising both sides to the power of any positive integer $m$ give: +++ ````{prf:proposition} RSA If $p$ and $q$ are prime, then $$ x^{\overbrace{m \operatorname{lcm}(p-1, q-1)}^{ed - 1}} \equiv 1 \mod \overbrace{p q}^n $$ (rsa) for any integer $x$. This implies {eq}`decryptability` with by choosing $n = pq$ and $$ ed \equiv 1 \mod \operatorname{lcm}(p-1, q-1). $$ (ed) ```` +++ Although $d$ is still the modulo multiplicative inverse of $e$, it is with respect to $\operatorname{lcm}(p-1, q-1)$, which is not easy to compute without knowing the factors of $n$, namely $p$ and $q$. It can be shown that computing $d$ is [as hard as](https://crypto.stackexchange.com/questions/16036/is-knowing-the-private-key-of-rsa-equivalent-to-the-factorization-of-n) computing $\operatorname{lcm}(p-1, q-1)$ or factoring $n$. +++ **How to generate the public key and private key?** +++ By {eq}`ed`, we can compute $d$ as the modulo multiplicative inverse of $e$. How to choose $e$ then? +++ We can choose any $e \in \{1, \dots, \operatorname{lcm}(p-1, q-1)\}$ such that $e$ does not divide $\operatorname{lcm}(p-1, q-1)$. +++ **Exercise** (Optional) For $e$ to have the modulo multiplicative inverse, it should not divide $\operatorname{lcm}(p-1, q-1)$. Why? +++ {"nbgrader": {"grade": true, "grade_id": "inverse", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false}, "tags": ["hide-input"]} Otherise, $ed \bmod \operatorname{lcm}(p-1, q-1) = 0$, which violates {eq}`ed` for any integer $d$. +++ The following function randomly generate the `e, d, n` for some given prime numbers `p` and `q`: ```{code-cell} ipython3 from math import gcd from random import randint def get_rsa_keys(p, q): n = p * q lcm = (p - 1) * (q - 1) // gcd(p - 1, q - 1) while True: e = randint(1, lcm - 1) if gcd(e, lcm) == 1: break d = pow(e, -1, lcm) return e, d, n, lcm ``` Note that $$ \operatorname{lcm}(p-1, q-1) = \frac{(p-1)(q-1)}{\operatorname{gcd}(p-1, q-1)}. $$ +++ As an example, if we choose two prime numbers $p=17094589121$ and $q=1062873761$: ```{code-cell} ipython3 e, d, n, lcm = get_rsa_keys(17094589121, 1062873761) e, d, n, e * d % lcm == 1 ``` The integer $1302$ can be encrypted as follows: ```{code-cell} ipython3 x = 1302 c = pow(x, e, n) print(f'The plain text "{x}" has been encrypted into "{c}".') ``` With the private key $k_d$, the ciphertext can be decrypted easily as ```{code-cell} ipython3 output = pow(c, d, n) print(f'The cipher "{c}" has been decrypted into "{output}".') ``` **Exercise** (optional) Using the `rsa` module, [generate a RSA key pair](https://stuvel.eu/python-rsa-doc/usage.html#generating-keys) with suitable length and then use it to encrypt and decrypt your own message. You can install the module as follows: ```{code-cell} ipython3 :tags: [remove-output] !pip install rsa ```