RSA cryptosystem


The RSA 'cryptosystem' is a family of public-key cryptosystems, one of the oldest widely used for secure data transmission. The initialism "RSA" comes from the surnames of Ron Rivest, Adi Shamir and Leonard Adleman, who publicly described the algorithm in 1977. An equivalent system was developed secretly in 1973 at Government Communications Headquarters, the British signals intelligence agency, by the English mathematician Clifford Cocks. That system was declassified in 1997.
RSA is used in digital signature such as RSASSA-PSS or RSA-FDH,
public-key encryption of very short messages such as RSAES-OAEP,
and public-key key encapsulation.
In RSA-based cryptography, a user's private key—which can be used to sign messages, or decrypt messages sent to that user—is a pair of large prime numbers chosen at random and kept secret.
A user's public key—which can be used to verify messages from the user, or encrypt messages so that only that user can decrypt them—is the product of the prime numbers.
The security of RSA is related to the difficulty of factoring the product of two large prime numbers, the "factoring problem". Breaking RSA encryption is known as the RSA problem. Whether it is as difficult as the factoring problem is an open question. There are no published methods to defeat the system if a large enough key is used.

History

The idea of an asymmetric public-private key cryptosystem is attributed to Whitfield Diffie and Martin Hellman, who published this concept in 1976. They also introduced digital signatures and attempted to apply number theory. Their formulation used a shared-secret-key created from exponentiation of some number, modulo a prime number. However, they left open the problem of realizing a one-way function, possibly because the difficulty of factoring was not well-studied at the time. Moreover, like Diffie-Hellman, RSA is based on modular exponentiation.
Ron Rivest, Adi Shamir, and Leonard Adleman at the Massachusetts Institute of Technology made several attempts over the course of a year to create a function that was hard to invert. Rivest and Shamir, as computer scientists, proposed many potential functions, while Adleman, as a mathematician, was responsible for finding their weaknesses. They tried many approaches, including "knapsack-based" and "permutation polynomials". For a time, they thought what they wanted to achieve was impossible due to contradictory requirements. In April 1977, they spent Passover at the house of a student and drank a good deal of wine before returning to their homes at around midnight. Rivest, unable to sleep, lay on the couch with a math textbook and started thinking about their one-way function. He spent the rest of the night formalizing his idea, and he had much of the paper ready by daybreak. The algorithm is now known as RSA the initials of their surnames in same order as their paper.
Clifford Cocks, an English mathematician working for the British intelligence agency Government Communications Headquarters, described a similar system in an internal document in 1973. However, given the relatively expensive computers needed to implement it at the time, it was considered to be mostly a curiosity and, as far as is publicly known, was never deployed. His ideas and concepts were not revealed until 1997 due to its top-secret classification.
Kid-RSA is a simplified, insecure public-key cipher published in 1997, designed for educational purposes. Kid-RSA gives insight into RSA and other public-key ciphers, analogous to simplified DES.

Patent

A patent describing the RSA algorithm was granted to MIT on 20 September 1983: "Cryptographic communications system and method". From DWPI's abstract of the patent:
A detailed description of the algorithm was published in August 1977, in Scientific Americans Mathematical Games column. This preceded the patent's filing date of December 1977. Consequently, the patent had no legal standing outside the United States. Had Cocks' work been publicly known, a patent in the United States would not have been legal either.
When the patent was issued, terms of patent were 17 years. The patent was about to expire on 21 September 2000, but RSA Security released the algorithm to the public domain on 6 September 2000.

Operation

The RSA algorithm involves four steps: key generation, key distribution, public-key operation, and private key operation.
A basic principle behind RSA is the observation that it is practical to find three very large positive integers,, and, such that for all integers , both and have the same remainder when divided by :However, when given only and, it is infeasible to compute th roots modulo ; that is, for uniform random , it is extremely difficult to find such that.
The integers and form the public key and is the private key. The modular exponentiation to the power of is used in encryption and in verifying signatures, and exponentiation to the power of is used in decryption and in signing messages.

Key generation

The keys for the RSA algorithm are generated in the following way:
  1. Choose two large prime numbers and.
  2. * To make factoring infeasible, and must be chosen at random from a large space of possibilities, such as all prime numbers between and . Many different algorithms for prime selection are used in practice.
  3. * and are kept secret.
  4. Compute.
  5. * is used as the modulus for both the public and private keys. Its length, usually expressed in bits, is the key length.
  6. * is released as part of the public key.
  7. Compute, where is Carmichael's totient function. Since, and since and are prime,, and likewise. Hence.
  8. * The may be calculated through the Euclidean algorithm, since.
  9. * is kept secret.
  10. Choose an integer such that and ; that is, and are coprime.
  11. * having a short bit-length and small Hamming weight results in more efficient encryption the most commonly chosen value for is. The smallest possible value for is 3, but such a small value for may expose vulnerabilities in insecure padding schemes.
  12. * is released as part of the public key.
  13. Determine as ; that is, is the modular multiplicative inverse of modulo.
  14. * This means: solve for the equation ; can be computed efficiently by using the extended Euclidean algorithm, since, thanks to and being coprime, said equation is a form of Bézout's identity, where is one of the coefficients.
  15. * is kept secret as the private key exponent.
The public key consists of the modulus and the public exponent. The private key consists of the private exponent, which must be kept secret.,, and must also be kept secret because they can be used to calculate. In fact, they can all be discarded after has been computed.
In the original RSA paper, the Euler totient function is used instead of for calculating the private exponent. Since is always divisible by, the algorithm works as well. The possibility of using Euler totient function results also from Lagrange's theorem applied to the multiplicative group of integers modulo pq. Thus any satisfying also satisfies. However, computing modulo will sometimes yield a result that is larger than necessary. Most of the implementations of RSA will accept exponents generated using either method, but some standards such as may require that. Any "oversized" private exponents not meeting this criterion may always be reduced modulo to obtain a smaller equivalent exponent.
Note: The authors of the original RSA paper carry out the key generation by choosing and then computing as the modular multiplicative inverse of modulo, whereas most current implementations of RSA, such as those following PKCS#1, do the reverse—choose and compute from it. Since can safely be small and fixed, whereas must be chosen from a large enough space to resist attack, the modern approach can reduce the cost of the public-key operation without loss of security.

Key distribution

Suppose that Bob wants to send secret messages to Alice, or verify messages from Alice. If they decide to use RSA, Bob must know Alice's public key to encrypt his secret messages or verify Alice's messages, and Alice must use her private key to decrypt Bob's secret messages or sign her own messages.
To enable Bob to send his encrypted messages or verify her future messages, Alice transmits her public key to Bob via a reliable, but not necessarily secret, route. Alice's private key is never distributed.

Encryption

After Bob obtains Alice's public key, he can send a message to Alice.
To do it, he first turns into an integer, the padded plaintext, such that, by using an agreed-upon reversible protocol known as a [|padding scheme]. He then computes the ciphertext, using Alice's public key, by:
This can be done reasonably quickly, even for very large numbers, using modular exponentiation. Bob then transmits to Alice. Note that at least nine values of will yield a ciphertext equal to
, but this is very unlikely to occur in practice.

Decryption

Alice can recover from by using her private key exponent by computing
Given, she can recover the original message by reversing the padding scheme, or discard it as corrupted if the padding is invalid.
Alice must discard if the padding is invalid: if she reveals any information about when it has invalid padding, an adversary could exploit this to decrypt messages without knowing the private key, by sending her random or maliciously crafted ciphertexts and observing how she responds.

Example

Here is an example of RSA encryption and decryption, ignoring the details of padding:
  1. Choose two distinct prime numbers, such as
  2. : and.
  3. Compute giving
  4. :
  5. Compute the Carmichael's totient function of the product as giving
  6. :
  7. Choose any number that is coprime to 780. Choosing a prime number for leaves us only to check that is not a divisor of 780.
  8. : Let.
  9. Compute, the modular multiplicative inverse of, yielding
as
The public key is. For a padded plaintext message, the encryption function is
The private key is. For an encrypted ciphertext, the decryption function is
For instance, in order to encrypt, one calculates
To decrypt, one calculates
Both of these calculations can be computed efficiently using the square-and-multiply algorithm for modular exponentiation. In real-life situations the primes selected would be much larger; in our example it would be trivial to factor back to the primes and., also from the public key, is then inverted to get, thus acquiring the private key.
Practical implementations use the Chinese remainder theorem to speed up the calculation using modulus of factors.
The values, and, which are part of the private key are computed as follows:
Here is how, and are used for efficient decryption :