Cryptanalysis


Cryptanalysis refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown.
In addition to mathematical analysis of cryptographic algorithms, cryptanalysis includes the study of side-channel attacks that do not target weaknesses in the cryptographic algorithms themselves, but instead exploit weaknesses in their implementation.
Even though the goal has been the same, the methods and techniques of cryptanalysis have changed drastically through the history of cryptography, adapting to increasing cryptographic complexity, ranging from the pen-and-paper methods of the past, through machines like the British Bombes and Colossus computers at Bletchley Park in World War II, to the mathematically advanced computerized schemes of the present. Methods for breaking modern cryptosystems often involve solving carefully constructed problems in pure mathematics, the best-known being integer factorization.

Overview

In encryption, confidential information is sent securely to a recipient by the sender first converting it into an unreadable form using an encryption algorithm. The ciphertext is sent through an insecure channel to the recipient. The recipient decrypts the ciphertext by applying an inverse decryption algorithm, recovering the plaintext. To decrypt the ciphertext, the recipient requires a secret knowledge from the sender, usually a string of letters, numbers, or bits, called a cryptographic key. The concept is that even if an unauthorized person gets access to the ciphertext during transmission, without the secret key they cannot convert it back to plaintext.
Encryption has been used throughout history to send important military, diplomatic and commercial messages, and today is very widely used in computer networking to protect email and internet communication.
The goal of cryptanalysis is for a third party, a cryptanalyst, to gain as much information as possible about the original, attempting to "break" the encryption to read the ciphertext and learning the secret key so future messages can be decrypted and read. A mathematical technique to do this is called a cryptographic attack. Cryptographic attacks can be characterized in a number of ways:

Amount of information available to the attacker

can be classified based on what type of information the attacker has available. As a basic starting point it is normally assumed that, for the purposes of analysis, the general algorithm is known; this is Shannon's Maxim "the enemy knows the system" – in its turn, equivalent to Kerckhoffs's principle. This is a reasonable assumption in practice – throughout history, there are countless examples of secret algorithms falling into wider knowledge, variously through espionage, betrayal and reverse engineering. :
  • Ciphertext-only: the cryptanalyst has access only to a collection of ciphertexts or codetexts.
  • Known-plaintext: the attacker has a set of ciphertexts to which they know the corresponding plaintext.
  • Chosen-plaintext : the attacker can obtain the ciphertexts corresponding to an arbitrary set of plaintexts of their own choosing.
  • Adaptive chosen-plaintext: like a chosen-plaintext attack, except the attacker can choose subsequent plaintexts based on information learned from previous encryptions, similarly to the Adaptive chosen ciphertext attack.
  • Related-key attack: Like a chosen-plaintext attack, except the attacker can obtain ciphertexts encrypted under two different keys. The keys are unknown, but the relationship between them is known; for example, two keys that differ in the one bit.

    Computational resources required

Attacks can also be characterised by the resources they require. Those resources include:
  • Time – the number of computation steps which must be performed.
  • Memory – the amount of storage required to perform the attack.
  • Data – the quantity and type of plaintexts and ciphertexts required for a particular approach.
It is sometimes difficult to predict these quantities precisely, especially when the attack is not practical to actually implement for testing. But academic cryptanalysts tend to provide at least the estimated order of magnitude of their attacks' difficulty, saying, for example, "SHA-1 collisions now 252."
Bruce Schneier notes that even computationally impractical attacks can be considered breaks: "Breaking a cipher simply means finding a weakness in the cipher that can be exploited with a complexity less than brute force. Never mind that brute-force might require 2128 encryptions; an attack requiring 2110 encryptions would be considered a break...simply put, a break can just be a certificational weakness: evidence that the cipher does not perform as advertised."

Partial breaks

The results of cryptanalysis can also vary in usefulness. Cryptographer Lars Knudsen classified various types of attack on block ciphers according to the amount and quality of secret information that was discovered:
  • Total break – the attacker deduces the secret key.
  • Global deduction – the attacker discovers a functionally equivalent algorithm for encryption and decryption, but without learning the key.
  • Instance deduction – the attacker discovers additional plaintexts not previously known.
  • Information deduction – the attacker gains some Shannon information about plaintexts not previously known.
  • Distinguishing algorithm – the attacker can distinguish the cipher from a random permutation.
Academic attacks are often against weakened versions of a cryptosystem, such as a block cipher or hash function with some rounds removed. Many, but not all, attacks become exponentially more difficult to execute as rounds are added to a cryptosystem, so it's possible for the full cryptosystem to be strong even though reduced-round variants are weak. Nonetheless, partial breaks that come close to breaking the original cryptosystem may mean that a full break will follow; the successful attacks on DES, MD5, and SHA-1 were all preceded by attacks on weakened versions.
In academic cryptography, a weakness or a break in a scheme is usually defined quite conservatively: it might require impractical amounts of time, memory, or known plaintexts. It also might require the attacker be able to do things many real-world attackers can't: for example, the attacker may need to choose particular plaintexts to be encrypted or even to ask for plaintexts to be encrypted using several keys related to the secret key. Furthermore, it might only reveal a small amount of information, enough to prove the cryptosystem imperfect but too little to be useful to real-world attackers. Finally, an attack might only apply to a weakened version of cryptographic tools, like a reduced-round block cipher, as a step towards breaking the full system.

History

Cryptanalysis has coevolved together with cryptography, and the contest can be traced through the history of cryptography—new ciphers being designed to replace old broken designs, and new cryptanalytic techniques invented to crack the improved schemes. In practice, they are viewed as two sides of the same coin: secure cryptography requires design against possible cryptanalysis.

Classical ciphers

Although the actual word "cryptanalysis" is relatively recent, methods for breaking codes and ciphers are much older. David Kahn notes in The Codebreakers that Arab scholars were the first people to systematically document cryptanalytic methods.
The first known recorded explanation of cryptanalysis was given by Al-Kindi, a 9th-century Arab polymath, in Risalah fi Istikhraj al-Mu'amma. This treatise contains the first description of the method of frequency analysis. Al-Kindi is thus regarded as the first codebreaker in history. His breakthrough work was influenced by Al-Khalil, who wrote the Book of Cryptographic Messages, which contains the first use of permutations and combinations to list all possible Arabic words with and without vowels.
Frequency analysis is the basic tool for breaking most classical ciphers. In natural languages, certain letters of the alphabet appear more often than others; in English, "E" is likely to be the most common letter in any sample of plaintext. Similarly, the digraph "TH" is the most likely pair of letters in English, and so on. Frequency analysis relies on a cipher failing to hide these statistics. For example, in a simple substitution cipher, the most frequent letter in the ciphertext would be a likely candidate for "E". Frequency analysis of such a cipher is therefore relatively easy, provided that the ciphertext is long enough to give a reasonably representative count of the letters of the alphabet that it contains.
Al-Kindi's invention of the frequency analysis technique for breaking monoalphabetic substitution ciphers was the most significant cryptanalytic advance until World War II. Al-Kindi's Risalah fi Istikhraj al-Mu'amma described the first cryptanalytic techniques, including some for polyalphabetic ciphers, cipher classification, Arabic phonetics and syntax, and most importantly, gave the first descriptions on frequency analysis. He also covered methods of encipherments, cryptanalysis of certain encipherments, and statistical analysis of letters and letter combinations in Arabic. An important contribution of Ibn Adlan was on sample size for use of frequency analysis.
In Europe, Italian scholar Giambattista della Porta was the author of a seminal work on cryptanalysis, De Furtivis Literarum Notis.
Successful cryptanalysis has undoubtedly influenced history; the ability to read the presumed-secret thoughts and plans of others can be a decisive advantage. For example, in England in 1587, Mary, Queen of Scots was tried and executed for treason as a result of her involvement in three plots to assassinate Elizabeth I of England. The plans came to light after her coded correspondence with fellow conspirators was deciphered by Thomas Phelippes.
In Europe during the 15th and 16th centuries, the idea of a polyalphabetic substitution cipher was developed, among others by the French diplomat Blaise de Vigenère. For some three centuries, the Vigenère cipher, which uses a repeating key to select different encryption alphabets in rotation, was considered to be completely secure. Nevertheless, Charles Babbage and later, independently, Friedrich Kasiski succeeded in breaking this cipher. During World War I, inventors in several countries developed rotor cipher machines such as Arthur Scherbius' Enigma, in an attempt to minimise the repetition that had been exploited to break the Vigenère system.