Block cipher

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.
A block cipher uses blocks as an unvarying transformation. Even a secure block cipher is suitable for the encryption of only a single block of data at a time, using a fixed key. A multitude of modes of operation have been designed to allow their repeated use in a secure way to achieve the other security goals of confidentiality and authenticity. However, block ciphers may also feature as building blocks in other cryptographic protocols, such as universal hash functions and pseudorandom number generators.

Definition

A block cipher consists of two paired algorithms, one for encryption,, and the other for decryption,. Both algorithms accept two inputs: an input block of size bits and a key of size bits; and both yield an -bit output block. The decryption algorithm is defined to be the inverse function of encryption, i.e.,. More formally, a block cipher is specified by an encryption function
which takes as input a key, of bit length, and a bit string, of length, and returns a string of bits. is called the plaintext, and is termed the ciphertext. For each, the function is required to be an invertible mapping on. The inverse for is defined as a function
taking a key and a ciphertext to return a plaintext value, such that
For example, a block cipher encryption algorithm might take a 128-bit block of plaintext as input, and output a corresponding 128-bit block of ciphertext. The exact transformation is controlled using a second input – the secret key. Decryption is similar: the decryption algorithm takes, in this example, a 128-bit block of ciphertext together with the secret key, and yields the original 128-bit block of plain text.
For each key K, E_K is a permutation over the set of input blocks. Each key selects one permutation from the set of possible permutations.

History

The modern design of block ciphers is based on the concept of an iterated product cipher. In his seminal 1949 publication, Communication Theory of Secrecy Systems, Claude Shannon analyzed product ciphers and suggested them as a means of effectively improving security by combining simple operations such as substitutions and permutations. Iterated product ciphers carry out encryption in multiple rounds, each of which uses a different subkey derived from the original key. One widespread implementation of such ciphers named a Feistel network after Horst Feistel is notably implemented in the DES cipher. Many other realizations of block ciphers, such as the AES, are classified as substitution–permutation networks.
The root of all cryptographic block formats used within the Payment Card Industry Data Security Standard and American National Standards Institute standards lies with the Atalla Key Block, which was a key innovation of the Atalla Box, the first hardware security module. It was developed in 1972 by Mohamed M. Atalla, founder of Atalla Corporation, and released in 1973. The AKB was a key block, which is required to securely interchange symmetric keys or PINs with other actors in the banking industry. This secure interchange is performed using the AKB format. The Atalla Box protected over 90% of all ATM networks in operation as of 1998, and Atalla products still secure the majority of the world's ATM transactions as of 2014.
The publication of the DES cipher by the United States National Bureau of Standards in 1977 was fundamental in the public understanding of modern block cipher design. It also influenced the academic development of cryptanalytic attacks. Both differential and linear cryptanalysis arose out of studies on DES design., there is a palette of attack techniques against which a block cipher must be secure, in addition to being robust against brute-force attacks.

Design

Iterated block ciphers

Most block cipher algorithms are classified as iterated block ciphers which means that they transform fixed-size blocks of plaintext into identically sized blocks of ciphertext, via the repeated application of an invertible transformation known as the round function, with each iteration referred to as a round.
Usually, the round function R takes different round keys ''K_i as a second input, which is derived from the original key:
where is the plaintext and the ciphertext, with r'' being the number of rounds.
Frequently, key whitening is used in addition to this. At the beginning and the end, the data is modified with key material :
Given one of the standard iterated block cipher design schemes, it is fairly easy to construct a block cipher that is cryptographically secure, simply by using a large number of rounds. However, this will make the cipher inefficient. Thus, efficiency is the most important additional design criterion for professional ciphers. Further, a good block cipher is designed to avoid side-channel attacks, such as branch prediction and input-dependent memory accesses that might leak secret data via the cache state or the execution time. In addition, the cipher should be concise, for small hardware and software implementations.

Substitution–permutation networks

One important type of iterated block cipher known as a substitution–permutation network takes a block of the plaintext and the key as inputs and applies several alternating rounds consisting of a substitution stage followed by a permutation stage—to produce each block of ciphertext output. The non-linear substitution stage mixes the key bits with those of the plaintext, creating Shannon's confusion. The linear permutation stage then dissipates redundancies, creating diffusion.
A substitution box substitutes a small block of input bits with another block of output bits. This substitution must be one-to-one, to ensure invertibility. A secure S-box will have the property that changing one input bit will change about half of the output bits on average, exhibiting what is known as the avalanche effect—i.e. it has the property that each output bit will depend on every input bit.
A permutation box is a permutation of all the bits: it takes the outputs of all the S-boxes of one round, permutes the bits, and feeds them into the S-boxes of the next round. A good P-box has the property that the output bits of any S-box are distributed to as many S-box inputs as possible.
At each round, the round key is combined using some group operation, typically XOR.
Decryption is done by simply reversing the process.

Feistel ciphers

In a Feistel cipher, the block of plain text to be encrypted is split into two equal-sized halves. The round function is applied to one half, using a subkey, and then the output is XORed with the other half. The two halves are then swapped.
Let be the round function and let
be the sub-keys for the rounds respectively.
Then the basic operation is as follows:
Split the plaintext block into two equal pieces,
For each round, compute
Then the ciphertext is.
The decryption of a ciphertext is accomplished by computing for
Then is the plaintext again.
One advantage of the Feistel model compared to a substitution–permutation network is that the round function does not have to be invertible.

Lai–Massey ciphers

The Lai–Massey scheme offers security properties similar to those of the Feistel structure. It also shares the advantage that the round function does not have to be invertible. Another similarity is that it also splits the input block into two equal pieces. However, the round function is applied to the difference between the two, and the result is then added to both half blocks.
Let be the round function and a half-round function and let be the sub-keys for the rounds respectively.
Then the basic operation is as follows:
Split the plaintext block into two equal pieces,
For each round, compute
where and
Then the ciphertext is.
The decryption of a ciphertext is accomplished by computing for
where and
Then is the plaintext again.

Operations

ARX (add–rotate–XOR)

Many modern block ciphers and hashes are ARX algorithms—their round function involves only three operations: modular addition, rotation with fixed rotation amounts, and XOR. Examples include ChaCha20, Speck, XXTEA, and BLAKE. Many authors draw an ARX network, a kind of data flow diagram, to illustrate such a round function.
These ARX operations are popular because they are relatively fast and cheap in hardware and software, their implementation can be made extremely simple, and also because they run in constant time, and therefore are immune to timing attacks. The rotational cryptanalysis technique attempts to attack such round functions.

Other operations

Other operations often used in block ciphers include data-dependent rotations as in RC5 and RC6, a substitution box implemented as a lookup table as in Data Encryption Standard and Advanced Encryption Standard, a permutation box, and multiplication as in IDEA.

Modes of operation

A block cipher by itself allows encryption only of a single data block of the cipher's block length. For a variable-length message, the data must first be partitioned into separate cipher blocks. In the simplest case, known as electronic codebook mode, a message is first split into separate blocks of the cipher's block size, and then each block is encrypted and decrypted independently. However, such a naive method is generally insecure because equal plaintext blocks will always generate equal ciphertext blocks, so patterns in the plaintext message become evident in the ciphertext output.
To overcome this limitation, several so-called block cipher modes of operation have been designed and specified in national recommendations such as NIST 800-38A and BSI TR-02102 and international standards such as ISO/IEC 10116. The general concept is to use randomization of the plaintext data based on an additional input value, frequently called an initialization vector, to create what is termed probabilistic encryption. In the popular cipher block chaining mode, for encryption to be secure the initialization vector passed along with the plaintext message must be a random or pseudo-random value, which is added in an exclusive-or manner to the first plaintext block before it is encrypted. The resultant ciphertext block is then used as the new initialization vector for the next plaintext block. In the cipher feedback mode, which emulates a self-synchronizing stream cipher, the initialization vector is first encrypted and then added to the plaintext block. The output feedback mode repeatedly encrypts the initialization vector to create a key stream for the emulation of a synchronous stream cipher. The newer counter mode similarly creates a key stream, but has the advantage of only needing unique and not random values as initialization vectors; the needed randomness is derived internally by using the initialization vector as a block counter and encrypting this counter for each block.
From a security-theoretic point of view, modes of operation must provide what is known as semantic security. Informally, it means that given some ciphertext under an unknown key one cannot practically derive any information from the ciphertext over what one would have known without seeing the ciphertext. It has been shown that all of the modes discussed above, with the exception of the ECB mode, provide this property under so-called chosen plaintext attacks.

Padding

Some modes such as the CBC mode only operate on complete plaintext blocks. Simply extending the last block of a message with zero bits is insufficient since it does not allow a receiver to easily distinguish messages that differ only in the number of padding bits. More importantly, such a simple solution gives rise to very efficient oracle attacks. A suitable padding scheme is therefore needed to extend the last plaintext block to the cipher's block size. While many popular schemes described in standards and in the literature have been shown to be vulnerable to padding oracle attacks, a solution that adds a one-bit and then extends the last block with zero-bits, standardized as "padding method 2" in ISO/IEC 9797-1, has been proven secure against these attacks.

Cryptanalysis

Cryptanalysis is the technique in which ciphers are decrypted without knowledge of the used key. Different attacks can be employed based on the information available to the cryptanalyst, these Attack models are:Ciphertext-only: the cryptanalyst has access only to a collection of ciphertexts or codetexts.Known-plaintext: the attacker has a set of ciphertexts to which they know the corresponding plaintext.Chosen-plaintext : the attacker can obtain the ciphertexts corresponding to an arbitrary set of plaintexts of their own choosing.Adaptive chosen-plaintext: like a chosen-plaintext attack, except the attacker can choose subsequent plaintexts based on information learned from previous encryptions, similarly to the Adaptive chosen ciphertext attack.Related-key attack: Like a chosen-plaintext attack, except the attacker can obtain ciphertexts encrypted under two different keys. The keys are unknown, but the relationship between them is known; for example, two keys that differ in the one bit.

Brute-force attacks

This property results in the cipher's security degrading quadratically, and needs to be taken into account when selecting a block size. There is a trade-off though as large block sizes can result in the algorithm becoming inefficient to operate. Earlier block ciphers such as the DES have typically selected a 64-bit block size, while newer designs such as the AES support block sizes of 128 bits or more, with some ciphers supporting a range of different block sizes.

Linear cryptanalysis

A linear cryptanalysis is a form of cryptanalysis based on finding affine approximations to the action of a cipher. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.
The discovery is attributed to Mitsuru Matsui, who first applied the technique to the FEAL cipher.

Integral cryptanalysis

Integral cryptanalysis is a cryptanalytic attack that is particularly applicable to block ciphers based on substitution–permutation networks. Unlike differential cryptanalysis, which uses pairs of chosen plaintexts with a fixed XOR difference, integral cryptanalysis uses sets or even multisets of chosen plaintexts of which part is held constant and another part varies through all possibilities. For example, an attack might use 256 chosen plaintexts that have all but 8 of their bits the same, but all differ in those 8 bits. Such a set necessarily has an XOR sum of 0, and the XOR sums of the corresponding sets of ciphertexts provide information about the cipher's operation. This contrast between the differences between pairs of texts and the sums of larger sets of texts inspired the name "integral cryptanalysis", borrowing the terminology of calculus.

Other techniques

In addition to linear and differential cryptanalysis, there is a growing catalog of attacks: truncated differential cryptanalysis, partial differential cryptanalysis, integral cryptanalysis, which encompasses square and integral attacks, slide attacks, boomerang attacks, the XSL attack, impossible differential cryptanalysis, and algebraic attacks. For a new block cipher design to have any credibility, it must demonstrate evidence of security against known attacks.

Provable security

When a block cipher is used in a given mode of operation, the resulting algorithm should ideally be about as secure as the block cipher itself. ECB emphatically lacks this property: regardless of how secure the underlying block cipher is, ECB mode can easily be attacked. On the other hand, CBC mode can be proven to be secure under the assumption that the underlying block cipher is likewise secure. Note, however, that making statements like this requires formal mathematical definitions for what it means for an encryption algorithm or a block cipher to "be secure". This section describes two common notions for what properties a block cipher should have. Each corresponds to a mathematical model that can be used to prove properties of higher-level algorithms, such as CBC.
This general approach to cryptography – proving higher-level algorithms are secure under explicitly stated assumptions regarding their components – is known as provable security.

Standard model

Informally, a block cipher is secure in the standard model if an attacker cannot tell the difference between the block cipher and a random permutation.
To be a bit more precise, let E be an n-bit block cipher. We imagine the following game:

The person running the game flips a coin.
* If the coin lands on heads, he chooses a random key K and defines the function f = E_K.
* If the coin lands on tails, he chooses a random permutation on the set of n-bit strings and defines the function f =.
The attacker chooses an n-bit string X, and the person running the game tells him the value of f.
Step 2 is repeated a total of q times.
The attacker guesses how the coin landed. He wins if his guess is correct.

The attacker, which we can model as an algorithm, is called an adversary. The function f is called an oracle.
Note that an adversary can trivially ensure a 50% chance of winning simply by guessing at random. Therefore, let P_E denote the probability that adversary A wins this game against E, and define the advantage of A as 2. It follows that if A guesses randomly, its advantage will be 0; on the other hand, if A always wins, then its advantage is 1. The block cipher E is a pseudo-random permutation if no adversary has an advantage significantly greater than 0, given specified restrictions on q and the adversary's running time. If in Step 2 above adversaries have the option of learning f⁻¹ instead of f then E is a strong PRP. An adversary is non-adaptive if it chooses all q values for X before the game begins.
These definitions have proven useful for analyzing various modes of operation. For example, one can define a similar game for measuring the security of a block cipher-based encryption algorithm, and then try to show that the probability of an adversary winning this new game is not much more than P_E for some A. Equivalently, if P_E is small for all relevant A, then no attacker has a significant probability of winning the new game. This formalizes the idea that the higher-level algorithm inherits the block cipher's security.

Practical evaluation

Block ciphers may be evaluated according to multiple criteria in practice. Common factors include:

Key parameters, such as its key size and block size, both of which provide an upper bound on the security of the cipher.
The estimated security level, which is based on the confidence gained in the block cipher design after it has largely withstood major efforts in cryptanalysis over time, the design's mathematical soundness, and the existence of practical or certificational attacks.
The cipher's complexity and its suitability for implementation in hardware or software. Hardware implementations may measure the complexity in terms of gate count or energy consumption, which are important parameters for resource-constrained devices.
The cipher's performance in terms of processing throughput on various platforms, including its memory requirements.
The cost of the cipher refers to licensing requirements that may apply due to intellectual property rights.
The flexibility of the cipher includes its ability to support multiple key sizes and block lengths.

Notable block ciphers

Lucifer / DES

Lucifer is generally considered to be the first civilian block cipher, developed at IBM in the 1970s based on work done by Horst Feistel. A revised version of the algorithm was adopted as a U.S. government Federal Information Processing Standard: FIPS PUB 46 Data Encryption Standard. It was chosen by the U.S. National Bureau of Standards after a public invitation for submissions and some internal changes by NBS. DES was publicly released in 1976 and has been widely used.
DES was designed to, among other things, resist a certain cryptanalytic attack known to the NSA and rediscovered by IBM, though unknown publicly until rediscovered again and published by Eli Biham and Adi Shamir in the late 1980s. The technique is called differential cryptanalysis and remains one of the few general attacks against block ciphers; linear cryptanalysis is another but may have been unknown even to the NSA, prior to its publication by Mitsuru Matsui. DES prompted a large amount of other work and publications in cryptography and cryptanalysis in the open community and it inspired many new cipher designs.
DES has a block size of 64 bits and a key size of 56 bits. 64-bit blocks became common in block cipher designs after DES. Key length depended on several factors, including government regulation. Many observers in the 1970s commented that the 56-bit key length used for DES was too short. As time went on, its inadequacy became apparent, especially after a special-purpose machine designed to break DES was demonstrated in 1998 by the Electronic Frontier Foundation. An extension to DES, Triple DES, triple-encrypts each block with either two independent keys or three independent keys. It was widely adopted as a replacement. As of 2011, the three-key version is still considered secure, though the National Institute of Standards and Technology standards no longer permit the use of the two-key version in new applications, due to its 80-bit security level.

IDEA

The International Data Encryption Algorithm is a block cipher designed by James Massey of ETH Zurich and Xuejia Lai; it was first described in 1991, as an intended replacement for DES.
IDEA operates on 64-bit blocks using a 128-bit key and consists of a series of eight identical transformations and an output transformation. The processes for encryption and decryption are similar. IDEA derives much of its security by interleaving operations from different groups – modular addition and multiplication, and bitwise exclusive or – which are algebraically "incompatible" in some sense.
The designers analysed IDEA to measure its strength against differential cryptanalysis and concluded that it is immune under certain assumptions. No successful linear or algebraic weaknesses have been reported., the best attack which applies to all keys can break a full 8.5-round IDEA using a narrow-bicliques attack about four times faster than brute force.

RC5

RC5 is a block cipher designed by Ronald Rivest in 1994 which, unlike many other ciphers, has a variable block size, key size, and a number of rounds. The original suggested choice of parameters was a block size of 64 bits, a 128-bit key, and 12 rounds.
A key feature of RC5 is the use of data-dependent rotations; one of the goals of RC5 was to prompt the study and evaluation of such operations as a cryptographic primitive. RC5 also consists of a number of modular additions and XORs. The general structure of the algorithm is a Feistel-like a network. The encryption and decryption routines can be specified in a few lines of code. The key schedule, however, is more complex, expanding the key using an essentially one-way function with the binary expansions of both e and the golden ratio as sources of "nothing up my sleeve numbers". The tantalizing simplicity of the algorithm together with the novelty of the data-dependent rotations has made RC5 an attractive object of study for cryptanalysts.
12-round RC5 is susceptible to a differential attack using 2⁴⁴ chosen plaintexts. 18–20 rounds are suggested as sufficient protection.

Rijndael / AES

The Rijndael cipher developed by Belgian cryptographers, Joan Daemen and Vincent Rijmen was one of the competing designs to replace DES. It won the 5-year public competition to become the AES.
Adopted by NIST in 2001, AES has a fixed block size of 128 bits and a key size of 128, 192, or 256 bits, whereas Rijndael can be specified with block and key sizes in any multiple of 32 bits, with a minimum of 128 bits. The block size has a maximum of 256 bits, but the key size has no theoretical maximum. AES operates on a 4×4 column-major order matrix of bytes, termed the state.

Blowfish

Blowfish is a block cipher, designed in 1993 by Bruce Schneier and included in a large number of cipher suites and encryption products. Blowfish has a 64-bit block size and a variable key length from 1 bit up to 448 bits. It is a 16-round Feistel cipher and uses large key-dependent S-boxes. Notable features of the design include the key-dependent S-boxes and a highly complex key schedule.
It was designed as a general-purpose algorithm, intended as an alternative to the aging DES and free of the problems and constraints associated with other algorithms. At the time Blowfish was released, many other designs were proprietary, encumbered by patents, or were commercial/government secrets. Schneier has stated that "Blowfish is unpatented, and will remain so in all countries. The algorithm is hereby placed in the public domain, and can be freely used by anyone." The same applies to Twofish, a successor algorithm from Schneier.

Generalizations

Tweakable block ciphers

M. Liskov, R. Rivest, and D. Wagner have described a generalized version of block ciphers called "tweakable" block ciphers. A tweakable block cipher accepts a second input called the tweak along with its usual plaintext or ciphertext input. The tweak, along with the key, selects the permutation computed by the cipher. If changing tweaks is sufficiently lightweight, then some interesting new operation modes become possible. The disk encryption theory article describes some of these modes.

Format-preserving encryption

Block ciphers traditionally work over a binary alphabet. That is, both the input and the output are binary strings, consisting of n zeroes and ones. In some situations, however, one may wish to have a block cipher that works over some other alphabet; for example, encrypting 16-digit credit card numbers in such a way that the ciphertext is also a 16-digit number might facilitate adding an encryption layer to legacy software. This is an example of format-preserving encryption. More generally, format-preserving encryption requires a keyed permutation on some finite language. This makes format-preserving encryption schemes a natural generalization of block ciphers. In contrast, traditional encryption schemes, such as CBC, are not permutations because the same plaintext can encrypt multiple different ciphertexts, even when using a fixed key.

Relation to other cryptographic primitives

Block ciphers can be used to build other cryptographic primitives, such as those below. For these other primitives to be cryptographically secure, care has to be taken to build them the right way.

Stream ciphers can be built using block ciphers. OFB mode and CTR mode are block modes that turn a block cipher into a stream cipher.
Cryptographic hash functions can be built using block ciphers. See the one-way compression function for descriptions of several such methods. The methods resemble the block cipher modes of operation usually used for encryption.
Cryptographically secure pseudorandom number generators can be built using block ciphers.
Secure pseudorandom permutations of arbitrarily sized finite sets can be constructed with block ciphers; see Format-Preserving Encryption.
A publicly known unpredictable permutation combined with key whitening is enough to construct a block cipher -- such as the single-key Even–Mansour cipher, perhaps the simplest possible provably secure block cipher.
Message authentication codes are often built from block ciphers. CBC-MAC, OMAC, and PMAC are such MACs.
Authenticated encryption is also built from block ciphers. It means to both encrypt and MAC at the same time. That is to both provide confidentiality and authentication. CCM, EAX, GCM, and OCB are such authenticated encryption modes.

Just as block ciphers can be used to build hash functions, like SHA-1 and SHA-2 are based on block ciphers which are also used independently as SHACAL, hash functions can be used to build block ciphers. Examples of such block ciphers are BEAR and LION.