Where encryption starts getting really interesting, in my opinion, is
block ciphers. Block ciphers are a general category of ciphers that
are sort of a combination of substitution and transposition ciphers, and
sort of something entirely different. They’re really fascinating
things, but they’re pretty complicated.
The basic core of block ciphers is encryption of blocks. A block is
a fixed-length series of bits. The basic cipher is a pair of functions (E,E-1), where E (the encryption function) takes a block B and a key K, and generates a new block B’=E(K,B), which is the encrypted form of the block; and E-1 (the decryption function) takes a key and an encrypted block, and returns the original plaintext block: B=E-1(K,B’).
The ideal block cipher will have the properties that:
- For most blocks B and keys K, E(K, B) is not very similar to B.
- For most blocks B and C which differ by one bit, E(K,B) and E(K,C)
will differ by much more than one bit.
- Given E and K, for any block B, it’s easy to compute E(K,B)
- Given E and an encrypted text B, it’s hard to compute either
the encryption key, or the plaintext block.
What happens to the block? It depends a lot on the particular block cipher. But in
general, it’s treated as a set of sub-blocks, which are, in turn, treated as a sort of
large alphabet. The block is encrypted using what amounts to some substitutions, and some
transpositions, performed in sequence. In many block ciphers, there’s a fairly simple
basic mechanism for substitution and transposition, which is repeated multiple times to
produce a final result. The specific set of substitutions and transpositions is determined
both by the specifics of the block cipher being used, and by the encryption key
used for the message. So the way that the key gets incorporated in the encryption of a
single block in a block cipher is by selecting the particular sequence of substitutions
and transpositions on the elements of the block.
Of course, there’s a huge problem with the description so far, even given how
incomplete it is. Not all messages that we want to encrypt are the same size, but we’ve
only described the basic idea of encrypting a single block! Filling in that gap makes it
clear why this kind of crypto is called a block cipher. Take a message of arbitrary size,
S. To encrypt it with a block cipher with block size B, you break the message into
⌈S/B⌉ blocks, padding out the last block in some way to make it full. Then you
encrypt the blocks individually one at a time, and then…
You were expecting me to say something like “output them”, right? Sorry. Not that
simple. There’s another little trick for block ciphers. You don’t really just
do the blocks in sequence, and then combine them. You usually add something: some kind of feedback between the blocks, some kind of transposition to re-order them, etc. Without that, you can get really poor results. Wikipedia has a fantastic example of what happens if you naively just encrypt each block, without any feedback or transposition. The
first figure to the right is a bitmap image of Tux the Linux penguin without any encryption. The second image is the Tux bitmap, encrypted with a block cipher without
any feedback or transposition.
So, given a message that consists of multiple blocks, you don’t want to just naively
encode them and combine them into a message. In fact, there’s no one answer to “How do you
generate the encrypted blocks and put them together to produce the ciphertext?” The answer
depends on something called the ciphers mode of operation.
I hate the term “Mode of operation”. It’s one of those terms that’s so mundane that it
doesn’t sound like it describes something important or technical. It sounds like
IBM-manager-speak for trying to pad out a description, to make it sound deep. You know,
like when a manager says “form-factor” instead of “size”. But it really is something very
important. There are two key functions performed by the mode of operation (hereafter
“MoO”): the first, I’ve already mentioned: re-assemble the separately encrypted blocks to
produce a ciphertext message. The second is to provide an extra layer of security. Used
naively, two identical blocks of cleartext will produce two identical blocks of
ciphertext. That’s bad: it gives someone trying to crack the message a major clue about
the structure of the message. The mode of operation can define some kind of additional
state mechanism which will alter the ciphertext, so that even identical blocks of
plaintext will generate different blocks of ciphertext.
The naive method of just encrypting each block separately and concatenating them is a
valid mode of operation, called the electronic codebook (ECB) MoO – it’s just a lousy one
that no one in their right mind would use, because it provides such pathetic security.
As I get to real examples of block ciphers, I’ll describe more modes of operation, but I’ll use one here: the cipher-block chaining MoO. In cipher-block chaining (CBC) mode,
you maintain a state vector the size of a block. You initialize that with something
computed from the encryption key. Then, for each block, you first do a bitwise exclusive-or of the plaintext block with the state vector. Then you encrypt the modified plaintext, and set the state vector to the value of the encrypted block. So
in CBC mode, the blocks are output in sequence, but the ciphertext of each
block depends on the encryption key, the cleartext, and the ciphertext of the previous
The selection of a mode of operation is critical for ensuring that the encryption
system works appropriately for your application. Selecting an appropriate MoO involves some fairly hairy tradeoffs between (among others):
- Ease of cracking;
- Ease of detecting errors
- Ease of detecting tampering
- Ease of detecting improper message duplication
In particular, most of the basic MoOs can provide either message
integrity (meaning that you can be confident that you got the complete message without
errors or tampering), or message confidentiality (meaning that you can be confident
that no one other that the intended recipient received the message). Basic MoOs
can’t provide both.
The first major standardized block cipher was called DES, for “data encryption standard”. It was a variation on one of the first published block ciphers, an IBM system called lucifer. In the next post, I’ll describe the DEA – the data encryption algorithm of the DES. It’s a pretty typical example of a block cipher. It’s fairly complicated, and so it deserves a post of its own to describe its
basic mechanism, and its modes of operation.