Now, we’re finally reaching the point where the block-cipher stuff gets really fun: block cryptanalysis.
As I’ve explained before, the key properties of a really good
encryption system are:
- It’s easy to compute the ciphertext given the plaintext and the key;
- It’s easy to compute the plaintext given the ciphertext and the key;
- It’s hard to compute the plaintext given the ciphertext
but not the key;
- It’s hard to compute the key.
That last property is actually a bit of a weasel. There are really a wide variety of attacks that try to crack an encryption
system – meaning, basically, to discover the key. What makes that
statement of the property so weasely is that it omits the information available to the person trying to crack it. In the first three properties, I clearly stated what information you had available to produce a result. In the last, I didn’t.
There’s a reason that I weaseled that. Partly, it’s because a correct statement of it would be ridiculously long and incomprehensible; and partly because it’s often deliberately set up differently for different encryption systems. You can design systems that are extremely strong against certain attacks, but not so good against others. There’s no universally ideal encryption system: it’s always a matter of tradeoffs, where you can handle some scenarios better than others.
Today we’re going to look at one particularly fascinating attack that’s used against block ciphers. It’s called differential cryptanalysis.
So what is differential cryptanalysis? It’s a technique that looks at how making specific changes in the plaintext input to the cipher produce changes in the output from that cipher. The basic idea, in its simplest form, is that you’re able to select a collection of plaintexts, and encrypt them. Then based on the results of doing
that, you try to compute the key.
Differential cryptanalysis is a form of the basic chosen plaintext attack. What makes it a differential attack is that you use related families of plaintext. That is, you’ve got a family of texts T0, T1, T2, …, where each text Ti is equal to T0 plus some small difference. For many differential attacks, that small difference is one bit.
How can you crack a cipher this way? To start, we assume that you know what cipher we’re using. For example, you know that we’re using DES with a 56 bit key. So you take your initial plaintext,
T0, and you encrypt it, giving you a ciphertext, C0. Then you change one bit of T0, giving you T1, and you encrypt that, giving you C1.
C1 differs from C0 in some number of bits. Now, you’ve reduced the problem from “What key could produce the plaintext/ciphertext pair (T0, C0)?” to
“What set of keys could produce the variation C0 to C1 by changing one bit of T0?”. I’ve drawn an outline of this process in a diagram to the right.
This reduction can be dramatic. For example, in one of the early proposal rivals to DES, caled FEAL, it turned out that you could crack it with a set of 8 one-block (plaintext,ciphertext) pairs. The problem is reduced from a blind search of a search-space of 264 possible keys to utter triviality by a set of 8 chosen plaintexts!
The subtlety of this attack is in how you choose the
set of plaintexts to use. For a typical Feistel network block
cipher, you need to do a very delicate analysis of the S-blocks. You trace out the possible paths of each bit through the network,
and using that, you can work out a set of probable differences – that is, bit patterns that are likely to be able to produce an informative difference in the ciphertexts. These probable patterns are called differential characteristics. Once you’ve worked out a large enough set of probably characteristics, you can use them to assemble a set of informative plaintexts that should provide you with what you need to determine the key.
It can get even more subtle than that. Since you know the algorithm, you can isolate a single stage of the cipher (as illustrated to the right), and throw bit-blocks at it in differential style. From that, you can work out the set of subkeys that could produce the observed differential result for that stage. If you do that for all of the stages, you’ll get a set of keys for each stage. Figure out which subkeys keys from each stage are mutually compatible, and then combine them, and you get a very small (usually one) key that could produce valid subkeys for each stage.
You can even repeat this process on multiple levels. If you’ve got a block cipher like DES, and you know the structure, you can do the differential analysis of a single stage by decomposing it into its individual S-boxes. Analyze each S-box, and come up with a set of potential subkey-fragments, and then combine those to create a set of potential subkeys for the stage. Then combine the stage results.
Differential analysis also reveals something fascinating about
DES. Most texts attribute the invention of this attack to Shamir in
the late 1980s, when he published a set of papers describing
differential attacks. But when you analyze DES, you find that it’s
not very susceptible to differential attacks! In fact, it
takes 255 chosen plaintexts to crack 56-bit DES – so differential analysis is effectively no better than simple brute force. But if you make some very minor changes to S-boxes in the algorithm, it becomes nearly as susceptible as FEAL.
So how is DES so resistant to an attack that wasn’t discovered
until almost 15 years after DES was first published? It turns out
that the IBM team that designed DES had figured out differential
attacks before they designed DES in 1974. They knew about that attack and its properties, and specifically designed DES to not be susceptible to it. But at the time, they were working with the NSA to produce an official standard encryption system, and the NSA requested that they not tell anyone about the attack they’d discovered.
This, naturally, has driven a lot of interesting discussion. Why did the NSA want to keep that secret? They claim that they did it because revealing the differential attack and how DES protected against it would “weaken the competitive advantage that the US enjoyed over other countries in the field of cryptography”. But
a lot of people (to be honest including me) believe that the real
advantage they wanted to maintain was the ability of the US
intelligence community to crack encryption systems used by other countries.
One last comment, and I’ll wrap up this post. Things like differential cryptanalysis are one of the many reasons not to roll your own cryptographic cipher. Some of the best cryptographers in the world – people who spent all of their time working on designing secure cryptosystems – designed systems that were easy to break using this technique. If you don’t know about every technique that could be used against your system, and you haven’t considered how your system will behave under probing by those techniques, then you have close to no chance of designing something unbreakable. Over time, so many cryptosystems have failed, because someone didn’t notice one tiny little problem, or didn’t think about one obscure technique, or didn’t notice one tiny little bug in their implementation. Those things are fatal to a good cryptosystem. Many of us who are math geeks have tried designing a cipher for the heck of it; how many of us who’ve done it are able to say that we’re sure that it couldn’t be cracked in five minutes of differential analysis? I’d wager that the answer is zero.