{"id":730,"date":"2009-01-08T21:52:29","date_gmt":"2009-01-08T21:52:29","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2009\/01\/08\/cryptographic-padding-in-rsa\/"},"modified":"2009-01-08T21:52:29","modified_gmt":"2009-01-08T21:52:29","slug":"cryptographic-padding-in-rsa","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2009\/01\/08\/cryptographic-padding-in-rsa\/","title":{"rendered":"Cryptographic Padding in RSA"},"content":{"rendered":"<p> Ok, away from politics, and back to the good stuff. When I left off talking about<br \/>\nencryption, we were <a href=\"http:\/\/scienceblogs.com\/goodmath\/2008\/12\/public_key_cryptography_using.php\">looking at<br \/>\nRSA<\/a>, as an example of an asymmetric encryption system.<\/p>\n<p> Since it&#8217;s been a while, I&#8217;ll start off with a quick review of RSA.<\/p>\n<p> Asymmetric encryption systems like RSA are based on computations that are easy to perform if you know the keys, and incredibly hard to perform if you don&#8217;t. In the specific case<br \/>\nof RSA, everything is based on a pair of very large prime numbers. If you know those two<br \/>\nprimes, and you know the public key, it&#8217;s really easy to compute the private key. But<br \/>\nif you <em>don&#8217;t<\/em> know the two prime numbers, then even given the public key,<br \/>\nit&#8217;s incredibly difficult to compute the private key.<\/p>\n<p> To be a bit more specific, in RSA, you get a pair of large prime numbers, P and Q. You<br \/>\ncompute from them a <em>totient<\/em> of their product, which is the number<br \/>\nN=(P-1)&times;(Q-1). Then you arbitrarily pick a public exponent, E, which is<br \/>\nsmaller than N, and which is prime relative to N.  You can then compute<br \/>\nthe private key exponent, D. If you know what P and Q are, it&#8217;s pretty easy to compute<br \/>\nD.<\/p>\n<p> Once you&#8217;ve got D and E, your public key is the pair is (N,E), and the private key is the pair is (N,D).<\/p>\n<p> If you&#8217;ve got a plaintext message M, then encrypting it with the public key<br \/>\nis done by computing M<sup>E<\/sup> mod N. If you&#8217;ve got a ciphertext C encrypted<br \/>\nwith the public key, then decrypting it is done by computing C<sup>D<\/sup> mod N.<\/p>\n<p> <em>Encrypting<\/em> a plaintext with the public key is <em>exactly<\/em> the same process<br \/>\nas <em>decrypting<\/em> a ciphertext produced with the private key. And vice versa.<\/p>\n<p><!--more--><\/p>\n<p> RSA is amazingly elegant. Every time I look at it for any reason, I&#8217;m struck<br \/>\nby its beauty and it&#8217;s simplicity. It&#8217;s really an amazing example of how<br \/>\nsomething seemingly abstract like number theory can produce practical, down-to-earth,<br \/>\nand useful.<\/p>\n<p> But RSA isn&#8217;t perfect. In fact, it&#8217;s got a some rather serious problems that<br \/>\nare a direct result of its mathematical structure. I&#8217;m just going to mention<br \/>\none, but there are several of them.<\/p>\n<p> Suppose you&#8217;ve got your public\/private keypair. You want to encrypt two messages, I<br \/>\nand J with your private key. Let&#8217;s call the function for encrypting a message<br \/>\nwith the private key E &#8211; so E(M) = M<sup>D<\/sup> mod N. Then<br \/>\nC<sub>I<\/sub> = E(I), and C<sub>J<\/sub>=E(J). Good so far.<\/p>\n<p> The problem is that given those two messages &#8211; for which an attacker has access to both<br \/>\nthe plaintext and the ciphertext &#8211; it happens that there&#8217;s a vulnerability. Because of the<br \/>\nway that encryption works, E(I&times;J) = E(I) &times; E(J)!<\/p>\n<p> This opens you up to something called <em>chosen ciphertext attacks<\/em>. A chosen<br \/>\nciphertext attack is one where you attack an crytographic system by running chosen<br \/>\nciphertexts through the decryption system to see if you can compute the encryption key.<br \/>\nThere&#8217;s an effective attack against the naive use of RSA based on a selected ciphertext<br \/>\nmethod. <\/p>\n<p> In addition to this, there are a few other interesting attacks that I won&#8217;t describe in<br \/>\ndetail. They all rely on various problems of the numerical structure of RSA encryption. In<br \/>\norder to defeat those attacks, RSA is typically used with something called a <em>padding<br \/>\nscheme<\/em>. The idea of the padding scheme is twofold. First, it increases the size of the<br \/>\nmessage &#8211; which guarantees that the encrypted message is large enough that it will not be<br \/>\neasy to use for an attack. Second, it intersperses pseudo-random information in a way that<br \/>\nmeans that a given plaintext message could be encrypted to a wide range of different<br \/>\nciphertexts, depending on the choices made during padding.<\/p>\n<p> A common scheme for padding is OAEP &#8211; Optimal Asymmetric Encryption Padding. OAEP<br \/>\nis a basic Feistel network system, like the ones I described when I was talking<br \/>\nabout DES. OAEP ends up doing a mixture of permuting the plaintext, and<br \/>\nadding pseudo-random noise to it. It&#8217;s a reversible transformation,<br \/>\nand the receiver of the encrypted message (and thus any attacker) knows how to<br \/>\ndo the reversal of the padding given the decrypted plaintext.  But the process<br \/>\nof permutation and random injection performed by the padding has the effect<br \/>\nof breaking the properties of RSA that make it easy to attack the system.<\/p>\n<p> For example, suppose that the padding function is called P. E(P(I))&times;E(P(J)) is<br \/>\n<em>not<\/em> equal to E(P(I&times;J)). Similarly, the message-size-related problems<br \/>\nwith RSA are not usable on messages whose size is increased past the vulnerability threshold<br \/>\nusing OAEP padding.<\/p>\n<p> So what does the padding look like? Let&#8217;s start with the pieces.<\/p>\n<ol>\n<li> You have a number, N, which is the length of the cryptographic modulus.<\/li>\n<li> You have two integers, k<sub>0<\/sub> and k<sub>1<\/sub>, which are parameters<br \/>\nof the protocol in use.<\/li>\n<li> You have the original plaintext message, M. By definition, the length of M is<br \/>\n(n &#8211; k<sub>0<\/sub> &#8211; k<sub>1<\/sub>). (The protocol is responsible for breaking<br \/>\nup large messages into sub-messages via an appropriate mode of operation.)<\/li>\n<li> You have a 0-padded version of M, M&#8217;, which is M followed by k<sub>1<\/sub><br \/>\nzeros &#8211; so M&#8217; has length N-k<sub>0<\/sub>.<\/li>\n<li> You have a random string of bits, r, which has length k<sub>0<\/sub>.<\/li>\n<li> You have a cryptographic hash function G, which expands a string of<br \/>\nk<sub>0<\/sub> bits to a string of N-k<sub>0<\/sub> bits.<\/li>\n<li> You have another cryptographic hash function H, which contracts<br \/>\na string of n-k<sub>0<\/sub> bits to a srting of k<sub>0<\/sub> bits.<\/li>\n<\/ol>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"oaep.png\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_360.png?resize=243%2C219\" width=\"243\" height=\"219\" class=\"inset right\" \/><\/p>\n<p> The OAEP padding algorithm is illustrated off to the side. The way that<br \/>\nit works is:  you compute G(r), giving you a string of N-k<sub>0<\/sub> bits.<br \/>\nYou XOR G(r) with M&#8217;, producing a string of N-k<sub>0<\/sub> bits, which we&#8217;ll<br \/>\ncall X. Then compute H(X), and XOR that with r, producing a result that we&#8217;ll call<br \/>\nY. The end result of the padding is the concatenation of X and Y.<\/p>\n<p> The way to understand this is that you&#8217;ve got some random bits in R, which you&#8217;re shuffling up (using G and H) and mixing with the bits of the original message. The resulting message is longer, and has a random element mixed into it, which defeats the numerical properties that make some attacks against RSA work. It&#8217;s easy to compute<br \/>\nX and Y given M and r. And given X and Y, if you know G and H it&#8217;s easy to compute<br \/>\nM and r. But given E(X concat Y), as an attacker, you&#8217;re screwed. You can<br \/>\nstill decript the message, and get X and Y, decode them, and get the plaintext. But<br \/>\nthe process of doing the padding obscures the numerical properties of RSA so that<br \/>\neven knowing the padding function, your attacks won&#8217;t work.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ok, away from politics, and back to the good stuff. When I left off talking about encryption, we were looking at RSA, as an example of an asymmetric encryption system. Since it&#8217;s been a while, I&#8217;ll start off with a quick review of RSA. Asymmetric encryption systems like RSA are based on computations that are [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[84],"tags":[],"class_list":["post-730","post","type-post","status-publish","format-standard","hentry","category-encryption"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-bM","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/730","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=730"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/730\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=730"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=730"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=730"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}