Applied Cryptography CS 6260
Popular in Course
Popular in ComputerScienence
This 0 page Class Notes was uploaded by Alayna Veum on Monday November 2, 2015. The Class Notes belongs to CS 6260 at Georgia Institute of Technology - Main Campus taught by Alexandra Boldyreva in Fall. Since its upload, it has received 11 views. For similar materials see /class/234025/cs-6260-georgia-institute-of-technology-main-campus in ComputerScienence at Georgia Institute of Technology - Main Campus.
Reviews for Applied Cryptography
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/02/15
Chapter 1 INTRODUCTION Welcome This course is your invitation to the fascinating eld of modern cryptog raphy The word cryptography comes from the Latin crypt meaning secret and graphr39a meaning writing So cryptography is literally the science of secret writing the study of how to obscure what you write so as to render it selectively unintelli gible Nowadays cryptography entails much more than secret writing Although main taining privacy of communications remains one of the eld s central goals cryptog raphy has grown to encompass a host of related problems and goals Cryptography has been used almost since writing was invented For the larger part of its history cryptography remained an art a game of ad hoc designs and attacks Although the eld retains some of this avor the last twenty ve years have brought in something new The art of cryptography has now been supplemented with a legitimate science In this course we shall focus on that science which is modern cryptography Modern cryptography is a remarkable discipline It is a cornerstone of computer and communications security with end products that are imminently practical Yet its study touches on branches of mathematics that may have been considered es oteric and it brings together elds like number theory computational complexity theory and probabiltity theory Be forewarned cryptography is a slippery subject That which seems meaningful can turn out to be meaningless that which seems true can turn out to be false and that which seems impossible can turn out to be doable So have funibut retain a healthy skepticism and always watch your step 2 INTRODUCTION v S BR A Figure 11 Several cryptographic goals aim to imitate some aspect of an ideal channel connecting a sender S to a receiver R 11 Goals and settings Modern cryptography addresses a wide range of problems But the most basic problem remains the classical one of ensuring security of communication across an insecure medium Let s introduce the rst two members of our cast of characters our sender S and our receiver R The sender and receiver want to communicate with each other Sometimes people call these characters Alice A and Bob B Alice and Bob gure in many works on cryptography But we re going to want the letter A for someone else anyway THE IDEAL CHANNEL lmagine our two parties are provided with a dedicated untappable impenetrable pipe or tube into which the sender can whisper a message and the receiver will hear it Nobody else can look inside the pipe or change what s there This pipe provides the perfect medium available only to the sender and receiver as though they were alone in the world It is an ideal communication channel from the security point of view See Figure 11 Unfortunately in real life there are no ideal channels connecting the pairs of parties that might like to communicate with each otehr Usually such parties are communicating over some public network like the Internet The most basic goal of cryptography is to provide such parties with a means to imbue their communications with security properties akin to those provided by the ideal channel At this point we should introduce the third member of our cast This is our adversary denoted A An adversary models the source of all possible threats We imagine the adversary as having access to the network and wanting to compromise the security of the parties communications in some way Not all aspects of an ideal channel can be emulated lnstead cryptographers distill a few central security goals and try to achieve them The rst such goal is privacy Providing privacy means hiding the content of a transmission from the adversary The second goal is authenticity or integrity We want the receiver upon receiving a communication pertaining to be from the sender to have a way of assuring itself that it really did originate with the sender and was not sent by the Bellare and Rogaway 3 adversary or modi ed en route from the sender to the receiver PROTOCOLS In order to achieve security goals such as privacy or authenticity cryptography supplies the sender and receiver with a protocol A protocol is just a collection of programs equivalently algorithms software one for each party in volved In our case there would be some program for the sender to run and another for the receiver to run The sender s program tells her how to package or encapsu late her data for transmission The receiver s program tells him how to decapsulate the received package to recover the data together possibly with associated infor mation telling her whether or not to regard it as authentic Both programs are a function of some cryptographic keys as we discuss next TRUST MODELS It is not hard to convince yourself that in order to communicate securely there must be something that a party knows or can do that the adversary does not know or cannot do There has to be some asymmetry between the situation in which the parties nds themselves and situation in which the adversary nds itself The trust model speci es who initially has what keys There are two central trust models the symmetric or shared key trust model and the asymmetric or public key trust model We look at them and the cryptographic problems they give rise to in turn 111 The symmetric setting In practice the simplest and also most common setting is that the sender and receiver share a key that the adversary does not know This is called the symmetric setting or symmetric trust model The encapsulation and decapsulation procedures above would both depend on this same shared key The shared key is usually a uniformly distributed random string having some number of bits k Recall that a string is just a sequence of bits For language theoretic background see Figure 12 The sender and receiver must somehow use the key K to overcome the presence of the adversary One might ask how the symmetric setting is realized Meaning how do a sender and receiver initially come into possession of a key unknown to the adversary We will discuss this later The symmetric model is not concerned with how the parties got the key but with how to use it In cryptography we assume that the secret key is kept securely by the party using it If it is kept on a computer we assume that the adversary cannot penetrate these machines and recover the key Ensuring that this assumption is true is the domain of computer systems security Let us now take a closer look at some speci c problems in the symmetric setting We ll describe these problems quite informally but we ll be returning to them later in our studies when they ll get a much more thorough treatment SYMMETRIC ENCRYPTION SCHEMES A protocol used to provide privacy in the 4 INTRODUCTION We will sometimes use words from the theory of formal languages77 Here is the vocabulary you should know An alphabet is a nite nonempty set We usually use the Greek letter Z to denote an alphabet The elements in an alphabet are called characters So for example 2 0 1 2 34 5 6 7 8 9 is an alphabet having ten characters and E 0 1 is an alphabet called the binary alphabet which has two characters We ll assume the binary alphabet A string is nite sequence of characters The number of characters in a string is called its length and the length of a string X is denoted So X 1011 is a string of length four over the binary alphabet and Y cryptography is a string of length 12 over the alphabet of English letters The string of length zero is called the emptystring and is denoted E If X and Y are strings then the concatenation of X and Y denoted X Y is the characters of X followed by the characters of Y So for example 1011 0 10110 The i th character of a string X where 1 S i S le is denoted Xi so that X X1 X2 lfi S j then denotes the string Xj This is the empty string ifi j If a is a character and i 2 0 is a number then a is the string consisting of the character a repeated i times It is understood that a0 E for any character a So for example 03 000 and 17 is how you d write the number n in unary notation We can encode almost anything into a string Usually the details of how one does this are irrelevant and so we use the notation something for any xed natural way to encode something as a string For example if n is a number and X is a string then Y ltnXgt is some string which encodes n and X It is easy to go from n and X to Y n X and it is also easy to go from Y ltnXgt back to n and X A language is a set of strings all of the strings being drawn from the same alphabet 2 1f 2 is an alphabet then 2 denotes the set of all strings whose characters are drawn from E For example 01 50100011011000 Figure 12 Elementary notation from formal language theory symmetric setting is called a symmetric encryption scheme When we specify such a scheme H we must specify three algorithms so that the scheme is a triple of algorithms H IC D The encapsulation algorithm we discussed above is in this context called an encryption algorithm and is the algorithm 5 The message M that the sender Wishes to transmit is usually referrred to as a plaintext The sender encrypts the plaintext under the shared key K by applying 5 to K and M to obtain a ciphertext C The ciphertext is transmitted to the receiver The above mentioned decapsulation procedure in this context is called a decryption algorithm and is the algorithm D The receiver applies D to K and C The decryption process might be unsuccessful indicated by its returning a special symbol 1 but if successful it ought to return the message that was originally encrypted The rst algorithm in H is the key generation algorithm which speci es the manner in which the key is to be chosen In most cases this algorithm simply returns a random string of length the key length The encryption algorithm 5 may be randomized or it might keep Bellare and Rogaway 5 Figure 13 Symmetric encryption The sender and the receiver share a secret key K The adversary lacks this key The message M is the plaintext the message C is the ciphertext some state around A picture for symmetric encryption can be found in Figure 13 The encryption scheme does not tell the adversary what to do It does not say how the key once generated winds its way into the hands of the two parties And it does not say how messages are transmitted It only says how keys are generated and how the data is processed WHAT IS PRIVACY The goal of a symmetric encryption scheme is that an adversary who obtains the ciphertext be unable to learn anything about the plaintext What exactly this means however is not clear and obtaining a de nition of privacy will be an important objective in later chapters One thing encryption does not do is hide the length of a plaintext string This is usually recoverable from the length of the ciphertext string As an example of the issues involved in de ning privacy let us ask ourselves whether we could hope to say that it is impossible for the adversary to gure out M given C But this cannot be true because the adversary could just guess M by outputting a random sequence of bits As indicated above the length of the plaintext is usually computable from the length of the ciphertext She would be right with probability 2 Not bad if say 71 11 Does that make the scheme bad No But it tells us that security is a probabilistic thing The scheme is not secure or insecure there is just some probability of breaking it Another issue is a priori knowledge Before M is transmitted the adversary might know something about it For example that M is either 0 or 1 Why Because she knows Alice and Bob are talking about buying or selling a xed stock and this is just a buy or sell message Now she can always get the message right with probability 12 How is this factored in So far one might imagine that an adversary attacking the privacy of an encryp tion scheme is passive merely obtaining and examining ciphertexts In fact this might not be the case at all We will consider adversaries that are much more powerful than that 6 INTRODUCTION Algorithm K Algorithm 5KM Algorithm DKC K lt1J 01 C static ctr lt7 0 ctr C lt7 C returnK 771quot 771quot lCl ifctrmgtk ifctrmgtlKl then return error then return error foriHltomdo fori1tomdo C iltiKctriEBMi MiKctriEBCi ctr lt7 ctr m return M return ctr 7 m C Figure 14 Encryption with a onetime pad The rst algorithm generates the key K the second encrypts plaintext M and the last decrypts ciphertext C SYMMETRIC ENCRYPTION WITH A ONE TIME PAD Now let s give an example of an encryption scheme one that arises in numerous spy novels Let K K1 KM denote the shared key which is a random sequence of k bits Think of k as some big number like a million Let M M1 denote the plaintext message that the sender wants to send also divided up into bits Assume that m S k that is the key is at least as long as the plaintext What the sender does is to compute I M EB for each 2 1 m The symbol EB denotes the exclusiveor XOR operation 0 ED 0 1 G91 0 While 0 G91 1 ED 0 1 The string C is the main part of the ciphertext which the sender sends out The receiver receives C C 1 C m and can recover M Via I M EB for all 1 S 2 S n This is possible for the receiver because he too knows the key K What we have just done is specify a particular encryption scheme or protocol H 1C D Its constituent algorithms implementing the above are depicted in Figure 14 We ll be discussing the security of onetime pad encryption later In this scheme when the sender wants to encrypt another message she has to use new key bits That is she keeps track of where she is in the key Via a counter and goes on from there Key bits are never re used That s Why this is called a onetime pad each key bit is used at most once You cannot encrypt more data than you have key bits An indication of where the sender is in the key should be included in the ciphertext to allow the receiver to decrypt MESSAGE AUTHENTICITY In the message authentication problem the receiver gets some message which is claimed to have originated with a particular sender The channel on which this message ows is insecure Thus the receiver R wants to distinguish the case in which the message really did originate with the claimed sender S from the case in which the message originated with some imposter A In such a case we consider the design of an encapsulation mechanism with the property that un authentic transmissions lead to the decapsulation algorithm outputting the Bellare and Rogaway 7 accept reject Figure 15 A message authentication code The tag 0 accompanies the message M The receiver R uses it to decide if the message really did originate with the sender S with whom he shares the key K special symbol 1 The most common tool for solving the message authentication problem in the symmetric setting is a message authentication scheme also called a message auth entication code MAC Such a scheme is speci ed by a triple of algorithms H K T V When the sender wants to send a message M to the receiver she computes a tag 0 by applying T to the shared key K and the message M and then trans mits the pair M o The encapsulation procedure referred to above thus consists of taking M and returning this pair The tag is also called a MAC The compu tation of the MAC might be probabilistic or use state just as with encryption Or it may well be deterministic The receiver on receipt of M and o uses the key K to check if the tag is OK by applying the veri cation algorithm V to KM and o If this algorithms returns 1 he accepts M as authentic otherwise he regards M as a forgery An appropriate reaction might range from ignoring the bogus message to tearing down the connection to alerting a responsible party about the possible mischief See Figure 15 112 The asymmetric setting A shared key K between the sender and the receiver is not the only way to create the information asymmetry that we need between the parties and the adversary In the asymmetric setting also called the publichey setting a party possesses a pair of keysia public key pk and an associated secret key sk A party s public key is made publicly known and bound to its identity For example a party s public key might be published in a phone book The problems that arise are the same as before but the difference in the setting leads to the development of different kinds of tools ASYMMETRIC ENCRYPTION The sender is assumed to be able to obtain an authentic 8 INTRODUCTION coins A Public Secret Figure 16 Asymmetric encryption The receiver R has a public key ka which the sender knows belongs to R The receiver also has a corresponding secret key SkR copy ka of the receiver s public key The adversary is assumed to know pk R too To send a secret message M to the receiver the sender computes a ciphertext C e EPkR and sends C to the receiver When the receiver receives a ciphertext C he computes M e DSkRC The asymmetric encryption scheme H IC D is speci ed by the algorithms for key generation encryption and decryption For a picture of encryption in the public key setting see Figure 16 The idea of public key cryptography and the fact that we can actually realize this goal is remarkable You ve never met the receiver before But you can send him a secret message by looking up some information in a phone book and then using this information to help you garble up the message you want to send The intended receiver will be able to understand the content of your message but nobody else will The idea of public key cryptography is due to Whit eld Di ie and Martin Hellman and was published in 1976 DIGITAL SIGNATURES The tool for solving the messageauthentication problem in the asymmetric setting is a digital signature Here the sender has a public key pk S and a corresponding secret key sks The receiver is assumed to know the key pkg and that it belongs to party S The adversary is assumed to know pkg too When the sender wants to send a message M she attaches to it some extra bits 7 which is called a signature for the message and is computed as a function of M and sks by applying to them a signing algorithm 5 The receiver on receipt of M and a checks if it is OK using the public key of the sender pkg by applying a veri cation algorithm V If this algorithm accepts the receiver regards M as authentic otherwise he regards M as an attempted forgery The digital signature scheme H 1C 5 V is speci ed by the algorithms for key generation signing and verifying A picture is given in Figure 17 One difference between a MAC and a digital signature concerns what is called Bellare and Rogaway 9 t accep reject Public Secret Figure 17 A digital signature scheme The signature 7 accompanies the message M The receiver R uses it to decide if the message really did originate with the sender S with has public key pk S nonrepudiation With a MAC anyone who can verify a tagged message can also produce one and so a tagged message would seem to be of little use in proving authenticity in a court of law But with a digitally signed message the only party who should be able to produce a message that veri es under public key pk S is the party S herself Thus if the signature scheme is good party S cannot just maintain that the receiver or the one presenting the evidence concocted it If signature 7 authenticates M with respect to public key pk S then it is only S that should have been able to devise a The sender cannot refute that Probably the sender S can claim that the key sks was stolen from her Perhaps this if true might still be construed the sender s fault 113 Summary To summarize there are two common aims concerned with mimicking an ideal channel achieving message privacy and achieving message authenticity There are two main trust models in which we are interested in achieving these goals the symmetric trust model and the asymmetric trust model The tools used to achieve these four goals are named as shown in Figure 18 10 INTRODUCTION symmetric trust model asymmetric trust model message symmetric aka private asymmetric aka public privacy key encryption key encryption message message authentication digital signature scheme authenticity code MAC Figure 18 Summary of main goals and trust models 12 Other goals Cryptography has numerous other goals some related to the ones above some not Let us discuss a few of them 121 Pseudorandom Number Generation Lots of applications require random numbers or bits These applications involve simulation ef cient algorithms and cryptography itself In particular randomness is essential to key generation and additionally many cryptographic algorithms such as encryption algorithms are randomized A pseudorandom number generator is a deterministic algorithm that takes as input a short random string called a seed and stretches it to output a longer sequence of bits that is pseudorandom In some applications people use Linear Congruential Generators LCGs for pseudorandom number generation But LCGs do not have good properties with regard to the quality of pseudorandomness of the bits output With the ideas and techniques of modern cryptography one can do much better We will say what it means for a pseudorandom number generator to be good and then how to design one that is good in this sense Our notion of good is such that our generators provably suf ce for typical applications It should be clari ed that pseudorandom generators do not generate pseudo random bits from scratch They need as input a random seed and their job is to stretch this Thus they reduce the task of random number generation to the task of generating a short random seed As to how to do the latter we must step outside the domain of cryptography We might wire to our computer a Geiger counter that generates a random bit every second and run the computer for say 200 seconds to get a 200 bit random seed which we can then stretch via the pseudorandom num ber generator Sometimes more ad hoc methods are used a computer might obtain a random seed by computing some function of various variable system parameters such as the time and system load We won t worry about the philosophical question as to whether the bits that form the seed are random in any real sense We ll simply assume that these bits are completely unpredictable to anything beyond the computer which has gath Bellare and Rogaway ll ered this dataimathematically we ll treat these bits as random We will then study pseudorandom number generation under the assumption that a random seed is available 122 Authenticated key exchange It is common for a pair of communicating parties to wish to establish a secure session This is a communication session in which they exchange information with the conviction that each is indeed speaking to the other and the content of the information remains hidden to any third party One example is a login session in which Alice wishes to remotely logon to her computer Another example is a web browsing session in which a client wants to communicate securely with a server for some period Parties who already either share a secret key or are in possession of authentic copies of each other s public keys could use these keys directly to provide privacy and integrity of communicated data via symmetric or asymmetric cryptography However this is not what is commonly done Rather the parties will use their existing keys icalled longlived keys in this contexti to derive a session key This is done via an authenticated key exchange protocol This is a message exchange whose goal is to provide the parties a fresh77 and authentic shared key that will then be used to encrypt and authenticate tra ic in the session using symmetric cryptography Once the session is over the session key is discarded Authenticated key exchange is one of the more subtle goals in cryptography and will spend some time later applying the paradigms of modern cryptography to see how to de ne this goal and provide high assurance solutions 123 Coin Flipping Alice and Bob are getting divorced and want to decide who gets to keep the car Alice calls Bob on the telephone and offers a simple solution Bob77 she says I ve got a penny in my pocket I m going to toss it in the air right now You call heads or tails If you get it right you get the car If you get it wrong I get the car77 Bob is not as bright as Alice but something troubles him about this arrangement The telephonecoin ip problem is to come up with a protocol so that to the maximal extent possible neither Alice nor Bob can cheat the other and at the same time each of them learn the outcome of a fair coin toss Here is a solutionisort of Alice puts a random bit 04 inside an envelope and sends it to Bob Bob announces a random bit 8 Now Alice opens the envelope for Bob to see The shared bit is de ned as 04 ED 8 See Figure 19 To do this over the telephone we need some sort of electronic envelope77 in cryptography this called a commitment scheme Alice can put a value in the envelope and Bob can t see what the envelope contains Later Alice can open the envelope so that Bob can see what the envelope contains Alice can t change her mind about an envelope s contentsiit can only be opened up in one way 12 INTRODUCTION Choose bit at at random Put x in an envelope amp send it A Choose bit B 393 at random and I send it The shared bit is x xor 3 0L Open up the gt envelope for so B can likewise compute it compute the Shared bit at xor 3 Figure 19 Envelope solution to the telephonecoin ipping 5problem Here is a simple technique to implement an electronic envelope To put a 0 inside an envelope Alice chooses two random 500 bit primes p and 1 subject to the constraints that p lt q and p E 1 mod 4 and q E 3 mod 4 The product of p and 1 say N pq is the commitment to zero that is What Alice would send to commit to 0 To put a 1 inside an envelope Alice chooses too random 500 bit primes p and 1 subject to the constraints that p lt q and p E 3 mod 4 and q E 1 mod 4 The product of these N pq is the commitment to 1 Poor Bob seeing N would like to gure out if the smaller of its two prime factors is congruent to l or to 3 modulo 4 We have no idea how to make that determination short of factoring Niand we don t know how to factor 1000 digit numbers which are the product of random 500 digit primes Our best algorithms would take way too long to run When Alice wants to decommit open the envelope N she announces p and 1 Bob veri es that they are prime this is easy to do and multiply to N and then he looks to see if the smaller factor is congruent to l or to 3 modulo 4 13 What cryptography is about Let us now move away from the particular examples we have given and ask What in general is cryptography about 131 Protocols parties and adversaries Brie y cryptography is about constructing and analyzing protocols which overcome the in uence of adversaries In the last sections we gave examples of several different protocol problems and a couple of different protocols Suppose that you are trying to solve some cryptographic problem The problem will usually involve some number of parties Us cryptographers often like to anthro pomorphize our parties giving them names like Alice and Bob and referring to them as though they are actual people We do this because it s convenient and fun But you shouldn t think that it means that the parties are really human beings Bellare and Rogaway 13 They might beibut they could be lots of other things too Like a cell phone a computer a processes running on a computer an institution or maybe a little gadget sitting on the top of your television set We usually think of the parties as the good guys and we want to help them accomplish their goal We do this by making a protocol for the parties to use A protocol tells each party how to behave A protocol is essentially a program but it s a distributed program Here are some features of protocols for you to understand A protocol instructs the parties what to do It doesn t tell the adversary what to do That is up to her A protocol can be probabilistic This means that it can make random choices To formalize this we usually assume that the model of computation that allows a party to specify a number n 2 2 and then obtain a random value 2 lt101 n 7 1 This notation means that 2 is a random value from the indicated set all values being equally likely A protocol can be statefal This means that when a party nishes what he is doing he can retain some information for the next time that he is active When that party runs again he will remember the state that he was last in So for example you could have a party that knows this is the rst time I ve been run this is the second time I ve been run and so on When we formalize protocols they are usually tuples of algorithms But the actual formalization will vary from problem to problem For example a protocol for symmetric encryption isn t the same type of thing as a protocol for a telephone coin ip Another word for a protocol is a scheme We ll use the two words inter changeably So an encryption scheme is a protocol for encryption and a message authentication scheme is a protocol for message authentication For us a function computed by a deterministic sequential algorithm is also a protocol It s a partic ularly simple kind of protocol How can we devise and analyze protocols The rst step is to try to understand the threats and the goals for our particular problem Once we have a good idea about these we can try to nd a protocol solution The adversary is the agent that embodies the source of the threat Adversaries aim to defeat our protocol s goals Protocols in turn are designed to to surmount the behavior of adversaries It is a gameia question of who is more clever protocol designer or adversary The adversary is usually what we focus on In rigorous formalizations of cryp tographic problems the parties may actually vanish being absorbed into the formalization But the adversary will never vanish She will be at center stage Cryptography is largely about thinking about the adversary What can she do and what can t she do What is she trying to accomplish We have to answer these questions before we can get very far Just as we warned that one shouldn t literally regard our parties as people so 14 INTRODUCTION Specify a RAM model Should we use xed width registers or arbitrary precision Figure 110 The RAM model with an oracle An adversaries is a program written in this model of computation Details of the model are not important but one has to x some model of computation too for the adversary The adversary might represent an actual person but it might just as well be an automated attack program a competitor s company a criminal organization a government institution one or more of the protocol s legitimate parties a group of friendly hackers or merely some unlucky circumstances conspiring together not controlled by any intelligence at all By imagining a powerful adversary we take a pessimistic view about what might go wrong We aim to succeed even if someone is out to get us Maybe nobody is out to get us In that case we should at least be achieving high reliability After all if a powerful adversary can t succeed in disrupting our endeavors then neither will noisy lines transmission errors due to software bugs unlucky message delivery times careless programmers sending improperly formatted messages and so forth When we formalize adversaries they will be random access machines RAMs with access to an oracle77 See Figure 110 for a for a description of this model of computation 132 Cryptography and computer security Good protocols are an essential tool for making secure computing systems Badly designed protocols are easily exploited to break into computer systems to eavesdrop on phone calls to steal services and so forth Good protocol design is also hard It is easy to under estimate the task and quickly come up with ad hoc protocols that later turn out to be wrong In industry the necessary time and expertise for proper protocol design is typically under estimated often at future cost It takes knowledge effort and ingenuity to do the job right Security has many facets For a system to be secure many factors must combine Bellare and Rogaway 15 For example it should not be possible for hackers to exploit bugs break into your system and use your account They shouldn t be able to buy off your system administrator They shouldn t be able to steal your back up tapes These things lie in the realm of system security The cryptographic protocol is just one piece of the puzzle lfit is poorly designed the attacker will exploit that For example suppose the protocol transmits your password in the clear that is in a way that anyone watching can understand what it is That s a protocol problem not a system problem And it will certainly be exploited The security of the system is only as strong as its weakest link This is a big part of the dif culty of building a secure system To get security we need to address all the problems how do we secure our machines against intruders how do we administer machines to maintain security how do we design good protocols and so on All of these problems are important but we will not address all of these problems here This course is about the design of secure protocols We usually have to assume that the rest of the system is competent at doing its job We make this assumption because it provides a natural abstraction boundary in dealing with the enormous task of providing security Computer system security is a domain of a different nature requiring different tools and expertise Security can be best addressed by splitting it into more manageable components 133 The rules of the game Cryptography has rules The rst rule is that we may only try to overcome the adversary by means of protocols We aren t allowed to overcome the adversary by intimidating her arresting her or putting poison in her coffee These methods might be effective but they are not cryptography Another rule that most cryptographers insist on is to make the protocols public That which must be secret should be embodied in keys Keys are data not algo rithms Why do we insist that our protocols be public There are several reasons A resourceful adversary will likely nd out what the protocol is anyway since it usu ally has to be embodied in many programs or machines trying to hide the protocol description is likely to be costly or infeasible More than that the attempt to hide the protocol makes one wonder if you ve achieved security or just obfuscation Peer review and academic work cannot progress in the absence of known mechanisms so keeping cryptographic methods secret is often seen as anti intellectual and a sign that ones work will not hold up to serious scrutiny Government organizations which deal in cryptography often do not make their mechanisms public For them learning the cryptographic mechanism is one more hoop that that the adversary must jump through Why give anything away Some organizations may have other reasons for not wanting mechanisms to be public like a fear of disseminating cryptographic know how or a fear that the organization s abilities or inabilities will become better known 16 INTRODUCTION 14 Approaches to the study of cryptography 141 Phases in cryptography s development The history of cryptography can roughly be divided into three stages In the rst early stage algorithms had to be implementable with paper and ink Julius Caesar used cryptograms His and other early schemes often took the form of substitution ciphers If A AB Z is the alphabet Caesar of course used the Roman onel the simplest substitution cipher is simply a permutation f A a A associat ing with each plaintext letter w its ciphertext letter f Permutation means it is one to one and onto that is bijective The mapping f is known to receiver and sender but at least a priori not to an adversary To send a message M View it as a sequence of letters M M1 The sender computes for 2 1 m and transmits C C1 The receiver knowing f also knows f l and can decode The adversary not knowing the association f but seeing only C may be ba led at rst But once enough words have been transmitted the code is soon broken because we can make guesses based on repetitions of letters and knowledge of frequencies of letters in words in the English language The system can be strengthened in various ways but none too effective The second age of cryptography was that of cryptographic engines This is associated to the period of the World War II and the most famous crypto engine was the German Enigma machine How its codes were broken is a fascinating story The last stage is modern cryptography lts central feature is the reliance on mathematics and electronic computers Mathematical tools are used to design pro tocols and computers are used implement them It is during this most recent stage that cryptography becomes much more a science We can characterize much of the work that has been going on in cryptography in a couple of different dimensions The rst distinction is between cryptanalysisdrwen design and proofdriven design The second distinction is between information theoretic cryptography and complexitytheoretic cryptography We would like to take up these two dimensions 142 Cryptanalysisdriven design Traditionally cryptographic mechanisms have been designed by focusing on concrete attacks and how to defeat them The approach has worked something like this 1 A cryptographic goal is recognized 2 A solution is offered One searches for an attack on the proposed solution When one is found if it is deemed damaging or indicative of a potential weak ness you go back to Step 2 and try to come up with a better solution The process then continues The third step is called cryptanalysis In the classical approach to design crypt analysis was an essential component of constructing any new design AAA in 0 Bellare and Rogaway 17 Problem Proposed Solution Bug Revised Solution Implement Bug Figure 111 The classical cryptography approach Sometimes one nds protocol problems in the form of subtle mathematical rela tionships which allow one to subvert the protocol s aims Sometimes instead one jumps out of the system showing that some essential cryptographic issue was overlooked in the design application or implementation of the cryptography Some people like to reserve the word cryptography to refer to the making of cryptographic mechanisms cryptanalysis to refer to the attacking of cryptographic mechanisms and cryptology to refer to union Under this usage we ve been saying cryptography in many contexts where cryptology would be more accurate Most cryptographers don t observe this distinction between the words cryptography and cryptology so neither will we There are some di iculties with the approach of cryptanalysis drive design The obvious problem is that one never knows if things are right nor when one is nished The process should iterate until one feels con dent that the solution is adequate But one has to accept that design errors might come to light at any time If one is making a commercial product one must eventually say that enough is enough ship the product and hope for the best With luck no damaging attacks will subsequently emerge But sometimes they do and when this happens the company that owns the product may nd it di icult or impossible to e fectively x the elded solution They might try to keep secret that there is a good attack but it is not easy to keep secret such a thing See Figure 111 Doing cryptanalysis well takes great cleverness and it is not clear that insightful cryptanalysis is a skill that can be e fectively taught Sure one can study the most famous attacksibut will they really allow you to produce a new equally insightful 18 INTRODUCTION one Great cleverness and great mathematical prowess seem to be the requisite skills not any speci c piece of knowledge Maybe you have heard of Don Copper smith or Adi Shamir These are two of the masters of this eld Sadly it is hard to base a science on an area where signi cant assurance is engendered by knowing that Don thought seriously about the mechanism for some time and couldn t nd an attack We need to pursue things di ferently 143 Shannon security for symmetric encryption The systematic approach to cryptography where proofs and de nitions play a visible role begins in the work of Claude Shannon Shannon was not only the father of information theory but he might also be said to be the father of the modern era of cryptography Let s return to the problem of symmetric encryption and our particular protocol for doing this which was to use a one time pad Security we have said means defeating an adversary so we have to specify what is it the adversary wants to do As we have mentioned before we need some formal way of saying what it means for the scheme to be secure We present the idea of Shannon Let M 0 1 a 0 1 be a probability distribution on the set of 71 bit messages That is assume Alice chooses M with probability This distribution is known to everyone including the adversary Thus before C is transmitted all the adversary knows is that any particular message M has probability MM of being transmitted We want to capture the constraint that the adversary s information about the message does not increase after seeing the ciphertext We have xed some encryption scheme IC D in mind For any string C let PMCM denote the a posteriori probability of M given ciphertext C namely PMCM PrMessage was M Ciphertext was C Here the probability is over the choice of key K and the choice of M from M Note it is a conditional probability namely the probability that M was the message given that a particular ciphertext C has been seen De nition 11 Encryption scheme IC D is Shannon secure if for every distri bution M it is the case that for every ciphertext C which occurs with nonzero probability and message M we have PMCM The way to interpret it is that after having seen C let the adversary take her best guess as to what M was The probability that she is right is not more than the probability that she would have been right had the sender simply chosen a message transmitted nothing at all and asked the adversary to guess this message As long as you don t end up with more information about the message after seeing C than you had before then the encryption is secure It turns out that the one time pad encryption has the above property We prove this in Chapter 7 Bellare and Rogaway 19 Shannon security however has important limitations Recall that the key in the onetimepad scheme had to be at least as long as the number of bits we want to encrypt It turns out that this is necessary to achieve Shannon security That is if an encryption scheme is to meet De nition 11 the number of key bits must be at least the total number of plaintext bits we re going to encrypt This fact has some fundamental implications If we want to do practical cryp tography we must be able to use a single short key to encrypt lots of bits This means that we will not be able to achieve Shannon security We must seek a different paradigm and a different notion of security 144 Computationalcomplexity theory Modern cryptography introduces a new dimension the amount of computing power available to an adversary It seeks to have security as long as adversaries don t have too much77 computing time Schemes are breakable in principle but not in practice Attacks are infeasible not impossible This is a radical shift from many points of view It takes cryptography from the realm of information theory into the realm of computer science and complexity theory in particular since that is where we study how hard problems are to solve as a function of the computational resources invested And it changes what we can ef ciently achieve We will want to be making statements like this Assuming the adversary uses no more than t computing cycles her prob ability of breaking the scheme is at most t2200 Notice again the statement is probabilistic Almost all of our statements will be Notice another important thing Nobody said anything about how the adversary operates What algorithm or technique does she use We do not know anything about that The statement holds nonetheless So it is a very strong statement It should be clear that in practice a statement like the one above would be good enough As the adversary works harder her chance of breaking the scheme increases and if the adversary had 2200 computing cycles at her disposal we d have no security left at all But nobody has that much computing power Now we must ask ourselves how we can hope to get protocols with such proper ties The legitimate parties must be able to ef ciently execute the protocol instruc tions their effort should be reasonable But somehow the task for the adversary must be harder 145 Atomic primitives We want to make a distinction between the protocols that that we use and those that we are designing At the lowest level are what we call atomic primitives Higher level protocols are built on top of these 20 INTRODUCTION Atomic Primitives 1 What s the distinction Perhaps the easiest way to think of it is that the proto cols we build address a cryptographic problem of interest They say how to encrypt how to authenticate how to distribute a key We build our protocols out of atomic primitives Atomic primitives are protocols in their own right but they are simpler protocols Atomic primitives have some sort of hardness or security properties but by themselves they don t solve any problem of interest They must be properly used to achieve some useful end In the early days nobody bothered to make such a distinction between protocols and the primitives that used them And if you think of the one time pad encryption method there is really just one object the protocol itself Atomic primitives are drawn from two sources engineered constructs and math ematical problems In the rst class fall standard block ciphers such as the well known DES algorithm In the second class falls the RSA function We ll be looking at both types of primitives later The computational nature of modern cryptography means that one must nd and base cryptography on computationally hard problems Suitable ones are not so commonplace Perhaps the rst thought one might have for a source of com putationally hard problems is NP complete problems lndeed early cryptosystems tried to use these particularly the Knapsack problem However these efforts have mostly failed One reason is that NP complete problems although apparently hard to solve in the worstcase may be easy on the average An example of a more suitable primitive is a oneway function This is a function f D a R mapping some domain D to some range R with two properties 1 f is easy to compute there is an e icient algorithm that given an E D outputs 3 f w E R 2 f is hard to invert an adversary I given a random 3 E R has a hard time guring out a point x such that fw y as long as her computing time is restricted The above is not a formal de nition The latter which we will see later will talk about probabilities The input at will be chosen at random and we will then talk of the probability an adversary can invert the function at y x as a function of the time for which she is allowed to compute Can we nd objects with this strange asymmetry It is sometimes said that oneway functions are obvious from real life it is easier to break a glass than to put it together again But we want concrete mathematical functions that we can implement in systems One source of examples is number theory and this illustrates the important interplay between number theory and cryptography A lot of cryptography has Bellare and Rogaway 21 been done using number theory And there is a very simple oneway function based on number theoryisomething you already know quite well Multiplication The function f takes as input two numbers a and b and multiplies them together to get N ab There is no known algorithm that given a random N ab always and quickly recovers a pair of numbers not 1 and N of course that are factors of N This backwards direction is the factoring problem and it has remained unsolved for hundreds of years Here is another example Let p be a prime The set Z 1 p 7 1 turns out to be a group under multiplication modulo 1 We x an element 9 E Z which generates the group that is 909192 gp z is all of Z and consider the function f 0 p7 2 a Z de ned by fc gm mod p This is called discrete exponentiation and its inverse is called discrete logarithm loggy is the value cc such that y gm It turns out there is no known fast algorithm that computes discrete logarithms either This means that for large enough 13 say 1000 bits the task is infeasible given current computing power even in thousands of years So this is another oneway function It should be emphasized though that these functions have not been proven to be hard functions to invert Like P versus NP whether or not there is a good one way function out there is an open question We have some candidate examples and we work with them Thus cryptography is build on assumptions If the assumptions are wrong a lot of protocols might fail In the meantime we live with them 146 The provablesecurity approach While there are several different ways in which proofs can be effective tools in cryptography we will generally follow the proof using tradition which has come to be known as provable security Provable security emerged in 1982 with the work of Sha Goldwasser and Silvio Micali At that time Goldwasser and Micali were graduate students at UC Berkeley They and their advisor Manuel Blum wanted to put public key encryption on a scienti cally rm basis And they did that effectively creating a new viewpoint on what cryptography is really about We have explained above that we like to start from atomic primitives and trans form them into protocols Now good atomic primitives are rare as are the people who are good at making and attacking them Certainly an important effort in cryptography is to design new atomic primitives and to analyze the old ones This however is not the part of cryptography that this course will focus on One reason is that the weak link in real world cryptography seems to be between atomic primi tives and protocols It is in this transformation that the bulk of security aws arise And there is a science that can do something about it namely provable security We will view a cryptographer as an engine for turning atomic primitives into protocols That is we focus on protocol design under the assumption that good atomic primitives exist Some examples of the kinds of questions we are interested in are these What is the best way to encrypt a large text le using DES assuming DES is secure What is the best way to design a signature scheme using multiplication 22 INTRODUCTION assuming that multiplication is oneway How secure are known methods for these tasks What do such questions even mean and can we nd a good framework in which to ask and answer them A poorly designed protocol can be insecure even though the underlying atomic primitive is good The fault is not of the underlying atomic primitive but that primitive was somehow misused lndeed lots of protocols have been broken yet the good atomic primitives like DES and multiplication and RSA have never been convincingly broken We would like to build on the strength of such primitives in such a way that protocols can inherit this strength not lose it The provable security paradigm lets us do that The provable security paradigm is as follows Take some goal like achieving pri vacy via symmetric encryption The rst step is to make a formal adversarial model and de ne what it means for an encryption scheme to be secure The de nition explains exactly whenion which runsithe adversary is successful With a de nition in hand a particular protocol based on some particular atomic primitive can be put forward It is then analyzed from the point of view of meeting the de nition The plan is now show security via a reduction A reduction shows that the only way to defeat the protocol is to break the underlying atomic primitive Thus we will also need a formal de nition of what the atomic primitive is supposed to do A reduction is a proof that if the atomic primitive does the job it is supposed to do then the protocol we have made does the job that it is supposed to do Believing this it is no longer necessary to directly cryptanalyze the protocol if you were to nd a weakness in it you would have unearthed one in the underlying atomic primitive So if one is going to do cryptanalysis one might as well focus on the atomic primitive And if we believe the latter is secure then we know without further cryptanalysis of the protocol that the protocol is secure too A picture for the provablesecurity paradigm might look like Figure 112 In order to do a reduction one must have a formal notion of what is meant by the security of the underlying atomic primitive what attacks exactly does it withstand For example we might assume that RSA is a one way function Here is another way of looking at what reductions do When I give you a reduction from the onewayness of RSA to the security of my protocol 1 am giving you a transformation with the following property Suppose you claim to be able to break my protocol P Let A be the adversary that you have that does this My transformation takes A and turns it into another adversary A that breaks RSA Conclusion as long as we believe you can t break RSA there could be no such adversary A In other words my protocol is secure Those familiar with the theory of NP completeness will recognize that the basic idea of reductions is the same When we provide a reduction from SAT to some computational problem E we are saying our E is hard unless SAT is easy when we provide a reduction from RSA to our protocol H we are saying that H is secure unless RSA is easy to invert The analogy is further spelled out in Figure 113 for Bellare and Rogaway Problem De nition Protocol Reduction Implement DONE Figure 112 The provablesecurity paradigm We think that computational problem E can t be solved in polynomial time We think that cryptographic protocol H can t be effectively attacked We believe this because if E could be solved in polynomial time then so could SAT say We believe this because if H could be effec tively attacked then so could RSA say To show this we reduce SAT to E we show that if somebody could solve E in polynomial time then they could solve SAT in polynomial time too To show this we reduce RSA to H we show that if somebody could break H by effective means then they could break RSA by effective means too Figure 113 The analogy between reductionist cryptography and NP Completeness the bene t of those of you familiar with the notion of NP Completeness Experience has taught us that the particulars of reductions in cryptography are a little harder to comprehend than they were in elementary complexity theory Part of the dif culty lies in the fact that every problem domain will have it s own unique notion of What is an effective attack77 It s rather like having a different version of the notion of NP Completeness as you move from one problem to another We will also be concerned with the quality of reductions One could have concerned oneself with this in complexity theory but it s not usually done For doing practical work in cryptography however paying attention to the quality of reductions is important Given these dif culties we will proceed rather slowly through the ideas Don t worry you will get it even if you never heard of NP Completeness The concept of using reductions in cryptography is a beautiful and powerful idea Some of us by now are so used to it that we can forget how innovative it was And for those not used to it it can be hard to understand or perhaps believe at rst hearingiperhaps because it delivers so much Protocols designed this way truly have superior security guarantees 24 INTRODUCTION In some ways the term provable security is misleading As the above indicates what is probably the central step is providing a model and de nition which does not involve proving anything And then one does not prove a scheme secure one provides a reduction of the security of the scheme to the security of some underlying atomic primitive For that reason we sometimes use the term reductionist security instead of provable security to refer to this genre of work 147 Theory for practice As you have by now inferred this course emphasizes general principles not speci c systems We will not be talking about the latest holes in sendmail or Netscape how to con gure PCP or the latest attack against the ISO 9796 signature standard This kind of stuff is interesting and useful but it is also pretty transitory Our focus is to understand the fundamentals so that we know how to deal with new problems as they arise We want to make this clear because cryptography and security are now quite hyped topic There are many buzzwords oating around Maybe someone will ask you if having taken a course you know one of them and you will not have heard of it Don t be alarmed Often these buzzwords don t mean much This is a theory course Make no mistake about that Not in the sense that we don t care about practice but in the sense that we approach practice by trying to understand the fundamentals and how to apply them Thus the main goal is to understand the theory of protocol design and how to apply it We rmly believe it is via an understanding of the theory that good design comes If you know the theory you can apply it anywhere if you only know the latest technology your knowledge will soon by obsolete We will see how the theory and the practice can contribute to each other re ning our understanding of both In assignments you will be asked to prove theorems There may be a bit of math ematics for you to pick up But more than that there is mathematical thinking Don t be alarmed if what you nd in these pages contradicts conventional wis dom Conventional wisdom is often wrong And often the standard texts give an impression that the eld is the domain of experts where to know whether something works or not you must consult an expert or the recent papers to see if an attack has appeared The difference in our approach is that you will be given reasoning tools and you can then think for yourself Cryptography is fun Devising de nitions designing protocols and proving them correct is a highly creative endeavor We hope you come to enjoy thinking about this stuff and that you come to appreciate the elegance in this domain 15 What background do I need Now that you have had some introduction to the material and themes of the class you need to decide whether you should take it Here are some things to consider in Bellare and Rogaway 25 making this decision A student taking this course is expected to be comfortable with the following kinds of things which are covered in various other courses The rst is probability theory Probability is everywhere in cryptography You should be comfortable with ideas like sample spaces events experiments conditional probability random variables and their expectations We won t use anything deep from probability theory but we will draw heavily on the language and basic concepts of this eld You should know about alphabets strings and formal languages in the style of an undergraduate course in the theory of computation You should know about algorithms and how to measure their complexity In par ticular you should have taken and understood at least an undergraduate algorithms class Most of all you should have general mathematical maturity meaning especially you need to be able to understand what is and what is not a proper de nition 16 Historical notes 17 Problems Problem 11 Suppose that you want to encrypt a single message M 6 012 using a random shared key K 6 012 Suppose you do this by representing K and M using two bits 00 01 or 10 and then XORing the two representations Does this seem like a good protocol to you Explain I Problem 12 Suppose that you want to encrypt a single message M 6 012 using a random shared key K 6 012 Explain a good way to do this I Problem 13 Besides the symmetric and the asymmetric trust models think of a couple more ways to create asymmetry between the receiver and the adversary Show how you would encrypt a bit in your model I Problem 14 In the telephone coin ipping protocol what should happen if Alice refuses to send her second message Is this potentially damaging I Problem 15 Argue that what we have said about keeping the algorithm public but the key secret is fundamentally meaningless I Problem 16 A limitation on zedtime faircoin ippiny TMs Consider the model of computation in which we augment a Turing machine so that it can obtain the output of a random coin ip by going into a distinguished state 62 the next state will be QH with probability 12 and the next state will be QT with probability 12 Show that in this model of computation there is no constant time algorithm to perfectly deal out ve cards to each of two players 26 INTRODUCTION A deck of cards consists of 52 cards and a perfect deal means that all hands should be equally likely Saying that the algorithm is constant time means that there is some number T such that the algorithm is guaranteed to stop within T steps I Problem 17 Symmetric encryption with a deck of cards Alice shuffles a deck of cards and deals it all out to herself and Bob each of them gets half of the 52 cards Alice now wishes to send a secret message M to Bob by saying something aloud Eavesdropper Eve is listening in she hears everything Alice says but Eve can t see the cards Part A Suppose Alice s message M is a string of 48 bits Describe how Alice can communicate M to Bob in such a way that Eve will have no information about what is M Part B Now suppose Alice s message M is 49 bits Prove that there exists no protocol which allows Alice to communicate M to Bob in such a way that Eve will have no information about M What does it mean that Eve learns nothing about M 7 Here what we mean is that for all strings C the probability that Alice says C is independent of M for all messages M0M1 we have that Pr Alice says C M M0 Pr Alice says C M M1 The probability is over the the random shuffle of the cards I Problem 18 Composition of EPT Algorithms John designs an EPT expected polynomial time algorithm to solve some computational problem Hibut he as sumes that he has in hand a black box ie a unit time subroutine which solves some other computational problem l l Ted soon discovers an EPT algorithm to solve l l True or false putting these two pieces together John and Ted now have an EPT algorithm for H Give a proof or counterexample When we speak of the worst case running time of machine M we are looking at the function Tn which gives for each n the maximal time which M might spend on an input of size n Tn maxz z nstepsMw When we speak of the expected running time of M we are instead looking at the function Tn which gives for each n the maximal value among inputs of length n of the expected value of the running time of M on this inputithat is Tn maxz z nEStepsMw where the expectation is over the random choices made by M I