Selected Topics CIS 400
Popular in Course
Popular in Computer & Information Science
This 18 page Class Notes was uploaded by Sam Rau on Wednesday October 21, 2015. The Class Notes belongs to CIS 400 at Syracuse University taught by Staff in Fall. Since its upload, it has received 25 views. For similar materials see /class/225601/cis-400-syracuse-university in Computer & Information Science at Syracuse University.
Reviews for Selected Topics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/21/15
INFORMATION THEORY CIS 400628 Spring 2005 Introduction to Cryptography This is based on Chapter 15 of Trappe and Washington FINITE PROBABILITY SPACES amp RANDOM VARIABLES DEFINITION gt A probability distribution p 2 p1 pn is a sequence of real numbers such that o 0 3 pl 3 1 for each i and gt A probability space X 19 is a set X 331 sun and a probability distribution p1 pn pi is the probability of sci written pXzcz 2 pi So pX X gt 01 gt An event E in a probability space X 19 is a subset of X and pXE def ZwiEE Examples Coin flips cards DEFINITION PROBABILITY REVIEW CONTINUED Suppose S X gt Y and X 19 is a prob space Then S and 1 induce a distribution on Y PYy pxw E X I 533 y EXAMPLE Roll two dice and take their sum gtX16gtlt16 gtY212 gtSabab F py2 F py3 F py4 F py5 F py6 F py7 PY12 PY11 PY10 PY9 PY8 136 236 336 436 136 118 112 19 536 636 16 gtpXa399b FS 3 i 3639 X gtY Gm kme I NGWHKDONH GONGUIiIkwm 90046me OOKI UIHgt I I O OOKI UI i ti t I O PROBABILITY REVIEW CONTINUED DEFINITION Suppose X pX is a probability space and S X gt Y S is called a Yvalued random variable on X amp for y E Y 1030 def PX E X SUB y ProbS y DEFINITION Suppose X pX is a probability space and S X gt Y and T X gt Z are random variables Then PSTyZ def px E X i 530 y amp TUB Z ProbS yT z EXAMPLE X16 SX gt01 TX gt01 SW 2 1 ifar is even TUB 1 ifar lt 3 0 ifac is odd 0 ifac 2 3 STILL MORE PROBABILITY DEFINITION S X gt Y and T X gt Z are independent iff forallyEY zEZ ProbS yT z ProbS y ProbT z EXAMPLE gtS16 gt01 Sac1ltgtaciseven gtT16 gt01 Tac1ltgtavlt3 gtU16 gt01 Uac1ltgtacisprime S and T are independent S and U are not independent DEFINITION Suppose S X gt Y T X gt Z and ProbT z gt 0 Then the conditional probability of y given z is ProbS yT z T e H y39 Z M ProbTz Sometimes ProbS yT z is written pyyz 5 BAYES S THEOREM Note If S and T are independent then Pr0bS le z Pr0bS y Bayes s Theorem If Pr0bS y gt 0 and Pr0bT z gt 0 then Pr0bS y Pr0bT zlS y Pr0bS le Z Pr0bT z proof on board ENTROPY Entropy Axioms 1 H Probability spaces gt R 2 H is continuous on the probability distributions 3 HUn g HUn1 where U 1 kpac i 4 Suppose 0 lt q lt 1 Then HP1qpj1 q pjpn Hp1pjpn pjHq1 q Intuitively HX the amount of disorderuncertainity in X average of bits needed to describe outcomes Example 1 Roll a fair 6sided die Results 24 6 or 1 3 5 2 Roll a fair 6sided die Results 2 46 or 135 H H 51 What does such an H look like 7 MORE ON ENTROPY Convention 0 log2 0 0 SHANNON S THEOREM Suppose H satisfies 1 4 Then there is a constant A gt 0 such that Hp1pn Zpi 10g2pi i1 DEFINITION Suppose S X gt Y is a random variable The entropy of S written is HS Z ProbS y 10g2ProbS y yEY Alternative Definition HS the expected value of log2 ProbS y over y E Y EXAMPLE APPLICATIONS EXAMPLE A fair coin X headstails pheads ptails HX 1 39 10g2 l39 10g2 1 It takes 1 bit to descibe the outcome Example An unfair coin Suppose 0 lt p lt 1 Probheads p Probtails 1 p Hunfair coin toss p 10g2p 1 p log21 p Example A fair nsided die Ha roll 2 log2 log2 log2 n Example Flipping two fair coins Heads no points Tails 1 point Two flips sum points Outcomes 0 1 2 with probabilities Htwo coin flips log2i log2 ilog2i Is there exactly one head the an number of yesn0 Are there two heads quesions needed to tell the result 9 JOINT AND CONDITIONAL ENTROPY SupposeSX gtY TX gtZandUX gtYgtltZ where Uar HSa T def Z Emamy 10g2pXYy ccEX yEY This is just the entropy of U We define conditional entropy of T given S by HTIS def Zpsy HTS y Zpscy Emmy 10g2pTzygt ZZPSTy7ZIOg2pTzIy since psTy z pTzypsy the uncertainty of T given S JOINT AND CONDITIONAL ENTROPY CONTINUED CHAIN RULE THEOREM HXY HX HYX The uncertainty of X Y the uncertainty of X the uncertainty of Y given that X happened THEOREM a HX g log2 X equal iff all elms of X equally likely You are most uncertain under uniform distrs b HXY g HX HY The info in X Y is at most the info of X the info of Y c HYX g Knowing X cannot make you less certain about Y only if X Y independent Proof of c By the Chain Rule HX HYX HX Y By b HX Y g HX HY So HX HYX g HX HY HUFFMAN CODES CONTINUED Constructing a Huffman coding gt Start with a table of letters with their probabilities gt Form them into one item trees gt Loop greedy Picture on board 0 If there is just one tree left quit o OW pick the two trees with lowest probs Form into a new tree with the sum of the probs Theorem Let L be the average number of bits per output for the Huffman encoding for random variable X Then HX g L g HX 1 05 log2 05 03 log2 03 01log2 01 01log2 01 m 1685 PERFECT SECRECY GOAM Use inf theory to explain how onetime pads provide perfect secrecy P plaintexts each with a certain probability C ciphertexts induced probabilities 7C keys assume independent of choice of plaintext EXAMPLE P 2 ab c Proba 05 Probb 03 Probc 02 K 61 k2 PI39ObUCl PI39ObUCg C U V W eKar a b c ProbU 205 Wh t E l f k1 U V W PrOblVl 03925 an litech teldeciellrerite In 122 U W V ProbW 0 25 p p 39 PERFECT SECRECY CONTINUED DEFINITION A cryptosystem has perfect secrecy iff HPC39 THEOREM The onetime pad has perfect secrecy Proof Setup gt z 2 size of alphabet eg 2 26 256 etc gt P strings of length L zL many gt K 31 3 vector of shifts each key k2 pKk zL gtCP gtCEC gt pcc Probpar ProbKk as E P k 6 7C ekac 0 Since P and K are independent ProbP acK 2 k2 Probpw ProbKk PROOF CONTINUED 2 Probpar ProbKk as E P k 6 7C 6191 2 c zL Probpar as E P k 6 7C 61930 c Obs Given p and 0 there is only one k such that ekac 0 50 Z Probpar as E P k 6 7C 6191 2 c 1 Therefore Pcc zL 1900 HK HC 10g2zL HP K C HP K HP HK P and K indep HP K C HP C HPC HC HP HK HPC HC HP HPC QED For RSA HPC 0 Why THE ENTROPY OF ENGLISH Question In a typical English language text how much information is there per letter gt For a random text over a z under the uniform distr HT log2 log2 26 470 or HT log2 2 17 2 log2 27 475 when you include spaces gt If we use our standard frequency tables for letters ie a0082 b0015 then HT 085log2 085 015log2 015 418 gt But there is a lot more structure to English or any natural language than letter frequences 17 THE ENTROPY OF ENGLISH ll Using digrams HT 356 Using trigrams HT 33 L ngram combinations H Ln HEnglish 11m n gtoo n z the average amount of information per letter in a long text the average amount of uncertainty in guessing the next letter in a long text How can we compute this thing THE ENTROPY OF ENGLISH I How to compute HEnglish limngt00 HLquotn Shannon s Idea gt First suppose you had an optimal next letter guesser 0 Given a prefix it ranks from 1 to 26 the letters as being most likely to be next it i ssunnytoday 21114321411111 0 Run a text through it and record what it guesses each letter corresponds to 0 From the predictor 21114321411111 we can recover the text gt Use a native English speaker the next letter predictor and gather stats assume determinism