### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Note 14 for CMPSCI 585 at UMass

### View Full Document

## 15

## 0

## Popular in Course

## Popular in Department

This 14 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 15 views.

## Similar to Course at UMass

## Popular in Subject

## Reviews for Note 14 for CMPSCI 585 at UMass

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Probabilistic Context Free Grammars Lecture 14 Introduction to Natural Language Processing CMPSCI 585 Fall 2004 Andrew McCallum including slides from Jason Eisner Ambiguity in Parsing 0 Time flies like an arrow 0 Fruit flies like a banana 0 I saw the man with the telescope How to solve this combinatorial explosion of ambiguity 1 First try parsing without any weird rules throwing them in only if needed 2 Better every rule has a weight A tree s weight is total weight of all its rules Pick the overall lightest parse of sentence 3 Can we pick the weights automatically We ll get to this later CYK Parser Input Astring of words grammar in CNF Output yesno Data structure n x n table rows labeled 0 to n1 columns 1 to n cell ij lists constituents spanning L For each i from 1 to n Add to i1i all Nonterminals that could produce the word at i1i time 1 flies 2 like 3 an 4 arrow 5 NP 3 Vst 3 SeNPVP SeVstNP SeSPP VPeVNP VPeVPPP NPeDetN NPe NP PP lt U U IN Det 1 NP 9 NP NP OOOIU Ll LIJCD L PPePNP CYK Parser For width from 2 to n For start from O to nwidth Define end to be startwidth For mid from start1 to end1 For every constituent in start mid For every constituent in midend For all ways of combining them if any Add the resulting constituent to startend ll 39lf 1 llies 2 like 3 an 4 allow 5 ll 39lf 1 llies 2 like 3 an 4 allow 5 NP 3 NF 3 NP 10 V51 3 V51 3 0 0 1saNPVP 1saNPVP asaVstNP asaVstNP 2 s a 5 PP 2 s a 5 PP 1 NP4 1VP VNP 1 NP4 1VPavNP VP 4 VP 4 2VPPVPPP 2VPPVPPP 2 P2 1NPaDetN 2 P2 1NPaDetN v5 2NPaNPPP v5 2NPaNPPP 3 081 1 a NP 4 NP NP 3 081 1 3 NP 4 NP NP N 8 0PPaPNP 4 N 8 0PPaPNP iilne 1 liies 2 like 3 an 4 allow 5 iilne 1 liies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 3 NP 10 Vs 3 s 8 V31 3 s 8 S 13 0 0 ISANPVP 1saNPVP asaVstNP ssequP 2 s a 5 PP 2 s a 5 PP 1 NP4 1VP VNP 1 NP4 1VPavNP VP 4 VP 4 2VPPVPPP 2VPPVPPP 2 P2 1NPaDetN 2 P2 1NPaDetN v5 2NPaNPPP v5 2NPaNPPP 3 081 1 a NPaNPNP 3 081 1 a NPaNPNP N 8 0PPaPNP 4 N 8 0PPaPNP lime 1 llies 2 like 3 an 4 arrow 5 lime 1 llies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 3 NP 10 V51 3 S 8 V51 3 S 8 S 13 S 13 0 0 1saNPVP 1saNPVP asaVstNP asaVstNP 2 s a 5 PP 2 s a 5 PP 1 NP4 1VP VNP 1 NP4 1VP VNP VP 4 VP 4 2VPPVPPP 2VPPVPPP 2 P2 1NPaDetN 2 P2 INPaDelN v5 2NPaNPPP v5 2NPaNPPP 3 Del 1 3 NPPNP NP 3 Del NP 10 3 NPPNP NP N 8 0PPaPNP N 8 0PPaPNP lime 1 ies 2 like 3 an 1 allow 5 lime 1 ies 2 like 3 an 1 allow 5 NP 3 NP 10 NP 3 NP 10 V51 3 s 8 V51 3 s 8 s 13 s 13 0 0 1saNPVP 1saNPVP GSaVstNP GSaVstNP ZSPSPP ZSPSPP 1 1 4 1VP VNP 1 1 4 1VP VNP VP 4 VP 4 2VP VPPP 2VP VPPP 2 P2 1NPgtDetN 2 P2 PP12 1NPgtDetN V5 2NPaNPPP V5 2NPaNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 1 3 NPPNP NP N 8 OPPaPNP 4 N 8 0PPPNP lime 1 mes 2 like 3 an 4 arrow 5 lime 1 liies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 3 NP 10 V51 3 s 8 V51 3 s 8 s 13 s 13 0 0 1SgtNPVP 1SgtNPVP GSgtVstNP GSgtVstNP ZSgtSPP ZSgtSPP 1 1 4 1VPraVNP 1 NP4 1VP VNP VP 4 VP 4 2VP VPPP 2VP VPPP 2 P PP 12 1NPDetN 2 P 2 PP 12 1NP DetN v1 VP 16 2 NPgtNP PP V 5 VP 16 2 NPgtNP PP 3 Del 1 NP 0 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP lime 1 lies 2 like 3 an 4 arrow 5 lime 1 lies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 3 NP 10 V51 3 s 8 V51 3 s 8 s 13 s 13 0 0 1SgtNPVP 1SPNPVP GSgtVstNP GSgtVstNP ZSgtSPP ZSgtSPP 1 NP 4 NP 18 1VPVNP 1 NP 4 NP 18 1VPVNP VP 4 VP 4 s 21 2VP VPPP 2VP VPPP 2 P2 PP I2 1NPD5tN 2 P2 PP 12 1NPD5tN V5 VP 16 2NPAVNPPP V5 VPl 2NPgtNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP lime 1 llies 2 like 3 an 1 anew 5 lime 1 llies 2 like 3 an 1 anew 5 NP 3 NP 10 NP 3 NP 10 V51 3 S 8 V51 3 S 8 S 13 S 13 0 0 1SgtNPVP 1SgtNPVP GSaVstNP GSaVstNP 2 SaSPP 2 SaSPP 1 NP4 NP 18 1VPVNP 1 NP4 NP 18 1VPVNP VF 4 S 21 VP 4 s 21 VP 13 2 VPgtVPPP VP 13 2 VPgtVPPP 2 P2 PP 12 1NPD5tN 2 P2 PP 12 1NPD5tN V5 VP 1e 2NPaNPPP V5 VP 1e 2NPaNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP iilne 1 max 2 like 3 an 4 anew 5 iilne 1 max 2 like 3 an 4 anew 5 NP 3 NP 10 NP 24 NP 3 NP 10 NP 24 V51 3 S 8 Vs 3 S 8 S 22 S 13 S 13 0 0 1SgtNPVP ISANPVP GSgtVSINP GSgtVSINP 2 SgtSPP 2 SgtSPP 1 NP4 NP 1VPVNP 1 NP4 NP 18 1VPVNP VP 4 S 21 VP 4 S 21 VP 18 2 VPaVPPP VP 8 2 VPaVPPP 2 P2 PP 12 1NPD5tN 2 P2 PP 12 1NPD5tN V5 VP 16 2NPgtNPPP V5 VP 16 2NPgtNPPP 3 Del 1 NP 10 3 NF PNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP lime 1 lies 2 like 3 an 1 arrow 5 lime 1 lies 2 like 3 an 1 arrow 5 NP 3 NP 10 NP 24 NP 3 NP 10 NP 24 VslSSS S22 Vs13S8 S22 S 13 S 27 S 13 S 27 0 0 1SgtNPVP 1SgtNPVP BS VstNP GSgtVSINP 2 SgtSPP 2 SgtSPP 1 NP4 Nr18 1VPVNP 1 NP4 NP 18 1VPVNP VP 4 s 21 VP 4 S 21 VP 13 2 VPgtVPPP VP 13 2 VPgtVPPP 2 P2 PP 12 1NPD5tN 2 P2 PP 12 1NPD5tN V5 VP 1e 2NPaNPPP V5 VP 1e 2NPaNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP lime I llies 2 like 3 an 1 allow 5 lime I llies 2 like 3 an 1 allow 5 NP 3 NP 10 NP 24 NP 3 NP 10 NP 24 VS 3 s 8 s 22 VS 3 s 8 s 22 s 13 s 27 s 13 s 27 0 NP 24 0 NP 24 s 27 1SgtNPVP 1SAgtNPVP GSaVstNP GSaVstNP 2 sas PP 2 sas PP 1 NP4 NP 18 1VPVNP 1 NP4 NP 18 1VPVNP VP 4 S 21 VP 4 s 21 VP 18 2 VP VPPP VP 18 2 VP VPPP 2 P 2 PP 2 1 NPgtDetN 2 P 2 PP 12 1 NPgtDetN V5 VP 16 2NPgtNPPP V5 VP IG 2NPgtNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP iiine l liies 2 like 3 all 4 allow 5 iiine l liies 2 like 3 all 4 allow 5 NP 3 NP 10 NP 24 NP 3 NP 10 NP 24 VS 3 s 8 s 22 VS 3 s 8 s 22 s 13 s 27 s 13 s 27 0 NP 24 0 NP 24 s 27 s 27 S 22 1saNPVP S 22 1saNPVP GSaVstNP s 27 GSaVstNP 2 s 8 PP 2 s 8 PP 1 NP4 NP 18 1VPVNP 1 NP4 NP 18 1VPVNP VP 4 s 21 VP 4 s 21 VP 18 2 VP VPPP VP 18 2 VP VPPP 2 P 2 PF 12 1 NPaDEtN 2 P 2 PF 17 1 NPaDEtN V5 VP 16 2NPgtNPPP V5 VP 16 2NPgtNPPP 3 Del 1 NP 10 3 NPPNP NP 3 Del 1 NP 10 3 NPPNP NP N 8 OPPaPNP 4 N 8 OPPaPNP s 3 Follow backpomters lime 1 llies 2 like 3 an 4 allow 5 lim 1 llies 2 like 3 an 1 allow 5 NF VP NP 3 NP 10 NP 24 NP 3 NP 10 NP 24 VS 3 s 8 s 22 VS 3 s 8 s 22 s 13 s 27 s 13 s 27 0 NP 24 0 NP 24 s 27 s 27 s 22 s 22 s 27 s 27 1 SaNP VP 1 1 NP 4 NP 18 GSaV NP 1 NP 4 NP 18 6 VP 4 s 21 ZSPSPP VP 4 s 21 2 VP 18 VPVNP Vi l8 2VPaVPPP 2VPaVPPP 2 P 2 PP 12 INPaDeIN 2 P 2 PP 12 INPaDeIN V 5 VP 16 V 5 VP 1e 2NPaNPPP 2NPaNPPP 3 Del 1 NP 10 8 NPaNPNP 3 Del 1 NP 10 8 NPaNPNP N 8 OPPaPNP 4 N 8 0PPaPNP s lime 1 llies 2 like 3 an 1 allow 5 NP VP NP 3 NP 10 NP 24 VS 3 s 8 s 22 VP PA 3 13 s 27 p NP 0 NP 24 s 27 s 22 s 27 1 SaNP VP 1 NP 4 NP 18 GSaVSNP VP 4 s 21 ZSPSPP VP 18 1 VPaVNP 2 VPaVP PP 2 P 2 PP 12 INPaDeIN V 5 VP 1e 2 NPaNP PP 3 Del 1 NF 10 3 NPaNPNP N 8 OPPaPNP Which entries do we need lime 1 ies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 V51 3 s 8 s 22 s 13 s 27 0 NP 24 s 27 s 22 s 27 ISaNPVP 1 NP 4 NP 18 GSaVSNP VP 4 s 21 ZSPSPP VP 18 lvPaVNP 2VPaVPPP 2 P 2 PP 12 INPaDeIN V 5 VP 1e 2NPaNPPP 3 Del 1 NP 10 3 NPaNPNP N 8 OPPaPNP lime 1 llies 2 like 1 allow 5 NP VP NP 3 NP 10 NP 24 VS 3 s 8 s 22 P PP s 13 s 27 0 NP 24 s 27 s 22 s 27 1 SaNP VP 1 NP 4 NP 18 GSaV NP VP 1 s 21 ZSPSPP VP 18 1 VPaVNP m 2 VF AVP PP 2 P 2 PP quot 1NPaDe1N V 5 VP 1e 2 NPaNP PP 3 Del 1 NP 10 3 NPaNPNP 4 N 8 OPPaPNP s lime 1 llies 2 like 4 arrow 5 NP VP NP 3 NP 10 NP 24 VS 3 s 8 s 22 VP K s 13 s 27 p Np 0 NP 24 s 27 De N s 22 s 27 1 SaNP VP 1 NP 4 NP 18 GSaV NP VP 4 s 21 ZSPSPP VP 18 1 VPaVNP 2 VPaVP PP 2 P 2 PP 12 INPADPSN V 5 VP 1e 2 NPaNP PP 3 131 I NP 10 3 NPaNPNP 4 N 8 OPPaPNP Whlch entrles do we need lime 1 llies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 V51 3 3 9 s 22 s 13 s 27 0 NP 24 s 27 s 22 s 27 1 SaNP VP 1 NP 4 NP 18 GSaV NP VP 4 s 21 ZSPSPP VP 18 1 VPaVNP m 2 VPaVP PP 2 P 2 PP quot 1NPaDe1N V 5 VP 1e 2 NPaNP PP 3 Del 1 NP 10 3 NPaNPNP 4 N 8 OPPaPNP Not worth keeping lime 1 llies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 V51 3 3 21 s 22 s 27 0 NP 24 s 27 s 22 s 27 1saNPVP 1 NP 4 NP 18 esaszP VP 4 s 21 ZSPSPP VP 18 1VPaVNP m 2VPaVPPP 2 P 2 PP quot 1NPaDe1N V 5 VP 16 2NPaNPPP 3 Del 1 NP 10 3 NPaNPNP N 8 0PPaPNP since it just breeds worse options lime l ies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 V51 3 S 8 S 22 s 27 0 NP 24 S 27 S 22 1 ssz VP 1 NP 4 NP 18 6 SgtV NP VP 4 s 21 2 SeSPP VP 18 1 VPgtVNP n 2 VPaVP PP 2 P 2 PP 2 1 NPaDe N V 5 VP 16 2 NPaNP PP 3 Del 1 NP 10 3 NPaNPNP N 8 0 PPaPNP Keep only bestinclass and backpointers so you can recover parse lime l llies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 V51 3 S 8 S 22 1 NP 4 NP 18 VP 4 S 21 Vp 13 1 saNP VP 6 savs NP 2 P 2 PP 12 2 ssspp V 5 VP 16 1 VPgtVNP 3 Del 1 NP 10 2 VPgtVPPP N 8 1NPaDe1N 2 NPaNP PP 3 NPgtNP NP 0 PPgtPNP Keep only bestinclass lime l llies 2 like 3 an 4 allow 5 NP 3 NP 10 NP 24 Vsl 3 s 8 s 22 0 h NP 24 1nler1or stock Q E 1 8 NP VP 1 NP 4 NP 1s esaszP VP 4 s 21 23SPP VP 18 1 VPaVNP 2 VPaVP PP 2 P 2 PP 12 INPaDeIN V 5 VP 16 2 NPaNP PP 3 Del 1 NP 10 3 NP NPNP N 8 0PPgtPNP Probabilistic Trees Instead of lightest weight tree take highest probability tree Given any tree your assignment generator would have some probability of producing it Just like using ngrams to choose among strings What is the probability of this tree 5 N VP time VP PP flies A P NP like Det 2n mow Probabilistic Trees Instead of lightest weight tree take highest probability tree Given any tree your assignment generator would have some probability of producing it Just like using ngrams to choose among strings What is the probability of this tree 5 PN1Ap S You rolled a lot MA VP PP ofindependentdice flies A P NP like Det 2n mow Chain rule One word atatime ptime flies like an arrow ptime pflies time plike time flies pan time flies like parrow time flies like an Chain rule backoff to get trigram model ptime flies like an arrow ptime pflies time plike time flies pan flies like parrow like an Chain rule written differently ptime flies like an arrow ptime ptime flies time ptime flies like time flies ptime flies like an time flies like ptime flies like an arrow time flies like an Proof PW X PX X PY X X PY X Chain rule backoff ptime flies like an arrow ptime p time flies time p time flies like I time flies p flies like an I flies like p like an arrow I like an Proof pxy l x px l x pylxgtlt1 py l x Chain rule One node at a time s ngxVP S S S quot p ume S P S P A VP PP NP VP NP VP NP VP llies quotme P NP s I s like Del N p NAP MIAP an allow lime lime VP PP s s P I A NP VP NP VP lime lime VP PP VP PP llies Chain rule backoff S PAW s S s l Pl mp P p NP NP llies quotme likeNi Del N p VP VP an allow VP PP 39P l 39 VP VP 47 Simplified notation s NPVP p lime Is psa NP VP s quot pNPalliesl NP VP PP llies P NP I quotk9 VP a VP NP VP DQN p an allow quot pVPalliesl VP quot Already have a CKY alg for weights s NP VP W lime s Wsa NP VP WNPa lliesl NP VP PP llies P NP quotk9 WVP a VP NP Del N an arrow WVPsiiies Just let Wxavz 409 PanZ x Then lightest tree has highest prob 49 Need only hestinclass to get best parse time 1 ies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 vs 3 s 8 s 22 s13j 8 s 27 0 239 NP 24 2392 23913 s 27 multiplytoge 22L S 22 1 saNPVP S 27 6 SgtVstNP 2 S as PP 1 NP4 NP 18 1VngNp VP 4 s 21 212 VP 18 2 VPaVP PP 2 P 2 PF 12 1 NPaDEtN V5 VP 1e 2NPaNPPP 3 Del 1 NP 10 3 NPPNP NP N 8 0 PPaPNP Probabilistic Context Free Grammars PCFGS A PCFG G consists of the usual parts of a CFC I A set of terminals wkk 1 V I Aset of nonterminals Ni 7 1ri I A designated start symbol N1 I A set of rules Ni a 5139 where is a sequence of terminals and nonterminals and I A corresponding set of probabilities on rules such that Vi Pmi s 5139 1 J time 1 llies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 vs 3 s 8 s 22 s 13 8 s 27 0 239 NP 24 2 s 27 multiplyto ge 2392L S 22 1 s NP VP 3 27 6 SgtVst NP 2 s a 8 PP 1 NP4 NP 18 1VPVNP VP 4 s 21 212 VP 18 2 VP VP PP 2 P 2 PP i2 1 NPaDEtN V 5 VP 16 2 NP gt NP PP 3 Del 1 NP 10 3 NPPNP NP N 8 0 PP PNP Why probabil ies not weights 0 We just saw probabilities are really just a special case of weights o but we can estimate them from training data by counting and smoothing Use all of our lovely probability theory machinery PCFG notation Sentence sequence of words W1 Wm WM the subsequence w wb NED nonterminal Ni dominates w W N A WaWb Ni Repeated derivation from Ni gives PCFG probability of a string PW1n PW1nt taparse ofwln t Z Pt tzyieldtw1n t1 10 NP01 VP07 astronomers V10 NPO4 saw NPO18 PP10 stars P10 NPO18 with ears The two parse trees probabilities and the sentence probability Pt1 10 X 01 X 07 X 10 X 04 XO18 X 10 X 10 X 018 00009072 Pt2 10 X 01 X 03 X 07 X 10 XO18 X 10 X 10 X 018 00006804 PW15 Pt1 1302 00015876 A simple PCFG in CNF SaNPVP l0 NP NP PP 04 PP a P NP 10 NP a astronomers 01 VPaVNP 07 NP ears 018 VP a VP PP 03 NP a saw 004 P a with 10 NP a stars 018 V a saw 10 NP a telescopes 01 fzi S10 NP01 VP03 astronomers VP07 PP10 V10 NP018 P10 NP018 saw stars with ears Assumptions of PCFGs l Place Invariance like time invariance in HMM Vk PNkc a is the same 2 Contextrfree PNII Elwords outside wk w1 PNII a E 3 Ancestorrfree PNII a glancestor nodes of Ngl PNII a E The suf cient statistics of a PCFG are thus simply counts of how often different local tree con gurations occurred counts of which grammar rules were applied Some features of PCFGs Reasons to use a PCFG and some idea of their limitations I Partial solution for grammar ambiguity a PCFG gives some idea of the plausibility of a sentence I But in the simple case not a very good idea as indepenr dence assumptions are two strong eg not lexicalized Gives a probabilistic language model for English I In the simple case a PCFG is a worse language model for English than a trigram model I Better for grammar induction Gold 1967 vs Horning 1969 I Robustness Admit everything With low probability Some features of PCFGs I A PCFG encodes certain biases eg that smaller trees are normally more probable I One can hope to combine the strengths of a PCFG and a trigram model We ll look at simple PCFGs rst They have certain inader quacies but we ll see that most of the staterofrtherart probe abilistic parsers are fundamentally PCFG models Just With various enrichments to the grammar A s 39ghtly different task Been asking What is probability of generating a given tree To pick tree with highest prob useful in parsing But could also ask What is probability of generating a given string with the generator To pick string with highest prob useful in speech recognition as substitute for an ngram model Put the tile in the lolderquot Vs Put the tile and the lolderquot To get prob of generating string must add up probabilities of all trees for the string Could just add up the parse prohah time t llies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 oops badttolinding Vst 3 s 8 S 22 exponentially many S 13 722 S 27 parses 0 27 NP 24 2 s 27 S 22 1 s NP VP 727 S 27 6 s Vst NP 722 2 s gt 5 PP 1 NP 4 227 NP 18 1 VPgtVNP VP 4 S 21 VP 8 2 VP VP PP 2 P 2 PP 12 1 NP gt Det N V 5 VP 16 2 NP gt NP PP 3 Det 1 NP 10 3 NPP NP NP N 8 0 PP P NP Any more efficient way39 time t llies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 3 2quot s 22 3 91 s 27 0 NP 24 s 27 s 222 1 ssNPVP s 227 6 SaVst NP 239ZSaSPP i5 y VP 18 2VP VPPP 2 p2 FT 242 1NPaDetN V5 VP 16 2NPaNPPP 3 Dett NP 10 aNPaNPNP N 8 OPPsPNP Add as we go the inside algorithm time t llies 2 like 3 an 4 arrow 5 NP 3 NP 10 NP 24 Vst 3 3 2434243 S 22 s 27 0 NP 24 s 27 3 222 1ssNPVP 227 essVst NP 239ZSaSPP i5 y VP 18 2VP VPPP 2 p2 FT 242 1NPaDetN V5 VP 16 2NPaNPPP 3 Dett NP 10 aNPaNPNP N 8 OPPsPNP Add as we go the inside algorithm time 1 ies 2 like 3 an 4 arrow 5 NP 3 NP 1 13 NP 222 Vsl 3 s 239239 227 0 3 222 247 247 24 iseNPVP 27 aseVslNP 2 2 2 SeSPP 1 NP 4 NP 18 W 4 S 21 1VPeVNP VP 18 2VPsVPPP PP 212 1 NPeDelN VP 16 2NPeNPPP 3 Del 1 NP 10 3 NPgtNP NP 4 N s OPPePNP lto one Inside and Outside Probabilities Probability of all possible rule rewrites for generating words inside position p to q given that nonterminal exactly spans p to q Inside 31mg PqulNqG Probability of all possible rule rewrites for generating words outside position p to q and that nonterminal exactly spans p to q Outside 0911961 PW1pf1iN7quq1mlG Inside and Outside Probabilities Inside amp Outside Probabilities u myquot 39 d2 Ou OtVPl LS ptimeVP today S EVP5 piies like an arrow VF lXVPllS WPI L5 ptime VP ies like an arrow today S uinSld39e the V Inside amp Outside Probabilities chPl5 ptimeP today i 5 rivpiisi pfies like an arrow VP WW5 3VPlle5l l s0v6 time fliulike an m39mwtod ampVP LS S ptime flies like an ZWO l rLW to 2y pVPl l 5 i time flies like an zrmwtoday 5 So cavpi 5 ivpi L5 Ips06 is probability that there is aVP here given all of the observed data words Probability of a string Inside probability PW1mlG PN1 W1mlG PW1miNmiG mum Base case We want to nd SJk k the probability of a rule Ni e Wk PltWkiNik G MW e Wle Bjkik Probability of a string Induction We want to nd 1310711 for p lt 1 As thIs Is the Inductive step usIng a Chomslo Normal Form grammar the rst rule must be of the form NJ a Nquot N so we can proceed by Induction dIvIdIng the strIng In two In varIous places and summIng the result NJ NY NS W10 W11 Wd1 W11 These InsIde probabIIItIes can be calculated bottom up Inside probabilities as CYK I 2 3 4 s I BNP 01 65 00126 65 00015376 2 3w 004 m 0126 6W 0015376 13V 10 3 BNP 013 BNP 001296 4 6p 10 Bpp 013 5 BNP 013 astronomers saw stars With ears Outside probabilities Base Case 1x11m 1 011m 0 for 7A 1 Inductive Case It s eIther a left or rIght branch 7 we w some over both posslbilltles and calculate uslng outside and InsIde probabIIItIes W1Wpilwpwq Wq1wewe1wm Forallj Mm PqulN qG 1171 Z Z PWpvadvWdlquSd1qlN qu mazp 1171 Z Z PNvaSd1qlN qu wasp PWIN aNgaNIaWG Pltw1a1mIN aNgdwfd wwpam 1171 Z Z PNdNfd1qlN qG macp PWpd Ngd GPWd1qled1qu 1171 Z 2 PW eN N wrwamadHe We Outside probabilities ProbabIIIty ofa strIng For any k15 k 5 m PW1mlG ZPW1Ilt71IWkIWk1mINile J ZPW1ke1IINikIWk1mlG J xPWle1ke1IINikIWk1mIG Z 1Xk1kPNJ e Wk J lnductIve DP calculatIon One calculates the outside probabilItIes top down after determInIng the InsIde probabIIItIes Outside probabilities Inductive case MM Z 2 PM lWaimyNgcyN meqH f9cql v i r Z PWlv IIWltaImINemev WNW f9 1 Z 2 mm ImumN IPNINampIIN I f9cql v i XPWHlclNqulc Z Z PWlc lIWHImINH m 2 XPN9V l1NH NaPW V Ilev 1 Z Z afp2PNf N 119116ng 12 f9cql v I Z Z we MW e N3 mama 1 g 1 Probability that a rule is used As with a HMM we can form a product of the inside and outside probabilities This time Xjpvq8jpvq Pltw1p1NqwW1miGgtPltwWiN qGgt Pw1mNmiG 7 ngqiwlwc a 39 Ra 1711 PN1 qlw1m70 7 Pw1mlG l am This is an expected count for the number of times this rule occurred Overall probability of a node existing As With a HMM we can form a product of the Inside and outside probabilities This time D J39WMJWJWM PW1p71yNqy Wq1mlGPWVqlNqi G Pw1mNin Therefore pW1mNpqu Z Dltjpq jpq J39 Just in the cases of the root node and the preterminals we know there Will always be some such constituent Learning PCFGs 1 I We would like to calculate how often each rule is used A CNfag NJ 5 I If we have labeled data we count and nd out I Relative frequency again gives maximum likelihood prob ability estimates I This is the motivation for building Treebanks of hand parsed sentences Learning PCFGs 2 InsideOutside I Otheiwise we work iteratively from expectations of cur rent model We construct an EM training algorithm as for HMMs For each sentence at each iteration we work out expecr tation of how often each rule is used using inside and outside probabilities I We assume sentences are independent and sum expecr tations over parses of each I We rerestimate rules based on these counts

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "I made $350 in just two days after posting my first study guide."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.