Class Note for CMPSCI 585 at UMass(3)
Class Note for CMPSCI 585 at UMass(3)
Popular in Course
Popular in Department
This 7 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at University of Massachusetts taught by a professor in Fall. Since its upload, it has received 10 views.
Reviews for Class Note for CMPSCI 585 at UMass(3)
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/06/15
Lexical Acguisition Lecture 9 Introduction to Natural Language Processing CMPSCI 585 Fall 2004 University of Massachusetts Amherst Andrew McCaIIum Words and their meaning Th ree lectu res Collocations multiple words together different meaning than than the sum of its parts Word disambiguation one word multiple meanings This time Lexical Acquisition verb subcategorization attachment ambiguity selectional preference semantic similarity multiple words same meaning Today s Main Points What is Lexical Acquisition and why is it useful Verb subcategorization Attachment ambiguity Selectional preference Clustering words into semantically similar classes Lexical Acquisition Acquiring the properties of words Practical filling holes in dictionaries Lots of useful information isn t in dictionaries anyway e g associated with versus associated to Claim most knowledge of language is encoded in words and their properties Acquiring collocations and word sense disambiguation are examples of lexical acquisition but there are many other types Why Lexical Acquisition Language evolves ie new words and new uses of old words are constantly invented Traditional Dictionaries were written for the needs of human users Lexicons are dictionaries formatted for computers In addition to the format lexicons can be useful if they contain quantitative information Lexical acquisition can provide such information mm Verb Phrase and Subcategorization Verb phrase consists of Verb a number of constituents Examples VP gt V disappear VP gt V NP prefer a morning flight VP gt V NP PP leave Boston in the morning VP gt V PP leave on Thursday VP gt V 8 said you had a 200 fare Sentential complement Different verbs different constituents A verb phrase can have many possible kinds of constituents but Not every verb is compatible with every verb phrase Examples want VP gt V NP I want a ight want VP a VVPto I want to y to nd VP gt V NP I found a ight nd VP gt V VPto l l found to y to Transitive take a direct object nd I found a ight Intransitive do not take a direct object disappear l disappeared a ight Transitive and Intransitive are simple examples of verb subcategorization Verb Subcategorization Verbs express their semantic arguments with different syntactic means frame slots for arguments of the verb category verbs that take the same semantic args eg verbs with semantic arguments theme and recipient subcategory verbs that use the same syntactic means to express these semantic arguments Additional examples subcategory 1 prepositional phrase He donated a large sum of money to the church subcategory 2 doubleobject He gave the church a large sum of money Examples of subcategorization frames Intransitive verb NPsubject The woman walked quot Transitive verb NPsubject NPobject John loves Mary quot Ditransitive verb NPsubject NPdirect object NPindirect object Mary gave Peter owers quot Intransitive with PP NPsubject PP I rent in Northampton quot Sentential complement NPsubject clause I know that she likes you quot Transitive with sentential complement NPsubject NPobject clause She told me that Gary is coming quot One verb multiple subcategorizations One verb can take different subcategorization frames Example find VPVNP findafight VPVNP NP find meaflight Subcategorization needed for parsing She told the man where Peter grew up She found the place where Peter grew up She told the man where Peter grew up She found the place where Peter grew up Helps us get attachment right Unfortunately most dictionaries don t contain subcategorization frames and those that do are horribly incomplete Learning subcategorization frames Brent 1993 Does some particular verb take direct object frame VP V NP Cues for frames eg assume that pattern verb pronoun capitalized word punctuation identifies direct object frame with error rate e01 Count occurrences n number of occurrences of verb in question m number of occurrences of cue with verb Hypothesis testing H0 verb does not take frame 77 PHl uo count 2 m Z 71 at 7 Pry177 7 r Learning subcategorization frames Brent 1993 Manning 1993 Brent s system does well at precision but not well at recall Manning 93 s system addresses this problem by using a tagger and running the cue detection on the output of the tagger eg say ndN DET NPquot indicates direct object frame Manning s method can learn a large number of subcategorization frames even those that have only lowreliability cues Learned subcategorization frames Manning 1993 m Correct Incorrect Oxford AL Dictionam bridge 1 1 burden 2 depict 2 emanate 1 leak 1 occupy 1 remark 1 retire 2 1 1 mewm xwm x Error in remark attributed intransitive frame probably due to quotAnd here we are 10 years later with the same problemsquot Mr Smith remarked Attachment Ambiguity Where to attach a phrase in the parse tree I saw the man with the telescope What does with a telescopequot modify Is the problem Al complete Yes but Proposed simple structural factors Right association kimbaii 1973 lovv or near attachment early closure of NP Minimal attachment FraZier 1978 depends on grammar high or distant attachment late closure ofNP Attachment Ambiguity Such simple structural factors dominated in early psycholinguistics and are still widely invoked In the V NP PP context right attachment gets right 5576 of the cases But this means that it gets wrong 3345 of the cases Attachment Ambiguity The children ate the cake with a spoonquot quotThe children ate the cake with frostingquot Joe included the package for Susanquot Joe carried the package for Susanquot Ford Bresnan and Kaplan 1982 It is quite evident then that the closure effects in these sentences are induced in some way by the choice of the lexical items Simple model Log likelihood ratio A common and good way of comparing between two exclusive alternatives Same idea as a na39i39ve Bayes classi er Pptttpt t39 tun 5 PJiu li tuttuututi if gt0 attach to verb if lt0 attach to noun For example Pwith a spoon ate gt Pwith a spoon cake Attachment P roblematic Example 0 Chrysler con rmed that it would end its troubled venture with Maserati w QM 9mm end 5156 607 venture 1442 155 Get wrong answer Pwithend 6075156 0118 Pwithventure 1551442 0107 Should also express preference for attaching low Attachment Method Hindle amp Rooth1993 Event space all V NP PP sequences but PP must modify V or first N Don t directly decide whether PP modifies V or N Rather look at binary random variables VAp Is there a PP headed by p which attaches to v NAp Is there a PP headed by p which attaches to n Both can be 1 He put the book on World War II on the table Attachment Method Hindle amp Rooth 1993 Independence assumptions PVAP NAp v n PVAp vn PNAp vn PVAp v PNAp n Decision space first PP after NP NBl PAttchpnvn PVAP0 v VA 1v PNAP1n 10 PNA 1n PNAP1 n It doesn t matter what VAp is If both are true the first PP after the NP must modify the noun in phrase structure trees lines don t cross Attachment Method Hindle amp Rooth1993 But conversely in orderfor the first PP headed by the preposition p to attach to the verb both VAp1 and NAp0 must hold PAttachpvvn PVAp1 NAp0vn PVAp1v PNAp0n We assess which is more likely by a log likelihood ratio PAttavh7l iilii n A l 1 r I n 1 01 PA1Til h1Il l39 PnAp illnplNAp all PNAI ln 1ng If large positive decide verb attachment if large negative decide noun attachment Attachment Method Hindle amp Rooth 1993 How do we learn probabilities From smoothed MLEs PVAP1V CVp CV PNAP1n Cnp Cn How do we get estimates from unlabeled corpus Use partial parser and look for unambiguous cases The road to London is long and winding She sent him to the nursery to gather up his toys Attachment Method Hindle amp Rooth1993 Hindle and Rooth heuristically determine Cvp Cnp and Cn0 from unlabeled data Build an initial model by counting all unambiguous cases Apply initial model to all ambiguous cases and assign them to the appropriate count if I exceeds a threshold 2l2 Divide the remaining ambiguous cases evenly between the counts increase Cvp and Cnp by 05 for each Attachment Method Example Hindle amp Rooth 1993 Moscow sent more than 100000 soldiers into Afghanistanquot Other attachment issues There are attachment questions other than prepositional phrases adverbial participial noun compounds Examples door bell manufacturer door bell manufacturer Unix system administrator Unix system administrator Data sparseness is a bigger problem with many of these In general indeterminacy is quite common We have not signed a settlement agreement with them Either reading seems equally plausible Lexical acquisition semantic similarity Previous models give same estimate to all unseen events Unrealistic could hope to refine that based on semantic classes of words Examples Susan had never eaten a fresh durian before Although never seen eating pineapple should be more likely than eating holograms because pineapple is similar to apples and we have seen eating apples An application selectional preferences Most verbs prefer arguments of a particular type Such regularities are called selectional preferences or selectional restrictions Bill drove aquot Mustang car truck jeep Selectional preference strength how strongly does a verb constrain direct objects see versus unknotted Measuring selectional preference strength Assume we are given a clustering of direct object nouns Resnick 1993 uses WordNet Plv39lr l sm DP i39llP Z Pitiii rig Pi l Selectional association between a verb and a class Plr lii ugm Aii r ij 5139 Proportion that its summand contributes to preference strength For nouns in multiple classes disambiguate as most I39kely sense Alix ii i llliL Alli i l 39Ei laiswsliil Selection preference strength made up data Noun class c Hg P c eat Pic seeL Pic find people 025 001 0 25 0 33 furniture 025 0 01 025 033 food 025 097 0 25 0 33 action 025 0 01 025 001 SPS Sv 176 000 035 Aeat food 1 08 A nd action 013 Selectional Preference Strength example BUt hOW mlght we measure Resnick Brown corpus word similarity for word classes Vector spaces Veil u Noun n Anuri39i Class l Noun n AIL n i Class anxwer rEmiejr 449 speech act traqsdv 3 EB tommunitalion I123 551 lit 22213212 515 Hi Zilflliilll l A documentbyword matrix A remember reply l3l aldlt meril Hunks 020 irlidealcummerce Wyn mmme l 23 ummumuliuvi nurndl l 2 mmmunlLaliml i usmanaut astronaut mnon av trutk wad mm 580 wrillng mmquot 4120 mm Li I o l l 0 m amid a 79 smuy nit39lhud rOIll method 41 0 i 1 u 0 Write letter 726 mung market you tommuce d 1 g D D 0 d4 0 O D l l d 0 0 D l 0 di 0 0 D 0 1 But how might we measure Similarity measures for binary vectors word similarity for word classes Vector spaces Similarity measure De nition wordbyword matrix B gt gt 39 r r r k matching coefficient lX m Yl z 7 f 335 L i quotJ 3 Dice coef aent mm i l y i n A 3139 L I f Jaccard coef cie nt X A modi erbyvhead matrix C Overlap inef cient mm 39m SME usmarnm onnaut Dmuon ar mzk cosine gag Amerltnn J l 0 l l V 5mmquot i u u L mi 0 n l i lull D 3 l D E old D C 0 l l Cosine asure Example of cosine measure on quot39e wordbyword matrix on NYT com 7 1 xi 39 7 7 2 wZi39LanZiquot 7 maps vectors onto unit Ell ClE by dividing through Wm am 737 quotPnn39fa39s ygf qt x s 77m mquot 7 fallen ell 932 decline 911 rlse 930 dict 929 W n9lhgt engineered genetital 755 drugs ass myrrh 657 am 85 Alfred named EM Robert EOE William 808 W 308 We umetlimg 95A thing 963 Ylvu 953 dim 952 Probabilistic measures Dis simiariiy measure Definition KL divergence Dian pz39 109 Skew Diqilar i 1 Ma JensenShannon was IRad Di 1i iDmHK J Li norm Manhattan 2 1171quot Iii Neighbors of word company Lee Skew or om i s Euclidean airline bUSinESS City business airiine airiine bank rm industry ageriLy bank program firm mite organization depenmem aqency bank manufacturer group system nerwork govt mday industry my series guvl iriduslry purlion Examples of Verb Subcategorization Frame Functions erb Example NP NP subject eeieei greet sne greeted m NPS sueieeiieiause nepe imminesz NP iNF Subje infinitive nepe gee hupesmjimuu NP NPS sueieeiieeieeiieiause ieii mieiemne WiH attend NP NP iNF subject ubiem infinitive EH mime mm attend
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'