TOP INTRO TO MULTIMEDIA NTWRK
TOP INTRO TO MULTIMEDIA NTWRK CS 510
Popular in Course
Popular in ComputerScienence
This 6 page Class Notes was uploaded by Orrin Rutherford on Tuesday September 1, 2015. The Class Notes belongs to CS 510 at Portland State University taught by David Maier in Fall. Since its upload, it has received 25 views. For similar materials see /class/168260/cs-510-portland-state-university in ComputerScienence at Portland State University.
Reviews for TOP INTRO TO MULTIMEDIA NTWRK
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/01/15
Relevance Feedback Query Expansion and Inputs to Ranking Beyond Similarity Lecture 10 CS 410510 Information Retrieval on the Internet Outline Query reformulation Sources of relevance for feedback Using relevance feedback Vector model Probabilistic model Clustering approaches Loca analysis Global analysis Examples of relevance feedback and query expansnon Inputs to document ranking other than similarity csstuvvlnlevzum 2 The basic problem m E W Queries are not always successful New queries do not learn from previous failure cssmvvlmevzum 3 Possible solutions Retrieve new documents Add new terms to the query query expansion Reorderthe documents a W tterms csstuvvlnlevzum a How Where do we get information for adding query terms or reWeighting documents Use relevant documents to help find more relevant documents Assumption Relevant documents are more similar to each other than they are to nonrelevant documents cssmvvlmevzum 5 How How do we get relevant documents 1 Ask the user Systems returns documents for initial query User marks which ones are relevan System uses as input to new query Assumption user will recognize documents that are useful or close to what he wan s 2 Assume topranked documents are relevant automatically reformulate query csstuvvlnlevzum 5 c 2007 Susan Price and David Maier Relevance feedback user 3Relevance feedback 4 New Result list cssiuvvinieizum 7 1 Query Blind feedback 1 Query 2 Term re welghtlng based on topranked documents 3 Results cssiuvvinteizum a Using relevance feedback Vector model use terms and term weights in relevant documents to reformulate the query Add new terms query expansion Re weight existing terms Basic approach vecrleva vecoldO vecposExamples vecnegExamples cssiuvvinieizum s Relevance feedback Vector model Three classic query reformulations on p yorlglnally set to l D is tne set of relevant docs DH is tne set of rlorlerelevarlt docs V 1 Rocchlo gm mg Jemima d emzwyw 1 Divide positive amp negative vectors by size of example sets 2 lde e v qm 1 52am di Masadi 3 9 DWH39 qm 141 32va 01 rmaxweym di Use onlv toperarlked rlorlerelevarlt document vecneva veculdQ VecpusExamples vecnegExamples cssiuvvnteizum in Using relevance feedback Probabilistic model use relevant documents to recalculate probabilities of relevance Re weight terms No query expansion cssiuvvinieizum ii Clustering approaches Cluster hypothesis closely associated documents tend to be relevant to the same 1 requests x X x x Relevant to query A X E Relevant to query a X Instead of using weights of individual terms in relevant documents use relevant documents to describe of cluster of desirable documents to v vanPUisbergen lriormatlori Petrleval cnaptei 3 1979 cssiuvvnteizum i2 c 2007 Susan Price and David Maier Clustering for query reformulation Local analysis Operate only on documents returned for the current query the local set Use term co occurrences to select terms for query expansnon Global analysis Use information from all the documents in the collection cssmvvlmevzum 13 Local Clustering Use term cooccurrence data to calculate correlation between terms u v CW Add the n query terms with the highest CW where u is a term in the original query Reflects term clustering cssmvvlnlevzum w Local Clustering Association clusters CW re ects 39equency of cooccurrence in the same ocument Metric clusters CW re ects proximity of cooccun39ences number of war currences ofu and v in documents Scalar clusters Calculate vectors of correlation values for all terms in local ocuments C v re ects similarity of vectors eg cosine of angle between vectors for u and v cssmvvlmevzum 15 Local Context Analysis Use noun groups one or more adjacent nouns in text not just keywords concepts 1 Retrieve top n ranked passages with original query 2 Calculate similarity between query and each concept noun Re ects cooccun39ence of query terms and concepts In passages 3 Add top mranked concepts to query weighted by similarity ranking cssmvvlnlevzum 15 Global analysis Similarity thesaurus Statistical thesaurus cssmvvlmevzum 7 Creating a similarity thesaurus Index a term based on documents it appears in each term associated with a vector k WlW2 WN where wU is the weight of document associated with term ins calculated similarly to tfidfexcept it uses inverse term frequency not inverse document 39eq length ofvector is number of documents in collection Calculate a matrix of term correlations c k kv 2 W W cssmvvlnlevzum a c 2007 Susan Price and David Maier r spaee rnddei orce 7 Calculate yirtuai gueiy vemurthat eernoinesyedters rereaen queryterrn vveighted by wv Ciasses deriyed from ciusters and 3 parameters 7 Previuus uses or gieoai anaiysis added terrns based 7 Similarity nresneid vvarittight clusters r e atiuri Witn individual gueiy terrns e ciuster size want srnaii cl ters 7 Mi irnu inyerse deeu entrregueney eim l Calculate sirniiarity of eacn terrn in coiiection to tne query concept Addtop rterrns to originai que y r e ayerage terrn Weightnumber ufterrns in class 7 Weignted by sirniiaritytei driginai gueiy cssinvumerznin 5 Sin We mi We b Using a statistical thesaurus Terms within a document class are considered related Documents and queries are indexed V y g V augmented using the thesaurus classes I h 5 Sin We mi Query expanSion on the Web Query expansion on the Web Wool um Gtng Googlesuggest FAQs That s pretty edeii dey dues it deit at Our aigdritnrns use a Wide range at nrdrrnatiein tn predictthe gueries users are must likelytu vvarittu see For example ceieigie Suggest uses data about the dyeraii pdpuiarity neip rankthe rennernents it offers n i ariuus searchestu c 2007 Susan Price arid David Maiei Relevance feedback mm 39 Textpresso Text mining application for scienti c literature Can search terms or categories often39ns rBioiogicai concepts gene ceii nucieic acid rReiationshipS association reguiation rBioiogicai processes Can search for gene or do query expansion to search for any gene name csstnvmnsmm 27 Whatttwe searcn using the medicai term VanceIa Autamatcaiiyexpands 39 seam m mm m carrespandmg MesH term Mme mm a bioiogic concept Expan from category to instances c 2007 Susan Price and David Meier Document ranking Using considerations besides similarity Filters Ranking adjustments cssiuvvimevzum 31 Possible filters Language English Spanish etc Source Other metadata Novelty Suppress documents already seen duplicates nearduplicates from same site cssiuvvimevzum 32 Possible ran king inputs Inputs otherthan similarity Importance page rank Popularity Reading level Currency Quality What is quality What are the criteria Who applies the criteria cssiuvvimevzum 33 Next Classification and clustering cssiuvvimevzum u c 2007 Susan Price and David Maier