COMP ANALY OF SOCIAL PROCESSES
COMP ANALY OF SOCIAL PROCESSES CSCI 6965
Popular in Course
Popular in ComputerScienence
This 14 page Class Notes was uploaded by Ransom Blanda on Monday October 19, 2015. The Class Notes belongs to CSCI 6965 at Rensselaer Polytechnic Institute taught by Staff in Fall. Since its upload, it has received 19 views. For similar materials see /class/224859/csci-6965-rensselaer-polytechnic-institute in ComputerScienence at Rensselaer Polytechnic Institute.
Reviews for COMP ANALY OF SOCIAL PROCESSES
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/19/15
QZSemantic A Lightweight Keyword Interface to Semantic Search Zhenning Shangguan For CSCI6965 Class Mar032009 Introduction Semantic search should support more expressive queries that addresses complex information needs However current semantic query engines only support formal queries Mm SPARQL Users must learn complex syntax first Users also must know the underlinu schema and vocabularies in the data set Keyword search is one possible solution Simple syntax Open vocabularies Keyword Search How to bridge the gap keyword and formal queries IR and DB community SW community SPARK ISWC O7 Tran T ISWC O7 etc 0 ChaHenges Open vocabulary Ranking Scalability Contributions Terms extracted from Wikipedia to enrich literals described in the original RDF data Mechanisms for query ranking considering several relevant factors 0 Novel graph data structure called clustered RACK graph and an exploration algorithm Allows the construction of the topk queries Workflow of QZSemantic Search chess Rani1i ng BL Tap k Reward we Q I Phrase Mapping Quay Conanunion Formal Query k1 Capin k2 SVG I I ltrp1p2gt WWW Eden Gth Imam r W3C5pecification p1ltX1ClbeJ v U nd ing p2ltx1hasAuthorx2nameCapingt Clustered RACK Graph T Elus reaing M REF Graph app RACK Graph Index Princess Input a keyword query Kcomposed of keyword phrases k1 k2 kn Search Process Phrase Mapping Query Construction and Ranking Index Process Mapping Clustering and Indexing Output a formal query F as a tree of the form ltr p1 p2 pngt where r is the root node of Fand p is a path in F 5 fatWMJ ci i all eei aWl g ila i j fie ig ct the same pa 0f C39NOdes Ewo AEd es are clusteradI to th y have g1 Sarge lgbel n connected to the sa e CNede very re Ion IS mappe o a ge at IS a e e y e re a Ion name an connec 5 two glocEsKNodes are clustered to one If they are connected to the same AEdge The resulting node i EOEWEEH ElElE la l dt8en39 41llod sthat is labeled by the attribute name and connects a C 6 Node with a KNode Query Interpretation in QZSemantic Phrase Mapping Q39quotquotgt mm 0 i Query Construction Mm 1mm 2 Maimngum Thread Expansion T Expansion 39 l Cursor Expansion C Expansion Two strategies for expansion Capin ave IntraThread Strategy bread ream InterThread Strategy Optimization for Top k Termination Optimization for Repeated Expansion V i caiion WA iv y I gt Query Construction Algorithm Input K k39li kg nk39n1 where 131 hits the KNodes KLi Ivnode knodeig 7kn0demi with the matching relevance as Si3i1312 39 newt Output A result set initially 0 Data Tprune pruning threshold initially To 1 foriE 1571 do Li new Thread 3 for j E limi do 4 l Liladd newCursaNsiT knadeui NULL 129 5 end 6 end 7 while 3139 6 ill It liipeek39CostO 3i so do 8 j pick from 1 rt in a round robin fashion 9 c o L opMir 10 CEmpa13ionc A and Tprune will be updated here 11 if LleeekCosLO gt Tpru ne then 12 Output the top k answers in 1i 13 end 14 end Query Ranking in QZSemantic Path only R1 Z 1 igign eEPz Adding matching relevance 1 R2 Z 52 1971 egpl Adding importance of edges and nodes R3 cosfr Z i Z caste 1991 Di 121 1 nods Costnode 2 7 log edgf 7 osfedge 2 7 log2 1 Experiment Setup TAP 220K triples Initiative www rdf perllib DBLP 26M triples 100 valid queries by combining literals from different attributes from one to three keywords LUBM1U LUBM20U and LUBM50U 8 queries from the LUBM Query Set LQ are used by removing 2 cyclic queries and 4 queries requiring reasoning support 10 Effectiveness Evaluation A simple but effective metric Target Query Position TQP TQP 11 Ptarget TQPs of different ranking schemes on TAP 1U 43 I 3 7 5 3 lPamDr y R1 4 lAddRelevanceml H I Add Importance RS J 3 1 Q1QEQ3Q4Q3Q5WQSQ9 TQPs on LUBM benchmark queries 11 Efficiency Evaluation Search time under 223 100 I 400 different ranking 80 schemes oglfpath 3 E Search time under 47 13 different topk o i i i i i i 5 11 02 Q3 Q4 05 Q6 Q7 Q8 09 Q1 02 03 Q4 Q5 06 Q7 Q8 QB Performance of pe n a Ity parameters 1 Index size and 1 dramasng search time on 33 f 60 different datasets 77 J1 Q 02 03 Q4 Q5 16 Q7 18 Q9 RACK graph vs clustered RACK graph mmom LU39BM10 LUBMGOIJ LUBMS00 DELP3SG Conclusions and Future Work For the effiCIency purpose we propose a new clustered graph index structure as a summary of the original RDF data and support topk formal query construction on it For the effectiveness purpose we design well performed ranking schemes Additionally we leverage knowledge from Wikipedia to enrich and disambiguates the keyword queries Future Work Query Capability Extension Clustering Method Backup Slides Clustered graph structure Corresponds to the summary of the original ontology Considered as quotreduced data space Used when computing topk queries Query ranking Query length Relevance of ontology elements to the query Importance of ontology elements