New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Santos Fadel


Santos Fadel
GPA 3.76


Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Class Notes
25 ?




Popular in Course

Popular in ComputerScienence

This 66 page Class Notes was uploaded by Santos Fadel on Monday October 19, 2015. The Class Notes belongs to CSCI 6965 at Rensselaer Polytechnic Institute taught by Staff in Fall. Since its upload, it has received 20 views. For similar materials see /class/224859/csci-6965-rensselaer-polytechnic-institute in ComputerScienence at Rensselaer Polytechnic Institute.

Similar to CSCI 6965 at RPI

Popular in ComputerScienence




Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/19/15
IJCAI Review James IVIichaelis CSCI 6965 Adv Semantic Technologies Class 10 3312009 Querying and Managing Provenance through User Views in Scienti c Work ows 39 ii we in in iilZiEEEE en Mia What is a user View in l mmwum mum in has 5 View lbllllmy39susel viw l g mun A useroriented level of component granularity in a work ow Users select modules from an original work ow Above and in turn user views are generated which highlights these modules Outline Goals of work Constructing user views of work ows Provenance querying based on user views Contributions related to user views Formal de nition Generation algorithm ZOOM System for storing visualizing provenance Performance evaluation Formal De nitions Work ow Speci cation Directed graph GWN E with input I and output 0 nodes indicating beginning and end Every node of GW must be on some path from input to output User view U a partition of GW nodes N excluding input and output into a set M1 Mn such that 7 M N Mi and M are disjoint for i j MlUMZUUMnN In this de nition each Mi is known as a composite module Constructing User Views Authors propose using an algorithm called RelesterViewBuilder to take as input a work ow speci cation and a set of relevant modules and produce as output a user view 0 Algorithm adheres to three properties for de ning quotgoodquot user views Properties of Good User View wellformed Given a work ow GW and set of relevant nodes RE N every module in the user view U contains at most one element of R preserves data ow Applies to user view U iff every edge in GW that induces an edge on an nrpath from Cr to Cr39 in UGW lies on an nrpath from r to r39 in GW where r and r39 are nodes in R U input output complete Applies to a user view U wrt data ow iff for every edge e on an nrpath from r to r39 in GW that induces an edge e39 in UGW e39 lies on an nrpath from Cr to Cr39 Here r r39 are nodes in R U input output RelesterViewBuilder Algorithm Uunmlvmkclnl rimununm mm wnrMhM w anmm uanst R mm L Slep n mum mulqu v n w 2 mum v Rmxummrkm 1 MI 1m 4 mmquot m ysccq 5 mmk M y H mm a runny Kiln 7 WWW mm mm mm 1 u unmarku x mm m n m y n orally 1m r u um mrlum 5 11mm m 317mm Hum meanU w my 2 m c mm mm mmwmte mm ulvx r m unv mmlumuvg h Imm in Hm H Wm WM W mm mmqslicwm n l H u quotan quot VF W U Hd u an Hrr llt 1U quot15mm to mummm a u 1 m n mp 3 mm the um up umvmnl39l H mm mm m Anylhrv m K m m x m 1 wr u 1 w in In y 2 m m mmmg cnlgc mm 1 x L w M M m quotmum dce to mm m an 5 mm x 39 mm 7 rpremvmm m r SMCY I XSHLCH mum m mum rr m a mm vrw 5 Hum M w A R1quot In mm m m w 39mmu 4 Notes about RelesterViewBuilder Preserves Properties 13 and produces a minimal user view No two M M can be removed from U and replaced by Mi U M to yield a solution that preserves Properties 1 3 Running time ON2 E The proof can be found in quotQuerying and managing provenance through user views in scientific work ow systems U of Penn TR IVIS CIS0713 2007 Does not guarantee a minimum solution one with smallest number of components Finding polynomial time algorithm for this remains open research problem User Inmace ZOOM Architecture 1 Wrappers W wUseerw 3quotth Wovk ow iuwvlew 169 em05 393119 les n x mvlmermw mum usorsuulpu usllvwws Rana mu v m m r umnum mm n3 v amp I WW mmmw Evaluating RelesterViewBuilder Ran algorithm on 1000 increasingly large randomized work ow speci cations 100 2000 nodes Scalability Upper bound of 80 ms Optimality Results showed that adding one relevant class in a work ow creates only one new composite class Evaluation of Provenance Querying 0 Key Metrics used Time required to construct user views and subsequently query the provenance warehouse Testing data 3O work ows generated by realworld systems in domain of biological science Series of synthetic work ows generated based on patterns from the real work ows Results of work ow runs Work ow data mm Luis M wmnm my a cum mi km Mum 1750 mg INN Provenance Querying Key metric considered Size of query results returned Data used Using 10 work ows in each of the 4 classes described in Table I we created 30 runs of each kind in Table II small medium and large generating 3600 runs in total Provenance Querying 0 Three varieties of user views considered UAdmin in which each step class is relevant no composite modules UBio constructed from relevant modules based on case studies from biology experts using RelesterViewBuilder UBlackBox in which the entire work ow is in one composite class N ROW Provenance Querying E ADMIN a LED 1 LBMBOX Rm Wonmow class F13 10 Size of quay resuh Interactivity Effect of view granularity on response time Cost of switching user views while analyzing the provenance of a given data item Caveat not clear what testing data used Results On average takes 13 msec to compute the provenance for a different user view Maximum computation time 1 sec for an execution in Class4 with 90 of relevant modules Visualization On average it took 300 msec maximum time was 2 sec IJCAI Review Relevance 6 Signi cance 7 Technical Soundness 8 Novelty 8 Quality of Evaluation 6 Clarity 6 Overall 7 Review Con dence 7 Relevance 7 Calculated by average of general AI relevance 7 and Semantic Web relevance 5 General AI Interesting approach for generating abstracted workflows based on nodes of user interest Could be very attractive to information retrieval groups Semantic Web Technologies not applied in this work but could be in later versions Signi cance 7 Authors present an algorithm for generating concise representations of work ows RelesterViewBuilder based on user interest in certain work ow components This could have widespread implications for improving work ow visualizationquerying systems However quantitative metrics for gauging researcher input on the design of the system are missing Technical Soundness 8 Could not nd any signi cant aws in the technical approach 0 However would have liked a look at the proof of correctness for the RelesterViewBuilder algorithm Novelty 8 Relatively unexplored approach for generating user centric work ow presentations Interesting implications for improvement of provenance based work ow querying Quality of Evaluation 6 In general felt the authors were on the right track However some of their results struck me as unclear For instance their gauging of optimality in the evaluation of the RelesterViewBuilder algorithm and the data used in the interaction evaluation of the ZOOM architecture As a rule of thumb use of data from alternate domains aside from biological sciences could have helped Clarity 6 At points presentation of ndings unclear Apparent typos omitting of information Overall Paper Score 7 Average of 6 IJCAI metrics Con dence of Reviewer 7 Fair amount of background with processing techniques for work ow data However do have limited background in working with synthetic work ows as well as large sets of work ows IJCAI Review James Michaelis CSCI 6965 Adv Semantic Technologies Class 7 3 3 2009 Semantic Reasoning A Path To New Possibilities of Personalization Objective of Paper Present a generic method of improving recommender systems through reasoning over Semantic Web data Content based strategy discover extra knowledge about the user s preferences more accurate and flexible personalization processes Present results of study comparing effectiveness of system implemented by the authors versus two machine learning based approaches Motivation Syntactically based recommendation systems tend to provide overspecialized recommendations which may not be of much value to users Collaborative ltering helps with this but has issues with sparsity of applicable recommendations Strategy Two key parts Semantic associations Trace semantic bonds between the user s preferences and the items available in the recommender system Formalize these bonds in a domain ontology along with their semantic annotations Spreading Activation techniques Explore these semantic relationships and discover new knowledge related to the users interests 39 Domain for demonstration MovieTV Reviewing Semantic Associations Domain knowledge is encoded in ontology In turn users can specify interest in certain movies using parameter DegreesOfInterest continuous variable ranging from 1 to 1 With user preferences entered traversal through domain ontology takes place Relevance Index computed through following metrics ppath association Applies for two programs linked by a chain or sequence of properties in the ontology pjoin association Two programs with respective attributes belonging to the same class in the domain ontology pcp association Two programs sharing a common ancestor in the genre hierarchy de ned in the ontology Semantic Associations Semmmc Ass Minus m ltme mm Rm on J My pjam Wclrmlz w Tpkva 2715 Lag Sanmmi 39 J hiya m w quotmm mm mm M W D W W nk a lt1 my 7 W V m m on may NM mmmzv m V mm mm mm WH mm mm Mew mg thwm m Myquot 1 g mm mm m Wm H W W t mm mm m m mm mwumaw m we mkm f ml mm m Mm Vmnm msm mm mm iv in 7 5m a m m lt5 m m 955 mm Spreading Activation Used to explore generated network of user preferences 39 Works as follows Nodes of the network have an implicit relevance named activation level Each linkjoining two nodes has given weight Stronger the relationship between both nodes the higher the assigned weight Initially a set of nodes are selected and their activation levels are spread until reaching the nodes connected to them by links named neighbor nodes Author Contributions Authors propose modi cations to Spreading Activation approach based on The kind of links traditionally modeled in the SA network SA network models both the properties de ned in the ontology and the semantic associations inferred in the ltering phase Their weighting process Adaptive weighting based on changes in user preferences Evaluation Authors implemented prototype of recommender system for testing purposes including An OWL ontology covering the domain of TV shows Similar to the movie ontology the technology was developed around User modeling technique based on ontologypro les Goal of study was to To evaluate the accuracy of our reasoningbased recommendations To compare this approach with two existing machine learning techniques that are devoid of semantic inference capabilities used by authors Evaluation 400 Students asked to rate 400 TV programs on continuous scale of 11 40 of student evaluations used as training data for the two machine learning techniques For remaining 60 of students the following took place 10 good and 10 bad TV shows randomly selected from each student evaluation to build pro le In turn these pro les were fed into the authors recommendations system as well as the other two systems 90 80 70 60 50 40 30 20 Recall Results Precision A ONthO I AssoSA threshold 065 CI Rules I SamCF 0 Recall U Precis39on Review of Paper Relevance to an Al audience 710 Clear that the novel use of spreading activation techniques in the paper will be of interest to researchers in information retrieval as well as applications of Semantic Web reasoning However not certain that techniques will carry as much interest in Alas a general eld Review of Paper Novelty 710 Appears to present a novel use of Spreading Activation through combination with Semantic Web data However this strikes me as a fairly specialized contribution Much work has been done to use Spreading Activation in web analysis in the past Review of Paper 0 Signi cance of Results 810 Interesting comparison done between the authors system and two alternate systems However it is not particularly clear why these two were selected a little more background on selection rationale would be helpful In addition expanding the pool of participants and or domain data in a followup evaluation would be helpful an intention the authors expressed as future work Review of Paper Technical Soundness 810 Speci cs of some techniques such as those used to compute signi cance of properties related to movies rated by users omitted from paper due to length constraints However a thorough technical review of all the techniques used in this system given in Blanco Y et al A Flexible semantic inference methodology to reason about user preferences in knowledge based recommender systems Knowledgebased Systems In press Review of Paper Quality of Evaluation 610 Two metrics used for rating effectiveness of three techniques Recall De ned as the percent of returned programs that the user rated interesting Precision De ned as percent of programs returned with positive user rating on 11 scale Not very well de ned what distinguishes these metrics in the results Expanded statistical analysis would be helpful Review of Paper Clarity 710 Admittedly problems emerged in part due to length constraints At many points technical speci cations abstracted with reference to alternate publications Discussion of metrics used in results as well as their signi cance could have been more clear Review of Paper Overall Score 710 Con dence in review 510 Numeric Reasoning in the Semantic Web Chim ne Fankam St phane Jean Guy Pierra Presented by Giovanni Thenstead Introduce new ontology database for reasoning and query processing Show alternative reasoning engine that utilizes various labeling techniques Explore a new database architecture that supports mixed ontology and labeling schemes Le a metascheme lntroduce new ontologybased query language Background Knowledge Binary Relation Properties Transitivity Reflexivity Symmetry o gt c o gt c Eager and Lazy Reasoning Eager reasoning is used as an optimization technique for preprocessing deduced facts in order to speedup future query requests Lazy reasoning is the opposite of Eager reasoning whereby deduction is performed onthe y for each query request Labeling schemes Interval labeling Geometric labeling Baekgroumd Knowledge OntologyBased Database OBDB Typell Class of databases that partitions keeps separate the ontology data tables from the instance data tables Partially Ordered SetsTree Order for Subdivisoning REID D Table Schemes Binary Table Table Per Class Reasoning that uses number or stringoriented labeling for qualified data instances in an TypeIll OBDB Problems with this approach Current Typell OBDBs do not support numeric reasoning However new proposals are being made for TypeIll OBDBs These TypeIll OBDBs may support Multiple ontology models OWL DAMLOL PBLIB etc New SQLlike query language New database management system DBMS that can automatically transform qualified are they transitive antisymmetric and reflexive Spatial or temporal data instances via labeling into an appropriate extended form Will add new metaschemes stored as new DBMS tables for ontologies and labels Issues with the Semantic Web Scaling on largesize data Applications need to manage an amount of ontology data that doesn t fit in memory Reasoning over largesize data Often times OBDBs that are very scalable to real world ontology and instance data are poor at reasoning ie they have subpar response time when reasoning is being performed Why Numeric Reasoning Scalability over largesize data Labeling decreases instance data representation making it more efficient to store in memory OntoDB New TypeIll OBDB that extends current Typell OBDBs with new metascheming for handling multiple ontologies and labels LabeHng Eliminates the need for reasoning over deductive facts Can exploit Eager Reasoning for optimization Because labeling is stable and requires less storage overheads eager reasoning preprocessing of instance data can be used as an effective strategy for improving query processing DATABASE Databases are efficient at handling large size data and also at performing numeric and string queries 11010011 LLGOLGLL Eager reasoning can be expensive It s better to represent data as numbers or strings instead of deduced facts which can get unstable ie data size varies Popular approaches used for reasoning by current OBDBs Eager Reasoning Deduced facts such as transitive closures for all transitive relationships can be derived and then materialized stored as for example database views these are essentially virtual database tables created onthefly Drawback requires extra storage and update overhead Lazy Reasoning Deduction is done onthefly using virtual deduced facts Done on various database mechanisms such as views labeling schemes or subtable relationships found on objectrelational databases Drawback requires extra processing cost but doesn t impose storage and update overheads Subsumption Reasoning Data instances that establish a subclasslike relationship with respect to another ie classOf instanceOf relationships this can be useful for carryingout transitive closures Legen 4gt mama a many 0 mm vane 4 mm Fig 1 Type 1 OBDBs approach OEDE pprCaches Type II OBDD M Fig 2 Type 2 OBDBs approach Subdivisions By the properties of Transitivity and Inclusion Subdivisions can be established based on the topographic hierarchy As such a partial order on these subdivisions can be used to exploit their positionalrelationshi to create some range ie unique interval under which a subdivision 5 domain falls in respect to its scope in an hierarchy Numeric Reasoning over Partially Ordered Sets anc I L2 II de France Pnimu Charemes 5 H 1 Pm Paiuers 339 4 7 3 Ln Racksquot 9 IO Fig 5 translation of a tree structure into two numeric values Numeric Reasoning over Spatial and Temporal Domains Example of Geometric labeling using circles Spatial Domain Temporal Domain 1970 1985 2007 I I I Time Line in Years Impl entation Labeling Schmne mm mm smv numberVCumw IBM dcfwlbcheme bouumwmm IA L 4 1xminunnxymin l u lJ lml39uuL Nun VIURCnmms mzlx39y Hmu run scnemem s maimww mm mm ling query mtuiwr Kim quot 3 My riirmnc query um mumJ 7 i tin m V Fig 6 Query processing steps Query processing on a Table Per Class backend can be 10 times as fast compared to its Binary Represented counterpart and provided the queries are being carried outwith a class name specified iplamentatiom Query Remii 139 ng 39 nmwm 39ih mumiiwiimyui win murmur i Amriuimc rumuvnmm i wigs imnmuaNJiic u mm whichminimumin miiiminnow i himi1quot gmmmim W xiihim i Inlulinlizm w on dwniiidn l r m a Minn gm 7 quotv um muttw Mum minimum at non Min17m Wm vuiliiLuM m H i i i i 5 MIN mum Jiiuni guwinLhuun hunudi A minim mm Fig 11 COG OntoDB example query w1ii mg Queries that are not apart of OntoQL proposed TypeIII OBDB query language are automatically converted to their OntoQL counterparts This is important so that the correct ontologies in the OntoDB DBMS are queried given their namespace umurwmu Ilmuagy modul Fig 7 COG 7 OmuDB ontology part m Suhdmmnl H A name Pam H de 1 Rune Pniluu Chammes Ln mchclle Pmluu cm Fig 9 COG OntoDB the data part Questions Does the labeling scheme support more complex geographic features ie with holes or irregular geometries than what is provided Gregory Todd Williams While it may be possible to extend the current labeling scheme definitions via the abeschemes table to include other shapes what you are asking does not appear to be a possibility that can be achieved given the unqiue dynamics properties that would have to be considered How is subclassing supported Jesse Weaver As mentioned earlier this can be achieved through Subsumption Reasoning which utilizes transitivity closures on derived data instances with antisymmteric transitive and reflexive properties How reliable is the automatic transformation process for nonOntoDB ontology and instance data Joshua Shinavier There is a default mechanism so if no implementation entry ontology scheme exists for an unknown instance data then whatever tricks available to the DBMS for default events may be applied Application to Research This may apply to my research interest Distributed Reasoning in the follow ways In the case of interval labels finding possible vertical partitioningpoints may be used to help distribute workload across multiple reasoners Maybe interesting to distribute the underlying OntoDB DBMS so that instead of storing the ontology and their respective instance data on one engine they may be spread across multiple reasoning engines Chim ne Fankam St phane Jean Guy Pierra quotNumeric reasoning in the Semantic Webquot SeMMA 3462008 84103 Questions


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Amaris Trozzo George Washington University

"I made $350 in just two days after posting my first study guide."

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.