Distributed Systems CS 757
Popular in Course
Popular in ComputerScienence
verified elite notetaker
This 49 page Class Notes was uploaded by Abe Jones on Saturday September 12, 2015. The Class Notes belongs to CS 757 at West Virginia University taught by Goseva-Popstojanova in Fall. Since its upload, it has received 16 views. For similar materials see /class/202748/cs-757-west-virginia-university in ComputerScienence at West Virginia University.
Reviews for Distributed Systems
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/12/15
Distributing Computations on PeertoPeer Networks By Mayra A Sacanamboy Outline 0 Motivation 0 Unstructured Peer to Peer with Random Walks and RS sets o Pastry Overlay and JVM o Balanced Tree overlay Layer Architecture 0 Chord overlay and Markets model 0 Conclusion Motivation 0 Large fraction of PCs are unused for a large fraction of time 0 Increasing computational power 0 Exploitation for parallel processing 0 Peer to peer architectures can take profit of idle CPU cycles Unstructured Peer to Peer with Random Walks and RS sets Unstructured PeertoPeer with Random Walks and RS sets 0 Random Walk is a Markov chain defined over a state space and a given state transition matrix 0 RS set is a communication fabric where each node contains 1NlogN pointers to random selected peers 0 Jobs are identified by unique identifiers instead of network addresses Job Allocation o A node joins the P2P network and creates its RSset asynchronously o A resource provider publishes its resource to its RSset o A node project owner looking for resources sends a query to its own RS set 0 Consumer can access the resource if and only if ProducerRSset n ConsumerRS set Job Allocation 0 Job allocation with redundancy o Replication at initiation o Multistep Replication o Computed results are routed to the node project owner using content based addressing rather than network node address Replication at Initiation NodeOWNER wants to submit a set of batches oftasks K with a replication factor r ach batch is identified by its associated metainformation job identifier and a unique name for i1 K do K is the total number of batches forjob J NodeBrandomWaIkNodeOWNER NodeOWNERsubmitBatchi NodeB r Parent NodeB while rgt0 do rr1 NodecrandomWakParent NodeBsubmitBatchi Nodec r NodecsendACK Parent if NodecMonitorLifeParentfai NodecabortTransaction end if end while if NodeBReceivedMessagesr1 NodeBsendActivationMessage NodesWithBatch end if end for Replication at Initiation o Drawback 0 High overheads when nodes fail during job submissions 0 Advantages o Reduced cost ofjob submission 0 Several jobs can be initiated concurrently MultiStep Replication p required processors hnetwork delay 02variance of the require time to complete a task Nodes n tasks p required mean time to complete a task odeOWNER wants to submit a set of batches of tasks K with a replication factor r1 ach batch is identified by its associated metainformation job identifier and a unique name or i1 K do K is the total number of batches forjob J NodeBrandomWakNodeOWNER NodeOWNERsubmitBatchi NodeB r end for IdeaITimehnNH Z yn K039 2K10gp NodeOW ERwaius IdeaITime NodeOWNERcollectcompetedTasks r392 while NodeOWNERIncompletedTasks gt 0 do NodeOWNERresubmitIncompletedTasks r nf NodeOWNERMissingTasks IdeaITimemaXrnfN 1p NodeOWNERwaits IdeaITime NodeOWNERcollectcompetedTasks rr1 end while 10 MultiStep Replication o Drawback 0 Overhead in bandwidth usage in the node owner by introducing jobs 0 Advantage o Achieves 100 ofjob completion 11 Job Monitoring and Aggregation n10 RSsetn1 RSsetn6 12 Pastry Overlay and JVM 13 Pastry Overlay o Represented by 128 bit randomly chosen nodeId Hash of IP or public key 0 NodeId is in base 2b b is a configuration parameter b typical value 2 or 4 o Evenly distributed nodeIds along the circular namespace 0 2128 1 space 0 Node state contains Leaf set 5MAL R LARJER 10233033 10239321 10233122 Leaf Set L 10233230 Routing table lII Routlng table R vii 14301233 104341203 10432102 4 393 302 l 423 2 3 1027171302 0 Big or e E 2 4 1 39 w Neighborhood set I source ROWStron et a 6 14 Pastry Leaf set o Serves as a fall back for routing table and contains o L2 numerically closest and larger nodeIds o L2 numerically closest and smaller nodIds 0 Size of L is typically 2b or 2 X 2b 0 Nodes in L are numerically close could be geographically diverse 15 Pastry Neighborhood set M 0 Contains the IP addresses and nodeIds of closest nodes according to proximity metric 0 Size of M is typically 2b or 2X2b 0 Not used in routing but instead for maintaining locality properties 16 Pastry Routing Table 0 Matrix of Logzb N rows and 2b 1 columns N is the number of nodes in the network 0 Entries in row n match the first n digits of current nodeId AND o Column number follows matched digits Format matched digits column number rest of ID o Logzb N populated on average 17 Pastry Node10233102 b 2 L 8 2 3 22301203 31203203 12230203 13021022 10323302 10031203 10132102 10200230 10211302 10230322 10231000 10233001 1022302 10232121 10233232 10233120 Source Rowstron et al 6 18 Pastry Routing o If message with key D is within range of leaf set forward to numerically closest leaf 0 Else lookup the routing table and forward to node that shares a longer prefix with D than current nodeId o If no such node exists forward to node that shares at least as many digits with D as current nodeId but numerically nearer than current nodeId 19 Job Allocation o All nodes are organized into Pastry Overlay o Periodically each node propagates its resource availability and characteristics using the routing table 0 Each node caches the previous info for its matchmaking between jobs amp resources 20 Job Allocation Ns submitted a jobj to Ne Ne s VM amp dynamic compiler gt instrumentation Ns queries the reporting module If Ns finds thatj is making progressgt Ns issues a credit to Ne resource info Matchmaking node 3c i39 92 Network 139 1 is TS f 39nfov 3quot rquot 39v remote job probing resource info annou ments lnstru merited code modi ed JVM with progress probe cred generator Submission Execution node Ns Ne Source Butt et al 3 Neighbor V set quot maintenance 39 matchmaking re sour2 21 Compensation via Credits o A distributed feedback database is built on top of Pastry 0 Any node can access this information and decide whether to allow dealings with a 39 requesting node Source Butt et al 3 22 Balanced Tree Overlay 23 Architecture o Nodes are located in the tree by their IP address 0 Every node has between m and 2m siblings o Fault tolerance every node has the addresses of k predecessors and k successors at the same level 24 Architecture fl ii u 39 E vAdjacent1i1 lt l Parent node B Child node 3 Brother of Cousin Node 25 Architecture o A father node has a pair of values that represent the interval of descendents addresses 0 Framework organized in three layers 0 Connectivity Protocol 0 Availability Protocol 0 Discovery Protocol 26 Connectivity Protocol o Maintains the overlay links in the network 0 States how the tree is kept balanced o How nodes join and leave the network 0 And finally how node failures are dealt with to rebuild the structure 27 Joining the network o A node requests an insertion which is routed up looking for a node whose interval has the address of the new node 0 Then it goes down until it reaches the node with the nearest address to the new node s address 0 Finally the nodes become brothers the new node updates its references list 28 Joining the network 0 Father node is notified and has to check if the number of its child nodes are between m and 2m 29 Leaving the network 0 Leaving node with children 0 Selects a leaf node as a new father that will not violate the balance property 0 Leaving node without children 0 Notifies its siblings and its father that it is going to leave so they will update their reference lists 0 The leaving node s father has to check the balance property 30 Availability Protocol 0 states how to maintain branch information without flooding the network a maximum computing power and 0 minimum number of hops with free nodes 0 Policies for deciding availability values 0 Optimistic o Conservative 31 Discovery Protocol o Tries to allocate the best free nodes 0 With high computing power o Are physically closed to the requester node 0 A node nW receives a message with t pending tasks 0 if nW is able to accept a new task nW will take it from the message 32 Discovery Protocol 0 Otherwise nW distributes tasks between its child nodes 0 of free nodes in each branch o Gives priorities to branches with high computing power or less hops o If children branches could not handle all the tasks 0 New message with the remaining tasks is sent to the nW s father 33 Discovery Protocol o If the new message reaches the root and cannot be sent to another branch that message is sent to the originating node meaning that there are no free nodes left 34 Chord Overlay and Markets Model 35 Founda on 0 Computing resources located in a chord overlay network 0 A market is a place for trade homogenous amounts of computing resources cycles disk space etc o A market owner MO is matchmaker between sellers Producer nodes and buyers Job owners 36 Founda on o Nodes with different but close enough idle processing capacities trade in the same market 0 A single physical node can be a MO of various markets 0 Estimation of future available computing power in the next Ttime units is done by each node based on current and past loads 37 Job Allocation 1 User submits task to the service layer 2 Service layer looks up the markets with the required resources 3 MO is queried about available sellers 4 MO finds the best sellers The IP address of a selected seller is sent to the service layer 6 The service layer submits the task to the seller 7 Seller receives com ensation after finis ing the task 8 The user receives the results Selecls an appropriate VFgt v Market uwner nodes lo 7 quot39 do malchmaking KI l r 39339 balween sellers and 9 L buyers I 1 1 I la xv N V a Service fl Layer l l l l 1 4 1 l Submig l i Pmlowl lorcleating I C e39 2 resource markets request to Iha I 3 l I servioelayer x4quot I39 f quoter l39 wl x t 39 O l39 V W T K Source Gupta R et al 5 38 Architecture 0 Creation and lookup for markets 0 Single overlay scheme o Processor overlay scheme 0 Resource Pricing o FiXEd 0 variable 39 Single Overlay Scheme o The estimated value of computing power by a seller acts as the Chord ID 0 Compute power markets are searched using the Chord lookup protocol 0 If computing power values are in a very narrow range then most of the markets would map few distinct physical nodes 40 Processor Overlay Scheme 0 An additional overlay keeps the information about computing power 0 The ID space is bounded by 2C 1 c represents max value of idle CPU cyclessec 0 Each CPU Market ID CMID is represented by a different node in the processor overlay 41 Resource Pricing Fixed o A market owner charges the same price to the sellers in that market 0 the sellers should pay the same listing price even if they earn different profits from the market where they are in o Marginal cost is the price of providing a computing resource and it is charge by sellers 42 Resource Pricing Fixed 0 Reverse Vickrey auction 0 Buyer selects the seller with the least marginal cost o Buyer has to pay the second lowest marginal cost listed in the market 0 This strategy provides 0 nonzero profit to the selected seller o ensure that sellers state their correct marginal cost to the MO 43 Resource Pricing Variable o A market owner charges a percentage of the selling price to the sellers in that market 0 Max min payoff strategy o PayOffMOHighestMarginalCost LowestMarginalCost HighestMarginalCost2 o PayoffsenerMarginalCost 1 its marginal cost is the lowest marginal cost in the network 44 Conclusions 0 Currently frameworks have bound lookup operations to log N with the aim of being highly scalable Unstructured Structured Random Pastry amp Balanced Chord amp Walks amp RS JVM Tree overlay Markets sets Model Node W log N 2m Log N state lookup log N log N IogmN Log N 45 Conclusions2 Some frameworks have implemented 0 Crossvalidation o Validity of results trough a voting scheme however it will not work well if the scheme is corrupted by malicious nodes 0 Monitoring job progress o Trustworthiness in the progress of tasks nevertheless adding some overhead 0 Topologyaware routing o Minimizes the distance that messages have to travel in the underlying network o Provides a fast and easy insertion method for joining nodes in the system 46 Conclusions 3 o faulttolerant mechanisms o Checkpointing o Redundant computations o Scalability problems o No mechanism to limit the number of nodes in market or to balance load between markets is presented in Chord overlay and Markets Model o High network maintenance traffic in the balance tree overlay since the tree has to be rebalanced if the numbers of siblings per node are not between m and 2m 47 References 1 Androutsellis Theotokis S and Spinellis D A Survey of Peer to Peer Content Distribution Technologiesquot ACM Computing Surveys Vol 36 No 4 2004 Awan A Ferreira R Jagannathan S and Grama A Unstructured Peer to Peer Networks for Sharing Processor Cyclesquot Parallel Computing Elsevier Science Vol 32 No 2 2006 Butt A Fang X Hu Y and Midkiff S Peer to Peer and Accountability Building Blocks for Distributed Cycle sharing Proceedings of the 3rd Virtual Machine Research and Technology Symposium VM 04 San Jose CA 2004 48 References 4 Celaya J and Arronategui U Scalable Architecture for Allocation of Idle CPUs in a P2P Networkquot Lecture Notes in Computer Science Springer Verlag 2006 5 Gupta R Sekhri V and Somani A CompuP2P An Architecture for Internet Computing Using Peer to Peer Networksquot IEEE Transactions on Parallel and Distributed Systems Vol 17 No 11 November 2006 6 Rowstron A amp Druschel P 2001 Pastry Scalable distributed object location and routing for large scale peer to peer systemsquot in IFIPACM International Conference on Distributed Systems Platforms Middleware 2001 pp 329 350 49
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'