C S Research
C S Research CIS 6935
Popular in Course
Popular in Comm Sciences and Disorders
This 3 page Class Notes was uploaded by Mrs. Rahul Wuckert on Thursday September 17, 2015. The Class Notes belongs to CIS 6935 at Florida State University taught by Staff in Fall. Since its upload, it has received 55 views. For similar materials see /class/205702/cis-6935-florida-state-university in Comm Sciences and Disorders at Florida State University.
Reviews for C S Research
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/17/15
Rethinking the Service Model Scaling Ethernet to a lIillion Nodesquot Andy Myersl T 8 Eugene Ngi Hui Zhangl TCarnegie Mellon University ABSTRACT Ethernet has been a cornerstone networking technology for over 30 years During this time Ethernet has been extended from a sharedchannel broadcast network to include support for sophisticated packet switching Its plugandplay setup easy management and selfcon guration capability are the keys that make it compelling for enterprise applications Look ing ahead at enterprise networking requirements in the com ing years we examine the relevance and feasibility of scaling Ethernet to one million end systems Unfortunately Ethernet technologies today have neither the scalability nor the relia bility needed to achieve this goal We take the position that the fundamental problem lies in Ethernet s outdated service model that it inherited from the original broadcast network design This paper presents arguments to support our position and proposes changing Ethernet s service model by elirninat ing broadcast and station location learning 1 INTRODUCTION Today very large enterprise networks are often built us ing layer 3 ie IP technologies However Ethernet being a layer 2 technology has several advantages that make it a highly attractive alternative First of all Ethernet is truly plug and play and requires minimal management In contrast IP networking requires subnets to be created routers to be con gured and address assignment to be managed 7 no easy tasks in a large enterprise Secondly many layer 3 enter prise network services such as IPX and AppleTalk persist in the enterprise coexisting with IP An enterprisewide Ether net would greatly simplify the operation of these layer 3 ser vices Thirdly Ethernet equipment is extremely cost effective For example a recent price quote revealed that Cisco s 10 Gbps Ethernet line card sells for one third as much as Cisco s 25 Gbps Resilient Packet Ring RPR line card and contains twice as many ports Finally Ethernets are already ubiqui tous in the enterprise environment Growing existing Ether nets into a multisite enterprise network is a more natural and p 39 path than altemati es sw h as build ing a layer 3 IP VPN Already many service providers 1 2 are offering metropolitanarea and widearea layer 2 Ether net VPN connectivity to support this emerging business need Looking ahead at enterprise networking requirements in the coming years we ask whether Ethernet can be scaled to one million end systems h anhlP This research was sponsored by the NSF under ITR Awards ANI 0085920 and ANI0331653 and by the Texas Advanced Research rograrn under grant No 00360400782003 Views and conclusions contained in this ocurnent are those of the authors and should not be interpreted as representing the o i cial policies either expressed or implied of NSF the state of Texas or the US government 1Rice University Ethernet s nonhierarchical layer 2 MAC addressing is of ten blamed as its scalability bottleneck because its at ad dressing scheme makes aggregation in forwarding tables es sentially impossible While this may have been the case in the past improvements in silicon technologies have removed at addressing as the main obstacle Bridges that can handle more than 500000 entries already ship today 3 We argue that the fundamental problem limiting Ethernet s scale is in fact its outdated service model To make our posi tion clear it is helpful to brie y review the history of Ether net Ethernet as invented in 1973 was a sharedchannel broad cast network technology The service model was therefore extremely simple hosts could be attached and reattached at any location on the network no manual con guration was re quired and any host could reach all other hosts on the network with a single broadcast message Over the years as Ethernet has been almost completely transformed this service model has remained remarkably unchanged It is the need to support broadcast as a rstclass service in today s switched environ ment that plagues Ethernet s scalability and reliability The broadcast service is essential in the Ethernet service model Since end system locations are not explicitly known in this service model in normal communication packets ad dressed to a destination system that has not spoken must be sent via broadcast or ooded throughout the network in order to reach the destination system This is the normal behavior of a sharedchannel network but is extremely dangerous in a switched network Any forwarding loop in the network can create an exponentially increasing number of duplicate pack ets from a single broadcast packet Implementing the broadcast service model requires that the forwarding topology always be loopfree This is implemented by the Rapid Spanning Tree Protocol RSTP 4 which com putes a spanning tree forwarding topology to ensure loop free dom Unfortunately based on our analysis see Section 2 RSTP is not scalable and cannot recover from bridge failure quickly Note that ensuring loopfreedom has also been a pri concern in much research aimed at improving Ethernet s scalability 5 6 7 Ultimately ensuring that a network al ways remains loopfree is a hard problem The manner in which the broadcast service is being used by higher layer protocols and applications makes the prob lem even worse Today many protocols such as ARP 8 and DHCP 9 liberally use the broadcast service as a discovery or bootstrapping mechanism For instance in ARP to map an IP address onto an Ethernet MAC address a query message is broadcast throughout the network in order to reach the end system with the IP address of interest While this approach is simple and highly convenient ooding the entire network when the network has one million end systems is clearly un scalable In summary the need to support broadcast as a rstclass service plagues Ethernet s scalability and reliability More over giving end systems the capability to actively ood the entire network invites unscalable protocol designs and is highly questionable from a security perspective To completely ad dress these problems we believe the right solution is to elimi nate the broadcast service enabling the introduction of a new control plane that is more scalable and resilient and can sup port new services such as traf c engineering In the next section we expose the scalability and reliability problems of today s Ethernet In Section 3 we propose elimi nating the broadcast service and discuss changes to the control plane that make this possible We discuss the related work in Section 4 and conclude in Section 5 2 PROBLEMS WITH TODAY S ETHERNET In this section we present evidence that Ethernet today is neither scalable enough to support one million end systems nor faultresilient enough for missioncritical applications The problems we discuss here all arise as a result of the broad cast service model supported by Ethernet To the best of our knowledge this is also the rst study to evaluate the behavior and performance of RSTP Our results strongly contradict the popular beliefs about RSTP s bene ts 21 Poor RSTP Convergence In order to safely support the broadcast service in a switched Ethernet a loopfree spanning tree forwarding topology is computed by a distributed protocol and all data packets are forwarded along this topology The speed at which a new for warding topology can be computed after a network compo nent failure determines the availability of the network Rapid Spanning Tree Protocol RSTP 4 is a change to the original Ethernet Spanning Tree Protocol STP 10 introduced to de crease the amount of time required to react to a link or bridge failure Where STP would take 30 to 50 seconds to repair a topology RSTP is expected to take roughly three times the worst case delay across the network ll We now provide a simpli ed description of RSTP which suf ces for the purpose of our discussion RSTP computes a spanning tree using distance vectorstyle advertisements of cost to the root bridge of the tree Each bridge sends a BPDU bridge protocol data unit packet containing a priority vector to its neighbors containing the root bridge s identi er and the path cost to the root bridge Each bridge then looks at all the priority vectors it has received from neighbors and chooses the neighbor with the best priority vector as its path to the root One priority vector is superior to another if it has a smaller root identi er or if the two root identi ers are equal if it has a smaller root path cost The port on a bridge that is on the path toward the root bridge is called the root port A port on a bridge that is con nected to a bridge that is further from the root is called a des ignated port Note that each nonroot bridge has just one root port but can have any number of designated ports the root bridge has no root port To eliminate loops from the forward ing topology if a bridge has multiple root port candidates it drops data traf c on all candidates but the root port Ports that drop data traf c are said to be in state blocking We have built a simulator for RSTP and have evaluated its behavior on ring and mesh topologies varying in size from 4 to 20 nodes We have found that contrary to expected behav ior in some circumstances RSTP requires multiple seconds to converge on a new spanning tree Slow convergence happens most often when a root bridge fails but it can also be triggered when a link fails In this section we explore two signi cant causes of delayed convergence count to in nity and port role negotiation problems 211 Count to In nity Figure 1 shows the convergence time for a fully connected mesh when the root bridge crashes We de ne the conver gence time to be the time from the crash until all bridges agree on a new spanning tree topology Even the quickest conver gence time 5 seconds is far longer than the expected conver gence time on such a topology less than 1 ms The problem is that RSTP frequently exhibits count to in nity behavior if the root bridge should crash and the remaining topology has a cycle When the root bridge crashes and the remaining topology includes a cycle old BPDUs for the crashed bridge can persist in the network racing around the cycle During this period the spanning tree topology also includes the cycle so data traf c can persist in the network traversing the cycle continu ously The loop terminates when the old root s BPDU s Mes sageAge reaches MaxAge which happens after the BPDU has traversed MaxAge hops in the network Note that the wide variation in convergence time for a given mesh size is indicative of RSTP s sensitivity to the synchro nization between different bridges internal clocks In our simulations we varied the offset of each bridge s clock which leads to the wide range of values for each topology Figure 3 shows a typical occurrence of count to in nity in a four bridge topology The topology is fully connected and each bridge s priority has been set up so that bridge 1 is the rst choice as the root bridge and bridge 2 is the second choice At time t1 bridge 1 crashes Bridge 2 will then elect itself root because it knows of no other bridge with superior priority Simultaneously bridges 3 and 4 both have cached in formation saying that bridge 2 has a path to the root with cost 20 so both adopt their links to bridge 2 as their root ports Note that bridges 3 and 4 both still believe that bridge 1 is root and each sends BPDUs announcing that it has a cost 40 path to B1 At t2 bridge 4 sees bridge 2 s BPDU announc ing bridge 2 is the root Bridge 4 switches its root port to its link to bridge 3 which it still believes has a path to bridge 1 Through the rest of the time steps BPDUs for bridges l and 2 chase each other around the cycle in the topology RSTP attempts to maintain least cost paths to the root bridge but unfortunately that mechanism breaks down when a bridge crashes The result is neither quick convergence nor a loop free forwarding topology 212 Hop by Hop Negotiation Figure 2 shows RSTP s convergence times on a ring topol