Advanced Networking CS 557
Popular in Course
Popular in ComputerScienence
This 148 page Class Notes was uploaded by Betty Kertzmann on Tuesday September 22, 2015. The Class Notes belongs to CS 557 at Colorado State University taught by Daniel Massey in Fall. Since its upload, it has received 26 views. For similar materials see /class/210175/cs-557-colorado-state-university in ComputerScienence at Colorado State University.
Reviews for Advanced Networking
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 09/22/15
CS 557 Lectures 29 Suggorting Real Time Agglications in an Integrated Services Packet Network Architecture and Mechanism D Clark S Shenker and L Zhang 1992 Fall 2009 The Story So Far email wwwv phone Some Essential Apps DNS naming and NTP time Transport layer End to End communication Multiplexing Reliability Congestion contro Flow contro Network layer Addressing Fragmentation Dynamic Routing Best Effort Forwardin lh r et 39PPP CSMA async sonet copper roer radio Data Layer richly connected netwo many paths with s of unreliable links Main Pomts Objective Design an architecture to support both realtime traffic and traditional data packets Approach Characterize traf c demands and network requirements Provide isolation between classes Provide sharing within classes Manage jitter in a hierarchical architecture Contributions An model for how one might support realtime What Causes Delay in a Network Review from Undergrad Networking Propagation delay Store and FonNard Delay Queuing Delay RealTime Traffic is sensitive to Delay Concern over the delay between packets jitter lfplay data immediately jitter is a problem Jumping picture Unfamiliar pauses and bursts in speech RealTime Traffic Playback Model for Applications Application buffers some data Replays data after some playback point Data needed before the playback point Data arriving after playback point is useless Managing Jitter Jitter Delay between packets Note above model toleratesjitter that is lower than playback point Jitter largerthan playback point causes data loss Characterizing Traffic Rigid vs Adaptive PlayBack Points Rigid assumes a priori bound on delay and sets a fixed playback point Adaptive assumes estimate of bound based on post facto results and adapts playback point based on past behavior Tolerant vs lntolerant Applications lntolerant applications have extremely negative behavior when jitter exceeds playback point Tolerant applications see a drop in service when jitter exceeds playback point but can make do with some temporary bad periods Natural combinations ofthe above include lntolerant Rigid Applications Adaptive Tolerant Applications Service Commitments Guaranteed Service Application provides it traffic commitment Network offers a commitment of throughput and delay Network commitment guaranteed unless Hardware error in network or application violates its commitment Predicted Service If the past is a guide to the future then network will meet its commitment No guarantee that past behavior predicts future Thus not suited for intolerant rigid behavior But can be far more efficient for adaptive applications Guaranteed Service Appropriate Necessary for Rigid lntolerant Apps Requires Isolation Between Flows Network guarantees commitment will be met regardless of other traffic behavior Solvable Using Variations of Fair Queuing Recall system of last lecture Add a weighting system to balance demands rather than fairness Requires Resources based on WorstCase Behavior Application likely will not always run at peak rate But network guarantees app can run at peak rate whenever it wants Set a priori bounds based on worst case analysis Predicted Service V th Weighted FairQueuing Application burst increases only that apps queue Large increase in jitter for that application No increase in jitter for other applications Achieved isolation of flows but is that always desirable But Suppose Applications Are Adaptive Could spread extra load across all applications No one application receives a high penalty All applications adapt a small amount Works well unless everyone has simultaneous burst But if past predicts future not all apps will change at once Best Solution to Spread Out Load Evenly Some Results NILEleng Q TarIr l Tlir mun and quotIquot quot1 Lli rrxrrlzilr minimu lrvava mCi ltfJ in mL1lwtmJ which M a 11h pzrlmu Lnnn minurn liuu fr 3 unfit ow walla 11 397ng up Pl srhmhltn zlxsnthma Tl Ill H U 3quot rl39rilzatsl FIFO Across Multiple Hops FIFO Works Well on a Single Ho But Jitter Increases Dramatically on Multiple Each hop provides an independent source of dela Hops Y FIFO Address the multiple hop by maintaining an average delay Co pa packet delay with average delay for that class Schedule packets using basic FIFO ru But pretend packet with higherthan average delay arrived earlier a m Net Result of FIFO Packet scheduled using FIFO rules according to the time a packet should have arrived Some Results Greases m mm w M 4 m n M Min I in t i m m m 9 quotAlu m m A rm W W armw Mm m 4 mm chmt mm mm Total Architecture Classes of Traffic with Isolation Guaranteed Predicted Datagram Sharing Within a Traffic Class Priority Levels with Jitter from higher levels distributed down But Also Requires Other Features AdmissionPolicing at Entrance Pricing for ClassesPriorities Otherwise why choose predicted class CS 557 Lecture 19 Congestion Avoidance Congestion Avoidance and Control Jacobson and Karels 1988 Spring 2009 The Story So Far Some Essential Apps DNS naming and TP time Transport layer End to E unication Multiplexing Reliability 0 ntro Network layer Addressing Fragmentation Dynamic Routing Best Effort w met39PPPM Forwarding copper iper radio a o Congestion 0 Flow contro Data Layer richly connecte n two many paths with many types of unreliable links Main Points Objective Techniques for dealing with network congestion Approach Slow Start Adaptive Timers Additive lncreasemultiplicitive decrease Contributions Essential points of TCP congestion control Motivation and Context Network Collapsing Due to Congestion Throughput drops from 32 Kbps to 40 bps One conclusion packet switching failed This paper says we can fix the problem Conservation of Packets Can t have collapse if packets entering network packets leaving network Can we achieve conservation of packets TCP Review RFCs 793 1122 1323 2018 2581 pointtopoint one sender one receiver reliable inorder byte steam no message boundaries pipelined TCP congestion and ow control set window size send amp receive buffers 7 mm dam full duplex data bidirectional data flow in same connection MSS maximum segment size connectionoriented handshaking exchange of control msgs init s sender receiver state before data exchange flow controlled sender will not overwhelm receiver data from IP TCP Flow Control 12 receive side of TCP connection has a receive buffer FRchifmdow 4 sp are room 7 RchuFfer 4 app process may be slow at reading from buffer application process flow control sender won39t overflow receiver39s buffer by Transmitting too much too fast speedmatching service matching the send rate to the receiving app s drain rate TCP Flow control 22 F Rcvwmdow A data from P 7 Rchu er 4 Suppose TCP receiver discards outof order segments spare room in buffer RcvWindow LastByteRead appucau39on Rchuffer LastByteRcvd Rcvr advertises spare room by including value of RcVWindow in segments Sender limits unACKed data to RcVWindow guarantees receive buffer doesn t overflow Seg s byte stream number of first byte in segments data ACKs seq of next byte expected from other side cumulative ACK Q how receiver handles outoforder segments TCP spec doesn t say up to implementation User types 6 host ACKs receipt of echoed 6 H A Seq42ACK79 data C host ACKS maria dam Seq7 9 Se Host Bg receipt of C39 echoes back 6 time simple telnet scenario Challenges to Conservation Connection never reaches equilibrium Too many initial packets drives network into congestion and then hard to recover Sender adds packets before one leaves Poortimer causes retransmission of packets that are still inflight on the network Equilibrium can t be reached due to resource limits on path Assume packet loss is due to congestion and back off by multiplicative factor Slow Start TCP is SelfClocking Receipt of ack triggers new packet Good if Network is in Stable State How to ramp up at the start Start slow 1 packet Each ack triggers two packets Quickly Ramp Up Window to Correct Size TCP retransmission scenarios I f o E 3 z N N E m 394 gr u l tn Sendbase sendBase 120 92 Timeauf Seq Send Base 100 Time Time Ios r ACK scenario TCP retransmission scenarios SendBuse 120 Time Cumula rive ACK scenario TCP Timeout Values Q how to set TCP Q how to estimate RTT timeout value SampleRTT measured time longerthan RTT from segment transmission but RTT varies until ACK receipt too Short premature ignore retransmissions timeout SampleRTT will vary want unnecessary estimated RTT smoother retransmissions average several recent too long slow measurements notjust reaction to segment current samPJeRTT loss TCP Round Trip Time RTT EstimatedR39I T 1 aEstimatedR391quot1 uSamp1eR391quot139 Exponential weighted moving average in uence of past sample decreases exponentially fast typicalvaluea 0125 Example RTT estimation R39l39l39 gaiacsumassedu to fantasiaeurecomlr m mllllumnnl E an m a 71 um miinn Samplem r 4 Emma mt TCP Round Trip Time and Timeout Setting the timeout EstimtedRTT plus safety margin large variation in EstimatedR I T gt larger safety margin first estimate of how much SampleRTT deviates from EstimatedRTT DevRT39I 1 DevR391quot139 5 lsampleR39I39T EstimatedR39I39Tl typically 3 025 Then set timeout interval TimeoutInterval EstimatedR39I T 4DevR39139T TCP Window Size Over Time congeshon win ow 24 Kbyles 16 Kbyles 3 Kbyles r llme Longlived TCP connection TCP Congestion Control Review When CongWin is below Threshold sender in slowstart phase window grows exponentially When CongWin is above Threshold sender is in congestionavoidance phase window grows linearly When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 M88 TCP ACK generation RFC 1122 RFC 2581 Event at Receiver TCP Receiver action Arrival of inorder segment with expected seq All data up to expected seq already ACKed Delayed ACK Wait up to 500ms for next segment If no next segment send ACK Arrival of inorder segment with expected seq One other segment has ACK pending Immediately send single cumulative ACK ACKing both inorder segments Arrival of outoforder segment higherthanexpect seq Gap detected Immediately send duplicate ACK indicating seq of next expected byte Arrival of segment that partially or completely fills gap Immediate send ACK provided that segment startsat lower end of gap Event State TCP Sender Action Commentary receipt for SS previously If CdngWin gt Threshold set state to Congestion data Avoidancequot Slow Start CongWin CongWin SS Resulting in a doubling of CongWin every RTT receipt for previously Avoidanc unacked e CA data Congestio CongWinCongWinMSS Additiveincrease MSSCon gWin resulting in increase of CongWin by1 MSS every RTT Loss event SS or CA Threshold CongWin2 Fast recovery detected by CongWin Threshold implementin triple Set state to Congestion multiplicative decrease duplicate Avoidancequot CongWin will not drop ACK below 1 MSS Timeout SS or CA Threshold CongWin2 Enter slow start ongWin 1 MSS Set state to Slow Startquot Duplicate SS or CA Increment duplicate ACK CongWin and Threshold ACK not change count for segment being acked Fast Retransmit Timeout Period Often If sender receives 3 relat39vely long ACKs for the same long delay before resending lost packet data it supposes that Detect lost segments via segment after ACKed duplicate ACKs data was lost Sender often sends many segments baoktobaok fast retransmit resend If segment is lost there will segment before timer likely be many duplicate expires ACKs Fast retransmit algorithm event ACK received with ACK field value of y if y gt SendBase SendBase y if there are currently notyetacknowledged segments start timer else increment count of dup ACKs received for y if count of dup ACKs received for y 3 resend segment with sequence number y 1 dUPlicaTe ACK for fast retransmit already ACKed segment TCP Fairness Fairness goal if K TCP sessions share same bottleneck link of bandwidth R each should have average rate of RK TCP connection 1 bottleneck router capacity R TCP connection 2 Why is TCP fair Two competing sessions Additive increase gives slope of 1 as throughout increases multiplicative decrease drops throughput proportionally equal bandwidth share loss decrease window b factor of 2 congestion avoidance a ditive increase loss decrease window by factor of 2 congestion avoidance additive increase Connection 2 throughput 70 Connection 1 throughput R TCP Throughput What s the average throughout ot TCP as a function of window size and RTT Ignore slow start When window is W throughput is WRTT Just after loss window drops to W2 throughput to W2RTT Average throughout 75 WRTT Let W be the window size when loss occurs Window Size and Loss Rate 10 Gbps throughput required window size W 83333 in ight segments TCP assumes every loss is due to congestion Generally safe assumption for reasonable window size Throughput in terms of loss rate 122 M85 remI L 2103910 Wow New versions of TCP for highspeed needed Fairness more Fairness and UDP Fairness and parallel TCP connections MUIt39med39a apps Often nothing prevents app from opening do nOt use TCP parallel connections between 2 do not want rate throttled hOStS by congestion control Web browsers do this 0 Instead use UDP Example link of rate R supporting 9 cnctions pump audiovideo at new app asks for 1 TOP gets rate R10 constant rate tolerate packet loss Research area TCP friendly unreliable transport new app asks for 11 TCPs gets R 2 TCP Delay Modeling Q How long does it take to receive an object from a Web sener after sending a request Ignoring congestion delay is influenced by TCP connection establishment data transmission delay slow start Window size First assume fixed congestion window Then dynamic window modeling slow start Calculating TCP Delay Simgle case Client requests a web page for server Assume no congestion and xed window size Define the following 0 object size bits W window Size S MSS Max Segment Size Assume always send segment of size S R Bandwitdh RTT Round Trip Time ACKs and HTTP Request is very very small What is the best case scenario Delay Modeling Q How long does it take to receive an object from a Web server after sending a request Ignoring congestion delay is influenced by TOP connection establishment data transmission delay slow start Window size First assume fixed congestion window Then dynamic window modeling slow start Calculating TCP Delay Simgle case Client requests a web page for server Assume no congestion and xed window size Define the following 0 object size bits W window Size S MSS Max Segment Size Assume always send segment of size S R Bandwitdh RTT Round Trig Time 2 grog delay ACKs and HTTP Request is very very small What is the best case scenario Fixed congestion window 1 mum TC 5 meet a an Best case WSR gt R l39l39 SR W quotquotquotquotquotquot 7 ACK for first segment in window returns before window s worth of data sen W delay ZRTT OR law 1 serve Fixed congestion window 2 Emgy PiQe case mm WSR lt RTT SR wait requen for ACK after sending window s worth of data K number of windows before done delay ZRTT OR 437 K1SR RTT WSR TCP Delay Modeling Slow Start 1 Now suppose window grows according to slow start The delay is l l Delay2RTT2P RTT 2P 1 R L R R where P is the number of times TCP idles at server PminQK l where Q is the number of times the server idles if the object were of infinite size and K is the number of windows that cover the object TCP Delay Modeling Slow Start 2 Delgy componen rs Q f QEE 39 2 RTT for connecTion es rab and reques r request OR To Transmi r mm objec r Time server idles due To slow sTarT firS39L Windqu sR secund vvinduvv ZsR 39 third Wiride Server idles I 45m P minK 1Q Times Examgle fuurileigdeuW 39 OS 15 segmen rs K 4 windows cumplete transmissiun Q P minK 1Q 2 mm delivered Server idles P22 Times TCP Delay Modeling 3 S R 2 send data to receive aek initialeTCP curmecliun 239 l transmit kLh window 39 secundwindw ZSR39 i E R77quot 2H E kth window idle R R lhivdwindw is 3 delay ZRTT idle Time inunhwinduw BSR p 9ZRTT RTT 2nd R Z R R DblEEl l rumplele o s s E ZRTT PRTTE 2quot iE TCP Delay Modeling 4 Recall K number of windows that cover object How do we calculate K K minlk 205 215L ZMSZ O mink2021L 2M 2 05 0 kit 127 mml 5 minkkzlogz1 log2 l Calculation of Q number of idles for infinitesize object is similar see HW TCP Evolution Continues Consider the impact of high speed links 1500 byte segments 100ms RTT 10 Gbps throughput What is the required window size Throughput 75 WRTT probably a good formula to remember Requires window size W 83333 in ight segments Summary Essential Components of TCP Slow Start AMD Timer Estimates 20 CS 557 Lecture 25 IP Forwarding Routing With a Clue Yehuda Afek Anat BremlerBarr and Sariel HarPeled 1999 Spring 2009 Forwarding lP Packets Objective A fast table lookup for fonNarding IP packets Approach Include a clue in each lP packet Previous router includes its longest match as the clue Allows receiving router to start lookup from that clue Contributions Interesting alternative to tagswitching MPLS etc Datag ram networks no call setup at network layer routers no state about endtoend connections no networklevel concept of connection packets fonNarded using destination host address packets between same sourcedest pair may take different paths 0 llcation transor39t network data link 4 billion possible entries Forwarding table DestinationAddress Ran 6 Link Interface 11001000 00010111 00010000 00000000 oug 0 11001000000101110001011111111111 11001000 00010111 00011000 00000000 through 1 11001000 000101110001100011111111 11001000 00010111 00011001 00000000 2 through 11001000000101110001111111111111 otherwise 3 Longest prefix matching Pre x Match Link Interface 11001000 00010111 00010 0 11001000 0001011100011000 1 11001000 0001011100011 2 otherwise 3 Examples DA 11001000 00010111 00010110 10100001 Which inTerface DA 11001000 00010111 00011000 10101010 Which inTerface Virtual circuits Signaling Protocols used to setup maintain teardown VC used in ATM framerelay X25 Periodical ro osed as new Internet solution i J a lica rion 39 39 T nsporf 5 DaTa flow begins 6 Receive da ra applica on neTworK 4 Call connec red 3 AccepT call Transpo r neTworK incoming call ysica 1 Ini ria re call Forwarding table VC number Forwarding Table in interface northwest r ou rer number Incoming interface 1 Incoming VC l Outgoing interface 1 Outgoing VC 1 12 2 22 2 63 1 18 3 7 2 17 1 3 87 97 Routers maintain connection state information Datagram or VC network lP Forwarding Switching Must search long forwarding provides a fast lookup table for 39OngeSF match Simply store a table mapping Searching this list can require VC number to interface multiple memory accesses Requires exactly1 memory Simple Trie structure requires access OlogW memory access B ut requires tagscircuits etc Where dqress Slze for each flow or pair or Objective is find the longest something match faster Requires some sort of setup Intuition Behind a Clue edloRl 7 R1 nds ms best match and that Entry says furward m R2 and looks up 129 821 l n N a o 9 lt 3 u o x a leely thatthe best match at routerRZ l5 7 Exactly 129 8216 Dr 7 Sumetnlng mare sperm We 129 82 nu24 Sending a Clue R1 sends it Best Match as a Clue Allows R2 to start search from that point w 39 Ml m M l ml h mm 2 mam my afDnsnlMMA w my Simple Clues Neighbor Router Sends a Clue The Best Match at that prefix Receiving Router Looks for Clue in its forwarding table Case 1 Clue is in trie and has descendants below it Start search from the clue No final Decision set Ptr to Clue Case 2 Clue is in trie and has no descendants below it Clue is also the best match in this table Set Final Decision FD Next hop of Clue Ptr empty Case 3 Clue is not in the table Find the best match for the Clue Set Final Decision to this best match Ptr empty Example of Simple Clues Neighbor Router Sends a packet Destination is 129821001 Clue is 1298216 Receiving Router Looks for Clue 1298216 in its forwarding table Case 1 1298216 is in the table and entries such as 129821024 129822418 and so forth are in the table Best match may be 1298216 or may be one of these more specific pre xes Start the search at 1298216 Case 2 1298216 is in table and has no descendents below it Best match for this pre x is 1298216 Case 3 1298216 is not in the table Some less speci c pre x Ike 1298 must be in the table Find this best match for 1298200 and use this Why Does This Help Receiving Router Builds a Clue table First time you see the Clue you need to look it up Second time you see the Clue you can Case 1 Search from a better starting point Case 2 Immediately map Clue to final decision One memory access to lookup Clue in Clue table Forward this packet to FD entry in Clue Table Just like TagSwitching Case 3 Immediately map Clue to final decision One memory access to lookup Clue in Clue Table Forward this packet to FD entry in Clue Table Just like TagSwitching Claim Cases 2 and 3 are common Can we reduce number of Case 1 events Advanced Clues Case 2 and Case 3 are same as Before Case 2 Clue is in trie and has no descendants below it Clue is also the best match in this table Set Final Decision FD Next hop of Clue Ptr empty Case 3 Clue is not in the table Find the best match for the Clue eg best match for 1298216 Set Final Decision to this best match Ptr em t Case 1 Compare sending routers table with your table lf sender always has a better match at same time or before you do then the clue is a best match for you as well Detailed and proved in Claim 1 of the paper Most important part to understand Try an example as a starting point to understand this claim Example of Advanced Clues Neighbor Router Sends a packet 7 Destination is 129 821921 7 Cluels1298216 Looks for 39 7 Receiving Router has 129 8216 129 82 2418 129 82129 821024 7 Best match must be one ofthese three Sending Router has forwarding table of 1298216 129822417 129821024 Intuition behind advanced method r if packet matches 129 82 2418 at receiver it also would have matched 129 82 2417 at sender 50 it can t be 129 82 24171 r if packet matches 129 821024 at receiver it also would have matched 129 821024 at the senderl 7 SO the packet must match 129 8216 No heed to searchl More generally for every possible better match at the receiver there was always a better or equal match at the sender Example of Using Clues Advanced says this must match or else would have matched better at sender Kl W m mt mm mum w W ow I r Might match 0000 11 39 QfEChes 0110 in Matches 11 or rum 3 A m mm m a Dmnlmtzd tr luukup I39m dimly m m the am Wilma an um nun dew rm quot4 m m mm to m folwmtdlnx mm n to m Mommas inn vertex Must match 00 Experimental Results Note average is nearly 1 memory access for advance inc mm hm hm hwylumi hm some mm mm H 7 w w sin mm Mass zuu aims um Mum quot52 tum low my my I Aural msquot Flunon mm m min 9mmquot by nutz Mm he m 1mm m Mcxhad rm lamultrth lo M nzuu Mamas m a Hm m on m in 251 um um 1mm tnnn luau mum mm my 5 l A mmquot or mam Wm to mm pmwd i nu l arm m lenwtd 1m Ame Hum i Mi umquot Wm mwi 0mm mm mm H r mu ample mm mm mm um um iAdvnm an mm mm tum mm 1 ng o m numbquot Mmunvvy Mann m inclh weenml 5 IS 372 um um quotLend mm is Summary Very Clever Technique to Speed Up lP FonNarding Send a clue to your neighbor Clue indicates the best match you found Often this is the same best match at your neighbor Can be incrementally deployed works with a variety of lookup schemes etc CS 557 Lecture 16 Routing Dampening Route Flag Damping Exacerbates Internet Routing Convergence ZM Mao R Govindan G Varghese and R Katz 2002 Timer Interaction in Route Flag Damping B Zhang D Pei D Massey and L Zhang 2005 Spring 2009 Understanding BGP Damping Objective Document an unexpected interaction in BGP damping Approach Anaysis simulation and measurement of damping behavior Contributions New look at BGP damping Interaction between convergence and damping Simple Route Flapping Unstable link configuration policy etc can all lead to persistent route changes 3K 3 Up Down Up Down Up Down gt v Announce gt Withdraw gt Announce Withdraw Potentially propagates to entire lnternetll MRAI Timer MRAI Timer will mask effect of very frequent changes 3 0 Up 1 Down 2 Up 3 Down 4 Up 5 Down gt O Announce gt 1 Withdraw No more updates until MRAI expires typically time 30 But MRAI Not Sufficient for Longer Time Scales 0 Up 0 Announce 15 Down I 30 UP 15 Withdraw 45 Down 60 Up 30 Announce 75 Down gt 45 Withdraw BGP Route Flap Damping I convergence 4 cuto C time Apply a penalty value for each update Penalty calculated on a prefix peer Ignore treat as no route updates when penalty above some cutoff Penalty decays over time Reuse route if penalty drops below some threshold Damping Parameters mm mm mm mum me lmrumkr um Damping in Network Operation e mechanism in maintaining global routing stability proposed by router vendors in mid 905 speci ed in RFC 2439 implemented by all major vendors Zhang et al survey shows that it s enabled by some ISPs bu no all A major reason is customer s complaint of unexpected long convergence time The Intended Behavior gt 3 1 customer f f Occasional flaps pass through without extra delay short convergence time Persistent flaps are suppressed reduced routing updates in the network longer convergence time Path Exploration and Damping dest Z s Candidate paths 0 O O IHGFA Updates sent by 2 include ZBA then ZCBA then ZEDBA then finally 2 IHGFA Neighbor of Z imposes a damping penalty and ignores the ZIHGFA route Resulting Convergence Time I39iuun J uln rrgum39u ill lL k u lhc liqur I lpull False Damping in the Internet Table 0 ilhdramul mumRd ap slul39min HH39LH H39 I i IJH iil il It illLID UH quoti iquot lillH Jun w un u39i liikl 39vi I 1 i AM l a 1 J I i39ii III p him Hull wL h quotin main vii iIi Damping Story So Far A key mechanism in maintaining global routing stability Path exploration causes damping to apply in unexpected ways Understanding BGP Damping Objective Analyze damping timers and propose solution Approach Use RCN to apply damping penalty to events rather than updates Contributions Better understanding of how damping works RCN based solution to improve damping algorithm Route Flap Damping BGP s mechanism to guard against persistent flapping To limit the global impact of local instability Suppress updates of unstable routes reuse after the route stabilizes Design tradeoff adaptability vs stability convergence time vs update count Specified in RFC 2439 implemented by major vendors enabled by some lSPs Penalty 4000 3500 3000 2500 2000 1500 1000 500 The Damping Design Use penalty to track route instability Increase upon receiving an update Otherwise decay exponentially Suppress the route if penalty is over the cutoff threshold Reuse when the penalty drops below I Cutoff Threshold I Reuse Threshold O l I 0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640 the reuse threshold Time seconds The Intended Behavior of Damping Occasional aps pass through without extra delay short convergence time Persistent aps are suppressed Less routing updates longer convergence time The convergence time is controlled by parameter settings at the adjacent router Simulation Setting Pulse a withdrawal and the following announcement One minute between two updates Two types of topologies with different sizes Cisco default parameters Update Count Convergence Time second I l I l I l I l l I No Damping Simulation mesh le 20000 Full Damping simulation mesh Full Damping simulation Internet 18000 m 16000 5 3 no damping 13 14000 gt 5 12000 E E 10000 3 Z 8000 6000 With damping 4000 x I 2000 0 1 2 3 4 5 6 7 8 9 10 Number of Pulses 7000 i No Damping simulation mesh 7F Full Damping simulation mesh 6000 Full Damping simulation lnternet 2 Full Damping calculation 5000 lt simu1ation 4000 3000 2000 1000 0 1 2 3 4 5 6 7 8 9 10 Number of Pulses 10 Path Exploration causes False Suppression BGP explores alternate paths after a route change A flap at the origin can be amplified to multiple updates on remote links Even a single flap can trigger route suppression somewhere in the network Mao et a 2002 Secondary Charging Prolongs Convergence Time Route reuse at different routers happen at different times Updates triggered by route reuse are treated as further aps by other routers Internet S lf 1 Penalty Example 3000 1 Cutoff Threshold 7 Reuse Threshold 2500 2000 1500 1000 500 0 l l 0 1000 2000 3000 4000 5000 6000 Time seconds Muffling Effect 1 When flapping is persistent route suppression is triggered at B Further flaps only increase B s penalty therefore B will have higher penalty than all the other routers 12 Muffling Effect 2 When B reuses the route there is no damped routes in the network thus no secondary charging The convergence time is determined solely by when B reuses its route which is the intended behavior Convergence Time 7000 l l No Damping simulation mesh 7 Full Damping simulation mesh 7 6000 Full Damping simulation Internet Full Damping calculation 39 0 500 39 f SImulatlon 4000 3000 Convergence Time second 2000 1000 0 1 2 3 4 5 6 7 8 9 10 Number of Pulses 13 The Damping Problem Damping penalty should increase only after a real flap not every update Path exploration multiple updates due to the same ap Secondary charging reuse a previously suppressed route Potentially others Solution Explicit root cause notification Root Cause Notification RCN Root cause is the original network event that triggers the update Attach the root cause to every update rcn location status sequence number updates triggered by the same event carry the same rcn rcn location AB status down seq 1 CS 557 Lecture 26 Domain Name System Development of the Domain Name System Mockapetris and Dunlap 1988 Impact of Con guration Errors on DNS Robustness V Pappas Z Xu 8 Lu D Massey A Terzis andL Zhang 2004 Spring 2009 The Story So Far Some Essential Apps DNS naming and P time Transport layer End to E unication Multiplexing Reliability 0 ntro a 0 Network layer Addressing Fragmentation Dynamic Routing Best Effort w met 39PPP39 Forwarding copper ber radio Congestion 0 Flow contro Data Layer richly connecte n two many paths with many types of unreliable links Slides Adopted From SIGCOMM 2004 Presentation Motivation DNS part ofthe Internet core infrastructure Applications web email e164 CDNs DNS considered as a very reliable system Works almost always Question is DNS a robust system JJserpeLceiALedmbustnes System robustness are they the same Motivation ShortAnswer Microsoft39s websites were o line for up to 23 hours the most dramatic snafu to date on the Internet because of an equipment miscon guration Wired News Jan 2001 Thousands or even millions of users affected All due to a single DNS configuration error Related Work Traf c amp implementation errors studies Danzig etal SIGCOMM92 bugs CAIDA traffic amp bugs Performance studies Jung etal IMW01 caching Cohen etal SAINT01 proactive caching Liston etal IMW02 diversity Server availability To appear OSDI04 IMCO4 Our Work Study DNS Robustness Classify DNS operational errors Study known errors Identify new types of errors Measure their pervasiveness Quantify their impact on DNS availability performance Outline DNS Overview Measurement Methodology DNS Configuration Errors Example Cases Measurement Results Discussion amp Summary Background barfuooom barfuooom NS n39slbarfoocom NS n barfoocom barfoooom NS n52barfoocom barfuooom MX mailbarfoocom wwwbarfoocom A 10101010 b A resource records name servers asking for www bar foo com answer wwmbaxtaamam A 10101010 client V lt caching server 4 b referral referral bar NS RRs foo NS RRS barA RRs fooARRs F foo zone bar zone referral com NS RRs 807quot A RRS root zone com zone Infrastructure RRs NS Resource Record iProvides the names of a and at the child zone zone s authoritative servers istored both at the parent A Resource Record 100com NS nslfoocom 100com NS n52foocom L I I I Ilslfoocom A 1111 n52foocom A 2222 I II III I iAssociated with 21 NS resource record glue A record foocom NS nslfoocom 100com NS n52foocom foocom NS ns3foocom y istored at the parent zone nslfoocom A 1111 ns2foocom A 2222 s3toocom A 3333 n What Affects DNS Availability Name Servers Software failures Network failures Scheduled maintenance tasks Infrastructure Resource Records Avaiabiit ofthese records Configuration errors focus of our work Classification of Measured Errors Inconsistency Dependency Lame Delegation Diminished Cyclic Delegation Inconsistency Redundancy Dependency lt 4 The configuration of infrastr More than one nameservers share a Rs does not correspond to common point of failure actual authoritative nam ese What is Measured Frequency of configuration errors System parameters TLDs DNS level zone size ie the number of delegations Impact on availability Number of servers lost due to these errors Zone s availability probability ofresolving a name Impact on performance Total time to resolve a query Starting from the query issuing time Finishing at the query final answer time Measurement Methodology Error frequency and availability impact 3 sets of active measurements Random set of 50K zones 20K zones that allow zone transfers 500 popular zones Performance impact 2 sets of passive measurements1week DNS packet traces Lame Delegation 100com NS Af00c0m 100com NS Bf00c0m com A100c0m A 1111 Bf00c0m A 2222 1 Nonexisting server W 3 seconds perf penalty 2 DNS error code 7 l RTT perf penalty A foo 3 Useless referral W 1 RTT perf penalty 4 N onauthoritative answer cached A foocom B foocom Lame Delegation Results Lame Delegation Results Pvceml euvNunssnnamnrlxnnsnnlkuulmnkm u m c an m Lame Delegation Results Error Frequency 15 of the zones 8 for the 500 most popular zones independent of the zone s size varies a lot per TLD Impact 70 of the zones with errors lose half or more of the authoritative sewers 8 of the queries experience increased response times up to an order of magnitude due to lame delegation Diminished Server Redundancy foocom NS Afoocom foocom NS Bfoocom com Afoocom A 1111 Bfoocom A 2222 A Network level belong to the same subnet B Autonomous system level belong to the same AS C Geographic location level belong to the same city Diminished Server Redundancy Results Error Frequency 45 ofall zones have all servers in the same 24 subnet 75 ofall zones have servers in the same AS large amp popular zones better AS and geo diversity Impact less than 999 availability all servers in the same 24 subnet more than 9999 availability 3 servers at different ASs or different cities Cyclic Zone Dependency 1 100c0m NS A100com 100c0m NS B100com m A100com A 1111 co Bf0000m A 2222 v TheA glue RR for Bf00com missing fog Bfozmom depends on A foo com If A foo com is unavailable then 39 Bf00c0m is too A foocom B foocom Abarcom A 2222 If Af00 and A bar are unavailable B addi39 are unresolvable foo A foocom barcom NS Abarcom barcom NS Bfoocom Cyclic Zone Dependency 2 100com NS Afoocom 100com NS Bbarcom Afoocom A 1111 com The B servers depend on A servers 7 Bbarcom B foocom The combination 0ff00com and bar com zones is wrongly con gured bar Thefor quot4 quot quotmm corre r we Abarcom Cyclic Zone Dependency Results Error Frequency 2 of the zones None ofthe 500 most popular zones Impact 90 of the zones with cyclic dependency errors lose 25 or even more oftheir seners 2 or 4 zones are involved in most errors Discussion UserPerceived System Pnhncfrgess Userperceived robustness Data replication only one server is needed Data caching temporary masks infrastructure failures Popular zones fewer con guration errors System robustness Fewer available servers due to inconsistency errors Fewer redundant servers due to dependency errors Discussion Why so many errors Superficially are due to operators Unaware ofthese errors Lack of coordination parentchild zone secondary servers hosting Fundamentally are due to protocol design Lack of mechanisms to handle these errors proactively or reactiver Design choices that embrace some of them Nameservers are recognized with names Glue NS amp A records necessary to set up the DNS tree Summary DNS operational errors are widespread DNS operational errors affect availability 50 of the servers ost less than 999 availability DNS operational errors affect performance 1 or even 2 orders of magnitude DNS system robustness lower than user perception Due to protocol design notjust due to operator errors Ongoing Work Reactive mechanisms DNS Troubleshooting Neth 04 Proactive mechanisms Enhancing DNS replication amp caching Thank You CS 557 Lecture 12 BGP Convergence lrn proved BGP Convergence via Ghost Flushing BremlerBarr Afek Schwarz 2003 BGPRON Improving f Through Root Cause quot39 39 39 Pei Azuma Massey Zhang 2005 Spring 2009 BGP Path Exploration deg Z s Candidate paths 0 O O I H G F A Obsolete paths C B A and E D B A explored before converging on valid path I H G F A Potential to Explore N Paths Paths Explored by A ACS Link CS fails ABCS ADCS ABDCS ADBCS No route Theoretically can explore N paths before no route Some Routing Terminology Tup route to previously unreachable prefix is announced Tdown route to current reachable prefix is withdrawn and no replacement exists Tshort route to current reachable pre x switches to shorter path Tlong route to current reachable pre x switches to a longer path Otherterminology Tdown faildown Tlong failover BGP MRAI Time and Convergence Minimum Route Advertisement Interval MRAI timer I Within M30 seconds at most one announcement from A to B not for the first announcement not for the withdrawal Impact a suppress transient changes b delay convergence A s path changes P1 w 2 P 3 4 P 5 I I time0 time30 time60 Msgs from A to B P1 W P4 p 5 BASO3 Improving BGP Convergence Objective Improve convergence time after a legitimate route change Approach Flush out ghost information that is blocked by the MRAI timer P4 in previous slide is ghost information Contributions Simple easily deployed and clever approach to improve convergence Theoretical understanding of convergence behavior Improves on 2002 result from Pei et al Basic Model Each AS is treated as one node Though not strictly required in ghost ushing Routers use shortest path routing policy Helps with analysis but not strictly required SPVP simple path vector protocol approximates BGP MRAI timer between updates Minimum Route Advertisement Interval Two consecutive updates must be at least MRAI time apart Ghost Information Obsolete path information stored at node Could be preferred route or backup route stored at a node MRAI timer can block removal of ghost information Router cannot announce its current choice of paths because it recently announced a different path Typical MRAI value is 30 seconds Can lead to increased convergence time and increased chance of selecting ghost paths Ghost Flushing Very Simple Rule for BGP Routers When route to P is update to a worse path and MRAI timer is delaying path announcement send withdrawP no route to P Path Length and Time Assume Tdown Event Let H message passing time Claim at time KH every message or node has ASPath length gt K By induction True at time H since neighboring routers received withdraw Assume true at time KH all paths longerthan KH Suppose K or less path exists at time K1H Must have come from some peer P with path length KH Path must have been removed priorto time KH Withdraw or longer path announced prior to time KH Must be received prior to time K1H contradiction Implications of TimeLength Shown that at the KH every message or node has ASPath length gt K Implications Longest possible path has length N At time NH all paths are longer than longest possible path By time NH all routers know that path is withdrawn Convergence time is NH Reduced from NMRA Message Complexity Claim at most 2 messages sent during each MRAI timer interval Resulting complexity Number ofMRAl rounds is NHMRAI Updates per round is 2E Complexity is O2ENHMRA BGP complexity is EN Tlong failover Complexity Expect good results but no theoretical results presented here Simulations show solid improvement Other simulations not shown here show some surprises Theoretical results later determined by Pei et al Covered next week PAO5 Improving BGP Convergence Objective Improve convergence time after a legitimate route change Approach Signal the cause of the path failure Contributions Dramatic reduction in convergence time plus ability to improve other parts of BGP Theoretical understanding of convergence behavior Z s Candidate paths 0 O O IHGFA avoided the obsolete paths Observation if 2 know B A failed it could ve BGP Path Exploration Revisited dest the obsolete paths Z s Candidate paths IHGFA 0 B A failure A9 0V E Root Cause Notification The node who detects the failure attaches root cause to msg Other nodes copy the root cause to outgoing messages the first msg is enough for Z to remove all I Overlapping Events Anothertopology change happens before the previous change s convergence nishes B A failure B A failure Propagation along lower path is slowerthan upper path Overlapping Events B A recovery Path B A dest Overlapping Events Observation need to orderthe relative timing of the root causes Wrong B A failure B A recovery Solution adding sequence number Node B maintains a sequence number for link B A lncremented each time the link status changes B A failure seqnum1 B A failure seqnuml Solution adding sequence number BA B A recnvery gm 12 Pam C B A seqnum of B A2 B A raum SeqnumZI Solution adding sequence number Sequence number orders the relative timing of the root causes Pam B A seqnum of B A2 dest 21209 CS 557 Lecture 8 Domain Name System DNS the DNS Security Extensions DNSSEC m 31 Hiii 5iquot Em C The Domain Name System 0 Virtually every application uses the Domain Name System DNS Rom 0 DNS database maps 7 Name to IP address wwwdamamil 128917620 7 And many other mappings isi darpa af mil mail servers IPv6 reverse 0 Data organized as tree structure nge andrews 7 Each zone is authoritative for its local data 21209 Common DNS Queries and Types NS record lookups Give me the name servers for a zone A record lookups Give me IPv4 addresses The responses to these queries usually also contain NS records MX record lookups Give me the name of your zone s mail servers DNS Query and Response D wwwdarpamil A mml Enduser WWwdarpamil gquot A 1289128 127 CaChlng DNS Server Actually wwwdarpami 192518195 But how would you determine this darpamil DNS Server DNS Vulnerabilities Original DNS design focused on data availability DNS zone data is replicated at multiple servers A DNS zone works as long as one server is available DDoS attacks against the root must take out 13 root servers But the DNS design included no authentication Any DNS response is generally believed No attempt to distinguish valid data from invalid Just one false root server could disrupt the entire DNS wwwdarpamil A19251819 Joe s Laptop wwwdarpamil A Simple DNS Attack Easy to observe UDP DNS query sent to well known server on well known port wwwdarpamil A Cachlng DNS Server A 1289128127 D an s Laptop First response wins Second response is silently dropped on the floor darpamil DNS Server 21209 A More Complex Attack 9 we Response Cachin server WWWattackercom A 1289128127 attackercom S nsattackercom WWWgooglecom A 1289128127 WWW goo glecom 1289128127 nsattackerxom Query WWWattackercom Query WWW googlecom quotmlI u Secure64 Laptop Remote attacker The Problem in a Nutshell 39 Resolver can not distinguish between valid and invalid data in a response 39 Idea is to add source authentication Verify the data received in a response is equal to the data entered by the zone administrator Must work across caches and views Must maintain a working DNS for old clients 21209 Slides adapted from quotQuantifying the Operational Status of the DNSSEC Deployment with SecSpiderquot IMC 2008 quot y a bf 2v i 1 x J LL iiigia n r Outline How DNSSEC works How to quantify DNSSEC s deployment Our dataset What we found Conclusion 21209 DNSSEC DNSSEC provides origin authenticity data integrity and secure denial of existence by using public key cryptography Origin authenticity Resolvers can verify that data has originated from authoritative sources Data integrity Can also verify that responses are not modi ed in ight Secure denial of existence When there is no data for a query authoritative servers can provide a response that proves no data exists How DNSSEC Works DNSSEC zones create publicprivate keys Public portion goes in DNSSEC record type DNSKEY Zones sign all RRsets and resolvers use DNSKEYs to verify them Each RRset has a signature attached to it RRSIG So once a resolver has a zone s DNSKEYs it can verify that RRsets are intact by verifying their RRSIGs 21209 2 1209 Signing Example Using a zone s key R on a standard RRset the NS secspxdercsuclaedu u 3300 IN NS zinccsui1aed sscsplder csuc a 3600 u I NS alphametsemculostatexdu Signature RRSIG will only verify with the 39 sets idermsmclaxdu 3500 IN NS alphametsec calmnetegdu DNSKEY 1f quot0 secs xdercsuc1aedu 3600 IN NS zlnccstuclaedu secsmdercsuciaed0 3500 m min us 5 4 3600 20030324024000 data was 20080322024800 407 s s 39 p1dercsuclaadu Magu ujucmyomntgg modl ed H m m n 2 E lt um 33 z m n w 21 H E Nhh l39illElSnE faEuE Gun kEgEZDMuUZBcJ qlNGBDLu 711n5udu72iltwR TEEUUirplnlZEINw Getting the Keys Until a resolver gets the DNSKEYs for a n zone data can be spoofed com 39 How can a resolver know that the DNSKEYs themselves are not being spoofed s g Q foocam DNSSEC zones securely delegate from parents to children Zones that have no secure parents are called trust anchors When a zone is a trust anchor the zones that it delegates to and itself form an island ofsecurity 21209 Why Monitoring is Important Monitoring is important so data sources can poll their own status Additional problems are visible when polling from different vantage points Availability can be dependent on each client s location Man in the Middle attacks become visible In DNSSEC Zone ops need to know their data is being served properly Resolvers need to know if their problems are local or global Monitoring answers these immediate questions and can offer aggregate and historical information But then What The raw data of such large systems is generally too voluminous to easily discern meaning More can be gained by making monitoring systems explicit components the designs of Internetscale systems Beyond discovering problems feedback from such distributed monitoring can augment existing systems like DNSSEC Quantifying DNSSEC s Deployment Since DNSSEC is an operational system quantifying it allows us to concisely describe its status We de ne 3 measures to quantify the overall system status Availability Resolvers must be able to get data served by zones Do different polling points have varying trouble getting data Lightweight DNS resolvers are quotpollersquot from different vantages Veri ability There must be a way for resolvers to verify that the cryptographic keys and signatures are genuine As opposed to spoofed by an attacker Validity The data covered by cryptographic protections must be valid A veri able RSA signature does not mean an RRset s data is correct Quantifying Availability We de ne availability A to be a given poller p s ability to retrieve records for a zone 2 at time t Alpzt gt1 IO We de ne dispersion as cases where some pollers can get data when others can t xiii Away 0 Am zj t lPl Use EWMA to track over time and overcome transient failures Take the compliment to get availability dispersion dispzj7 t 2 1209 Quantifying Veri ability Beyond availability resolvers should not blindly trust keys they get from queries DNSSEC s veri cation comes from tracing a chain of trust from a trust anchor Ta Therefore we de ne DNSSEC s veri ability as the fraction of zones covered by any T IT 1 Vf 1 IZSI Quantifying Validity There is a difference between verifying crypto signatures and having data that is actually valid Problems occur when Sigs are veri ed but data is wrong false positive Sigs are not veri ed but data is valid false negative In this work we take speci c examples of these that we observed 21209 Validity Metric False negative can result from RDS cases where parent zones issue V l U l deleg DS lame secure delegations lR l False positives can result from cases where old RRsets have valid signatures but whose letaLel data is stale has changed Vfresh 1 W Vd Vdelegy Vfresh Our Dataset This data is taken from the first DNSSEC monitoring project SecSpider htt secs idercs uclaedu We take user submissions crawl wrious sources do NSEC walking etc submit your zones early and often P We track all zones in our corpus from a set of distributed pollers We currently track over 15000 zones Most ofthese are test zones and their behaviors are often quite different than production zones 21209 Production Zones We focus our analysis on quotproductionquot zones The behaviors of test zones skew analyses Zones are classi ed as production iff they Are under arpa Reverse mappings 16096179131inaddrarpa Or are a TLD com edu se de bg Or maintain a reachable web server at a www domain Or have a mail server listed under an MX Soon this will include gTLDs like gov Distributed Polling We poll from distributed locations in Asia North America and Europe a RRRR a A i r i c a A i r i a We use measurements from these pollers as indications of phenomena Clearly more pollers are necessary to precisely model DNSSEC s state 2 1209 The State of DNSSEC At the time of this paper s submission 871 production zones 662 islands of security 974 islands are size 1 g Since the writing of this paper the Kaminsky E ect has happened CDF of DNSSEC Zones The number of production zones has more than doubled from 871 to 2209 We saw an roughly 30 growth in DNSSEC in August 2008 alone Availability Dispersion Evaluated by querying for DNSKEYs Precondition for verifying all zone data Roughly 20 of the production zones we observed suffered availability dispersion Availability Dispersion 08 06 gt 04 02 Availability Dispersion of DNSSEC Zones 7 Availability Dispersion 7 O 100 200 300 400 500 600 700 800 900 DNSSEC Zones Some resolvers were be unable to get their data based solely on where they the resolvers are There is currently no way for ops to know this 2 1209 13 21209 Why is there Dispersion This metric quanti es that there is a real problem worth investigating Many have speculated that DNSSEC s larger packets may exacerbate PMTU problems DNSSEC puts large payloads such as DNSKEYs and RRSle in packets that traditionally only need a few hundred bytes 0 Middle boxes may regard oversized DNS packets as invalid etc Several studies have been conducted that show SOHO routers for example can have trouble with DNSSEC quot 39 39 Iiu7dnssec incident enbdf min We Suspected PMTU SecSpider performed PMTU walks whenever DNSKEY requests either resulted in a truncated response or failed completely All requests began by advertising buffer sizes of 4096 bytes A recent measurement at the f root server found roughly 50 of query traf c speci ed this buffer size httpswwwdnsoarcnetnode146 How We Measured PMTU Our PMTU walk began by trying TCP as a faHback This always worked from our polling locations We note however that various anecdotal evidence suggests that this could be a concern to some operators httpwwwnanoqorqmtqO410pdftovamapdf At the same time a binary search was done to determine the maximum size a UDP request can advertise before a drop or truncation It turns out Success Rate of PMTU Exploration i 7000 Some polling locations have noticeably more PMTU problems than PMTU Attempts Success 6000 oralions gt 5000 of PMTU Expl Number of PMTU Walks Attempted others 60 i 4 00 3000 For example Here you 40 2000 can see that poller 2 20 1000 attempts and fails many 0 0 0 1 2 3 4 5 more PMTU walks than others panama Attempting a walk means there was a minor problem with false advertising Failing a walk means that there is a fundamental disconnect and that the resolver cannot advertise a size that the server can use wo truncating data it just won t t 2 1209 15 State of Veri ability Trust Anchor Coverage CDF of all Zones By con guring a fEW Degree ofCoverage otAIIZones large Ta s an initial gain of roughly 25 Shows an underdeveloped delegation hierarchy Root is not signed and zones in 00111 can tjoin a n iSIa n d o 100 200 300 400 500 600 700 Number of configured Trust Anchors of Zones Covered Overall this metric grew from 0223 to 0241 during the last part of 2007 As absolute values these are hard to interpret but the trend shows an increasing coalescence of secure zones Dealing with Low Veri ability DNSSEC s low veri ability is easy to see with SecSpider Alternate approaches may be necessary while the hierarchy mends itself So SecSpider acts as a distributed monitor that facilitates trust anchor lookups Keys are reserved after passing distributed consistency checks 2 1209 16 State of Validity Taking 2 measurements in 2007 Vd Vdeleg VfTesh 0893 0468 gt 0909 0802 Clearly the absolute values are subject to debate We take note of the trend Delegation status hasn t changed much Note the staleness trend What has Validity Done for Me Lately As a distributed monitor SecSpider has alreadylet i operators see their m i V own false positives After initially publicizing that stale RRsets are apparent a sharpdecrease was seen after January 2007 21209 Condugon Without a monitoring system how can zone administrators know there is a problem With SecSpider zone admins can see issues and correct them 0 How can resolvers know why they are having a problem With SecSpider people can try to see if their problems are local or global Monitoring must be part of the lifecycle of this class of Internet scale systems The availability metric showed us where to look SecSpider can be a solution for the veri ability problem SecSpider has already provided feedback about validity ThankYou 21209 Backup RECONSTRUCTED timeseries with RECONSTRUCTED prodonlu Thu Oct 11 064000 200 Mon Oct 29 092810 2007 7 Mon Jan 21 140755 200 Making pruned Get vuln stats Resdo islands w PMTU ch2 querusstats bu removing universal Failures again from before and prod only ter Fri Oct 12 142305 20 7 Med Oct 31 114102 2007 ed Oct 24 151926 2007 Sun Jan 27 145251 20 Redoing validity RECONSTRUCTED Island size sturr RECONSTRUCTED metric HITH valids in nominator Sat Oct 13 221200 20 Tue Jan 22 161643 20 Med Oct 24 151927 2007 Thu Jan 31 135740 200 ake poller PMTU Resgen islanP vs admin size graph Rvail stats Gen per zone avail metric s s s s 7 if Failures Prom prod2 Thu Jan 24 180147 200 Mon Oct 29 110951 2007 Fri Nov 2 122155 sss s Med Oct 24 163337 2007 Make poller PMTU success rate png with Plot of zone avail pru minus universal s s Failures Prom prod2 s take quoton Oct 29 112423 2007 ss s Med Oct 24 163648 2007 Create T replotting coverage data Med Oct 31 125826 2007 resort repllot by d rsion Fri Jan 25 161138 20 replots to PDFs resorting Thu Jan 24 192441 200 replot Thu Jan 24 192636 200 21209 Size at Island vs of separate Administrative Domains IZE at Island 39 i i S independent Admin Domal d d w 59 hartsnet ninbsni 20arpa 8dtarpa zxcom 176Marpa island Rank 7 CDF ot PMTU Explorations per Zone I m c O N c 9 E u 9 n 8 a 20 PoiierO 85 7 Puller 1 Poiier 2 u t i i i 0 Pullers 1 2 l3 5 6 7 0 50 100 Percent of Requests Needinq PMTU Explorations Der Zone 21209 33109 CS 557 Lecture 18 Denial of Capabilities Portoullis Protecting Connection Setup from DenialofCapability Attacks B Parno D Wendlandt E Shi A Perrig B Maggs and YC Hu 2005 Spring 2009 Phase 1 Request Capabilities source destination attacker Denial of Capabilities DOC Exhaust request channel resources Legitimate senders cannot obtain capabilities from the receiver ln capability architectures the destination decides But its too late to decide at the destination for request packets Defending DoC Attacks Network can t tell apart legitimate requests from attacker requests Simple solution fair queue all requests nm ng number of attackers legitimate senders rm rg sending rate ofan attacker legitimate sender Y pa A requester should receiver Y nm n ie the sending rate should not matter capacity of bottleneck link allocated towards request ckets 33109 33109 Identify Based Fairness Persource fairness r g minrg Yng nm Issues with NAT source address spoo ng Perpath fairness r g minrgYP1Npi Coarse granularity attacker may insert bogus path information Perdestination fairness Attacker may ood all destinations near the victim Can actually amplify the attack on a destination Proofof Work Fairness Perbandwidth Fairness r g minrg Y kK Large variation in bandwidth capacity of senders Does not protect network links Percomputation Fairness r g minrg Y cgC Equivalent to persource fairness without the problems associated with spoo ng But computational disparities Portcullis Principle Achieves percomputation fairness Computational puzzles Willing to pay the extra cost for setup packets Legitimate sender only needs to get across one request to succeed Computation Puzzles Find X such that p Hashx r hi desth H e and the last t bits of p are zeros Finding a solution requires bruteforce search e is the puzzle difficulty Level a puzzle is twice as hard as level 21 puzzle solution hi is the puzzle seed Guarantees freshness of puzzle solutions Distributed through a seed distribution service 33109 Puzzle Seed Generation amp Distribution Puzzle seed generation Hash chain hm ho Hash chain anchor SlGNhn Seed generator releases the hash chain anchor SlGNhn once a year Distributed to all routers in the network using routing protocols Every t minutes seed generator injects current seed hi into DNS Clients pull the current seed from DNS to compute puzzle solution dPortcullis Operation source destination 1 Pull puzzle seed hi from DNS 2 Compute puzzle solution send request with solution 3 Hash hi to get anchor hm verify puzzle solution update to new anchor hi send to downstream router 4 Repeat Step 3 33109 PSDN 01 Portcullis Operation Under DoC attack destination Legitimate source computes puzzle solution at level 1 sends request At the same time attackers send requests with puzzle difficulty level 1 Legitimate request is dropped Source tries again increasing the puzzle solution to level 2 Attackers keep flooding with requests at level 1 Router fonNards the legitimate packet which reaches the destination Puzzle Sharing amp ReUse Attacker may share puzzle solutions Attacker may reuse puzzle solutions Routers store puzzle solutions for time penodt No need to store longer than t since hi is refreshed in the next interval Routers use bloom filters to store puzzle solutions 33109 Main Results A legitimate sender using Portcullis succeeds in Onm regardless of attacker strategy The upper bound for a legitimate sender to successfully transmit a request is Qrim Evaluation Topology based on CAIDA skitter map Victim destination at the center 1000 legitimate sender 1000 to 20000 attackers flooding at their full uplink capacity 1Mbps Measure the time it takes to establish capabilities at the legitimate senders 33109 7 Czlpabilily Eslzlblishcd Simulation Results lr Capabllil Eslu 4 51 i A i 1 0 1 Time 5 With 20000 attackers Time 5 With 1000 attackers CDF of capability setup time for legitimate senders Summary Denial of Capabilities is an important problem for capability architectures Portcullis leverages proofof work techniques to help senders establish capabilities Using Portcullis a legitimate sender can establish capabilities within Onm Computational disparities do exist but they are smaller 38x compared to bandwidth disparities 1 500x Memory bound techniques might work better 33109 CS 557 Lecture 15 Convergence and Packet Delivery An Anal sis of Conver enoe Dela in Path Vector Routn Protocols Dan Pei Beiohuan Zhang Dan Massey and Lixia Zhang 2003 A Study of Packet Delivery F durinq Routha Convergence Dan Pei Lan WangDan Massey and S Felix Wu Lixia Zhang 2003 Spring 2009 BGP Path Exploration Revisited dest Z s Candidate paths 0 O O IHGFA Observation if 2 know B A failed it could ve avoided the obsolete paths Root Cause Notification The node who detects the failure attaches root cause to msg Other nodes copy the root cause to outgoing messages the first msg is enough for Z to remove all the obsolete paths 49 0quot G Z s Candidate paths O GBA IHGlA Overlapping Events Anothertopology change happens before the previous change s convergence nishes B A failure B A failure Propagation along lower path is slower than upper path Overlapping Events B A recovery Path B A dest Overlapping Events Observation need to order the relative timing of the root causes Wrong B A failure Solution adding sequence number Node B maintains a sequence number for link B A Incremented each time the link status changes B A failure seqnum1 2 B o G o dest Solution adding sequence number B A B A recovery seqnum2 Path C B A seqnum of B A2 BA failure seqnuml Solution adding sequence number Sequence number orders the relative timing of the root causes Path B A seqnum of B A2 dest Convergence Improvements MRAI Timer Require minimum time between updates Typically 30 seconds Assertion Checking Pei et al Signal policy or topological failure Discard routes that include failed subpath Ghost Flushing Afek etal When the MRAI timer delays an update send a withdrawal Root Cause Notification Pei et al Explicitly list the cause of the failure Convergence Review Assume P5 is longer than P4 a suppress transient changes b delay convergence A s path changes P1 w 2 P 3 4 P 5 I I I tIme0 tIme30 time60 MRAIAto B P1 w P4 p5 Ghost Ato B P1 w P4 w P5 RCN Ato B P1 W RCN P4 P5 RCN RCN RCN Types of Routing Events Tup route to previoust unreachable pre x is announced Tdown route to current reachable prefix is withdrawn and no replacement exists Tshort route to current reachable pre x switches to shorter path Tlong route to current reachable pre x switches to a longer path Other terminology Tdown faildown Tlong failover Analysis of Tdown convergence delay bound Shortest path it takes at most dh X dest BGP N1Mh Assertion Peilnfocom02 NdegreedestMh Ghost Flushing N1h Bremler Barrlnfocom03 Root Cause Notification dh h processing delay at a node 2 seconds BremlerBarrlnf0com03 d network diameter 10 HustonBGPReport04 N number of nodes in the network 19K M MRAI delay required delay between 2 consecutive 30 Seconds not applicable to withdrawals Q Model Previous results assume uniform U model time h to process a message Q model assumes messages processed in FIFO order RCN s performance even better in Q model BGP has E in the formula Packet Delivery Performance 5 impimimmi M has us nicrg dEllVEV is the 39 ultimate anal Inn Ghost mm n my quot 1 ssemnn an 55 m 75 im as an 55 i pniwiii V Simulzllnn ms cisiiiiaiqm M Cmmmkml Packet Delivery during Routing onvergence Failures do occur in the Internet 2mm is mm mm lti davlum iwmi whim ispmsim w chanvu i mum was an Routing convergence a er failure takes time is isom lSP vmncal Sosecands Dim leml mam lSP vmncal Dummies mm sivcamn l Packets can be delivered during convergence When Does Packet Delivery Fail 4m m39s next hop m rel IS K5 blLlllL15 R4 all Goal of this paper How to maximize packet delivery during routing convergence 0 Previous work focused on preventing loops minimizing convergence time and routing overhead 0 Topological connectivity s impact 0 Studying RIP Distributed BellmanFordDBF BGP This problem becomes more important with Laigei l J 39g frequ 39 Richer connectivityHuston01 gt potentially helps with more alternate paths Higher bandwidth gt more packets sent during convergence 12809 CS 557 Lecture 6 More Distance Vector Routing A Path Finding Algorithm for LoogFree Routing JJ GarciaLunaAoeves and SMurthy 1997 The Synchronization of Periodic Routing Message Sally Floyd and Van Jacobson 1993 Spring 2009 Distance Vector Routing Objective Compute the shortest path to every destination in network Approach Each node maintains a table listing Destination Distance NextHop Periodically advertise Destination Distance to neighbors NOTE THE RESULTING MEMORY IS ON 12809 Distance Vector SoftState Send updates every 30 seconds Even if nothing has changed TimeOut Stale Distances If 90 seconds elapse no update from NextHoplD then DistlD 16 NextHoplD none Triggered Updates If route changes send update immediately But also ensure 5 second spacing between triggered updates Distance Vector Fault Detection Spilt Horizon with Poison Reverse If NextlD J then update to J lists DDistlD16 Apply Triangle Check Before Accepting Update Distances should obey the triangle rule If update fails the triangle rule Send probes to check distance Apply rules to limit probing Distance Vector Net Result 1 LinkDX fails DistCX2 gt DistDX 16 BEltBE 2 C NextCXD 2 Update DistAX4 arrives at D gt DistDX11 NextDXA 3 D sends update to C DistDX 11 gt DistCX 12 DistAX4 NextAXB DistDX1 NextDXX Path Finding Algorithm 12 Objective Compute the shortest path to every destination Maintain table size of On update size of 01 table size of On and same message complexity Approach Each node maintains a table listing Destination Distance NextHop LastHop Note changes table size from 3N to 4N Periodically advertise distances and last hops to your neighbors Destination Distance LastHop to neighbors Note changes update size from 2 to 3 Same message sending rules So no change in message complexity 12809 Path Finding Algorithm Upon receiving update DestDistJDLastHopJD from node J node I applies rule If NextHoplD J then DistlD DistlJ DistJD Else if DistlJ DistJD lt DistlD then DistlD DistlJ DistJD NextHoplD J Use LastHop to find potential loops Path Finding Dist3x3 DiStCX2 Find Path from Ato X NextBXC NextCXD Last BixD LastCXD Start Path ADX What is LastAD PathALastADDX now recurse until whole path has been found DistAX4 005W DistDX1 NextAXB NextDXX LastAXD LastDXD 12809 Path Finding Summary Keeps same complexity as original DV Table size is 4N ratherthan 3N Message sizes are 3N rather than 2N But also any node to inferthe complete path Relies on subpath property of shortest paths Interesting addition and roughly state ofthe art in distance vector routing Most deployed DV protocols are still variations of basic RIP No simple with no RIPTP no path nding etc Update Synchronization FJ93b Objective Examine Synchronization of Weakly Coupled Systems Approach Use a Markov Chain model to explore synchronization and explain network behavior Contributions Random jitter to break synchronization is now a fundamental part of protocols Explains why there is synchronization and how to choose the random jitter amount 12809 mama Ping RoundTrip Times JMMI l lin HIquot QH H H Appears m have permdu usses Why Packet Losses in a RIP Network RTP Nemmk LussEs Appears m match R F update frequency Synchronization Periodic messages common in many protocols RIP router sends updates every 30 seconds System begins in an unsynchronized state RIP routers start at different times Might expect updates are not synchronized Example Router R1 starts at time 1 and sends updates at time 131 62etc Router R2 starts at time 3 and sends updates at time 13 33 63 etc Router R3 starts at time 5 and sends updates at time 5 35 65 etc Synchronization In Reality Periodic messages common in many protocols RIP router sends updates every 30 seconds System begins in an unsynchronized state RIP routers start at different times But the initial state does not matter Some systems start out in any state synchronized or not and will become unsynchronized over time Other systems start out in any state synchronized or not and will become synchronized Transition from unsynchronized to synchronized or vice versa is abrupt 12809 Periodic Message Model Router R prepares and sends its own outgoing message 2 If R receives an update R also processes the incoming message 3 Router R sets a new periodic message interval in range TpTr TpTr ln RIP Tp 30 seconds Tr random jitter amount 4 lfincoming message received it can trigger an update In other words go directly to state 1 without waiting for timer to expire Creating and Breaking Clusters Cl menus H l mm 439 39VCT39Iv 3 mail 3339 rim1 H D TIIUE Hquot i i 39 39 3739 23 a H J J BE39PIJ E l a 1 5 l Ilarm 12809 Cluster Size Over Time Note all routers in same cluster synchronized r l f mgrLi39mn 3 m m m m IMO 39m annual Markov Chain Model Model system using a Markov Chain 7 l V I7 I 91 Finn Nu I In dn nku olni 1 A I 39 39 Each state lists the size ofthe largest cluster Assume largest cluster either grows or loses by 1 Some simpli cations from real model 12809 Cluster Behavior Given a cluster of size I consider when the timers for nodes in the cluster expire Node has interval oftime Tp Node waits iTc to process updates Node includes some random amount Tr Assume the random amount is uniform over interval Expectation for when first timer expires is Tp Tc Tri1i1 In other words the periodic interval time plus the time to process other cluster updates plus the expected random variance Cluster Breakup Typical cluster breakup scenario One node breaks off from the cluster Results in one cluster of size 1 and one of size i l Let Node A be the node whose timer expires rst To break from cluster node must compute and send its message before second node in the cluster sends it mess In otherwords node A sends its message and resets its timer before receiving any other message from cluster nodes Mathematically we get that let L spacing between message one and two let Tc compute and send time for first node nt L gt Tc gt prob l Tc2Trquoti 12809 Cluster Addition Conservative cluster addition scenario One node joins the c us er What it means to join the cluster Timer otfset for the cluster moves to within Tc ofthe timer offset for some other node cluster of size 1 In other words node A sends its message and updates from the cluster arrive before it is done Mathematically we get that preb we e NrilTp lrlTrTrlrlil Understanding the System Given a Markov and chain and probabilities for state transitions we can determine expected behavior 7 State N system is eempletely synchronized 7 State 1 system is eempletely unsynchrunized Function fi rounds until chain rst enters state i inm lwmm mm mnmmwummm 12809 Understanding the System 2 Funmun gU ruunds unm nam mm Enters statew siamng frurn N mammmm wnmmmnmwwmn The Impact of Randomness Tr HIEDB Conclusions Natural tendency for the system to synchronize Random jitter amount Tr breaks clusters and unsynchronizes system Selecting Tr gt Tp2 should eliminate synch Selecting Tr gt 10 Tc should quickly breakup any syncrhonization Fundamental part of all modern protocols is a random jitter on any periodic timer 12809 CS 557 Lecture 24 RED Random Eariy Detection Gateways for Congestion Avoidance Floyd and V Jacobson 1993 Spring 2009 The Story So Far Some Essential Apps DNS naming and TP time Transport layer End to E unication Multiplexing Reliability c ntro Network layer Addressing Fragmentation Dynamic Routing Best Effort w met39PPPM Forwarding copper ber radio a o Congestion 0 Flow contro Data Layer richly connecte n two many paths with many types of unreliable links Congestion Control Options Control Congestion at Source Typically adjust congestion windows But can t assume everyone will do this Faulty implementation ofthe control algorithm Intentional ignorance of the control algorithm Dynamic Routing Route packets around congestion Has problems with jitter convergence etc Managing the Queues at the Routers Topic of this paper RED Main Points Objective Technique to notify senders of congestion and achieve high throughput and average queue sizes Approach Gateways mark or drop packets Monitor average queue size Randomly markdrop some packets above a given min threshold Markdrop all packets above a max threshold Contributions Well known technique and implemented in production routers Congestion at Gateways End system can only infer congestion Lack a global understanding of traffic Lack a sufficiently long time period Gateway best suited to signal congestion But should also tolerate bursty How to do this The Basic Idea Gateway signals congestions by Marking some packets Dropping some packets Maintain an average queue value Want to tolerate bursts Don t overreact to transient congestion DropMark some packets when average queue exceeds a min threshold minthresh DropMark all packets when average queue exceeds a max threshold RED Algorithm For each packet arrival calculate average queue size avg if minth lt avg lt maxth calculate probability pa with probability pa mark packet else if avg gt maxth mark packet Calculating the Average Weight average function avg 1wqavg wq q How to Select wq If wq is too large will not filter out transient congestion To allow burst of L packets Want avg lt minth llI L 1 1wqquotL11wq lt min ex minth 5 L 50 gt wq lt 00042 If wq is too small avg doesn t reflect actual queue size Calculating the Drop Probability 2 0 won an muu Mao mm mm mm haul mm mw ml Mambo Zl 20 mp my my Mel mm mm Marking probability pb maxp avg minth maxth minth Method 1 Each packet marked with probability pb Result ProbXn pb 1pbquotn1 Method 2 Keep count of packets since last mark Each packet marked with probability pb1 countpb Why is Randomness Important 2 a mob zauu mua me Econ mm Number llav w my Maximal mm w my Mum Zl Need to avoid global synchronization Don t want every connection to slow dovm at the same time Fraction of marked packets should be roughly proportional to connections share of the bandwidth The Impact of RED on Queuing l Cit l in ailinuz llll Queue rm Dublin 5le sili imal m avlzann queue m mashed my Note the average queue value avg in the above The queue size is burst The average tolerates transient bursts ofpackets The average also tracks the general trend in queue size Comparing Red and DropTail mun mman 11 mum knightMt RED Summary Effective In Helping Manage Queue Sizes Does not assume a particular transport protocol But well suited to TOP style backoff Tolerates Transient Bursts Uses a weighted average of queue size rather than the current queue state Avoids Global Synchronization Probability of drops increase with congestion High Degree of Operational Flexibility Can select appropriate wq maxp minth maxth etc Some guidelines for selecting values Though optimal values not known foling Fwt wil En emett A E to in vatftj In it to mo 1 g h Wm a g U63 Dan Massey Colorado State University Lan Wang University of Memphis Beichuan Zhang University of Arizona Lixia Zhang UCLA The Real Value of the Internet Lots of great fundamental science protocols design principles Packet vs Circuit switching TCP reliability and congestion control End to End Principle and many more great research contributions But most people think of the Internet in terms of applications delicious g You Tube M roa cast ourse 2 Google eFlT Future Internet Architecture Internet Success Driven By User Innovations Architectural Objective Enable and Facilitate Future Innovations Architectural Requirement Provide Universal Connectivity Edge Independence core doesn t block apps Scaabiity in both apps edges and core Availability in spite of faults and attacks Today s Limitations Users are Handicapped by Lack of ISP support in new applications Lack of globally unique addresses Lack of DdoS protection ISP are handicapped by Lack of routing scalability Lack of manageability Lack of routing system security Users and lSPs tightly coupled together The eFlT Transit Wire Customer 3 wwwwwwv V V V V ss asaswasams a vaasvaaswvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaasvaavvaavvaavvaavvaavvsxgsz mappin yquotWWWWWW W W Customer A Enables users to treat the transit core as a wire connecting to all other Internet hosts 5 Building the Transit Wire Separation Put users and ISPs in different address and routing spaces Enable the two parties to evolve independen y Enable scalable routing system design Raise the bar against attacks targeted at the global routing infrastructure Building the Transit Wire Mapping Mapping needed to bridge the two spaces User address gt Provider router address Indicates user attached to the Internet Spaces insulated from one and another Provides a layer of indirection for traffic engineering DDoS prevention Design Challenges Mapping Ensure authenticity and availability despite Faults crashes link failures and so forth Intentional attacks Scale to the growth in user sites Minimize packet losses and delays Transient failures invalidate mapping data Flexible interface to support plugin of new services and functions Design Challenges Transit Wire New Transit wire addresses for Traffic Engineering Management Routing Scalability New routing protocol for Traditional convergence delay and so forth Economic models and policies New physical layer changes Optical networks Basic Premise Map and Encap Mapping Service bridges the two spaces Map user address to a transit wire router address Insulate the changes of one space to another and provides a layer of indirection Packets are encapsulated with a transit wire destination address when entering the Transit Wire 12809 CS 557 Lecture 7 Landmark Routing The Landmark Hierarchy A New Hierarchv For Routha in Verv Larae Networks Paul Tsuchiya 1988 Spring 2009 Landmark Routing Objective Reduce the routing table size without increasing path length by a large amount Approach Select some routers as Landmarks Construct a hierarchy of landmarks using increasing radius at each level Name hostsrouters according to their proximity to landmarks 12809 Hierarchial Routing Distance Vector and Link State routing don t scale to very large networks Hierarchial Approach to routing Break the at network into several pieces Routing algorithm runs within a each piece Routing algorithm run between different pieces Can recursively break into more pieces to create more levels in the hierarchy Traditional LinkState Hierarchy Divide network into regions 7 Router knuvvs thafull tupulugy ar its raglan 7 Ex ruuterln regions and border on er 7 Ruuter ln 2 2 knuvvs raglan 2 l raglan l and raglan 3 Exist ana huvvtu raaan them 12809 The Basic TradeOff Each node knows only a limited amount of topological information A bene t for reducing routing table size routing computation number ofupdates etc Lack offull information results in some non Routers simply don t know a shorter path exists because hierarchy limits topology information stored at a given node Routing in The Hierarchy Routing follows the hierarchy and results in 7 hop path Shortest path if all routers know full topology is 4 hops The Landmark Hierarchy Associate a level and a radius with every router Level 0 rout 7 Every router Within radius r0 knows how to reach this router r Cari reach at least orie Level 1 router Similar for Level 1 Level 2 Level H router r lri other Words rH rietWork diameter Example Landmarks uKl Mquot 0 mm I mmi u 1 quotlaMy iZEEIB Addressmg In the Landmark Hierarchy Associate each node WIth a sequence of Landmarks that lead to the node 7 Start Wlth a Level H landmark 7 Next Select a Level H4 landmark 7 Next Select a Level H72 landmark r Flnally select a Level 0 landmark Require that source within radius of Level 0 landmark Level 1 landmark within radius of Level 0 landmark etc Landmark Example 5 lZEEIB Landmark Example Routing Table at G 2 2 F LMOk 0 K LMOf o F Path 39om G dig to T dnt SiGFEDUT 39 Shortest Path is G UT Some Advantages of Landmarks Limits amount of storage space required at each router Comparable to linkstate areas Can dynamically elect landmarks o dynamically select linkstate areas Path to destination does not go through each landm Always use border routers in linkstate hierarchy a 2 lZEEIB 12809 Performance Definitions ri landmark radius distance that a level i landmark can be seen di distance from routerto nearest level i landmark Obsenations Increase in R gt Increase in routing table size more landmarks visible in table decreases path lengths switch to level faster More landmarks at level i gt router closerto landmark at level i gt di smaller ri1 must be big enough to cover nearest i landmark di smaller gt can decrease ri1 Performance Results Depends on ratio rd 8588 r id 0 1 12809 Toward Global Internet Routing Global Internet is a twolevel hierarchy Divide the network into autonomous systems CSU is an AS ATampT is an AS etc Within an AS Typically distance vector or link state routing Choice is entirely up to local AS Between different Autonomous Systems Distance vector Link State Landmarks Something else CS 557 Lecture 20 Congestion and Complexity Observations on the Dynamics of a Congestion Control Algorithm he Effects of TwoWax Traf c Zhang Shenker and Clark 1991 Spring 2009 The Story So Far Some Essential Apps DNS naming and TP time Transport layer End to E unication Multiplexing Reliability c ntro Network layer Addressing Fragmentation Dynamic Routing Best Effort w metpppvr Forwarding copper ber radio a o Congestion 0 Flow contro Data Layer richly connecte n two many paths with many types of unreliable links