Phys Design Automat
Phys Design Automat ECE 6133
Popular in Course
Popular in ELECTRICAL AND COMPUTER ENGINEERING
This 0 page Class Notes was uploaded by Cassidy Effertz on Monday November 2, 2015. The Class Notes belongs to ECE 6133 at Georgia Institute of Technology - Main Campus taught by Sung Lim in Fall. Since its upload, it has received 11 views. For similar materials see /class/233856/ece-6133-georgia-institute-of-technology-main-campus in ELECTRICAL AND COMPUTER ENGINEERING at Georgia Institute of Technology - Main Campus.
Reviews for Phys Design Automat
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/02/15
Steiner Routing ECE6133 Physical Design Automation of VLSI Systems Prof Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Routing placement O Generates a quotloosequot route for each net gtltgt y lt I OAss1gns a 11st of rout1ng reg1ons to each net W1thout I D 1 specifying the actual layout of Wires global routing Global routing detailed routing O Finds the actual geometric layout of each net Within the assigned routing regions compction Detailed routing Routing constraints 0 100 routing completion area minimization under a set of constraints Placement constraint usually based on fixed placement Number of routing layers Geometrical constraints must satisfy design rules Timing constraints performance driven routing must satisfy delay constraints Crosstalk Process variations l lgl Geometrical constraint T woilayer routing Graph Models for Global Routing Grid Graph 0 Each cell is represented by a vertex 0 Two vertices arejoined by an edge if the corresponding cells are adjacent to each other 0 The occupied cells are represented as filled circles whereas the others are as clealr circles Q lw Graph Model Channel Intersection Graph Channels are represented as edges Channel intersections are represented as vertices Edge weight represents channel capacity Extended channel intersection graph terminals are also represented as ve rtices o 0 channel l intersection dgt dgt gt l graph LEE 5 extended I C channel 6 b O b O intersection lil I graph ampltLJgt GlobalRouting Problem 0 Given a netlist NN1N2Nn a routing graph G VE find a Steiner tree Ti for each net Ni 1 g 2 g n such that U j g cej Vej e E and 21LTi is minimized where Cej capacity of edge j new 2 1 if ej is in T1 W 0 otherwise U j 2171 xij of wires that pass through the channel corre sponding to edge ej LTi total wirelength of Steiner tree Ti 0 For high performance the maximum wirelength max1LTz is mini mized or the longest path between two points in Ti is minimized Classification of GlobalRouting Algorithm 0 Sequential approach Assigns priority to nets routes one net at a time based on its priority net ordering o Concurrent approach All nets are considered at the same time com plexity global routing algorithm sequential approach linesearch multiterminal I Steinertree based concurrent approach I hierarchical integer programming I Data Structures and Basic Algorithms Spanning Tree Problem Formulation Given a graph G 2 V7 E7 select a subset V g V7 such that V has property 79 Minimum Spanning Tree Problem Formulation Given an edge weighted graph G 2 V7 E7 select a subset of edges E g E such that E induces a tree and the total cost of edges EeieElw ei is minimum over all such trees7 where wte is the cost or weight of the edge ei Used in routing applications 3 6 Algorithms for VLSI Physical Design Automation Data Structures and Basic Algorithms Steiner Trees 1 Problem formulation Given an edge weighted graph G 2 V7 E and a subset D g V7 select a subset V g V7 such that D g V and V induces a tree of minimum cost over all such trees The set D is referred to as the set of demand points and the set V D is referred to as Steiner points 0 Used in the global routing of multiterminal nets 6 Demand Point a b 1 Algorithms for VLSI Physical Design Automation Elm Data Structures and Basic Algorithms Underlying Grid Graph The underlying grid graph is de ned by the intersections of the horizontal and vertical lines drawn through the demand points 14 Algorithms for VLSI Physical Design Automation Elm Data Structures and Basic Algorithms Different Steiner trees constructed from a MST e 1 Algorithms for VLSI Physical Design Automation Elm I The 1Steiner Problem I Definition We denote the minimum spanning tree over a point set P by MSTP and use cMSTP to denote the cost of the MST on point set P Given a point set P p1 pquot a ISteiner point is any point x such that cMSTP U x is minimized with cMSTP U x lt cMSTP A ISteiner tree is the minimum spanning tree overP U x Routing Practical Problems in VLSI Physical CAD lSteiner Algorithms I Why 1Steiner Insertion I Can Reduce Wirelength J 1 1 Fig 3 Execution of iterated 1Steiner on a fourpoint example Routing Practical Problems in VLSI Physical CAD 1Steiner Algorithms lSteiner by KahngRobins I Iterative 1 Steiner Insertion Algorithm El Keep adding lSteiner point onebyone until no more gain By the result of Hanan we can nd a lSteiner point by constructing a new MST on n 1 points for each ele ment in the Steiner candidate set then picking the can didate which results in the shortest MST I Nai39ve implementation On2 X n log n X n I Sophisticated implementation 0n3 Routing Practical Problems in VLSI Physical CAD lSteiner Algorithms l 1Steiner by Borah Owens Irwin l Interesting Observation Our edgebased algorithm is based on connecting a node to the nearest point on the rectangular layout of an edge in the tree and removing the longest edge in the loop thus formed a b C Routing Practical Problems in VLSI Physical CAD 1Steiner Algorithms I Gain Computation I Things to do 1 Add node p 2 Remove edge 61 3 Remove edge 92 4 Add edge connecting p to pl 5 Add edge connectng 3 to pg 6 Add edge connecting p to 133 I Thus the gain is gain B gth 2 lengthpp1 Routing Practical Problems in VLSI Physical CAD 1Steiner Algorithms l Overall Algorithm I Multipass Heuristic El Entire algorithm can be repeated Algorithm EdgebasedSteinero Begin 1Compute the rectilinear minimum spanning tree of the set of nodes 2Compute all possible ltnode edgegt pairs that give positive gain 3Sort all the pairs in descending order of gain 4While there are pairs with positive gain do If the two edges to be replaced exist in the tree then Replace the pair of edges with three new edges and a new node Endif Endwhile End Routing Practical Problems in VLSI Physical CAD 1Steiner Algorithms I Comparison l Kahng Robins vs Borah Owens Irwin D Kahng Robins tends to give better results D Borah Owens Irwin runs much faster 014 log 11 vs 0112 Routing Prncu39cnl Problems in VISI Physical CAD 1Steiner Algorithms l lSteiner Routing by KahngRobins I Perform 1Steiner Routing by KahngRobins 39 Need an initial MST wirelength is 20 39 16 locations for Steiner points C a b WL20 c quot1 ll 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 117 1 First lSteiner Point Insertion I There are siX lSteiner points 39 Two best solutions we choose 0 randomly C C C before 1nsert10n a WL1g h WL19 c WL18 1 3 1 1 l1 Practical Problems in VLSI Physical Design lSteiner Algorithm 217 1 First 1Steiner Point Insertion cont isle O f C before insertion a WL18 e WL19 r WL19 4 k Practical Problems in VLSI Physical Design 1Steiner Algorithm 317 1 Second 1Steiner Point Insertion I Need to break tie again 39 Note that a and b do not contain any more 1Steiner point so we choose c C f C C before Insemon a WL1 b WL17 c WL1 1 3 3 1 Practical Problems in VLSI Physical Design 1Steiner Algorithm 417 I 1 Third lSteiner Point Insertion I Tree completed all edges are rectilinearized 39 Overall wirelength reduction 20 16 4 C before insertion WL15 1 3 3 1 1 l1 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 517 l lSteiner Routing by BorahOwensIrwin I Perform a single pass of BorahOwensIrwin 39 Initial MST has 5 edges with wirelength of 20 39 Need to compute the maxgain node edge pair for each edge in this MST 1 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 617 1 g g 1 Best Pair for a c We rst let 1 b and 61 a Next we compute the shortest Manhattan distance between 1 and a quotrectilinear layoutquot of 01 which is 2 in this case The node 1 is the nearest point on this rectilinear layout of 391 to p1 Next we look for 2 the longest edge on pl to n path which is 392 b 0 Thus gainb a l6 t1 lt 2 lengthpp1 l 2 2 I Cl l M iii Practical Problems in VLSI Physical Design 1Steiner Algorithm 717 1 Best Pair for 190 I Three nodes can pair up with bc gama b 0 lengthm 0 lengthp a 4 2 2 gamd b 0 0119th7 d lengtMp d 5 4 1 gaine b7 0 length0 e lengthp e 4 3 1 lac lpa 4 2 55b Practical Problems in VLSI Physical Desigi 1Steiner Algorithm 817 1 Best Pair for 196 cont I All three pairs have the same gain 39 Break ties randomly lbd lpd 5 7 4 08 lpe 4 7 3 1 x1 f 145 1 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 917 1 Best Pair for M I Two nodes can pair up with bd 39 both pairs have the same gain lbc lpc 4 7 3 lbc lpe 4 7 3 14 Practical Problems in VLSI Physical Design 1Steiner Algorithm 1017 4 a r 1 Best Pair for 66 I Three nodes can pair up with c e lbc 1pb 4 i 3 lbd 1pd 5 7 4 f I Practical Problems in VLSI Physical Design 1Steiner Algorithm 1117 4 1 r 1 Best Pair for 66 cont 55b 1 4 1 gain1 Practical Problems in VLSI Physical Design lef lpr32 1Steiner Algorithm 1217 1 Best Pair for ef I Can merge with c only 08 lpc 4 7 3 Practical Problems in VLSI Physical Design 1Steiner Algorithm 1317 41 1 51 451 1 Summary I Maxgain pair table 39 8011 based on gain value pair gain 6 1 62 3 a 2 a c b c a b7 2 b7 0 10 071 l b 1 b C I we 1 m bx e1 f I e f ale 1 I 3 1 1 4 Practical Problems in VLSI Physical Design 1Steiner Algorithm 1417 1 First lSteiner Point Insertion I Choose 1 ac maxgain pair 39 Mark 61 610 62 b c 39 Skip 51 b c c bd b c e since their ele2 are already marked 39 Wirelength reduces from 20 to 18 pan gain 81 62 I a 2 a C b7 C 1 b 2 I C a C C b 1 l b1 b C b l C 6 37 C a f 1 e f c e before after C xi gr 1 1 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 1517 1 Second 1Steiner Point Insertion I Choose 0 ef last one remaining 39 Wirelength reduces from 18 to 17 pair ga1n e 62 9 a 2 a C b7 C a7 1 C 2 b C aC c 1111 1 b d 127 C 1 C 6 l C C b C a e f 1 e f a e before 4 1 14 i1 9 Practical Problems in VLSI Physical Design 1Steiner Algorithm 1617 1 Comparison I KahngRobins vs BorahOwensIrwin 39 KhangRobins has better wirelength 16 vs 17 but is slower kah nglrobins bora hlowenirwin 1 l1 9 Practical Problems in VLSI Physical Design lSteiner Algorithm 1717 4 xi r Bounded Radius Routing I Why Radius El Longest sourcesink path length among all sinks El Smaller path resistance better performance I Both Radius and Cost El Cost Wirelength El Radius R and Wirelength C are both important for RC delay reduction I Bounded PRIM vs Bounded Radius Cost El Cong A B Kahng G Robins M Sarrafzadeh and C K Wong quotProvably good performancedriven global routingquot TCAD 1992 Routing Practical Problems in VLSI Physical CAD BRBC Algorithm i Radius VS Wirelength als Fig 1 An example where the cost of a shonest path tree right is 9I N l times larger than the cost of a minimum spanning tree Iefr 2 2 2 3 2 3 s 1 a a 3 3 3 2 2 2 2 a b c Fig 2 An example in the Manhattan plane of how increasing the value of 6 may result in decreased tree cost but increased radius rT a s 0 costT 17 nT 6 b e l costT 15 nT 10 c e 00 costT l4 rT 14 Routing Practical Problems in VLSI Physical CAD BRBC Algorithm Bounded PRIM Algorithm I Variation of PRIM S MST algorithm T ram MN while V lt INI Select two terminals 7 E V and y E N V minimizing dis x y if distTsx dis x y S 1 c R then 3 139 else nd the rst terminal 22 along the path in T from a to s such that distTsx distz39y g R V 2 V U 17 E39 Equot U 3330 Fig 4 Algorithm BPRIM computing a bounded radius spanning tree T for a given set of terminals N with source 5 E N and radius R using parameter 6 Routing Practical Problems in VLSI Physical CAD BRBC Algorithm I Bounded PRIM Algorithm I Comparison 6 0 05 in nity D Radius bound value increase D Wirelength decreases Routing Practical Problems in VLSI Physical CAD BRBC Algorithm Bounded Radius Spanning Tree I Shallow Light Algorithm compute M STG and SPTG Q MSTG L depth rst tour of MSTG S 0 fori1t0L1 S S costLLil ifS Z 639 distds L341 then Q Q U minpathdsy 13m S 0 T shortest path tree of Q Fig 9 Computing 21 bounded radius spanning tree Tfor G V E with sources 6 Vand radius R using parameter e T will have radius at most 1 e R and cost at most 1 25 cosIMSTG Routing Practical Problems in VLSI Physical CAD BRBC Algorithm I Bounded Radius Spanning Tree IB 4quot 439 I 39 TreeF Augmentation of Q added edges shown in dotted lines D Final BRMST is SPT on Q Romng Prncu39cnl Problems in V39LSI Physical CAD BRBC Algorithm Why BRBC Works Theorem 2 For any weighted graph G and parameter e the routing tree T constructed by our algorithm has ra dius rT s l e R Proof For any 11 e V let vi be the last node be fore v on the MST traversal L for which we added minpathds 12 l to Q in the algorithm as shown in Fig 10 By the construction of the algorithm we know that distLv l o s 5 R We then have diStTS U S diStTS Ufil diStLUil U A distGsv639R RE Rl639R D IA Routing Practical Problems in VLSI Physical CAD BRBC Algorithm Why BRBC Works minpathGsvi L MST tour d1 LVi1 V gt Vii V Fig 10 Depiction of the bounded radius construction Routing Practical Problems in VLSI Physical CAD BRBC Algorithm l Bounded Radius Routing I Perform bounded PRIM algorithm 39 Unders0s05andsoo 39 Compare radius and wirelength 39 Radius 12 for this not d quot a a e i O i O a i o 9 SI 1 a 1 3 f xi 3 A 1 41 1 Practical Problems in VLSI Physical Design Bounded Radius Routing 116 lBPRIM Under a 0 I Example 39 Edges connecting to nearest neighbors cd and ce 39 We choose ad based on lexicographical order 39 stod path length along T 125 gt 12 radius bound 39 First appropriate edge found sd c 9 gr 1 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 216 lBPRIM Under 8 0 cont I Radius bound 12 edges connecting to stoy path length rst feasible nearest neighbors along T appredge dist s 1 dismay y appropriate min dismay y chosen of chosen edge edge 9711 s a 0 5 a b at b 5 4 b7 C b c 9 3 C l e l 12 5 I 36 6111 C 6 12 5 16 y 69 114 Sig 112 0 IL 39f g f d IL 11 5 h elf 94 62f 115 inf ties broken should be 5 12 C39 lexicographically otherwise 1 appropriate used 1 1 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 316 lBPRIM Under 8 0 cont 4quot k Practical Problems in VLSI Physical Design Bounded Radius Routing 416 55b lBPRIM Under 8 0 cont 55b 1 1 fdi b V h c a e l g 3 9 S f i e Z fd 7 J9 7 lh Q p 39 quot E C I oquot a e quot vg39 f g Practical Problems in VLSI Physical Design Bounded Radius Routing 516 lBPRIM Under a 05 I Radius bound 18 edges connecting to stoy path length rst feasible nearest neighbors along T appredge disrT Su 1 dis r y appropriate min thisHI y Chosen of chosen edge edge u 1 O 5 7 LL 1 1 b 5 l 1 I a 9 3 C I F e rad 12 5 35 111 r e 12 5 9K MI 174 39 111Phghefgf dI 175 rah 639 g 5 17 5 ties broken should be S 18 should be S 12 1 lexicographically otherwise 1 appropriate used 1 1 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 616 lBPRIM Under 8 05 cont Bounded Radius Routing 716 lBPRIM Under 8 05 cont 4quot k Practical Problems in VLSI Physical Design Bounded Radius Routing 816 55b lBPRIM Under a oo RadiquoundZOO dI I regularPRIM b j h b h Bounded Radius Routing 916 1 C N9 Practical Problems in VLSI Physical Design lBPRIM Under 8 00 cont quot3 i h 3quot S 4quot k Practical Problems in VLSI Physical Design Bounded Radius Routing 1016 55b 1 Comparison I As the bound increases 12 gt 18 gt oo 39 Radius value increases 12 gt17 gt 22 39 Wirelength decreases 56 gt 49 gt 36 1 a r 1 Practical Problems in VLSI Physical Design Bounded Radius Routing 1116 l Bounded Radius Bounded Cost I Perform BRBC under 8 05 39 8 de nes both radius and wirelength bound 39 Perform DFS on rootedMST 39 Node ordering L s a b c ef e g e c d h d c b a s 39 We start with Q MST 9x4 f i s gr 5 DJ l 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 1216 l MST Augmentation I Example Visit a Via sa 39 Running total of the length of Visited edges S 5 39 Rectilinear distance between source and a distsa 5 39 We see that 8 distsa 05 5 lt S 39 Thus we reset S and add sa to Q note sa is already in Q N lt e x f i s gr l 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 1316 l MST Augmentation cont edge Li 6 dis s Li S reset S s u L 05 5 5 yes 1 b b 0 5 4 5quot Visit nodes based on L b c C 05 j 3 yes 0 e 05 7 5 yes f f 05 6 5 yes f e 05 7 5 yes a g g 05 9 4 no 9 639 e 05 7 8 yes sac C 056 5 yes 0 I d 05 11 5 no cl 1 h 05 12 10 yes h l l 05 11 5 no 10 L 05 6 10 yes 11 b 05 7 395 no I a a 05 5 7 yes quot 5 S 0395 39 O 5 yes dotted edges are added k Practical Problems in VLSI Physical Design Bounded Radius Routing 1416 55b lLast Step SPT Computation I Compute rooted shortest path tree on augmented Q 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 1516 4 1 14 i1 l BPRIM vs BRBC I Under the same 8 05 39 BPRIM radius 18 wirelength 49 39 BRBC radius 12 wirelength 52 39 BRBC signi cantly shorter radius at slight wirelength increase l 1 9 Practical Problems in VLSI Physical Design Bounded Radius Routing 1616 g 9 gr I Elmore Routing Algorithms I Elmore Delay El W C Elmore The transient response of damped linear network with particular regard to wideband ampli ers Journal of Applied Physics 1948 El Computes delay in RC tree C u mm mono Z 0 e Epathn0n El Delay is quadratic function of wirelength El Reasonably accurate easy to calculate Routing Practical Problems in VLSI Physical CAD ERTSERT Algorithms Delay Calculation Need to consider a more accurate delay model for general circuits RC Delay The delay caused by wires is due to their capacitance and resistance R 0c mi C oc wl Lumped circuit approximations for distributed RC lines 7239 model T mode1 L model R RZ R2 R TAMT W A WVT 02 02 C C h l 1 a lumped wire quotITmodel T model L model C C C RC delay A to B DAB Rl1C2 C3 Bro C DEC RZ72 BloD DBD 2R373 R2 C C2 2 2 R1 2 w I IIH M 9 gt M DU Ml IIH IIH Elmore Delay Calculation I Example 1 2 rd C nan 05609711 Ca T mab 05Ca7b 2b quotrd CSya Za Cayb 21 7157a 05Csva Za Cab 21 aw 050mi 25 Routing Practical Problems in VLSI Physical CAD ERTSERT Algorithms I Elmore Delay Calculation I Example 2 a What is the Elmode delay equation for node d Ranting Przc czleblems in VLSI Physical CAD ERTSERT AlgrriLhms ERT Algorithm I Similar to Prim s MST El Build Elmoreminimal MST El Maximum Elmore delay among all sinks is minimized I Tackle delay directly instead of targeting radius E T Algorithm Input signal net N with source no G N Output routing tree T over N 139 T V 2 Hindi 2 While IVI lt INI do 3 Find u e V and u E V which minimize the maximum Elmore delay from no to any sink in the tree V U 1 EU uv 4 V V U 1 5 E EUuv 6 Output resulting spanning tree T V E Routing Practical Problems in VLSI Physical CAD ERTSERT Algorithms SERT Algorithm I Extended ERT for Steiner Routing We apply the ERT approach to Steiner routing by allowing the new pin to connect to an edge or the source of the existing tree possibly inducing a Steiner node on this edge at the point that is closest to the new pin d c d C scrd pair uses Steiner node 1 Routing Practical Problems in VLSI Physical CAD ERTSERT Algorithms How to Grow Tree I ERT Algorithm 1 Visit each node in the tree and find its closest neighboring node not in the tree 1 Choose the node pair that minimizes max Elmore delay I SERT Algorithm 1 Visit each node not in the tree a Each node can connect to each edge in the tree or the source a Nodeedge connection may use a Steiner point 1 Find the nodeedge pair that minimizes max Elmore delay Routing Practical Problems in VLSI Physical CAD ERTSERT Algorithms l Elmore Routing Tree ERT Algorithm I Perform ERT algorithm under 65mm technology 39 Unit length resistance r 04 Qym 39 Unit length capacitance c 02 f Fym 39 Driver output resistance rd 250 Q 39 Sink input capacitance r 50 f F 1 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 116 1 g g 1 Adding First Edge I Simply add the nearest neighbor to the source 39 Add sa Practical Problems in VLSI Physical Design ERTSERT Algorithln 216 4 1 1 4 i1 1 Adding Second Edge I Rule each node in T can connect to its nearest neighbor 39 Two edges to consider ab sc 39 Elmore delay calculations shown on next slides 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 316 4 1 1 4 J Elmore Delay Calculation I Case 1 edge ab 2 rd CS rsa05c5a Ca T W b O5Ca7b 21 rd Csa Za from 3b T1541 O395Csa Za Cab 2b mm 05Cab 312 025600 50 1200 50 12300 50 1200 50 24600 50 3955ps 1 q l 14 Practical Problems in VLSI Physical Desigi ERTSERT Algorithm 416 l Elmore Delay Calculation cont I Case 2 edge sc 39 It is easy to see that tc gt ta 39 Elmore delay is tc 2035ps 39 Thus we add sc to minimize maximum Elmore delay l 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 516 1 Adding Third Edge I Three edges to consider ab sd ad 39 Elmore delay tb 42675ps td 29375ps td 59175ps 39 Add tc 29375ps 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 616 4 1 1 7 i1 1 Adding Fourth Edge I Four edges to consider ab sb cb 6119 39 In all these cases delay to b is the maximum 39 tb 4630ps 4720ps 10720ps 8310ps respectively 39 Add ab a C Ll tb 4630ps b a 1 1 l1 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 716 1 Final ERT Result I Maximum Elmore delay is 11 4630ps 39 No Steiner node used 39 Star shaped topology 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 816 4 1 14 i1 l Steiner Elmore Routing Tree SERT I Perform SERT algorithm under 12um technology 39 Unit length resistance r 0073 Qym 39 Unit length capacitance c 0083 f Fym 39 Driver output resistance rd 212 Q 39 Sink input capacitance r 71 f F 1 9 Practical Problems in VLSI Physical Design ERTSERT Algorithln 916 4 g g 1 First Iteration I Simply add the nearest neighbor to the source 39 Add sa Practical Problems in VLSI Physical Design ERTSERT Algorithm 1016 4 1 5145 1 Second Iteration I Rule each node not in T can connect to each edge in T using a Steiner point or directly to source 39 6 edges to consider ab sb sc 39 Nodep is a Steiner node Practical Problems in VLSI Physical Design ERTSERT Algorithm 1116 4 1 1 4 i1 1 Second Iteration cont I Case 6 results in minimum delay 10 2686ps 39 Add tc 2686ps dl d e ifquot Practical Problems in VLSI Physical Design ERTSERT Algorithm 1216 4 1 1 4quot 1 Third Iteration I 7 edges to consider d C C td 4133ps Cl i e A so A Practical Problems in VLSI Physical Design ERTSERT Algorithm 1316 1 Fourth Iteration I 6 edges to consider tb 6063ps td 5573ps Practical Problems in VLSI Physical Design 1 g 51 451 ERTSERT Algorithm 1416 1 Final SERT Result I Maximum Elmore delay is tb 6063ps 39 Two Steiner nodes used 9 Practical Problems in VLSI Physical Design ERTSERT Algorithm 1516 4 1 1 7 i1 Clustering ECE6133 Physical Design Automation of VLSI Systems Prof Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology 1 Circuit Clustering I Grouping cells to form bigger cells 39 Why do we do this Cluster A with its Update the closest neighbor circuit netlist 1 9 Practical Problems in VLSI Physical Design a 9 gr 1 Circuit Clustering I Motivation 39 Reduce the size of at netlists 39 Identify natural circuit hierarchy I Objectives 39 Maximize the connectivity of each cluster 39 Minimize the size delay and density of clustered circuits 1 l 9 Practical Problems in VLSI Physical Design 1 a r l Clustering vs Partitioning I Differences and similarities 39 Divide cells into groups under area constraint A 39 Clustering if A is small partitioning otherwise 39 Clustering preprocess of partitioning I Clustering Metrics 39 Absorption Density Rent Parameter Ratio Cut Closeness Connectivity etc I Partitioning Metrics 39 Cutsize and delay 11 ll 1 Practical Problems in VLSI Physical Design g j l 1 Density Metric I Desire high density in each cluster 39 Applied to a single cluster DENC1 weVE2C1SV Sm Sm sv3 1 I 9 Practical Problems in VLSI Physical Design 1 g g 1 Previous Works nish Cutsizeoriented 39 K IconnectiVity algorithms GarberPromelSteger 1990 39 Randomwalk based algorithm Cong et al 1991 HagenKahng 1992 39 MulticommodityFlow based algorithm Yeh ChengLin 1992 39 Clique based algorithm Bui 1989 CongSmith 1993 39 Multilevel clustering KarypisKumar DAC97 CongLim ASPDAC OO Delayoriented 39 For combinational circuits LawlerLevittTurner 1969 Murgai Brayton Sanjiovanni 1991 Rajaraman Wong 1995 CongDing 1992 39 For sequential circuits Pan et al TCAD 99 Cong et al DAC 99 39 Signal ow based clustering CongDing DAC 93 Cong et al ICCAD 97 l 9 Practical Problems in VLSI Physical Design lLawler s Labeling Algorithm I Assumption 39 Cluster size S K intracluster delay 0 intercluster delay 1 Objective Find a clustering of minimum delay Phase 1 Label all nodes in topological order 39 For each PI node v Lv 0 39 For each nonPI node v 39 p maximum label of predecessors of v 39 Xp set of predecessors of v with label p 39 if lXp lt KthenLv p else Lv pl I Phase 2 Form clusters 39 Start from P0 to generate necessary clusters 39 Nodes with the same label form a cluster I 1 ll 9 Practical Problems in VLSI Physical Design l Raj aramanWong Algorithm bikb First optimal algorithm that solves delayoriented clustering problem under general delay model Given 39 DAG cluster size limit Find 39 Optimal clustering that minimizes maximum PIPO path delay Delay model 39 Node delay d intracluster delay 0 intercluster delay D 39 Better than unit delay model used in Lawler Node duplication is allowed ll 9 Practical Problems in VLSI Physical Design l Raj aramanWong Algorithm I Initialization phase 39 Compute n X 11 matrix Axv allpair maxdelay value from output of x to output of v using node delay only 39 Set labelPI delayPI labelnonPI 0 I Labeling Phase 39 Compute label based on topological order of the nodes 39 Label denotes max delay from any P1 to the node 39 Clustering info is also computed during labeling I Clustering Phase 39 Actual grouping and duplication occur 39 Done based on reserve topological order 1 i 9 Practical Problems in VLSI Physical Design 1 a l l Labeling for Node V N Lu 4 U1 0 We compute the subgraph rooted at t39 denoted G that includes all the predecessors of v We compute Mgr for each node T E G71r where 11 1 l Ar r 139 denotes the current label for t39 and AM u is an entry of the A matrix mentioned above We sort all nodes in GUz in decreasing order of their lU values and put them into a set S We remove a node from S one by one in the sorted order and add it to the cluster for 7 denoted clusterw until the size constraint is violate We compute two values ll and lg If l Ilst i rlt contains any PI nodes the maximum l value among these PI nodes becomes 1 If S is not empty after lling up quotlll f6 7 l the maximum 11 D among the nodes remaining in 5 becomes 12 where D is the intercluster delay The new label for L39 is the maximum between 1 and 2 Practical Problems in VLSI Physical Design 1 What is going on 1 By the de nition of iv EU u is a lower bound on the delay along any path from a primary input to 1 that passes through u The greater the value of El n the more the need to include it in clusterv Hence we try to cluster 1 with as many high Evvalued nodes as the capacity constraint permits After building clusterv v is labeled by considering all possible paths from an input to the output of 1 All of the paths can be divided into two categon39es 1 Paths that lie entirely in clusteT CLr Such paths start from a primary input that is in clustern and never exit the cluster The maximum delay along any such path is 11 max 1u u E clusterv 0 PI 1 2 Paths that cross the boundary of dusterv Among these paths the maximum delay is 2 342 r max 5100 D u E Gvcluste39rv 2 171 9 Practical Problems in VLSI Physical Design l Clustering Phase We rst put all PO nodes in a set L We then remove a node from L and form its cluster N Given a node 1 we form a cluster by grouping all nodes in 39711st 7 139 which was computed during the labeling phase L We then compute inpu l the set ofinput nodes of Instr71v 4 We remove a node 439 from 13972putt39 one by one and add it to L if we have not formed the cluster for r yet Ln We repeat the entire process until L becomes empty l k Practical Problems in VLSI Physical Design 55b l Rajaraman Wong Algorithm I Perform RW Clustering on the following di graph 39 Inter cluster delay 3 node delay 1 39 Size limit 4 39 Topological order T defghijkl not unique 1 9 gr 1 9 Practical Problems in VLSI Physical Design Rajaraman Wong Algorithln 18 1 Max Delay Matrix I All pair delay matrix Ax y 39 Max delay from output of the PIS to output of destination fgh l l l 000 020 3 1 001020 0 2 0 2 l l 1 000000000 0000000000 0 000000000 000000000000 L b000 100000000 1 l Practical Problems in VLSI Physical Design Rajaraman Wong Algorithln 28 J Label and Clustering Computation sh kmmMeK amMMwwA First Gd 2 1 b d By de nition 1a b 1 Thus 112 112 Ma 1a Am mmuwaw ampamp Then we have S a b recall that S contains Gdd with their d values sorted in a decreasing order Since both a and b can be clustered together with d while not Violating the size constraint of 4 we form clusterd 1 I d Since both a and b are PI nodes we see that 1 maXlda ldb 2 Since 5 is empty after clusten ng 12 remains zero Thus d maxl1l2 2 v 7 l quot 74 Practical Problems in VLSI Physical Design RajaramanWong Algorithm 38 JLabelConunua on I Compute 139 and clusterz39 node 139 Ci 2 a7bc d ef7gi see Figure 13 Thus Ma aAai 12 Mb 11 Abi 134 2a 21c Aci 1 4 39 lidldAdi21 5 momoAmw224 umm Mf lfAfyi 21 39 liglgA97i3 l 1 Ilti7 S gecbadf and we form cluster igecl Note that c is PI so 1 Me 4 Since S bi ad7 f 75 0 after Clustering we have 2 11015 D 119 D 4 3 7 Cl recall that mS is the node in S with the maximum value of li value Thus1z maXl1l2 7 Cl39jLP l lvj Practical Problems in VLSI Physical Design RajaramanWong Algorithm 48 l Labeling Summary I Labeling phase generates the following information 39 Max label max delay 8 node label clustering a l a b l b C l d 2 1 b7 1 e 2 1 C e f 2 at f g 3 b c e g h 2 a h i 7 6 e g i j 7 b3 67 9 J k 8 9 2313 k z 8 a g j I 9 Practical Problems in VLSI Physical Design Rajaraman Wong Algorithln 58 4 9 5T1 i1 l Clustering Phase I Initially L POs k I remove k from L and add clk to S clk According to Table 11 we see that clk gij Then Iclk f7deh as illustrated in Figure 14 Since S does not contain clusters rooted at f d e and h we have L l U d7 6 h 1 f d7 87 h 2 clushte rk a 4 4 quot l Practical Problems in VLSI Physical Desigi Rajaraman Wong Algorithm 68 l Clustering Summary I Clustering phase generates 8 Clusters 39 8 nodes are duplicated root elements If g Lj 1c 1 gja 1 f CL f d 1 241 e 1 C e h h b b C C 4 1 14 l1 9 Practical Problems in VLSI Physical Design Rajaraman Wong Algorithln 78 1 Final Clustering Result I Path ceg ik has delay 8 max label clusterf 5 clusteik clusterl 9 Practical Problems in VLSI Physical Design Rajaraman Wong Algorithln 88 4 1 1 7 l1 Probing Further I Raj aramanWong Algorithm 39 Wang and Wong 1994 nds set of nodes to be replicated so that cutsize is minimized 39 Vaishnav and Pedram 1995 minimizes power under delay optimal clustering properties 39 Wang and Wong 1997 performed delayoptimal clustering under area andor pin constraint Pan et at 1998 performed delayoptimal clustering with retiming for sequential circuits 39 Cong and Romesis 2001 developed heuristic for twolevel delayoriented clustering problem 1 I 9 Practical Problems in VLSI Physical Design 1 a l Multilevel Paradigm Combination of Bottomup and Topdown Methods From coarsegrain into nergrain optimization Successfully used in partial differential equations image processing combinatorial optimization etc and circuit partitioning DEB DEB coarsening Q Uncoarsenin EB g Initial Partitioning General Framework Step 1 Coarsening Generate hierarchical representation of the netlist Step 2 Initial Solution Generation Obtain initial solution for the toplevel clusters Reduced problem size converge fast Step 3 Uncoarsening and Re nement Project solution to the next lowerlevel uncoarsening Perturb solution to improve quality re nement Step 4 Vcycle Additional improvement possible from new clustering Iterate Step 1 with variation Step 3 until no further gain Vcycle Refinement Motivation Postre nement scheme for multilevel methods Different clustering can give additional improvement Restricted Coarsening Require initial partitioning Do not merge clusters in different partition Maintain cutline cutsize degradation is not possible Two Strategies Vcycle vs vcycle Vcycle start from the bottomlevel vcycle start from some middlelevel Tradeoff between quality vs runtime Application in Partitioning Multilevel Partitioning Coarsening engine bottomup Unrestricted and restricted coarsening 0 Any bottomup clustering algorithm can be used Cutsize oriented MHEC ESC vs delay oriented PRIME Initial partitioning engine Movebased methods are commonly used Re nement engine topdown Movebased methods are commonly used Cutsize oriented FM LR vs delay oriented xLR Stateof theart Algorithms hMetis DAC97 and hMetisKway DAC99 hMetis Algorithm Best Bipartitioning Algorithm DAC97 Contribution 3 new coarsening schemes for hypergraphs Original Graph Edge Coarsening Edge Coarsening heavyedge maximal matching 1 Visit vertices randomly 2 Compute edgeweights 1n1 for all unmatched neighbors 3 Match with an unmatched neighbor via max edgeweight hMetis Algorithm cont Best Bipartitioning Algorithm DAC97 Contribution 3 new coarsening schemes for hypergraphs O 80 O Hyperedge Coarsening Modified Hyperedge Coarsening Hyperedge Coarsening independent hyperedge merging 1 Sort hyperedges in nondecreasing order of their size 2 Pick an hyperedge with no merged vertices and merge Modified Hyperedge Coarsening Hyeredge Coarsening post process 1 Perform Hyperedge Coarsening 2 Pick a nonmerged hyperedge and merge its nonmerged vertices hMetisKway Algorithm Multiway Partitioning Algorithm DAC99 New coarsening First Choice variant of Edge Coarsening Can match with either unmatched 0r matched neighbors 6 Original Graph First Choice Greedy re nement Onthe y gain computation N0 bucket not necessarily the maxgain cell moves Save time and space requirements hMetis Results Bipartitioning 0n ISPD98 Benchmark Suite Scaled Cutsize FM LR LRESC hMetis hMetisKway Results Multiway Partitioning 0n ISPD98 Benchmark Suite I hMetisKway l KPMLR LRESCPM Scaled Cutsize Zway 8way 16way 32way lMultilevel Coarsening Algorithm I Perform Edge Coarsening EC 39 Visit nodes and break ties in alphabetical order 39 Explicit cliquebased graph model is not necessary l 1 9 Practical Problems in VLSI Physical Design Multilevel Coarsening 111 1 9 gr JEdge Coarsening a Visit a Note that a is contained in m only So neighbo a 07 e The weight of 10 1n1 1 05 The weight of 16 1 m l 05 Thus we break the lie based on alphabetical order So a merges with c We form 01 1 c and mark a and c b visit Z Note that bis contained in n2 only So neighborb 2 07d Since G is already marked 19 merges with d We form 2 2 197 d and mark I and d c since 0 and d are marked we skip them cluster nodes Cl 1 C C2 2 d C13 8 9 1 14 f h a l rk Practical Problems in VLSI Physical Desigi Multilevel Coarsening 211 1 Edge Coarsening cont d Visit 6 the unmarked neighbors of e are 9 and f We see that we g 1 and wef 05 So 6 merges with 9 We form 03 69 and mark 6 and g e Visit f Node f is contained in n3 n4 and n6 So neighbor 0 d e g h But the only unmarked neighbor is h So f merges with h We form 04 f h and mark f and h f since 9 and h are marked we skip them cluster nodes C1 17 1 C2 1 d 3 g C74 h 5b 7 l39 L Practical Problems in VLSI Physical Desigi Multilevel Coarsenjng 311 l Obtaining Clusteredlevel Netlist I of nodeshyperedges reduced 4 nodes 5 hyperedges net gatelevel clusterlevel nal cluster nodes 711 CL C 6 6391 CV3 CV1 6 3 1 17 c 712 C d 12 C39ll 63 6392 C39Q 773 Q 67 6 6393 6394 C11 CV3 6393 en 714 dl f C2 C4 7 C4 C4 f 12 H5 919 C3 C3 w no ffgsl39l 0490304 C3 C4 9 Practical Problems in VLSI Physical Design Multilevel Coarsening 411 4 9 5T1 i1 lHyperedge Coarsening I Initial setup 39 Sort hyperedges in increasing size n4 n5 n1 n2 n3 n6 39 Unmark all nodes Multilevel Coarsening 511 4 9 5T1 i1 9 Practical Problems in VLSI Physical Design lHyperedge Coarsening a Visit 714 l f since I and f are not marked yet we form 71 dif and markd and f b visit n5 9 9 since e and g are not marked yet we form 02 6 g and mark 6 and g C Visit 711 1 c e since 6 is already marked we skip m 14 Practical Problems in VLSI Physical Design 4 g r cluster nodes a g a b C r h Multilevel Coarsening 611 lHyperedge Coarsening d Visit ng I c 1 since dis already marked we skip 722 6 visit 71 0 e f since 6 and f are already marked we skip 713 f visit m f g h since f and g are already marked we skip 726 cluster nodes Cl dv 32 6quot 9 C3 Ci b 05 CG 14 Practical Problems in VLSI Physical Design Multilevel Coarsening 711 4 j r l Obtaining Clusteredlevel Netlist I of nodeshyperedges reduced 6 nodes 4 hyperedges net gate level cluster level nal cluster nodes In Q7 C7 6 C737 CV5 612 13 1395 6392 CV1 d 712 C 614 15611 04 2156391 12 g T23 3 6 625 12 111 153 1392 31 1393 714 1 611 11 0 C14 quot5 6 g C2 C2 0 05 C n6 f g 11 Ch 12625 01 C2 C6 12 9 Practical Problems in VLSI Physical Design Multilevel Coarsening 811 4 9 5T1 i1 lModi ed Hyperedge Coarsening I Revisit skipped nets during hyperedge coarsening 39 We skipped n1 n2 n3 n6 39 Coarsen uncoarsened nodes in each net l 1 9 Practical Problems in VLSI Physical Design Multilevel Coarsening 911 1 9 gr 1 Modi ed Hyperedge Coarsening a v Visit n1 2 an 0 6 since e is already marked during HEC we group the remaining unmarked nodes a and c We form 03 a c and mark a and c b Visit n2 2 I c d since dis marked during HEC and c during MHEC as above we form C4 b and mark I c Visit 713 1 67 f all nodes are already marked so we skip 723 d visit me 2 f7 97 h since f and g are already marked we form 05 h and mark h cluster nodes 61 if 32 87 9 C3 as C C4 1 6395 4 1 q a g Practical Problems in VLSI Physical Desigi Multilevel Coarsenjng 1011 Placement ECE6133 Physical Design Automation of VLSI Systems Prof Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Placement o The process of arranging the circuit components on a layout surface o Inputs A set of fixed modules a netlist 0 Goal Find the best position for each module on the chip according to appropriate cost functions Considerations mutabilitychannel density wirelength cut size performance thermal issues IO pads Density 2 2 tracks required mung MEN Shorter wirelength 3 tracks required wirelength 10 wirelength 12 Estimation of Wirelength Semi perimeter method Half the perimeter of the bounding rectangle that encloses all the pins of the net to be connected Most widely used approximation Complete graph Since edges in a complete graph 12 is gx of tree edges n 1 wirelength w gzayjmnetdist Minimum chain Start from one vertex and connect to the closest one and then to the next closest etc Source to sink connection Connect one pin to all other pins of the net Not accurate for uncongested chips Steiner tree approximation Computationally expensive Minimum spanning tree LLLLLLLLE Lkakkkkk oo N FL xi Liiii amp LLLLL DL LLLLLL Lx LLLLLLLLM LJJJJJAAJJ LLCLCLLJT CCCCECEE QLLLLL WnH LHLH 00 as L LLLLM g Wrmrrr 1 mxx LLLLLK LL LLLLLL xs LLLLLL LL t H u Pains H LLQL iii amp L iii iii LJJJAJAJiJg JJJAJJJJJ Etii iaitim L LLLLLL L4 LLLL LLLJ xxwxxxx E g EELxLLLw iLmLLL I x 3 xxxx 45 EELCBELU rm W LIJJJJ LCCCCW LLLLLLCILJE LLLLLLLLL LJJJJJJJJJ 3 LJJJJJJJJJ I3 Steiner tree len 12 Spanning tree len sourceitoisink len I 7 Placement Methods Constructive methods Cluster growth algorithm Forcedirected method Algorithm by Goto Mincut based method Iterative improvement methods Pairwise exchange Simulated annealing Timberwolf Genetic algorithm Analytical methods Gordian GordianL MinCut Placement o Breuer A class of min cut placement algorithms DAC 77 o Quadrature suitable for circuits with high density in the center a Bisection good for standard cell placement o SliceBisection good for cells with high interconnection on the periphery 3a 2a 3b 1 30 217 3d 3a IQM39KWNN 661561617 4 605b6d 10519611017810 10d n2 quot2 112 k ci2nk quadrature bisection slicebisection Algorithm for MinCut Placement Algorithm MinCutPlacementN n C N the layout surface n of cells to be placed no of cells in a slot C the connectivity matrix 1 begin 2 if ngno then PlaceCellsNnC39 3 else 4 N17N2 lt CutSurfaceN n1C391 n2C392 lt PartitionnC39 Call MinCutPlacementN1n17C391 Call MinCutPlacement N27 n2 C2 end OOIO O39 Quadrature Placement Example o Apply K L heuristic to partition Quadrature Placement Cost Cl 4 12L 02R 2 etc Z i7 amp121114 L169 QILIiI6 CI 01 4a C2 02 C417 MinCut Placement with Terminal Propagation o Dunlop amp Kernighan A procedure for placement of standard cell VLSI circuits o Drawback of the original min cut placement IEEE TCAD Jan 1985 positions of terminal pins that enter a region What happens if we swap 1 3 6 9 and 2 4 5 7 in the previous example L1 L2 S 39 i prefer to have them in R1 S L1 L2 Does not consider the R2 Terminal Propagation 0 We should use the fact that s is in L1 center dummy cell w L2 P Lower cost R2 L1 L2 x higher cost P will stay in R1 for the rest ofpartitioning R 0 When not to use p to bias partitioning Net 3 has cells in many groups minimum rectilinear Steiner tree P P h h3 h M Don t use p to bias the solution in either direction Use p p2 p3 G Terminal Propagation Example 0 Partitioning must be done breadth first not depth first W i C C C C a a u P1 RIM L R L C I o a 9 MA da R2 g unbiased partition 0 with terminal propagation without terminal propagation Creating Rows Terminal propagation reduce overall area by 30 Creating rows Choose 1 and 3 preferably to balance row to balance row length during rearrangement ROW 1 ROW 2 cells in C1 gt rowl Row 3 cells 1n C3 rowl a Row4 cells in C2 CZ B Rowl Row2 B1 Creating Rows Example Partitioning of circuit into 32 groups Each group is either assigned to a single row or divided into 2 rows a fourrow standard cell design Experimental Results CMOS Chip with 453 nets and 412 cells Manual solution track density147 feedthroughs184 Automated solution without terminal propagation td313 ft591 td reduced to 235 by iterative interchanges with terminal propagation td186 ft182 td reduced to 152 by iterative interchanges Iterative Interchange Re nement is helpful The program is in production use as part of an automatic placement system in ATampT Bell Lab Solutions within 10 of the best hand layout Remarks on Mincut Placement Also implemented FM partitioning method Much faster but solutions appeared to be not as good as K L Use Simulated Annealing to do partitioning Much slower If restricted to a reasonable CPU time solutions are of similar quality of those by FM method Easy to implement Seeking an elegant way to force some cells to be in particular positions Investigate other algorithms for terminal propagation Terminal propagation is the bottleneck of CPU time lMincut Placement 1 a r l I 9 Practical Problems in VLSI Physical Design I Perform quadrature mincut onto 4 X 4 grid 39 Start with vertical cut rst 771 61f 712 1161 713 hf 1 774 2 691 715 11 h m 17j 777 2 r 718 ifJ39s 7L9 LOPi 7110 I 1110 7111 17711 7712 17177 undirected graph model w kclique weighting thin edges weight 05 thick edges weight 1 7713 2110 Mincut Placement 112 lCut 1 and 2 I First out has mincutsize of 3 not unique 39 Both cuts 1 and 2 divide the entire chip e 6 6 0 a cut 1 b cut 2 1stIevel quadrants formed 13le 1 9 Practical Problems in VLSI Physical Design Mincut Placement 212 lCut 3 and 4 I Each cut minimizes cutsize 39 Helps reduce overall wirelength C out 3 d cut 4 quotI 9 Practical Problems in VLSI Physical Design Mincut Placement 312 553 lCut 5 and 6 I 16 partitions generated by 6 cuts 39 HPBB wirelength 27 M e cut 5 f cut 6 2ndIevel quadrants formed quot1 9 Practical Problems in VLSI Physical Design Mincut Placement 412 553 l Recursive Bisection I Start With vertical cut 39 Perform terminal propagation with middle third Window a CUt 1 b cut 2 quotI 9 Practical Problems in VLSI Physical Design Mincut Placement 512 55 1 Cut 3 Terminal Propagation I Two terminals are propagated and are pulling nodes 39 Node k and 0 connect to n and j 191 propagated outside window 39 Node g connect to j f and b 192 propagated outside window 39 Terminal 1 pulls kog to top partition and p2 pulls g to bottom 1 Cut 4 Terminal Propagation I One terminal propagated 39 Node n and j connect to okg p1 propagated 39 Node 139 and j connect to e a no propagation inside window 39 Terminal 1 pulls n and j to right partition 1 Cut 5 Terminal Propagation I Three terminals propagated 39 Node 139 propagated to 191 j to 192 and g to 193 39 Terminal 1 pulls e and a to left partition 39 Terminal 2 and p3 pull be to right partition 4 window q a 57 39 1 ll 9 Practical Problems in VLSI Physical Design Mincut Placement 812 1 Cut 6 Terminal Propagation I One terminal propagated 39 Node n and j are propagated to 191 39 Terminal 1 pulls 0 and k to left partition window 4 Mincut Placement 912 1 Cut 7 Terminal Propagation I Three terminals propagated 39 Node j b propagated to 191 ok to 192 and hp to 193 39 Terminal 1 and p2 pull g and l to left partition 39 Terminal 3 pull 1 and d to right partition q a 57 39 1 ll 9 Practical Problems in VLSI Physical Design Mincut Placement 1012 lCutsto 15 I 16 partitions generated by 15 cuts 39 HPBB wirelength 23 l 1 9 Practical Problems in VLSI Physical Design Mincut Placement 1112 1 9 gr 1 Comparison I Quadrature vs recursive bisection terminal propagation 39 Number of cuts 6 vs 15 39 Wirelength 27 vs 23 l quadrature bisection 1 7 Practical Problems in VLSI Physical Desigi Mincut Placement 1212 Analytical Placement Gordian package GORDIAN Gordian VLSI Placement by Quadratic Programming and slicing Optimization J M Kleinhans GSigl FM Johannes KJ Antreich IEEE TCAD 1991 GORDIANL Analytical Placement A Linear or a Quadratic Objective Function G Sigl K Doll FM Johannes DAC91 Gordian A Quadratic Placement Approach Global optimization solves a sequence of quadratic programming problems Partitioning enforces the nonoverlap constraints Overview of Gordian Package Procedure Gordian l1 globaloptimizel while there exists Mlgtk for each r e Rl partitionr r r l setupconstraintsD globaloptimizem repartitionD nalplacementm endprocedure Problem Definition connection to y other modules net node v n n 39n 39n 39 pin v xuv yuv am bvu offset from center of u xv y v xu Squared wire length of net v 2 2 Lv xv yuv yv ueMv xuvxuavu9yuvyubvu Cost Function Minimize the following Zvav veN xyXTCXdeYTCYdyTY xXTCXdTX Constraints The center of gravity constraints At level 1 chip is divided into q S 2 regions For region p the center coordinates up vp M P set of modules in region p Matrix from for all regions constraint Z Fuxu u p 2F ueMp ueMp 12 Zn if i 6 MP AlX ul am ieMp 0 otherwise Problem Formulation up vp A B C D E F G quot395 39quot F z B 7 0 0 MOM 39 s 39 39 Linearly constrained Quadratic Programming problem LQP minCDx XTCX dTX such that A X u xeR Solution Method Algebraic Manipulation Aqu D 391qu qgtltmq X d X d is dependent variable DB u where o o o X l X 1 1s 1ndependent varlable X D39lBXl D39lu DquotB 1 1 X I Xi 0 ZXx0 Unconstrained Quadratic Programming problem Solved by ConjugateGradient method UQP min Pxl szTCZX CTXl where CT CX0 d xieRmq Partitioning Recursive partitioning is needed to resolve module overlap in global placement global placement problem will be solved again with two additional centerofgravity constraints CPO Mp MpMp 40 x S x u39e Mp and uquote Mp 30 a ZFuZFuz05 20 u39eMP ueMP 10 cut value C p a 2 WV 0 VENC I I 00 025 05 075 10 Repartitioning Module exchange after each cut to improve cut size terminal propagation using global placement positions Repartitioning to undo the mistake made at the previous level Procedure repartitionl if overlap exists for each reRl 1 mergeregionsr r r partitionr r r setupconstraintsD globaloptimizel endif Summary of Gordian module coordinates Global Optimization minimization of Wire length Partitioning of module set and dissection of lacement region position constraints Regions With S k modules module coordinates Final Placement adoption of style dependent constraints Complexity space Om time Oml395 logzm Final placement standard cell macrocell amp SOG Experimental Results Comparison of Results for Standard Cell Blocks Area After Routingmm2 Circuit GORDIAN MinCut Annealing scbl 27 31 26 scb2 58 53 50 scb3 157 256 91 scb4 140 169 132 scb5 106 113 109 scb6 113 127 128 scb7 164 202 198 scb8 517 892 595 scb9 540 986 800 CPUtime scb8 120s 366s 39851s CPUtime scb9 135s 440s 34709s ratio 1 3 300 l GORDIAN Placement I Perform GORDIAN placement 39 Uniform area and net weight area balance factor 05 39 Undirected graph model each edge in kclique gets weight 2k c g g quot1 1 9 Practical Problems in VLSI Physical Design GORDIAN Placement 121 110 Placement I Necessary for GORDIAN to work y L 4 W4 3IW3 2Iw2 I24 1 w1 I23 21 22 0 l V l 2 3 4 X 171 Practical Problems in VLSI Physical Design GORDIAN Placement 221 4 g r l Adj acency Matrix l Shows connections among movable nodes 39 Among nodes a to j O O O l 0 O 0 0 0 0 0 0 0 0 23 O 12 1 93 0 76 0 0 0 0 172770 O 0 12 O 76 0 2323 0 2730 12120 1 0 O 0 930 0 O 023023O O 232323 0 93 1930 0930 0 93 0 0 93900 12 O 232323 0 0120 1 0 00000 000 7612 O O 0 0001 g 41 GORDIAN Placement 321 4 quot 4T a xivj Practical Problems in VLSI Physical Design J Pin Connection Matrix l Shows connections between movable nodes and 10 39 Rows movable nodes columns 10 xed 00 000 000000 00g0000 00g0000 000 000 0000000 00000000 00000100 00000010 CI 00000001 9 Practical Problems in VLSI Physical Design GORDIAN Placement 421 J Degree Matrix l Based on both adjacency and pin connection matrices 39 Sum of entries in the same row node degree CD 3 W o o o o o o o o 0 any 0 o o o o o o o cgt o o o o o o o mlg o o o o o o o o a 33 o o o o o o o 0 mm o o o o o o o 0 Gal 0 o o o o o o c wloo o o o o o o o o 94 o o o o o o o o w o o o o o o o o 045 O o o o O o o o O jab Hi C l 1C1 Practical Problems in VLSI Physical Design GORDIAN Placement 521 l Laplacian Matrix I Degree matrix minus adjacency matrix 0 0 g 0 0 0 0 0 1 0 0 0 0 o 0 o o o 0 g 0 0 0 0 0 1 i 0 O2 0 2 O2 02 O 5 1 s 0 5 F 3 0 o o 0 0 0 0 0 1 0 13 0 0 0000 0 0 0 1 0 0 0 1 41 1 lt1 H w Practical Problems in VLSI Physical Design GORDIAN Placement 621 1 Fixed Pin Vectors I Based on pin connection matrix anle location Each entry 39139 in II denoted In is computed as follows 117 Zlh j IOU j where in denotes the entry of the pin connection matrix and 11 j is the ICOOl dlnale of the corresponding 10 pin j 39 Ydirection is de ned similarly 171 Practical Problems in VLSI Physical Design GORDIAN Placement 721 la a ll J Fixed Pin Vectors cont 2 2 1 1M 30g0000152030404z 1 1 z 1 By examining the remaining 9 movable cells we get d 1 0 1 1 0 3 4 1 g g 0 0 0 0 0 g 0 0 0 0 0 0 4 0 0 g 0 0 0 0 W4 0 O 0 0 O 0 3IW3 0 g 0 0 0 0 0 2Iw2 I24 0 0 0 0 0 0 0 00000000 1m 23 0 0 0 0 0 1 0 0 21 22 CI 00000010 034x j 0 0 0 0 0 0 0 1 7 Avg Practical Problems in VLSI Physical Design GORDIAN Placement 821 JF1xed Pm Vectors cont 2 2 g 1 ly l 1203O450000102 2 g a 4 By examining the remaining 9 movable cells we get dj2 4 6 i 0 0 0 1 2 o 3 g g 0 0 0 0 0 g 0 0 0 0 0 0 4 0 0 g 0 0 0 0 WA 0 O 0 0 O 0 3IW3 0 g 0 0 0 0 0 2Iw2 I24 0 0 0 0 0 0 0 00000000 lIW1 23 0 0 0 0 0 1 0 0 21 22 CI 00000010 034x j 0 0 0 0 0 0 0 1 7 Avg Practical Problems in VLSI Physical Design GORDIAN Placement 921 1 Level 0 QP Formulation I No constraint necessary Minimize 1 Q39lf EITCLI39 dprgl and 1 y 2 51ij 3911 dfy We use MOSEK and obtain the following solution rT 095 092 121 132 132 161 198 213 259 251 yT 127 183 248 261 116 145 184 092 141 203 if practical Problems in VLSI Physical Design GORDIAN Placement 1021 12 xi r 1 Level 0 Placement I Cells with real dimension will overlap W4 W3 I d C I w2 I b g I 24 o f 9 0 l W1 I a e o I 23 h C 21 22 1 3 3 1 1 1 9 Practical Problems in VLSI Physical Design GORDIAN Placement 1121 1 Level 1 Partitioning I Perform level 1 partitioning 39 Obtain center locations for centerof gravity constraints W4 W4 W3 I d W3 I c 39 W2 I b g I Z4 W2 I X X I Z4 I f 9 W1 a 39e h I I 23 W1 23 Z1 22 Z1 Z2 xi m 14 1 9 Practical Problems in VLSI Physical Design GORDIAN Placement 1221 1 Level 1 Constraint We rst sort the nodes based on their 17coordinates I a C 2 L g hj We perform paititioning under a 05 SP1 b1r1 d Sp fly11211 The center location vectors are 1 i 1 lt1gti 2 gt 3 2 We build the matrix A for the centerof gravity constraint at level I 1 J l l J 1 Cl Alt1gt3g3g300009 0 0 0 0 0 I g 1 l 3 14 Practical Problems in VLSI Physical Design GORDIAN Placement 1321 lLevel 1 LQP Formulation We now solve the following Linearly constrained QP LQP to obtain the new placement for the movable nodes 1 r Minimize GI l fC l Ilyfr subject to A r 119 1 Mlnlmlze My SgTC y yTy subject to A y 11 The solutions are as follows rT070 071 117 121 122 217 310 234 356 333 yT134 191 266 276 130 183 245 132 191 249 Practical Problems in VLSI Physical Design GORDIAN Placement 1421 1 11 51 171 1 Level 1 Placement W4 wal d c0 9 wzl bgtlt X 9 I24 f 39 o o W1I 3 e h I23 Z1 22 14 Practical Problems in VLSI Physical Design 1 a r GORDIAN Placement 1521 1 Veri cation I Verify that the constraints are satis ed in the left partition The following cells are partitioned to the left a070 131 b071 19 1 117 266 1121 276 and l22 130 Thus the center of gravity is located at 070 071 117 121 122 r 100 d 134 104 266 276 130 r 200 0 WA This agrees with the center location 1 2 wal d c39 939 391 W2I boX X o 392 W1 I a 5 h 23 4 1 2 a ll 171 Practical Problems in VLSI Physical Design GORDIAN Placement 1621 1 Level 2 Partitioning I Add two more cutlines 39 This results inpicdp2abepsgJp4 hi W4 W4 l P1 P3 W3 I d W3 I X X c g j 2 4 W2I boX X o 3924 W2Ip p I24 f 39 o o W1 I a e h I Z3 W1 I X X 23 I Z1 22 21 22 11 1 Practical Problems in VLSI Physical Design GORDIAN Placement 1721 sq xi gr 1 Level 2 Constraint The center location vectors are 1 32 lt2 1 2 12 I i 5 y i 32 3 12 Next we build the matrix A for the center of gravity constraint at level I 2 Recall that p1 a I 2 1163 glj p4 h Thus 00 000000 42 00 00000 000000 00 00000 0 0 Where the rows denote the partitions 1 through p4 and the columns denote the cells a through j A Practical Problems in VLSI Physical Design GORDIAN Placement 1821 1 Level 2 LQP Formulation We now solve the following LQP to obtain the placement of the movable nodes 12 1 zTC39J 1Tr subject to A 1 J Minimize gr i 1 Minimize My SyTCy IyTy subject to Am y ufygl The solutions are as follows zT 053 078 100 100 139 228 289 300 366 311 F 101 178 300 332 052 111 315 059 157 322 Practical Problems in VLSI Physical Design GORDIAN Placement 1921 11 11 441 39 1 Level 2 Placement I Cliquebased Wiring is shown W4 Cd W3 I 2 at W2 I I 74 b i f W1 I 5x X I Z3 8 h Z1 22 4 1 r 9 W2l I24 f i 23 W1I a e I h Z1 22 111 9 Practical Problems in VLSI Physical Design GORDIAN Placement 2021 1 Summary I Centerof gravity constraint 39 Helps spread the cells evenly while monitoring wirelength 39 Removes overlaps among the cells with real dimension 0d 2 it I d I d I C Q 0 CC C g39J I b Q J I I b f I I I b t I of I f I39 II a 39e 39 I I a e h I I o I h a e h ii I Practical Problems in VLSI Physical Design GORDIAN Placement 2121 la a ll Linear vs Quadratic Objective gg xed movable xed xed movable xed Quadratic objective function Linear objective function q lll2l 2l ly2l 2 q 41 12l 0 gt1 31 1 lal 37 llal ly l 3 Linear vs Quadratic Objective Quadratic objective function tends to make very long net shorter than linear objective function lets short nets become slightly longer LL I A rowl I I rowl I I row2 I A row2 1 row3 I B row3 I r0W4 I I I row4 I Linear objective function Quadratic objective function Optimizing Linear Objective Global Placement with linear objective function q Z Z xuv xv 2 quadratic objective function veNueMv IZZ veNueMv Trick use quadratic programming to minimize linear objective function IZ Z xuv xv 2 2 xuv xv veNueMv xuv xv veNueMv guv guv 9gvz ueMv x xv linear objective function xuv xv xuv xv Analytical Placement Results wire length mm 400 circuit primary 2 300 III Gordian I GordianL 200 100 0 l lullIlllrll lrllmlrllrIIFlrlFII Ilrll 391 Dz 6 Q o g Qz number of pms of a net Figure Sum of wire lengths versus pins Analytical Placement Results ion Linear objective function ITITIIC39T39 I 3 I l quotwtT23quot 31 ll 4 LII NW 39 in a Global placement with 1 region 9 IllIlll lllll Analytical Placement Results Quadratic objective function Linear objective function gaming umuu mu V Illll T717 1 ll ll Illllllll IquotI m m llu lll lllllll ll lllllllllll b Global placement with 4 regions Analytical Placement Results Quadratic objective function rillmugllgumilivnuml Ln1 3 39 I I 7 311 m l llll 39lx l N Linear objective function I A 139 V 2 am mum Ill Ill c Final placements Placement by Simulated Annealing o Sechen and Sangiovanni Vincentelli The TimberWoIf placement and routing package IEEE J Solid State Circuits Feb 1985 TimberWoIf 32 A new standard cell placement and global routing package DAC 86 o TimberWoIf Stage 1 Modules are moved between different rows as well as within the same row Modules overlaps are allowed When the temperature is reached below a certain value stage 2 begins o TimberWoIf Stage 2 Remove overlaps Annealing process continues but only interchanges adjacent modules within the same row Solution Space amp Neighborhood Structure 0 Solution Space All possible arrangements of the modules into rows possibly with overlaps o Neighborhood Structure 3 types of moves M1 Displace a module to a new location M2 Interchange two modules M3 Change the orientation of a module IIII HIII ZIIEE IIZ IIE o I ll I I ll I 4 3 overlap M1 M2 M3 Neighborhood Structure TimberWolf first tries to select a move between M1 and M2 ProbMl 08 ProbM2 02 If a move of type M1 is chosen and it is rejected then a move of type M3 for the same module will be chosen with probability 01 Restrictions 1 what row for a module can be displaced 2 what pairs of modules can be interchanged Key Range Limiter At the beginning WTHT is very large big enough to contain the whole chip Window size shrinks slowly as the temperature decreases Height and width oc logT Stage 2 begins when window size is so small that no inter row module interchanges are possible Cost Function 0 Cost function C C1 C2 C3 0 01 total estimated wirelength Cl ZiENets aiwi 04147514 are horizontal and vertical weights respectively 04 L i 1 gt gtlt perime ter of the bounding box of Net 239 Critical nets Increase both 04 and i If vertical wirings are cheaper than horizontal wirings use smaller vertical weights 51 lt 04 penalty function for module overlaps C2 VZWE quoty penalty weight Oij amount of overlaps in the ac dimension between modules 239 and j penalty function that controls the row length C2 62746130 L7a DT 6 penalty weight D74 desired row length LT sum of the widths of the modules in row 7quot Annealing Schedule Tk 9 ka1 123 rk increases from 08 to max value 094 and then decreases to 08 At each temperature a total of nP attempts is made n of modules P user specified constant Termination T lt 01 l TimberWolf 70 Placement I Perform TimberWolf placement 39 Based on the given standard cell placement 39 Initial HPBB wirelength 23 392 0 713 1L C k n 714 1 hat 715 1771 71 L 717 a e f h7 IL 713 1 l 719 bg 21 In 6 7110 1134 a a 3 quot1 1 9 Practical Problems in VLSI Physical Design TimberWolf Placement 116 J First Swap 5b 4 Cr Practical Problems in VLSI Physical Design I Swap node I and e We shift node h on the shorter side of the receiving row Node 9 included in nets 113 119 and e in 111 117 711 a7 69 712 o n3 56 Ian 714 1 hi 715 M7771 716 617 1W 717 0 e f h7 71 n8 d7 7L9 bgim n10 a7 1510 TimberWolf Placement 216 1 Computing AW I AW Wirelength change om swap Let 39wr and iv 2 respectively denote the Wirelength before and after the swap Then Thus AUquot 2 A1L3 A019 A011 A017 2 19 I Practical Problems in VLSI Physical Design TimberWolf Placement 316 4 g r 1 Estimating AWs I AWs Wirelength change from shifting 39 h is shifted and included in H4 dhi and n7 cefhn 39 h is on the tight boundary of H4 gradiemh 39 h is not on any boundary of H7 no further change on gradienth 1 g 51 451 Practical Problems in VLSI Physical Design TimberWolf Placement 416 1 Estimating AWs cont Thus gradien L 1 Since I is shifted to the right by 5iftJl nIO39llJIIUI 1 Thus A1175 gradien z Sliftll loIHZTUZ 1 1 1 Based on the calculation of AW and AWS we get AC 2 AW A1175 2 19 1 20 Practical Problems in VLSI Physical Design TimberWolf Placement 516 1 1 411 1 Accuracy of AWs Estimation I How accurate is AWs estimation 39 Node h is included in H4 dim and n7 cefhn 39 Real change is also 1 accurate estimation iv124 7 11024 lL 77 7 15017 20719 28 i 28 1 I Practical Problems in VLSI Physical Design TimberWolf Placement 616 4 a r 1 Estimation Model B I Based on piecewise linear graph 39 Shifting h causes the wirelength of H4 to increase by l 19 to 20 and no change on n7 stay at 28 Wn C n4 j Mold hlnew ii 1 1 9 Practical Problems in VLSI Physical Design TimberWolf Placement 716 J Second Swap I Swap node m and 0 39 We shift node d and g on the shorter side of the receiving row 39 Node m included in nets 115 119 and 0 in 112 1110 711 a7 69 712 o n3 56 Ian 714 1 hi 715 M7771 716 617 1W 717 0 e f h7 71 n8 d7 7L9 bgim n10 ch 1510 A CI til Practical Problems in VLSI Physical Design TimberWolf Placement 816 1 Computing AW I AW Wirelength change om swap A125 w39n5 7111125 12 711 1 129 Livn9 i wng 22 i 26 7 1 A022 LUllg Mm 7 14 7 A0110 iii7210 Lvn10 23 23 0 2 Thus ATV A015 3129 A112 A0710 10 I Practical Problems in VLSI Physical Design TimberWolf Placement 916 4 g r 1 Estimating AWs I Cell 6 and g are shifted 39 dis included in H4 dhi n6 dkj and n8 dl 39 d is on the tight boundary of HG and n8 39 So gradientd 2 1 n4 n6 r 111 9 Practical Problems in VLSI Physical Design TilnberWolf Placement 1016 1 Estimating AWs cont I Cell 6 and g are shifted 39 g is included in n1 aeg and n9 bgim 39 g is on the tight boundary of ml and n9 39 So gradientg 2 9 441 Practical Problems in VLSI Physical Design TilnberWolf Placement 1116 1 Estimating AWs cont Both cell 2 and g are shifted to the right by 2 Thus All9 g39rad iE39ntUZ Shiftamoun d gram lien g shiftamoum g 2 2 2 2 8 Based on the calculation of All7 and All395 we get AC 2 AW Alli 10 8 2 A Practical Problems in VLSI Physical Design TilnberWolf Placement 1216 J Third Swap I Swap node k and m 39 We shift node 6 on the shorter side of the receiving row 39 Node k included in nets 113 116 1110 and m in 115 119 711 a7 69 712 o 7L3 50 L371 714 1 hi 715 J 17 m 716 617 1W 717 0 e f h7 n 718 611 7L9 bgim 7110 1 1510 4 r Practical Problems in VLSI Physical Design TimberWolf Placement 1316 1 Computing AW I AW Wirelength change om swap Thus Allv A013 A010 A0210 A025 A019 2 7 Practical Problems in VLSI Physical Design TilnberWolf Placement 1416 1 g 51 451 1 Estimating AWs I Cell c is shifted 39 c is included in H3 bckn and n7 cefhn 39 c is on the left boundary of H3 39 So gradientc l Practical Problems in VLSI Physical Design TilnberWolf Placement 1516 41 1 51 451 1 Estimating AWs cont Since is shifted to the left by 1 5hiftamountc 1 Lastly Allfg gradienik shichnnOlmtc 1 1 1 Based on the calculation of Allquot and Allfg we get AC AW AWS 7397 1 76 Practical Problems in VLSI Physical Design TilnberWolf Placement 1616 1 g 51 4711 Multi net Routing ECE6133 Physical Design Automation of VLSI Systems Prof Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Net Ordering 0 Net ordering greatly affects routing solutions 0 In the example we should route net b before net a route net at before net I route net 1 before net at Net Ordering cont d o Order the nets in the ascending order of the of pins within their bounding boxes o Order the nets in the ascending or descending order of their lengths o Order the nets based on their timing criticality routing ordering a 0 7gt b I 7gt 612 7gt c 6 o A mutually intervening case a prevents routing of b b prevents routing Ufa a feasible routing RipUp and Rerouting Rip up and re routing is required if a global or detailed router fails in routing all nets Approaches the manual approach the automatic procedure Two steps in rip up and re routing 1 Identify bottleneck regions rip off some already routed nets 2 Route the blocked connections and re route the ripped up connec tions Repeat the above steps until all connections are routed or a time limit is exceeded 1 Routing Multiple Nets I Onebyone method 39 Need to update the usage after each net 39 Order is important affects routability signi cantly 1 dz 2 3 net 233v144l dzlnr2 1 2 3 4 Fig I An instance of global routing in a twodimensional arraysi 171 Practical Problems in VLSI Physical Design 1 g r l Steiner MinMax Tree I De nition 39 Given an undirected edgeweighted graph GVE and a net n that contains a subset of nodes D the Steiner MinMax Tree SMMT of n is a Steiner tree of n where the maximum weight among all edges in the tree not the total wirelength is minimized I Rationale 39 If the edge weight indicates the routing usage congestion of the edge minimizing the maximum edge weight in the Steiner tree means avoiding congested spots I Complexity 639 39 SMMT problem solved optimally with ALGSMMT j 39 MSMMT SMMT with minimum wirelength NPHard I 1 ll 9 Practical Problems in VLSI Physical Design 1 Algorithm I Two phase algorithm 39 SMMTphase build SMMT for each net onebyone 39 If the wirelength is above the threshold we discard 39 Repeated multiple times if desired 39 SPphase reroute each net using the shortest path tree 39 Start with the geometric center 39 Add one node at a time using shortest path resembles Prim s MST 39 Repeated multiple times if desired 39 Which net order 39 Increasing order of HPBB size 11 ll 1 Practical Problems in VLSI Physical Design 1 a l l SlV VIT Phase I Wirelength threshold during pass j 39 P1 HPBB ofneti SMMTphase Pass390 S Jquot Assign cj current constant fori l to n do begin Ni current net as dictated by H ifLN so then begin temp ALG SMMTN length ofNi by ALG SMMT tftemp s cjp then LNlt temp accepting routing ofNi end Cl end a r 111 9 Practical Problems in VLSI Physical Design 1 SP Phase I ALG SP resembles Prim s MST 39 Connect a node to T using minweight path SPphase PaSSjJ S J fori 1 to n do begin N current net as dictated by 11 temp ALGSPi length ofNiby ALGSP iftemp lt LN then LN temp changing the routing of previously routed nets will make some edges avail able for unrouted nets end 111 9 Practical Problems in VLSI Physical Design 4 g r l ALGSMMT Algorithm Procedure ALGSMMTG V E D begin T An MST ofG while there exists a degree I Steiner vertex5 in T do begin 11 a degree 1 Steiner vertex of T a Steiner leaf Remove 0 and the corresponding edge from T end Y 399 OUTPUTT 2 5331 end 1 1 5 o 1 o a 10 1 9 1 C 0 1 net n5 MST A v Practical Problems in VLSI Physical Design Results TABLE 1 DIFFICULT quotND RANDOM DATA RESULTS Description Rest lts Name No of Muitip Bounding Maxt Total Time Nets plicity Length Cap Length sea Dif cult4 8 2 32 2 35 007 Di 39icuILB 3392 2 256 4 296 115 Dif cult15 128 2 2048 9 2214 275 Random4 5 4 15 2 16 010 Random8 18 4 154 3 160 027 Random16 72 6 1148 6 1229 150 1 51 multiplicity net size k Practical Problems in VLSI Physical Design 16 23 1 i ll 2 8 3O 24 7 239 5 Q l IQ J h 1 39 1 l 6 l4 5 22 98 2 lb 3 3H4H H ll 18 l 10 21 6 3 31 2 g 19 2 27 25 16 8 24 29 Fig 6 An instance ofdii culIS l Steiner MinMax Tree Routing I Route the nets onto 5 X5 grid 39 Edge capacity is 3 39 Route nets in the given order 39 Two phases SMMTphase use c 20 and SPphase quot1 130 03 32 34 quot2 032 30 43 quot3 131 22 40 44 quot4 030 21 13 41 24 quot5 230 04 42 33 1 g g 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 115 l SiV VITPhase I Route rst net 39 Net n1 HPBB 7 edge weights 0 no edge usage yet 39 MST is not unique 39 SMMT maxweight 0 wirelength 9 lt 20 cl X 7 1 9 gr 1 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 215 l SlV VITPhase I Route second net 39 Net n2 HPBB 7 edge weights re ect routing of n1 39 SMMT maxweight 0 wirelength 10 lt 20 X 7 SMMT l 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 315 1 9 gr l SlV VITPhase I Route third net 39 Net n3 HPBB 7 edge weights re ect routing of ml and n2 39 SMMT maxweight 1 wirelength 15 gt 20 X 7 39 So we reject this SMMT routing failed SMMT quot1 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 415 A l SlV VITPhase I Route fourth net 39 Net n4 HPBB 8 edge weights re ect routing of ml and n2 39 SMMT maxweight 1 wirelength 15 lt 20 X 8 l 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 515 1 9 gr l SlV VITPhase 1 9 gr 1 1 9 Practical Problems in VLSI Physical Design I Route fth net 39 Net n5 HPBB 8 edge weights re ect routing of n1 n2 n4 39 SMMT maxweight 1 wirelength 12 lt 20 X 8 T1 T 1 4 1 E W 1 3 1 Ween 6 M lt2 4 1f1 y1 dj zj 1011 2 0 41 1 6 1 0 014 I o 5 netn5 MST SMMT Steiner MinMax Tree 615 1 Summary of SlV VITPhase 3 13 1 lt13 2 1 1 14 a 143 112 a a L14 nal routing graph Practical Problems in VLSI Physical Design Steiner MinMax Tree 715 a 2 l SMMTn4 SMMTn5 A 439 1 l SPPhase I Reroute rst net SMMTn1 wirelength 9 Source node s 32 arrow geometric center among terminals Sinks are added to s in this order 34 03 10 SPn1 wirelength 8 1 1 MM 13 after ripping up SMMTn1 SMMTn1 SPn1 Steiner MinMax Tree 815 4111 Practical Problems in VLSI Physical Design l SPPhase I Reroute second net 39 SMMTn2 wirelength 10 39 Routing graph re ects rerouting of 111 ie SPn1 39 Source node s 30 sinks are added 43 02 39 SPn2 wirelength 7 W 1 0 119 1 1 1 2 2 9 2 1 1 2H 2 1 2 1 9 1 0 Q 0 1 1 1 2 1 1 U 1 O Q G 1 9 1 O 2 1 2 Cl 0 1 o o 1 o 1 o I SMMTn2 after ripping up SMMTn2 SPn2 1 411 Practical Problems in VLSI Physical Design Steiner MinMax Tree 915 l SPPhase I Reroute third net 39 SMMTn3 does not exist due to routing failure 39 Routing graph re ects rerouting of 111 and n2 39 Source node s 22 sinks are added 11 40 44 39 SPn3 wirelength 9 4142 241 4 4 412624113 1 1 1 2 3 1 1 2 3 1 2 2 1 2 2 Q G lt3 it 1 3 a ft m 3 1 1 1 o o 1 o 2 2 2 2 Cl 1 1 1 o o 1 0 j SMMTn3 empty after ripping up SMMTn3 SPn3 1 3 3 1 4111 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1015 l SPPhase I Reroute fourth net 39 SMMTn4 wirelength 15 Routing graph re ects rerouting of n1 n2 n3 Source node s 21 sinks are added 41 00 13 24 39 SPn4 wirelength 9 1 SMMTn4 after ripping up SMMTn4 SPn4 1 1 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1115 l SPPhase I Reroute fth net 39 SMMTn5 wirelength 12 Routing graph re ects rerouting of n1 n2 n3 n 4 39 Source node s 20 sinks are added 42 33 04 39 SPn5 wirelength 9 1 SMMTn5 after ripping up SMMTn5 4 I 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1215 1 Summary of SPPhase 1111 Q 3 1 1 22 1 1 3 2 1 1 2J 1 1 3 1 Q1 14 1 1 3 1 111 6 1 8 final routing graph Q 1 1 1 14H 1 2 l SPnS SPn4 51 1 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1315 l SlV VIT vs SP I SMMT promotes 39 Even usage of the edges less congestion 39 Not a fair compatison since n3 is missing in SMMT 39 Still SMMT tends to minimize congestion if 3 o 1 1lt1 1lt2 2 o 1 o1 5 1 5 1J I o 1 6 1 6 1 6 1 o SMMT SP 2 H3 1 I 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1415 1 g r l SiV VIT vs SP cont I SP promotes 39 Shorter wirelength higher weight more congestion 39 Congestion vs wirelength tradeoff exists SMMT phase SP phase net max ewgt wirelength max e wgt wirelength n1 2 9 2 8 n g 2 IO 2 7 713 routing failed 3 9 724 2 15 3 9 n 5 2 l 2 3 9 1 r e 2 14 1 9 Practical Problems in VLSI Physical Design Steiner MinMax Tree 1515 Routing Models o Grid based model A grid is super imposed on the r0uting region Wires follow paths along the grid lines o Gridless model Any model that does not follow this gridded approach grid b ased gridless Models for MultiLayer Routing o Unreserved layer model Any net segment is allowed to be placed in any layer 0 Reserved layer model Certain type of segments are restricted to par ticular ayers Two layer HV horizontal Vertical VH Th ree Iayer HVH VHV E 6 track 2 e traCk 3 lt77 track 2 track 1 track 1 unreserved layer model HVH m 0 d 61 VHV model 3 types of 3 layer models Terminology for Channel Routing Problems terminals 9 upper boundary 1451670491 quotelmquot 23535268987 9 lower boundary 2 3 5 3 5 2 6 8 9 8 7 local 43 2 damn13554333 terminals m4 is ax I i i i 7 E dogleg V branches T lt lower boundary trunks upper boundary 0 Local density at column 239 total of nets that crosses column 239 0 Channel density maximum local density of horizontal tracks required 2 channel density Channel Routing Problem Assignments of horizontal segments of nets to tracks Assignments of vertical segments to connect horizontal segments of the same net in different tracks and the terminals of the net to horizontal segments of the net Horizontal and vertical constraints must not be violated Horizontal constraints between two nets The horizontal span of two nets overlaps each other Vertical constraints between two nets There exists a column such that the terminal on top of the column belongs to one net and the terminal on bottom of the column belongs to the other net Objective Channel height is minimized ie channel area is mini mized Horizontal Constraint Graph HCG o HCG G V E is undirected graph where V mlm represents a net m E vi0j a horizontal constraint exists between m and nj o For graph G vertices 4 nets edge 2 j 4 net 2 overlaps net j 15202110340 5 I H O O O 30125340023 2 A routing problem and its HCG 3 Vertical Constraint Graph VCG o VCG G V E is directed graph where V mlm represents a net m E UiUj a vertical constraint exists between m and nj o For graph G vertices 4 nets edge i gtj 4 net 2 must be above net j 5 1 15202110340 o o o o o o o o o o o O H O H O O O O 30125340023 A routing problem and its VCG 3 2L Channel Routing Basic LeftEdge Algorithm Hashimoto amp Stevens Wire routing by optimizing channel assignment within large apertures DAC 71 No vertical constraint HV layer model is used Doglegs are not allowed Treat each net as an interval Intervals are sorted according to their left end x coordinates Intervals nets are routed one by one according to the order For a net tracks are scanned from top to bottom and the first track that can accommodate the net is assigned to the net Optimality produces a routing solution with the minimum of tracks if no vertical constraint Basic LeftEdge Algorithm Algorithm BasicLeft EdgeUtrackjD U set of unassigned intervals nets 117In Ij Sj76j2 interval j with leftend xcoordinate sj and rightend ej trackj track to which net j is assigned 1 begin 2 Ult 11I2In 3 tlt O 4 while 1720 do 5 t lt t 1 6 watermark lt O 7 while there is an Ij E U st Sj gtwatermark d0 8 Pick the interval Ij E U with Sj gt watermark nearest watermark 9 trackj lt t 10 watermark lt j 11 U lt U 1 12 end Basic LeftEdge Example 0 U 11IQI6 1 1312 2613 48 I4 510 5 711 6 912 0 t 1 Route 1 watermark 3 Route 3 watermark 8 Route 6 watermark 12 o t 2 Route 2 watermark 6 Route 5 watermark 11 o t3 Route I4 0 column I 10 II 12 2 I 0 T l 39Qw Q k Dem 1IkOo I 1IO D 1It Q N Q 0 u me Q 0 w Q0 w 90 w Q0 ox w Q0 m 90 N Q0 MN N density Constrained LeftEdge Algorithm Algorithm ConstrainedLeft EdgeUtrackUD U set of unassigned intervals nets 117In Ij Sj76j2 interval j with leftend xcoordinate sj and rightend ej trackj track to which net j is assigned 1 begin 2 Ult 11I2In 3 tlt O 4 while 1720 do 5 t lt t 1 6 watermark lt O 7 while there is an unconstrained Ij E U st Sj gt watermark do 8 Pick the interval Ij E U that is unconstrained with Sj gt watermark nearest watermark 9 trackj lt t 10 watermark lt j 11 U lt U 1 12 end Track 1 Track 2 Track 3 Track 4 Constrained LeftEdge Example 1 1312 2 15 3 68 4 1011 5 26 6 2 79 Route 11 cannot route 13 Route 16 Route I4 Route 12 cannot route 3 Route 15 Route 13 I4 0 00 39 track 1 track 2 U01 Q0 0 track 3 Q0 4 track 4 Doglegs in Channel Routing Doglegs may reduce the longest path in VCG aab c ab c c1 ab cdd ab c dd a a c2 b gt g c d c1 d Doglegs break cycles in VCG O b a b1 E a i b a 1 a b a b b2 Doglegs in Channel Routingmonm Restricted Dogleg vs unrestricted dogleg a a a 3 Detailed Routing Dogleg Router 0 Drawback of LEA the entire net is on a single track o Doglegs are used to place parts of a net on different tracks7 thereby minimizing Channel height 2 1 1 2 3 ski 2 a 1 1 2 3 2 2 3 Using a dogleg to reduce Channel height Algorithms for VLSI Physical Design Automation Sherwani 92 Detailed Routing 1 Dogleg Router 0 Each Multi terminal net is broken into a set of two terminal nets 0 Two parameters are used to control routing 1 range Determine the number of consecutive two terminal subnets of the same net that can be placed on the same track 2 routing sequence Speci es the starting position and the direction of routing along the channel 0 Modi ed LEA is applied to each subnet 0 l 2 2 4 3 0 0 I I I I I I I I I I I I I I H Hie 01224300 I IIII quot quotquot 12033044 12 3044 a b Example of Dogleg Router I I I I 0 3 Algorithms for VLWWg gctlDDWgn Automation Dogleg Router Example Decompose multiterminal nets into twoterminal nets 1 1 2 0 2 3 1 0 0 23 2b 0 6 0 1 I 2 0 2 3 33 3b 0L0 2 4 3a 3b 2 3 o 3 4 b a 2n 4 07 u L 0 2 O A 4 Final solution 0 Characterizing Channel Routing Problem 0 1 4 5 1 6 7 0 4 9 10 10 l1394ll739 w Vertical constraint graph G 9 Horizontal constraint graph The channel routing problem is completely characterized by the vertical constraint graph and the horizontal constraint graph Zone Representation of Horizontal Segments 10 10 4 9 0 0 91 V 0 W 791 7009 47009 V 4700 467 v 246 V 1245 12345 12345 v 1 Zone Zone Representation of Horizontal SegmentSCont d Zone representation Si set of nets intersect column i 8 we only need to consider those sis which are maximal 8 Zone gt maximal clique in the horizontal constraint graph Merging 0f Nets and can be merged GD 19 63 Updated graph and zone rep Net i and netj can be merged if a there is no path directed connecting them in VCG b the two nets do not overlap Change of VCGS cw track 1 a track 2 track 3 9 track 5 d or 4 How to choose two feasible nets to merge gtDetermine the quality of the solutions Proc s the Zones Sequentially 0 bthA LEFT135 RIGHT6 LEFT123 RIGHT7 LEFT2356 RIGHT89 Process the Zones Sequentially Cont d 17 LEFT2384 RIGHT10 38 569 LEFT17 2 38 410 569 RIGHT First Approach 8 Merge LEFT and RIGHT so as to minimize the increase of the longest path length in the VCG E Heuristic rule to select nets to merge sequentially duv maxdudV 39 A v d23 C9 u22 G Longest lower path l 3 o u 1o3 69quot Longest upper path uV maXUUUV 1 What to Choose from PQ The purpose here is to minimize the length of the longest path after merger However it will be too time consuming to nd an exact minimum merger hence a heuristic merging algorithm will be given Let us introduce some basic intuitive ideas First a node In E Q is chosen which lies on the longest path before merger furthermore it is farthest away from either s or t Next a node n E P is chosen such that the increase of the longest path after merger is minimum If there are two or more nodes which will result in a minimum increase we choose n such that 1101 dn is maximum or nearly maximum and that the condition umdm undn is satis ed or nearly satis ed These can be implemented by introducing the following l Heuristic Merging Algorithm given P Q begin while Q is not empty do begin among Q nd m which maximizes fm a1 a2 a3 among P nd 11 which minimizes gn m and which is neither ancestor nor descendant of m a4 merge n and m 35 remove n and m from P and Q respectively end end Practical Problems in VLSI Physical Design