### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Foundations of Fine CS 560

CSU

GPA 3.51

### View Full Document

## 5

## 0

## Popular in Course

## Popular in ComputerScienence

This 27 page Class Notes was uploaded by Betty Kertzmann on Tuesday September 22, 2015. The Class Notes belongs to CS 560 at Colorado State University taught by Sanjay Rajopadhye in Fall. Since its upload, it has received 5 views. For similar materials see /class/210182/cs-560-colorado-state-university in ComputerScienence at Colorado State University.

## Reviews for Foundations of Fine

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/22/15

HighPerformance Embedded Systems39onaChip Sanjay Rajopadhye Computer Soiemzea Colorado State University Lecture 4 Systolic Synthesis oontdo Convolution Inwalspecmca on 12 1 w Wm 5w Serialization amp Alignment Replace unbounded fanin Z by sequence of binary additions Align input and output vars LocalizationUhiformization Remove unbounded fanout ie Iong dependences 9139 0 jn 11WijXij jltn 1ijn WijXij Y jZOZi jgtOXi 1j 1 7 j iyj j 0 O igt Scheduling amp Allocation L 189 I 39l t m jn 1 Scheduling amp Allocation Scheduling amp Allocation VA n 239quotT Mun tram I I tij2i jn 1 ai7j II 008 Transformation isj 2jn lxj gtlt m jn1 5 CoB Transformation K II iajagijFnl 222jn 1 A V II N A 5w 6 V II m k 3393 ll DOM OH H HODi 3K1 DI H Graph of Transformed SURE Transformed SURE jn1 Yzj jltn 1zlib 133 WZIJXZ X j j0 jgt0 Xz39 1j 1 w in 1 jzn WM 9 zgtn 1 3Wh Z Gory Details p n 1event Wtgtp Xtgtp p lt n 1 event p n 1Yt 1p 1 WtpXtp pgt0 Xt 1p 1 tzn l p wp t tgt7 L 1 p3 event p n1IWt 2P Yt p Xtp p0event n 1 13154344 W p HighPerformance Embedded SystemsonaChip Sanjay Rajopadhye Computer Science Colorado State University Lecture 3 Systolic Arrays amp Systolic Synthesis gt IRISA Systolic Synthesis gt IRISA Systolic Synthesis Nth mathematical speci caticn SARE reductions do the following not necessarily in order quot39 IRISA Systolic Synthesis Vl th mathematical speci cation SARE reductions do the following not necessarily in order 1 Serialize reductions and Align inputs and outputs 2 Localize dependences 3 Schedule the SARE 4 Allocate the computation to processors 5 Transform the SARE 6 Generate the HDL gt I R I s A Example Convolution Inmalspecmca on 71 1 31 E wj33 j j gt gtIRISA Serialization amp Alignment Replace unbounded fanin Z by sequence of binary additions Align input and output vars jZOijij Yiiyj jgtO Yz39j 1 wgtiltw gt gtIRISA LocalizationUhiformization Remove unbounded fanout ie Iong dependences y n 1 0 I WijXij gt0 Yz j 1 0 0 gt0 WUJ Xij j jgtOXi 1j 1 J J L YUJ XUJ WUJ gt gtIRISA Scheduling amp Allocation gt gtIRISA Scheduling amp Allocation Adate tz j i j Scheduling amp Allocation gt gt IRISA 7 gtgtIRISA Geometric Transformation Executing Compiling Equations C8560 Classs Notes Alan LaMielle amp Andy Stone scribes for Sanjay Rajopadhye Consider a formula for the Fibonacci sequence ilt1 1 blzl igt1 bz39ell bli 2l 1 Out b 2 de ned over the domains D b l 0 S i S N and DOM l ie the domain of Out is the empty domain Z0i In other words Out is a scalar variable How would one write a program that implements1 this system of recurrence equations An obvious approach is through recursioni The following pseudocode is one possibility Algorithm 1 fibint 1 if 239 lt 1 then 2 return 1 3 else 4 return fibi71fibi7 2 5 end if However there are obvious limitations to this naiive implementation It has exponential running time with numerous duplicated calculations The following tree isllustrates this fact n 7 1 7 n n 7 7 in 1Note that there is a closed form solution to Fibonacci w so no one would really want to write a program to compute Fibonacci except for pedagogical purposes b6 b5 W4 bm3 we b 2 b 1 b 1 b a b a b1 b1 b2 b1 bl1 W1 bl1 Notice the repeated computations For example b3 is called three times and b2 is called ve times Thus the purely recursive approach is impractical Assuming we are working with the recursive approach we can certainly do better Notice from the above Fibonacci call graph that even though we are repeating computations only n distinct arguments are ever used for bn and thus only n distinct computations are actually needed Utilizing this fact we can employ a common technique called memoizationi Memoization is the act of remembering or caching already computed values for use at a later time and is so named because we are keeping a memo of already computed values Here is pseudoc code that utilizes this technique initialize an array of specified size where all elements are set to initialVal except elements 0 and 1 which are set to 1 int initializeFibArrayint size int initialVal int newArray newArray intmallocsizeofint size forint i 2 i lt size i newArrayEi initialVal newArrayEO newArray 1 1 return newArray memoized fibanocci int fibint i static const int NUTCUMPUTED 1 static int cache initializeFibArrayMAX NUTCUMPUTED ifcachei NUTCUMPUTED cachei fibi1 fibi2 return cachei The basic structure of this code is as follows Initialize an array of n values to a special value signifying that each value has not yet been computed ln bn check to see if the value for n has been computed If it has return the stored value and otherwise compute the value store it and return it We can use this strategy for compiling equationsl NN1 Allquot 0 2 xi p Figure 1 Forward Substitution Slices As a second example7 consider the all to familiar forward substitution formula nd 1 given L and b in il 2121 13971 igtl 11217 ELIjzj j1 I737 6 How would we go about compiling this equation We expect to do 0n2 work since we are computing n values of I over a summation of an average of g values See Figure l for a graphical representation of this computationi Each vertical slice of the triangle consists of a number of multiplications Llgt111 All of these must be summed The result of the reduction is stored at Iii What might the code that performs this computation look like to 1 initialize an array of specified size where all elements are set to initialVal except elements 0 and 1 which are set int initializeFSArrayint size int initialVal int newArray newArray intmallocsizeofint size forint i 2 i lt size i newArrayEi return newArray initialVal memoized Forward Substitution int computexint iint L int b static const int NUTCUMPUTED 1 static int c che initializeFSArrayMAX NOTCUMPUTED ifcachei NUTCUMPUTED ifi1 cache i bi else int res0 forint j1jltij resLij computexj Lb resbi res cache i res return cacheEi An addition to this may be to mark in the cache array that the value is currently being computed This allows us to detect circular dependencies and let the programmer know This works for nite domains but what about in nite domains such as those that occur in signal processing We cannot allocate a cache array in this case MAX is in nite In this case do not memoize and incur the cost of repeated evaluations There may be ways to improve this naive fallback and we may see them later What are the problems with the code generation strategy that utilizes memoization 1 It is slow The speed loss comes from a nnumber of reasons 0 The conditionals in the loop are one source of slowdown o Recursion is expensive stack frames maintenance is not free and potential context switching comes at a price as well Note that this slowdown is within a constant factor of the total computation it is manageable better than an exponential time program for a linear work algorithm but usually still unacceptable 2 Memory Hog The memory hog problem could be really severe if the language was not rich enough to expresscompile reductions this was the case with the original de nition of Alpha Moreover if some of the later analysis questions are dif cultimpossible for programs with reductions then we may be forced to somehow remove reductions MMA pha has a transformation called serializeReduce that does just this 3 No parallelism To better understand the memory problem let s see how we would write the forward substitution program without reductions De ne a new variable Temp de ned over the domain of the computation jgt 0 TempLj1 Lijzj Ii il bi 5 igt1 12 7 Temp 1 Temp O 0 4 The standard memoized program for this will require quadratic storage for the entire domain of Temp Further Optimizations How do we make this computation faster If we can gure out the direction ie a schedule for the computation such that dependences in the original program are respected ie the schedule is legal can we increase performance Yes if our program execution order follows the schedule we do not have to check if something was previously computed We are optimizing away the demanddriven computation Once we have a schedule we can gure out when a value is no longer needed based on its lifetime do a reuse analysis and only allocate space for the maximum number of simultaneously live values rather than the whole range of values Different analyses include scheduling value reuse and memory reuse analysis

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.