### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 630 Class Note for STAT 51400 at Purdue

### View Full Document

## 16

## 0

## Popular in Course

## Popular in Department

This 13 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 16 views.

## Similar to Course at Purdue

## Reviews for 630 Class Note for STAT 51400 at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Statistics 514 Design of Experiments Topic 4 Topic Overview This topic will cover 0 FundamentalsModel of Experimental Design 0 Introduction to Randomization o Permutation Test 0 Blocking Experimental Design Treatments7 units7 and assignment methods specify an experimental design 0 Underlying this class is the belief that experiments7 are different Different from what Different how 0 XD uses careful problem solving that require technical assumptions having to do with the nature of the data product 0 Since statistics is usually about analysis7 understanding of design relies heavily on having good data analysis techniques The nature of the data analysis technique will dictate the questions you can ask 0 Very often7 assumed models seek only to show effects7 not measure them As such7 there are many different models that could suf ce Thus7 the experiment doesn7t help in distinguishing between them Desirable Criteria for Experimental Design 0 The design points should exert equal in uence on the determination of the regression coef cients and effect estimates 0 The design should be able to detect the need for nonlinear terms 0 The design should be robust to model misspecification7 since all models are wrong 0 Designs in the early stage of the use of a sequential set of designs should be constructed with an eye toward providing appropriate information for the follow up experiments Topic 4 Page 1 Assumed Mechanism controllable factors 1 2 1 i Process or System xed inputs 6 Black BOX a responses 21 22 2p uncontrollable factors nuisance factorsinherent noise Why Statistical Experimental Design Because X1 X2 does not imply Y1 Y2 May be 1 Y fX 6 random 6 with mean 07 OR7 2 Y fX7 Z7 Z records other variables Puzzler which model is more generalmore useful Strategies of Experimentation o Best guess experiments 7 Used a lot i More successful than you might suspect7 but there are disadvantages o One factor at a time experiments 7 Sometimes associated with the scienti c or engineering method Devastated by interaction7 also very inef cient o Statistically design experiments 7 Based on Fisher7s factorial concept test all possible combinations A good design must 0 avoid systematic error7 o be precise7 0 allow estimation of error7 0 have broad validity Statistical expertise can help by xing up some common mistakes7 chie y confounding more later Topic 4 Page 2 Failed Experiment Did not answer question7 Not Proved answer we didn7t want77 To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination he may be able to say what the experiment died of7 R A Fisher Caveat Even if you see a statistician your experiment might still die Common things that go wrong 0 not a large enough sample 0 drop outs o ethical challenges 0 politicalsociallegal resistance 0 not enough money Ultimately randomized experiments are intrusive thus setting up an arti cial situation that has high internal validity but perhaps reduced external validity What can go right o It is not unusual for a well designed experiment to analyze itself77 0 You can see a lot by just looking7 7 Yogi Berra Terminology Measurement Unit 7 actual object on which the response is measured Ex ample leaf of a plant Experimental Unit 7 batch of things material animal person machine to which a treatment is applied Example plot of land Usually it is better to have more experimental units and fewer measurement units Criterion experimental units should be independent Topic 4 Page 3 Replication 7 Each treatment is applied to experimental units that are repre sentative of the population of units to which the conclusions of the experiment will apply Repetition 7 Like replication7 except that measurement is done on the same experimental unit Blinding 7 Evaluators of a response do not know which treatment was given to which unit Doubleblinding 7 Both evaluators of the response and the experimental units do not know the assignment of treatment to units Control 7 Treatment is a standard77 treatment that is used as a baseline or base of comparison for other treatments Placebo 7 A null treatment used when the act of applying a treatment 7 any treatment 7 has an effect Three Sources of Variability 1 Variability due to conditions of interest wanted 2 Variability in measurement process unwanted 3 Variability in eccperimental material unwanted Good design lets you estimate amount of variability due to each source Three Kinds of Variability 1 Planned systematic variability 7 the kind we want 2 Chancelike variability 7 the kind we can live with 3 Unplanned systematic variability 7 the kind that THREATENS DISASTER Confounding and Selection Bias Two in uences on a response are confounded if the design makes it impossible to isolate the effects of one from the effects of another Selection bias occurs in observational studies when the process of selecting groups to be compared confounds the effects of interest with other effects How long does it take for a car s brakes to stop it from say 50 miles per hour Topic 4 Page 4 Blatant confounding Compare Mercedes and minivans Do 10 Mercedes trials on wet pavement Do 10 minivan trials on dry pavement May see differences but cant tell why Subtle confounding Compare wet and dry pavement for minivans While one driver does 10 trials on wet pavement another does 10 trials on dry pavement More subtle confounding Compare wet and dry pavement for minivans on driver First do 10 trials on dry pavement then do 10 trials on wet pavement Could be confounded with run order Basic Principles of Experimental Design 0 Intervention if factors are not assigned can7t validly predict effect or even show that there is an effect after intervention Common Example Cannot randomly assign people to smoke or not Thus there is little strictly valid evidence that smoking is harmful o Randomization running trials in an experiment in random order 7 to avoid confounding with hidden factor confound treatment assignment or run order with random variable that is generated to be independent of response protection 7 averages out unknown lurking factors independence of trials avoids bias randomization test E Anova F test o Replication decrease uncertainty by averaging out experimental variability improves precision of effect estimation estimation of error or background noise 0 Blocking decrease uncertainty by adjusting for controlling speci c nuisance factors 7 accounts for variability but does not stem from identi able agent since not ran domly assigned Topic 4 Page 5 o BalanceCompleteness guarantees that there is no ambiguity as to where the effect is coming from 0 Random factors 7 even if we dont see every level of a factor7 can infer that factor has some effect In this case7 get inference but no real predictive power Example of Randomization 2 groups of tomatoes Assign varieties AB of tomatoes to plot7 measure yield IAIAIAIAIAIBIBIBIBIBI Maybe the land isn7t uniform7try IAIBIAIBIAIBIAIBIAIBI orarandom allocation IAIBIBIBIAIAIBIAIAIBI if you7re worried about periodic effects A strong effect is unlikely to match a random allocation although there are no guaran tees Randomization Principle Whenever possible7 any assigning or sampling should be done using a chance device 0 Typically all allocations should be possible 0 Typically all allocations should be equally likely How Do We Randomize Randomizing run order Write out treatments in any order apply random permutation Ranking Method 1 Generate Ul U071z3917 771 no ties 2 Rank Ul is lej 3 U1 3 7Tz39 rankUl Sampling method 1 Draw 7T1 from 17 771 Topic 4 Page 6 2 Draw 7T2 from 17 7n 7 3 Draw 7T3 from 17 7n 739139177T2 4 Draw 7Tn only one choice left Example Raw Trt Ul Rank Run Trt 1 1 01398928 1 1 1 2 1 04903066 6 2 2 3 1 08459779 9 3 3 4 1 08692369 11 4 3 5 2 06389887 8 5 2 6 2 03783782 4 6 1 7 2 04057894 5 7 2 8 2 08906754 12 8 3 9 3 06366516 7 9 2 10 3 03087094 3 10 1 11 3 08491306 10 11 3 12 3 02690837 2 12 1 Should we randomize 1 Protects against unforeseen error patterns More likely to get genuine replicates 2 Allows randomization analysis 0 Can we do the analysis without randomization 0 Of course7 statistical analysis does not check for randomness7 but 0 the resulting conclusions tend to be overly optimistic 3 Might cost too much cheaper alternatives come from sampling theory What Can Go Wrong Without Randomization Patterns in Errors Obs 1 2 3 4 5 6 7 8 9 10 Des1 A A A A A B B B B B Des2 A B B B A A B A A B Y1 Y2 M17M2 D8313 M161M162M165M256M2510 MI MZ Topic 4 Page 7 516263645556575859510 Des2 51752753754555675758597510 Trend lf Ee C gtlt 239 7 55 linear trend Trend adds 7250 under Des1 Trend adds 30 under Des2 Optimal versus linear trend IAIBIBIAIAIBIBIAIAIBI Autocorrelations 1 z j COF5i751 P li jl 1 0 239 i jl gt 1 V3F51 62 63 64 55 56 i 57 i 58 i 59 5105 02 p02 V3F51 52 i 53 64 65 66 57 68 69 5105 02 pU p 31 0 changes VarOilv 7 Estimates of 52 don7t capture this A random design mitigates 0 Balance is important Keep the treatment group sizes equal or approximately so There are versions of randomization that dont preserve balance 0 Randomization might occur in spacetime or some other dimension 0 Randomization sometimes needs to be constrained Example two queues to two different evaluators o Haphazard is not randomized Example 4 treatment16 units HAPHAZARD Treatment A is assigned to the rst four units we happen to encounter treatment B to the next four units and so on As each unit is encountered we assign treatment A B C and D based on whether the seconds77 reading on the clock is between 1 and 15 16 and 30 31 and 45 or 46 and 60 Topic 4 Page 8 RANDOM We use 16 identical slips of paper four marked with A7 four with B7 and so on to D We put the slips of paper into a basket and mix them thoroughly For each unit7 we draw a slip of paper from the basket and use the treatment marked on the slip Randomization InferencePermutation Test Data A A B B test statistic 22 40 10 16 18 List all lt m 712 orderings 711 Y 22 40 10 16 YA 7 Y3 P1 A A B B 22407107162 18 P2 A B A B 22740107162 712 P3 A B B A 22740710162 76 P4 B A A B 72240107162 i6 P5 B A B A 72240710162 12 P6 B B A A 72274010162 718 Logic 1 A7 B identical null hypothesis true i distribution of 17A 7 173 independent of obser vation labels null distribution 2 Make histogram of permuted 17A 7 173 values 3 If actual value based on data is extreme7 conclude groups differ Whats the pvalue in the example 0 For 711 n2 57 there are lt 10 5 0 Can do medianA 7 medianB 0 Replace YAZ by YAZ 7 A7 to test A E B A o For large 7117 equivalent to t test Topic 4 Page 9 252 possible orders Example Paired t testRandomz39zation Paired Test In a study of egg cell maturation the eggs from each of four female frogs were divided into two batches and one batch was exposed to progesterone After two minutes the CAMP content was measured It is believed that cAMP is a substance that can mediate cellular response to horrnones FROG cAMP Content Control Progesterone Diff 1 6 4 2 2 4 5 1 3 5 2 3 4 4 2 2 o t test d 2 713 2 a d 15 and 5 0866 The test statistic is 1732 Using Table ll and 3 degrees of freedom the p value is between 005 and 010 one sided 010 and 020 two sided The actual two sided p value is close to 018 o randomization The result of each pair does not depend on the allocation of treat ments Thus there are 24 16 possible outcomes The observed outcome is 2 7 1 3 2 6 1 Edi of occurrences 8 2 6 2 4 4 2 6 0 2 From the table there are four of sixteen outcomes as unlikely77 or more simply due to chance Thus the p value is 025 Discussion 0 We will be discussing the randomizationpermutation components of future designs 2 2 levels 2 1 factors mixed models nested models etc 0 Understanding where non parametric methods come from is becoming more im portant In the mechanics of inference different methods affect how the reference distribution is generated or how big the con dence interval is 0 Its the independence ofthe randomization distribution from the experimental mech anism that makes the randomization hypothesis valid Topic 4 Page 10 Poser What do we do if we get by chance Obs 1 2 3 4 5 6 7 8 9 10 Des1 A A A A A B B B B B or Obs 1 2 3 4 5 6 7 8 9 10 Des2 A B A B A B A B A B This American Life It s part of the game http www thislife orgRadiojpisode aspxsched887 or Search on Meet the Pros7 at httpwwwthislifeorg Story times 2045 4700 Example of Problems with Randomization In a trial on newborn infants with respiratory failure7 the new treatment T was highly invasive extracorporeal membrane oxygenation EMCO7 while the control treatment 0 was conventional medical management A randomized trial was set up which saw a binary response success or failure of the treatment Adaptive Urn Scheme 0 At each subsequent trial7 a treatment T or C was chosen as a ball from an urn o The initial trial has two balls marked T and C in the urn 0 Each time a success is observed7 a ball marked by the successful treatment is added to t he ur n First Trial Trial Treatment Outcome Trial Treatment Outcome 1 T survival 7 T survival 2 0 death 8 T survival 3 T survival 9 T survival 4 T survival 10 T survival 5 T survival 11 T survival 6 T survival 12 T survival Topic 4 Page 11 Analysis of the ef cacy of T over 0 was considered to be inconclusive Second Trial Boston 1986 0 Patients were randomized equally to T and C in blocks of size 4 o Stopping Rule Four deaths cumulative on one of either T or 0 Results 0 T 9 units with no failures 0 C 10 units with 4 failures Conclusions Substantial but not overwhelming evidence in favor of EMCO Moral One should be very careful when trying to do better than chance sometimes there is no way to avoid being unlucky Upshot Statistics lets us quantify our chances of being unlucky Blocking A nuisance factor is any possible source of variability other than the conditions you want to compare that is7 anything other than effects of interest that might affect the response 0 Randomizmg turns any bias resulting from a nuisance in uence into chance error However7 this increases the size of the chance error7 making it harder to detect and measure the effects of interest 0 Blocking turns a nuisance factor into a factor of the design Goal within block variation ltlt between block variation The more similar the units in a block7 the more effective blocking will be Topic 4 Page 12 Example Goal Study the effect of Vitamin B6 on premenstrual syndrome Units Human volunteers7 sorted into pairs One got B6 the other got a placebo Grouping Severity of symptoms as evaluated by a questionnaire Another nuisance in uence stress at different times of year For many students7 the beginning of a semester tends to be less stressful than the end7 when there are exams to take and papers to write7 For many people7 major holidays are often stressful77 As a result December and January should be treated as dz ereut blocks of time in the study Maxim Block what you can7 and randomize what you cannot Topic 4 Page 13

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.