### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# STAT DESIGN&EXPERMT STAT 402

ISU

GPA 3.5

### View Full Document

## 13

## 0

## Popular in Course

## Popular in Statistics

This 56 page Class Notes was uploaded by Giovani Ullrich PhD on Saturday September 26, 2015. The Class Notes belongs to STAT 402 at Iowa State University taught by Staff in Fall. Since its upload, it has received 13 views. For similar materials see /class/214414/stat-402-iowa-state-university in Statistics at Iowa State University.

## Reviews for STAT DESIGN&EXPERMT

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/26/15

Introduction to SAS Procedures and basic Data step programming Philip Dixon 3 January 2007 This document provides general guidance for writing and running SAS programs and summarizes some of the useful procedures Examples of these procedures in the context of an entire SAS program are provided on my STAT402 web pages wwwpubliciastateedu pdixonstat402sasprogshtml The examples all use the insect data le insecttxt7 from the data les section of the 402 web page This le has two measurements on each of 6 trees The rst line is a header line giving variable names The remaining 6 lines each have data from one tree There are three pieces of informaiton7 the tree number7 the number of insect species found last year and the number of insect species found this year Values on a line are separated by one or more spaces A lonely indicates a missing value SAS can also read les with tabs or commas between elds7 but to do this7 you need to add options to the lNFlLE command Data steps lNFlLE commands7 CARDS statements7 or DATALlNES statements The easiest way to read data is from a text le The data can be included with the sas commands or kept separate You use an in le command when you keep the data separate You use a cards or a datalines statement when you include the data with the sas commands The lNFlLE command speci es the name of the le containing the data The le name goes in quotes It can include whatever path information eg 7gphilipinsecttxt7 that is needed data insect infile CDocuments and SettingspdixonIASTATEMy Documentsinsecttxt input tree last this To read data stored with the sas commands7 use CARDS or DATALlNES data insect input tree last this cards 1 42 37 2 51 47 Notes CARDS or DATALlNES is the last command in the data step7 the data ends with a INPUT commands The input command tells SAS how to read your data There are many options for reading complex data sets well use the simplest7 7list input7 The input line lists the variables to be read from one line of data Each data value is separated by one or more spaces So7 the above input lines read three numbers from each line of data The rst is stored in the variable tree the second in 7last7 and the third in 7this Character variables By default7 considers all variables as numeric ie containing 17 217 It is often helpful to read character variables eg Dry Moist Fluctuating especially for treatment or site names Since Dry77 can not be converted to a number7 you need to tell SAS which variables contain character information You do this by putting a after the variable name on the input line You can not do arithmetic operations eg compute a mean on character variables data insect input tree last this cards A 42 37 B 51 47 C 23 17 D 62 E 32 F 37 27 Missing Data When you use list input7 the order of the values matters If a value is not recorded eg the last value for tree E7 you need to indicate the missing value SAS uses a lone decimal point to indicate a missing value You can use this in data sets eg insecttxt or in commands see the commands for deleting observations7 below Saving paper SAS output the Listing window can ll many pages Two ways to reduce the paper 1 Add an option to the top of your le options formdlim7 7 Pages of output are separated by iiiii instead of being put on new sheets of paper 2 Use Microsoft WORD or other software to print 2 or 4 pages on one physical page Data step programming All transformations and computations on variables are done in the DATA step Each operation is a separate command Each goes after the input command Here are some of the common commands H 3 9 7 U Log transformations default is natural log loglast loglast Difference diff this last Recoding eg number to character if tree 1 then newtree 7A rubr7 else if tree 2 then newtree 7B nigr7 else if tree lt 5 then newtree 7C occ7 else newtree 7Q alba7 Deleting observations if tree 5 then delete Keeping certain observations no then clause if tree gt 3 data insect infile insecttxt input tree last this diff this last proc print title Insect data set with difference run PROC PRINT This proc prints out a data set see example above The title statement is optional PROC PLOT This proc plots two variables The plot command speci es the variables to be plotted The syntax is plot Y X The Y axis variable comes rst7 then the X axis variable data insect infile insecttxt input tree last this diff this last proc plot plot thislast title Insect data set run PROC GPLOT Draws high resolution plots in a new window Same syntax as proc plot PROC UNIVARIATE and PROC MEANS Both compute descriptive statistics means7 vari ances7 other quantities The VAR command speci es which variables are to be used UNlVARlATE computes more descriptive statistics MEANS gives you more compact output and is easier to use to produce a new data set If you want descriptive statistics for subgroups eg for each treatment7 then you need to use the BY command To use BY7 the data need to be sorted by that variable rst The PLOT option to PROC UNlVARlATE draws box plots low resolution data insect infile insecttxt input tree last this if tree lt 4 then group First else group Second proc univariate var last this run proc means var last this run proc sort by group run proc univariate plot by group var last this run PROC BOXPLOT Draws high resolution box plots Only works with windowing SAS The plot command describes the response variable and the group variable The syntax is plot responsegroup data insect infile insecttxt input tree last this if tree lt 4 then group First else group Second run proc boxplot plot lastgroup title Box plot of last values for each group of trees run PROC IMPORT SAS can import data directly from an EXCEL spreadsheet The format and types of data that can be read differ slightly in PC and Vincent versions DATAFILE speci es the data set and its type7 indicated by the le extension XlS is an excel spreadsheet csV is a comma delimited le OUTspeci es the SAS data set name Here7s the PC format See me if you need to read Vincent or MAC les Again7 you may need to use a di erent path to the le proc import datafile CDocuments and SettingspdixonIASTATEMyDocumentsinseCtXls outinsect run proc print title Insect data set run Randomized Complete Block Designs Adapted from Experimental Designs 2 ed 1957 by Cochran and Cox John Wiley amp Sons Inc Probably the most frequently used design is the randomized complete block design RCBD The fundamental idea of the RCBD is to group experimental material together into homogeneous blocks The object of this grouping is to keep the errors Within each group as small as possible The randomized complete block design has several advantages 0 Using blocks of more homogeneous experimental material usually re sults in more accurate results than if a completely randomized design is used c Any number of treatments and any number of replicates blocks can be used In the complete block design7 every treatment Will have the same number of replicates Statistical analysis is straightforward o If the variance is larger for some treatments than others7 an unbiased error for testing any speci c combination of the treatment means can be obtained The only disadvantage to the RCBD comes When there are missing values If an entire group block is missing or if data on an entire treatment are miss ing7 there is no dif culty With the analysis However7 When some individual units are missing there can be some problems There is a missing plot technique that lets you use the available data to their fullest extent One other caution With the RCBD7 if there are not real differences among the blocks then blocking can actually cost you some precision in the estimate of error variability This is due to the fact that blocking takes away degrees of freedom from the estimate of the error variation SPSS Output for Unbalanced Recall Data ANOVA Re all Sum of Squares df Mean Square F Sig Between Groups 791121 5 158224 11923 000 Within Groups 212333 16 13271 Total 1003455 21 Multiple Comparisons Dependent Variable Recall Tukey HS Mean Difference I ll Treatment J Treatment J Std Error Sig 95 Confidence Interval Lower Bound Upper Bound N20 N40 850000 257593 043 168000 2000 N60 833333 278232 076 6317 172984 V20 233333 278232 956 66317 112984 V 40 350000 257593 750 118000 48000 V 60 950000 257593 020 178000 12000 N40 N20 850000 257593 043 2000 168000 N60 1683333 278232 000 78683 257984 V20 1083333 278232 014 18683 197984 V40 500000 257593 415 33000 133000 v60 100000 257593 999 93000 73000 N60 N20 833333 278232 076 172984 6317 N40 1683333 278232 000 257984 78683 v20 600000 297443 375 155841 35841 V40 1183333 278232 007 207984 28683 V60 1783333 278232 000 267984 88683 V20 N20 233333 278232 956 112984 66317 N40 1083333 278232 014 197984 18683 N60 600000 297443 375 35841 155841 V40 583333 278232 337 147984 31317 v60 1183333 278232 007 207984 28683 V40 N20 350000 257593 750 48000 118000 N40 500000 257593 415 133000 33000 N60 1183333 278232 007 28683 207984 v20 583333 278232 337 31317 147984 v60 600000 257593 239 143000 23000 v60 N20 950000 257593 020 12000 178000 N40 100000 257593 999 73000 93000 N60 1783333 278232 000 88683 267984 V20 1183333 278232 007 28683 207984 V 40 600000 257593 239 23000 143000 Descriptive Statistics Dependent Variable Recall Reinforcement lsolationTime Mean Deviation N None 20 minutes 220000 365148 4 40 minutes 305000 369685 4 60 minutes 136667 550757 3 Total 228182 794756 11 Verbal 20 minutes 196667 305505 3 40 minutes 255000 238048 4 60 minutes 315000 341565 4 Total 260909 559383 11 Total 20 minutes 210000 336650 7 40 minutes 280000 392792 8 60 minutes 238571 1033487 7 Total 244545 691256 22 Tests of BetweenSubjects Effects Denendenf Var39able Recall Type III Sum Source of Squares df Mean Square F Sig Corrected Model 791121 a 5 158224 11923 000 Intercept 12240817 1 12240817 922385 000 Reinforcement 66150 1 66150 4985 040 IsolationTime 210509 2 105254 7931 004 Ezlgifg irpneent 553937 2 276969 20870 000 Error 212333 16 13271 Total 14160000 22 Corrected Total 1003455 21 a R Squared 788 Adjusted R Squared 722 Multiple Comparisons DependentVariable Recall LSD I Mean lsolationTime J lsolationTime Difference lJ Std Error Sig 95 Confidence Interval Lower Bound Upper Bound 20 minutes 40 minutes 70000 188539 002 109968 30032 60 minutes 28571 194722 162 69851 12708 40 minutes 20 minutes 70000 188539 002 30032 109968 60 minutes 41429 188539 043 1460 81397 60 minutes 20 minutes 28571 194722 162 12708 69851 40 minutes 41429 188539 043 81397 1460 Based on observed means The mean difference is significant atthe 05 level 1 E0 00 Stat 402 More uses of contrasts in a 2 way factorial design Why I don7t use any multiple comparison adjustment for orthogonal or nearly orthogonal set of contrasts Usual tests in 2 way factorial have no multiple comparison adjustment Even when there are lots of tests eg in a 4 way factorial7 l7ve never seen any multiple testing adjustment When equal 717s7 those tests are a set of orthogonal contrasts So7 be consistent Don7t adjust other sets usual structure unusual questions Treatments have a 2 way structure7 but the questions are not the usual questions One example crop genotype7 3 levels GMO type A7 GMO type l37 isoline control insecticide7 2 levels none7 standard practice Q 1 Compare the GMO type A without insecticide to control with standard practice 2 Compare the GMO type B without insecticide to control with standard practice MAN MBN MON MAI M31 01 Q 1 1 0 0 0 0 1 Q 2 0 1 0 0 0 1 Could also ask other questions7 but these two are the most important Use contrasts above to answer Augmented factorials not the only name used for this sort of design Common in crop fertility studies7 could easily happen elsewhere Type of N fertilizer7 2 levels organic or Ammonium Nitrate Rate of application7 4 levels 07 507 75 or 100 Typical sort of cell means Type 0 50 75 100 organic 70 120 130 140 AmNitr 70 140 150 160 Usual 2 way ANOVA strong evidence of an interaction But7 expect that if type of fertilize has any effect when applied The two 0 rate cells are the same treatment Using contrasts among 7 unique treatments contra l 0 M07950 M07935 079100 MAN50 MAN75 MAN100 Ave Fertilizer 1 16 16 16 16 16 16 Type7 when present 0 13 13 13 13 13 13 Rate7 when present 0 12 0 12 12 0 12 0 14 12 14 14 12 14 TR7 when present 0 1 0 1 1 0 1 0 l 2 l l 2 l JMP output for fuel economy experiment Oneway Analysis of mpg By Vehicle 40 0 35 30 25 d 9 2039 gt o I M I O I p I V Each Pair Student39s t Vehicle 005 Oneway Anova Summary of Fit quuare 0993534 Adj quuare 0992841 Root Mean Square Error 0709628 Mean of Response 27325 Observations or Sum Wgts Analysis of Variance 32 Source DF Sum of Squares Vehicle 3 21664600 Error 28 141000 C Total 31 21805600 Means for Oneway Anova Level Number Mean Std Error 247000 025089 0 8 193000 025089 P 8 242000 025089 V 8 411000 025089 Std Error uses a pooled estimate of error variance Mean Square F Ratio 722153 1434063 0504 Lower 95 Upper 95 24186 25214 18786 19814 23686 24714 40586 41614 Prob gt F lt0001 Means Comparisons Default output for Compare Means Each Pair Student s t Comparisons for each pair using Student39s t t Alpha 204841 005 AbsDif LSD V M P O V 0727 15673 16173 21073 M 15673 0727 0227 4673 P 16173 0227 0727 4173 0 21073 4673 4173 0727 Level Mean V A 41100000 M B 24700000 P B 24200000 0 C 19300000 Levels not connected by same letter are signi cantly different Means Comparisons Set Alpha Level at 001 Comparisons for each pair using Student39s t t Alpha 276326 001 AbsDif LSD V M P O V 0980 15420 15920 20820 M 15420 0980 0480 4420 P 15920 0480 0980 3920 0 20820 4420 3920 0980 Level Mean V A 41100000 M B 24700000 P B 24200000 0 C 19300000 Levels not connected by same letter are signi cantly different Means Comparisons Comparisons for all pairs using Tukey Kramer HSD q Alpha 273031 005 AbsDif LSD V M P O V 0969 15431 15931 20831 M 15431 0969 0469 4431 P 15931 0469 0969 3931 0 20831 4431 3931 0969 Level Mean V A 41100000 M B 24700000 P B 24200000 0 C 19300000 Levels not connected by same letter are signi cantly different SPSS Analysis of Vehicle MPG Means OneSamp e TTest G nerahzed unear Mode s waed Mo es orre ate Regresswon Loghnear C assw Dam Reducuon Sca e Nonparamemc Tess nme senes rvwa Ummes Wuwdow migratiannm IOK m Pas e Rem Can25 Famm Hahn m memwe NAREIEIEIEIZ Cnmvasts Pusth Optmns SVNVK WaHeeruncan Tukey mu WW H Enm Ram an I Tukey srb r Dunne Duncan Cmmm Categmy L 39 Huchbevg s 072 Tau 9am l a 2mm r 007mm r Mfunm r Tamhane sTZ Dunnen sTS GameeruweH Dunnen s c Swgnmcance EVE us mm He p ANOVA MPG Sum of Sguares df Mean Sguare F Sig Between Groups 2166460 3 722153 1434063 000 V thin Groups 14100 28 504 Total 2180560 31 Multiple Comparisons Dependent Variable MPG LSD Mean Difference 95 Confidence Interval SI Vehicle SJ Vehicle lJ Std ror Sig Lower Bound UEEer Bound Mercedes Oldsmobile 540000 35481 000 46732 61268 Peugeot 50000 35481 170 2268 12268 Volkswagon 1640000 35481 000 171268 156732 Oldsmobile Mercedes 540000 35481 000 61268 46732 Peugeot 490000 35481 000 56268 41732 Volkswagon 2180000 35481 000 225268 21 0732 Peugeot Mercedes 50000 35481 170 12268 2268 Oldsmobile 490000quot 35481 000 41732 56268 Volkswagon 1690000 35481 000 176268 161732 Volkswagon Mercedes 1640000 35481 000 156732 171268 Oldsmobile 21 80000 35481 000 210732 225268 Peugeot 1690000quot 35481 000 161732 176268 The mean difference is significant at the 05 level Intro to SAS lab activities Stat 402A Philip Dixon This is intended to be used with the handout Using SAS on PC s available on the Stat 402 website That handout explains the concepts behind these activities This document directs you through running a SAS program editing SAS programs and how to get help Activities are in bold My comments are in regular text Start up SAS The icon should be on the desktop It is a grey triangle with a yellow arrow The icon text should say SAS 91 or Find SAS in the Stalt MenuPrograms list The SAS panel has 4 windows Explorer we won t use this at all at least for this class Editor This is where you type or copy your SAS program Log This is where SAS puts errors or notes about your program Output This is where results go when the program works Start up your favorite browser IE Firefox Netscape Navigate to the SAS programs part of the Stat 402A web site Open tomat02sas Select the program copy it to the clipboard and paste it into the SAS editor window ctrlA ctrlC then ctrlX you could also save the program on the hard disk or a ash drive then open it in SAS Run tomat02sas You have three ways to run the program click the submit icon This is the running person on the SAS shortcut bar click the submit menu item Click on Run on the menu bar then click on submit hit the F3 function key This is the keyboard shortcut you should see new text in both the log and output windows ask for help if you don t see the output Tomat02sas reads data into a SAS data set then draws sidebyside boxplots does a t test then ts a lway ANOVA More information especially about the data step and input statements is in the Using SAS handout More information about SAS procedures is in the Handout on basic SAS procedures and data step programming Notice where the data go after all the other commands in the data step between the datalines command and the lonely It can be difficult to debug SAS programs when you include the data with the program These data sets are small imagine having 1000 lines of data between the data step and the proc steps It is easy to read data from an external le The only problem is that SAS does not use the My Documents folder by default Change the default directory to MyDocuments or a folder in MyDocuments locate the text CDocuments on the bottom line of the SAS window double click on that text A Change folder window will pop up Navigate to MyDocuments or a folder in MyDocuments I recommend a folder for all your Stat 402 work If you don t like MyDocuments you can use whatever location you would like Use your browser to open tomatotxt then save it in your MyDocumentsStat 402 folder Save tomatosas in your MyDocumentsStat 402 folder Click on the Editor window so it is highlighted then Click the open icon or use fileopen Navigate to MyDocumentsStat 402 folder and click on tomatosas You should see a new window with the tomatosas code in it Close the editor window with tomat02sas If you submit the tomatosas code the output and log information will be appended to what is already there Clear the output and log windows Highlight the window then either click the CLEAR icon the black X next to the submit icon select editclear all from the menu bar or type ctrl e the keyboard shortcut repeat with the other window You can also clear the editor window this way ctrlZ undo will undo any of the above clear s submit the tomatosas code The output is identical the only difference is that the data is not included with the SAS program Your practice Download the shoil data from the class web page modify your tomatosas or tomat02sas code to run a t test comparing means of the two shoil treatments Getting help the easiest ways are in of ce hours or to email the contents of the log window to me If you are stuck for more than 15 minutes ask for help Other students in the computer lab may be able to help If not send me an email The best way to use email is clear the log window see above submit the program again highlight the log window select all the text and paste into the body of an email message Send that message as a plain text le not a styled text or html le Use Stat 402 or something like that as the subject I get lots of spam with HELP in the subject line If you can t remember an option look back through preVious SAS programs They ll all be posted on the class web page Or look at the SAS help les To nd names of options you need the Syntax pa1t click on the help icon book with on cover expand SAS Products 1 expand SASStat statistical procedures expand SASSTAT Users Guide scroll down and expand the procedure of interest e g PROC TTEST expand Syntax click on the statement of interest 2 or expand Base SAS data step simple stats for data step commands then expand SAS Language Dictionary expand Statements click on the statement of interest for simple stats procedures and utility procedures then expand Procedures click on the procedure of interest You will nd proc means proc print proc plot proc import here for proc corr proc freq 0r proc univariate expand SAS Procedures CORR FREQ UNIVARIATE click on the procedure of interest these procedures are also linked to the preVious Procedures list Stat 402A 3 way factorial treatment designs Concepts from 2 way extend to more 3 4 or more factors Complete factorial all combinations of levels used as treatments Can be many treatments eg eXpt on pig feed composition 4 levels of lysine 3 levels of methionine and 2 levels of protein 24 treatments 2 reps per treatment 48 observations Effects de ned just as before marginal means are averages over left out77 factors and replicates marginal means for lysine averages over all levels of methionine all levels of protein and all replicates marginal means for lysineprotien averages over all levels of methionine and all replicates cell means for MLP averages over all replicates Compute marginal means by averaging cell means works for balanced or unbalanced data skeleton ANOVA table I X J X K factorial n reps in CRD Source df df in general Treatments 23 UK 1 Lysine 3 1 1 Methionine 2 J 1 Protein 1 K 1 LM 6 l 1J 1 LP 3 l 1K 1 MP 2 J 1J 1 something else 6 l 1J 1K 1 Error 24 lKJn 1 What is the something else Answer the 3 way interaction of LMP Interpretation of 3 way interaction 1 A problem see below for possible solutions 2 Answer to the Q Are the two way interaction effects the same for each level of the third factor Remember 2 way interaction Are the simple effects of A the same at every level of B 3 way interaction generalizes this Are the 2way AB int effects the same for each level of C Example row 3 on graphs has no 3 way interaction Why The cell rneans level level Cell mean for ofC ofB A1 A2 1 1 6 6 1 2 1 3 2 1 2 4 2 2 1 5 When C 1 the interaction e ect for AB is simple e ect of A when B1 sirnple e ect of A when B2 6 6 3 1 2 When C 2 the interaction e ect for AB is 4 2 5 1 2 These are the same so there is no 3 way interaction 4 way interaction if four factors Are the 3 way ABC interaction e ects the same for each level of D7 Why no AB interaction in the fourth set of plots 2 way interaction is a comparison of averages over replicates and levels of C The cell means and the averages of 01 and 02 level level Cell mean for ofC ofB A1 A2 1 1 3 7 1 2 3 3 2 1 4 4 2 2 0 4 ave 1 35 55 ave 2 15 35 Effect of A when B1 is 55 35 2 when B2 is 35 15 2 No AB interaction What to do if 3 way interaction is signi cant 1 is additive model appropriate rnultiplicative e ects 2 split data describe L and M e ects at each level of P or P and M e ects at each level of L or P e ects at each level of L and M Use 2 way interactions to decide how to split Goal is simple summary of treatment e ects 3 ignore lf rnagnitude srnall may decide to use rnarginal mean as a single approxirnation Concepts extend to many factors Commonly see 4 or 5 factor designs Have seen 15 Watch out for treatments Industrial screening studies often 2 levels of each treatment 8 factor study has 256 treatments 8 main effects and lots of information about lots of interactions 28 2 way interactions7 56 3 way interactions7 1 8 way interaction Often not always magnitude of main effect gt that of 2 way interactions gt 3 way gt gt 8 way interaction Most interested in main effects and perhaps 2 way interactions May be reasonable to assume high order interactions eg 4 way7 5 way7 8 way are zero If so7 can reduce the number of treatments and still get good estimates of main effects and some interactions Fractional factorials no details7 see me if want to consider Study 3 factors7 2 levels each or 7 8 treatments A 12 fraction estimates main effects7 no interactions from 4 treatments Trt A B C 1 2 3 4 omit omit omit omit Notice each trt occurs twice and is absent twice Estimates of the main effects are more precise than if you used a 7one at a time7 design also four treatments TrtAB C 1 2 3 4 SPSS Output for Treatment ongoraphobia Experiment Between Subjects Factors Value Label N Drug 100 Prozac 18 200 ElaVil 18 300 XanaX 18 400 Placebo 18 Therapy 100 Psychodyna 24 m1c 200 CogB eh 24 300 Group 24 Depression 100 No 36 200 Yes 36 Tests of Between Subj ects Effects Dependent Variable Serm Type III Sum 0 Source Squares df Mean Square F Sig Corrected Model 1009278a 23 43882 15957 000 Intercept 12746722 1 12746722 4635172 000 Drug 486722 3 162241 58997 000 Therapy 33361 2 16681 6066 004 Depression 40500 1 40500 14727 000 Drug Therapy 385528 6 64255 23365 000 Drug Depression 20944 3 6981 2539 068 Therapy 31083 2 15542 5652 006 Depress1on Drug The 11139 6 1856 675 670 Depress1on Error 132000 48 2750 Total 13888000 72 Corrected Total 1141278 71 a R Squared 884 Adjusted R Squared 829 1 Grand Mean Denendenf Variable Se erity Mean 13306 Std Error 195 Lower Bound 12913 95 Confidence Interval Upper Bound 13699 2 Drug Denende Variable S verity 95 nnfids nr e Interval r 9 Mean Std Error Lower Bound Upper Bound Prozac 9944 391 9159 10730 Elavil 11667 391 10881 12453 Xanax 15278 391 14492 16064 Placebo 16333 391 15547 17119 3 Therapy Denenr lenf Variat le Severity 95 nnfir ls not Interval Therapy Mean Std Error Lower Bound Upper Bound Psychodynamic 14125 339 13444 14806 CogBeh 12458 339 11778 13139 Group 13333 339 12653 14014 4 Depression Demandean riable Seve ity 95 nnfir ls not Interval Depression Mean Std Error Lower Bound Upper Bound No 14056 276 13500 14611 Yes 12556 276 12000 13111 5 Drug Therapy CogBeh CogBeh CogBeh CogBeh 6 Drug Depression 16445 7 Therapy Depression 13379 8 Drug Therapy Depression 17925 12925 Multiple Comparisons Mean Difference Xanax 53333 Placebo Xanax 36111 Placebo Elavil 36111 Placebo Elavil 46667 Xanax Xanax 53333 Placebo Xanax 36111 Placebo Elavil 36111 Placebo Elavil 46667 Xanax 10556 means on The mean difference is significant at the 05 level Severity Subset Drug N 1 2 3 Tukey HSDav Prozac 18 99444 Elavil 18 116667 Xanax 18 152778 Placebo 18 163333 Sig 1000 1000 238 Means for groups in homogeneous subsets are disp ayed Based on Type III Sum of Squares The errorterm is Mean SquareError 2750 8 Uses Harmonic Mean Sample Size 18000 b Alpha 05 Multiple Comparisons Mean Difference 8750 47871 0875 18375 on means The mean difference is significant atthe 05 level Severity Sutset Thera y N 1 2 Tukey HSDav CogBeh 24 124583 Group 24 133333 133333 Psychodynamic 24 141250 Sig 171 233 Means for groups in homogeneous subsets are displayed Based on Type III Sum of Squares The errorterm is Mean SquareError 2750 8 Uses Harmonic Mean Sample Size 24000 b Alpha 05 Incorrect Analysis of Reaction Time Data Response Reaction Time Factor Alcohol 5 I 4 I I I 39 I a 339 I I E I i I 2 I I I I I I 1 39 I G I l O 2 4 Alcohol Summary of Fit quuare 0104701 Adj quuare 0038382 Root Mean Square Error 0923119 Mean of Response 2226667 Observations or Sum Wgts 30 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Alcohol 2 2690667 134533 15788 Error 27 23008000 085215 C Total 29 25698667 Means and Std Deviations Level Number Mean Std DeV 10 194000 089592 2 10 210000 100333 4 10 264000 086436 Prob gt F 02247 Plot of Residuals versus Alcohol Level Residuals oneway J I I I 15 u 2 I 1 I 05 I u I 05 I I 1 l I 45 39r I I 0 2 4 Alcohol Distribution of Residuals Normal Quantile Plot Stat 402A models for correlations between repeated measures Some covariance matrices for three repeated measures per eu Independence 1 parameter 02 02 0 0 0 02 0 0 0 02 AR1 2 parameters 02 p p2 U2 p U2 U2 ARH1 4 parameters of 0 0 p 01 P 0102 P2 0103 P 0102 U P 0203 p2 0103 p 0203 0 ANTE1 5 parameters of 0 09 p1 p2 0 P1 0102 P1P2 0103 P1 0102 0 P2 0203 P1P2 0103 P2 0203 0 Corresponding correlation models Split plot7 Compound Symmetry 2 parameters 0 039 m In 2 2 2 2 2 2 2 2 2 2 2 2 AR1 RE 3 parameters 072 02 p Ufn02 U p02 Ufnp202 a p02 Ufn02 a p02 Ufnp202 U p02 U 02 ARH1 RE 5 parameters 02 0f 0 0 p m7 2 2 2 2 2 01 Um p01020m p U103Um 2 2 2 2 PU1U2Um 02Um PU2UsUm 2 2 2 2 2 p 01030m p0203am 03Um UN 6 arameters 02 02 02 039 039 039 p 17 27 37 127 237 13 2 01 U12 U13 2 012 02 023 2 U13 U23 03 Independence Split plot 1 0 0 1 p p 0 1 0 p 1 p 0 0 1 p p 1 AR1 AR1 RE 1 p p2 1 f1fp f1fp2 p 1 p f1fp 1 f1fp 92 p1 f1fp2 f1fp 1 ARH1 ARH1 RE 1 p p2 p 1 p not nice 92 p 1 ANTE1 UN 1 P1 P1P2 1 P12 P13 P1 1 P2 P12 1 P23 P1P2 P2 1 P13 P23 1 Summary of characteristics of correlation models Model Variances Correlations Independence constant none Split plot constant same for all pairs AR1 constant decline with separation eventually zero same for 1 to 2 and 2 to 3 AR1 RE constant decline to non zero ARH1 di erent decline with separation ARH1 RE di erent decline to non zero ANTE1 di erent decline to zero di erent for 1 to 2 and 2 to 3 UN di erent every pair di erent Choosing an appropriate correlation structure Model pararn AlC AlCc BlC lndependence 1 3420 3421 3435 Split plot 2 3370 3374 3385 AR1 2 3331 3335 3346 AR1RE 3 rim 07 so same as AR1 ARH1 4 3358 3372 3382 ARH1RE 5 672 07 so same as ARH1 ANTE1 5 3357 3389 3403 UN 6 3357 3389 3403 Effect of correlation structure on results for asparagus study Model Trt A Trt B Year 2 Year 3 se df se df lndependence 1209 33 1046 33 Split Plot 1784 9 782 24 AR1 1834 93 698 231 ARH1 1801 853 621 206 ANTE1 1848 874 764 12 UN 1862 829 749 12 Assuming independence over time produces clearly di erent results clearly wrong Choice of model for correlated data has much less impact on results Models with lots of pararneters7 eg UN7 ANTE17 have smaller df and larger se This is a general pattern Results in loss of power Multivariate approach corresponds to UN also lower power Two rep rneas models with similar AlC and BlC values will usually but not always produce similar but not sarne conclusions about treatment e ects F statistics7 p values7 se df My advice use AlC and biological knowledge to choose correlation rnodel Try to get close to a reasonable rnodel Don7t worry too much about getting the model exactly right are Exp Agra 1992 volume 28 pp 37583 39 Printed in Great Brita l 1 quot i DATA ANALYSIS IN AGRICULTURlA L EXPERIMENTATION II SOME STANDARD CONTRASTS By s C PEARCE r Applied Statistics Research Unit University ofKent at Canterbury England Accepted 22 February 1992 SUMMARY In the preceding paper in this series Pearce 992 it was exptained how an experimenter can ask speci c questions about the treatment responses and can obtain answers to them by using contrasts of interest Here two standard cases are examined one in which treatments are quantitative in nature like the amount ot fertilizcr applied or the dates on which spraying takes place and the other in which the treatment set is formed from alt combinations oftwo or more other sets Such a factorial set might be formed from several kinds of herbicide being used with a range of cultivations Finally a non standard example is examined in which a well considered set of treatments provided information directly relevant to the subject of enquiry The subject is considered with some emphasis on testing hut in many instances estimation would be better An iisis do data damn do a xperimmlado iz agn cola II RESUMEN En 1 articulo anterior de esta serie Pearce 1992 sc cxplica c mo un expcrimcntador puedc plantear preguntas especi cas acerca de las respuestas dcl tratamiento y puede oblener rcspuestas a cstas preguntas mediante c uso de contrastes dc inter s En este articulo se examinan dos casos est ndar uno en cl cual los lratamientos son dc naturaleza cuantitativa tal como la Canlidad dc Fertilizante aplicado 0 las fechas en las qur 5r realiza la pulverizaci n 3 mm en el cual cl programa de tratamicnto est armada por todas as combinacioncs dc dos o mzis programas distintos Tal programa Factorial podn a umformarse de divorsos tipos dc lxerblcidas utilizadns sobre una serie dc cultivos For Gltimosc examina un ejemplo no est ndar cn cl one an programa de 39 muy bien quot x 39 39 39 directamente relevante al tema de consulla El tema cst mnsidmado con clerk nfasis en las compmbaciones pero en muchos casos la estimaci n resultan a major INTRODUCTION It will be assumed that the reader is already conversant with the terminology and natation of the rst paper However the reader is reminded that in writing a contrast only the relative values of the coef cients matter nm the absolute values For example 1 is the same as 1 2 WI Dr 43 I or 1 2 1 In what follows contrasts are written using only whole numbers QUANTITATIVE TREATMENTS Om of the most important standard cases ofstructure among treatments arises when they are quantitative in nature eg when fertilizer is applied at levels 20 376 s c refines 25 30 and 35 or an operation is carried out at 14 18 or 22 days after germination Unless otherwise stated it is here assumed that the steps are all equal in size though that is not essential Actually equality ofspacing is a matter ofscale cg doses of l 2 fl and 8 can be accepted as equally spaced if dosage acts logarithmically The problem is to discern the form of the graph when response is plotted against dose The whole topic has become rather stylized so care is needed if contrasts are to be relevant and not merely conventional a point well argued by Dawkins 1983 If there are three levels it is usual to take two contrasts First there is the linear effect 1 0 l which measures the general slope Next there is the quadratic effect 1 2 1 used to Show whether the response is straight or not measuring as it does the extent to which the middle level gives a response that differs from the mean of the extremes Le Jri l i which is equivalent To take an example if the three levels gave responses of 10 14 and 16 respectively the linear effect would equal 6 16 10 and the quadratic 2 10 2 X 14 16 It will be noted that the linear and quadratic effects are orthogonal using the word in the sense of the first paper Hence adding their sums ofsquares will give the correct total with two degrees ol freedom Although the method is much used and generally eliective there are several comments that can usefully be made quot Although the quadratic effect is usually considered after the linear it should really come rst lfit shows that the graph is not a straight line there is no general slope for the linear eliect to measure it will vary from segment to segment The quadratic effect will usually be smaller than the linear and its standard error larger Accordingly an experiment capable of establishing the second may be inadequate forjuclging the rst There are various Forms of Curvature If for example the response curve is expected to rise asymptotically to an upper bound it may well depart most markedly from a straight line somewhere near the middle but what if it is expected to be sigmoidal The straight line and the sigmoid might be close together at the halfway point the dilierence between the two being greater nearer the ends The curve implied by the contrasts will be polynomial in character and therefore unrealistic in a biological context It may indeed lit the known responses well but it could have implications that are unlikely or even absurd ifextrapolated outside the range of doses actually studied It should not be taken as a law of nature CHOICE OF LEVELS These points are not made to discredit the method but to emphasize the need for thought when using it especially when it comes to choosing levels to estimate characteristics of a response curve 0644 D p J MUN Wu w Data analysis 2quot agricultural experimentation I 377 For four equally spaced levels there is a convention corresponding to the one just given for three First there is the linear effect measured by 3 l 1 3 and little special needs to be said about that The quadratic ellect is given by 1 l 1 1 Here there is a difference If the greatest departure of the graph from a straight line is expected halllway between the extremes a point is needed there Two points one on either side of the midpoint would not be as good On the other hand two points could be better than one for detecting the general area of greatest departure Also by providing increased replication near the middle they could go some way to meeting the point previously made about the larger standard error usually associated with the quadratic effect The remaining degree offreedom is for the cubic effect Its contrast is 1 3 3 1 and merits some attention Basically it measures the tendency of the graph to in ect to curve rst one way and then the other It takes the overall slope over the total range 1 0 0 1 and compares it with the slope in the middle third between the two intermediate doses 0 l 1 0 the latter multiplied by three to allow for the reduced range There are occasions when five or more levels are needed and no one should object if there is a good reason Sometimes for example the need is to see the general shape of the curve Where the shape can be foretold however usually only afew points are needed to quantify it SOME ALTERNATIVE PROCEDURES FOR QUANTITATIVE TREATMENTS There is no need to adhere rigidly to convention For example with three levels it could be that the maximum departure from straightness supposing it exists is most likely to occur about onethird ofthe way from the lowest level to the highest If that is so there is no objection to seeking evidence of curvature by putting the intermediate point there and taking the quadratic effect as 2 3 l It is true that this contrast is not orthogonal to l 0 1 but the linear effect should now be 4 1 5 In general ifthe intermediate point is placed at x the end points being at 0 and l the coef cients of the linear contrast are 1 t 27 l and 2 7 x resPectively Those for the quadratic effect are l x l and x respectively Either set of coefficients can be multiplied by a constant if that would simplify the contrast It is really a matter when searching for deviation from a straight line of seeking it where it is most expected A more general approach to unequal spacing has been presented elsewhere Pearce e105 1988 Furthermore the dif culty already mentioned of the larger standard error associated with the quadratic effect can be serious With four levels the situation can be eased though it will remain unsatisfactory With three levels it may be advisable to duplicate the middle one to impmve the precision of the quadratic eflect at the expense of the linear For example given levels I 2 and 3 an experimenter might use three blocks of l223 rather than four of 123 In general there will be 239 replicates oil and 3 but 2r ol 2 The resulting analysis ofvariance may look forbidding to those accustomed to equal replication but there are several 378 s c PeARcs ways of dealing with it The simplest is to take each block in turn and to choose arbitrarily one ofthe plots receiving the middle level for allocation to Treatment 2a the other going to 2b That gives Treatments 1 2a 2b and 3 in that order The linear effect is now given by 1 0 t 1 and the quadratic by 1 1 l 1 That leaves 0 1 1 0 as the third orthogonal Contrast but it represents the difference between two identical treatments Consequently its sum of squares together with its one degree offreedom should he removed from treatments and transferred to error or residual variation or whatever other name is favoured If the experimenter has opted for a completely randomized design the plots assigned to the middle level can similarly be allocated in equal numbers to two treatments 2a and 2b in any convenient manner With a Latin square there should be one plot of2a and one of 2b in each row and column STUDYING A SUCCESSION 0F CONTRASTS A previous comment concorned the order in which contrasts should be studied If the experimenter believes that there is an appreciable quadratic effect that conviction may affect attitudes to the linear Of course it may be known beforehand that the graph will or will not be curved Too strict an adherence to testing procedures can lead to problems They arise from the value of the quadratic effect being small relative to its standard error To return to the example in the rst section the three responses were 10 14 and 16 which many would think indicated curvature although it might not be easily demonstrated If the experiment was designed to show up the linear effect 1 0 1 which equals 6 it may not be adequate to demonstrate the quadratic effect 1 2 1 which equals 2 In fact the standard errors will be in the ratio oFNE to V6 Pearce 1992 so the smaller value will also be less well determined That is the reason for doubling the replication ofthe middle level illustrated in the previous section In general if an experiment is needed to study the characteristics of a response curve it will have to be larger than one intended merely to show that there is some sort of effect ofthe substance being assessed If that is the sole objective it is enough to use a large dose in comparison with a small one or a long exposure in comparison with a short one but the form of the response curve may call For a much more extensive investigation The casual addition of intermediate levels in the hope that they will provide valuable information at minimal cost is a temptation to be avoided The topic has been examined in more detail elsewhere Pearce 1989 At the inception oi39the experiment there must have been a recognition that the graph might be curved or there would have been nojusti ttation for the middle level How high was that expectancy Perhaps there was a rm belief that the graph would not be straight In that case there is no need to show that the presupposition was correct though a test oftlie quadratic ellcct might be politic to convince sceptics Perhaps however a test was always envisaged but for some reason for example the experiment proved to be less sensitive than hoped Data analysis in agricultural experimentation H 379 signi cance has not been achieved It should here be recalled that a non signi cant result does not prove non existence Ifthere is any reasonable doubt in the matter it may be better to accept the curvature even though its existence has not been formally demonstrated These remarks apply with greater force to cubic and higher effects which are usually even smaller and are often estimated with poor precision FACTORIAL TREATMENT SETS In a factorial experiment the treatments are formed from all combinations of two or more factors Thus an experimenter might want to try out the eflect offour spray substances applied on three dates giving 12 treatment combinations in all To take the simplest possible case let there be two factors each at two levels For example someone may have decided to estimate the effect of applying more nitrogen than usual and to do so at two levels of phosphate That gives four treatment combinations 1 N P NP where 1 is the basic treatment while N and P denote higher levels of nitrogen and phosphate respectively Two questions are obvious First does the increased nitrogen have any effect Its contrast is 1 1 l 1 Then the same question might be asked about phosphate which leads to 1 1 1 1 but that provides only two degrees offreedom between four treatments If all is to be orthogonal what is the third The answer is l 1 1 1 known as the interaction ofthe two factors It may be noted that the coef cients of the interaction contrast can be found by multiplying corre sponding coef cients in the contrasts for the two main effects 1 X l 1 1 X 12 l 1 X 1 l and 1 X 1 1 The rule applies generally At rst sight the interaction is ofonly minor importance a sort of adornment but in fact it dominates the whole analysis If there is no added phosphorus the eflect ofnitrogen is given by 1 1 0 0 but ifthere is it is given by D 0 l l The contrast for interaction represents the difference between the two that is it asks whether added phosphate affects the response to nitrogen Equally the interaction can be taken as the difference between the contrasts l 0 1 0 and 0 1 0 1 that is it asks whether the effect ofphosphate depends upon the level of nitrogen Clearly the answer to these equivalent questions determines what follows Ifit is concluded that there is an interaction large enough to be important there is no point in looking at the cmain cllect of nitrogen 1 1 quot1 1 because it represents the mean of what happens in two different cases without applying in either case Attention should he directed rather to the particular effects vvvv l l 0 O and O 0 l 1 The main cllcct ofphosphate must similarly give way to two particular effects On the other hand if the experimenter is satis ed that the interaction is too small to matter it is sensible to merge the two particular effects to find the main effect which will be better determined on account of the increased hidden replication l Haw 380 c PEARCE This transference ofinterest is usually effected easily because the contrast for a particular effect can be derived from those for the corresponding main effect and the interaction For example 1100 1111 1 1 11 00 11 11111 111 Since the existence of the interaction has been established it must have been possible to estimate it with some precision If the main effect also can be estimated precisely there is no difficulty with the particular effects but if the main effect has been relegated to a main plot analysis difficulties can arise SUCCESSIVE EXAMINATION 0 EFFECTS With quantitative levels dif culties arose from the quadratic effect being smaller in relation to its expected standard error than the linear Much the same applies to interactions if testing is used too rigorously For example if there are two factors A and B each at two levels and the means are 1 10A12B13AB 17 many would take that as evidence that the combined effect ofA and Bi 7 l 10 was greater than the sum ofthe separate effects 2 12 10 for A and 3 13 lOfor B Here it will be convenient to write the main effects as l 1 1 1 and l l 1 l and the interaction as 1 l l 1 All will have the same standard error The values of the two main effects are 6 and 8 respectively but that ofthe interaction is 2 It could therefore be missed by an experiment geared to establishing the main effects yet it is sufficiently large to affect the way the data are interpreted In a biological context interactions are common so it is better to play safe and regard any appreciable interaction as real whether it is signi cant or not THE PURPOSE OF INDIVIDUAL FACTORS In many circumstances it is important to look ahead and ask which set of particular effects will be called for if an interaction is found Here a distinction needs to be made Sometimes a factor is included because it is itself an object of study sometimes to see ifit will provoke an interaction with a factor of the first sort For example an experimenter might want to study some fertilizer treat ments but recognizes the possibility that their effects might well be different on irrigated land Accordingly presence and absence of irrigation is added as another factor Effectively the fertilizer experiment is now being conducted twice once in each set of conditions When it comes to interpretation there will be little interest in the effect ofirrigation which is well known and anyway is not under Study but Dam analysis in agiz cztltuml experimentation H 38 there will be great interest in the fertilizer treatments and the interaction In such circumstances it is convenient to speak of substantive and provocative factors The distinction affects both design and analysis Often provocative factors relate to farming systems and are dif cult to apply to small areas If that is so there are advantages in a splitplot design It is true that any effect ofthe main plot factor will be poorly estimated but if no one is much interested that does not matter Also in analysis iftherc is clearly an interaction substantive factors need to be expressed by their particular effects but provocative factors can receive more cursory attention In the examplejust given the interest lies in the effects of the fertilizers Ifthey are different under irrigated and unirrigated conditions the next step will be to look at their particular effects and that will cause no dif culty Ifirrigation has been applied to main plots the main effect of fertilizers and the interaction will appear together in the sub plot analysis MORE EXTENSIVE FAGTORIAL SETS An experimenter is not restricted to using only two levels for each factor there can be as many as the situation requires Suppose for example that there are two factors A and B where the rst is simply the absence or presence ofA but the second is quantitative being formed from three equally spaced doses ofB l 2 and 3 That gives six treatment combinations namely 1 2 3 1A 2A and 3A The main effects are easily written as contrasts That OFA is given by CI l l 1 1 1 1 while the two degrees of freedom For the quantitative factor B can be written as C310ll01 C3 1 2 39112r1 Multiplying out the main effect ol39A C with the linear effect ofB C2 gives 140I xC21 0 1 1 01 Using the quadratic effect Cg instead of the linear gives C520 X Cg 1 2 1 2l These ve contrasts correspond to the ve degrees of freedom between six treatment combinations For a factorial set interpretation begins with the interaction and for a quantitative set with the highest order polynomial Putting those considerations together indicates C5 as the starting point Il it suggests that the quadratic effect depends upon the presence or absence oFA the fact should be noted and taken into account in what follows lng can be ignored the next contrasts to be examined should be C3 and C34 and so on following a logical scheme 382 s c PEARCEquot There is no reason to stop at two factors Indeed experiments with more are common and often very effective The contrasts for interactions of higher order are found by multiplying out the coef cients of all relevant main ellects but often there is no need to do soJust as two factor interactions are usually smaller than main effects and are therefore dif cult to detect so three factor interactions are even more dil cult They can still be important but often little is gained by going into details about them Interactions of higher order are usually better ignored Sometimes they are estimated and then merged into a single composite effect sometimes they are added to error a device that is useful ifdegrees offreedom for error are few and sometimes they are confounded an important and useful technique that will not be considered here The importance of the two factor interactions on the other hand can scarcely be overstated A NONSTANDARD SET OF CONTRAS I S Although most sets of contrasts are of the kinds considered above there is no objection to any other provided it sets out the questions that the experiment is intended to answer A good example is presented in the book by Little and Hills 1978 The experiment they describe well illustrates what can result when experimenter and statistician collaborate each respecting the skill and insight of the other The experiment which was on sugar beet was intended to compare the ef cacy of ve forms of nitrogenous fertilizer namely A ammonium sulphate B am monium nitrate C urea D calcium nitrate and E sodium nitrate together with F no added nitrogen At rst sight there is little pattern to the set of treatments but in fact they correspond well to the questions under study The authors propose ve contrasts c 1 1w111s 02 1 i4 11 0 03 1 0 1 1 0 c1 1 1 0 0 o 0 c5 0 0 01 1 0 It will be noted that all pairs of contrasts are orthogonal so the initial partition of the sum of squares for treatments which has ve degrees of freedom should add up correctly That need not preclude the study of other contrasts as the analysis proceeds It is best to interpret the contrasts in reverse order First C5 studies the difference between the two mineral nitrates l and E Ifit proves to be non significant lieneeforward the two can be regarded as similar in their effect Similarly 3 compares two ammonium salts A and B one of them the nitrate again ifit shows nothing it may he assumed that the two treatments diller very little in their ellect and they also could be merged The next contrast 13 Data analysis in agrimlluml experimentation I 383 compares the nitrates with the ammonium salts while C2 enquires whether the organic compound differs from the four inorganic ones Using the yield data in the book it emerges that all these contrasts with the possible exception Ong gave completely negative results leading to the conclusion that one form of nitrogen must have had much the same effect as any other Finally it emerged that most of the treatment sum ofsquares 185137 with ve degrees of freedom was due to C11 with its one degree of freedom 18020 thus showing that nitrogen however supplied increased yield A critic might complain that the experimenter was lucky Suppose that CS had given a marked effect what would have happened then The answer is that anyone interpreting data must follow where theyr lead lfthe sulphates ofealcium and sodium lead to different yields of sugar beet it would not be correct to form I them into a group though it might be sensible to take the better and use that alone making 03 into 1 10 2000rl 1 OO20 according to the outcome ofC5 Ofeourse a nonsigni cant result does not prove that the effect of the contrast is necessarily zero and some other course might be chosen The point is that the ve degrees of freedom between the six treatments should not be examined collectively as if they were an undifferentiated mass neither should they be subjected to an automated procedure that pays no regard to purpose or to the relationships between them As with the standard eases the aim should be to follow through the questions being asked and to provide answers to each Acknowiedgemenls The author thanks Dr D J Pike for many helpful comments on an earlier draft REFERENCES Dawltins H C 1983 Multiple comparisons misused Why so frequently in response Curve studies Biomerirs 3989 790 Little T hi 8t Hills 1978 Agiimt lawl Experimentation Design and Amtfysz39s New Yorktjohn Wiley and Sons Pearce S C 0989 The size ol a comparative CxperimenLJonmm uf pptz edszirricr 1623 6 i cnrre 8 Us 1992 Data analysis in agricultural experimentation l Contrasts ol intcrest Expey mmtal Agrimftme 28245 253 Pearce S 3 Clarke G Mr Dyke Gt V amp Kempson R E US38 A ilamml ofCrop Experimentation London Charles Cril lin NEW York Oxford University Press In a study of reaction time under the in uence of alcohol individuals are given varying amounts of alcohol and their time in seconds to correctly type in a code ashed on a computer screen is measured Because individuals reaction times may vary individuals are used as blocks by reusing each individual The order of the amount of alcohol given is randomized for each individual and enough time is allowed between trials so that any residual alcohol is out of an individual s system by the time the next amount is administered Ten individuals take part in the experiment Response Reaction time seconds Conditions Amount of Alcohol Material Ten individuals Control of Outside Variables Amount of liquid alcohol 7 Up is set at 8 oz Time to drink beverage set at 10 minutes Time between drinking and reaction time task is set at 15 minutes Length and complexity of code is the same for each individual Randomization Each individual will experience all of the conditions The order will be randomly assigned for each individual This is a randomized complete block design where blocks are formed by reusing individuals Replication There are 10 individuals in the Experiment Data SPSS Output for Recall Experiment BetweenSubjects Factors Vaue Label N Reinforcement 100 None 12 200 Verbal 12 IsolationTime 100 20 minutes 8 200 40 minutes 8 300 60 minutes 8 Descriptive Statistics Dependent Variable Recall Reinforcement IsolationTime Mean Deviation N None 20 minutes 220000 365148 4 40 minutes 305000 369685 4 60 minutes 135000 450925 4 Total 220000 809040 12 Verbal 20 minutes 21 0000 365148 4 40 minutes 255000 238048 4 60 minutes 315000 341565 4 Total 260000 534279 12 Total 20 minutes 215000 342261 8 40 minutes 280000 392792 8 60 minutes 225000 1030950 8 Total 240000 700931 24 Tests of BetweenSubjects Effects Dependent Variable Recall Type III Sum Source of Sguares df Mean Sguare F Corrected Model 896000a 5 179200 13785 000 Intercept 13824000 1 13824000 1063385 000 Reinforcement 96000 1 96000 7385 014 IsolationTime 196000 2 98000 7538 004 Rli39gggg gn g 604000 2 302000 23231 000 Error 234000 18 13000 Total 14954000 24 Corrected Total 1130000 23 R Squared 793 Adjusted R Squared 735 DependentVariable Recall Multiple Comparisons t 210092 Mean Difference I lsolationTime J lsolationTime lJ Std Error Sig 95 Confidence Interval Lower Bound Upper Bound 20 minutes 40 minutes 65000 180278 002 102875 27125 60 minutes 10000 180278 586 47875 27875 40 minutes 20 minutes 65000 180278 002 27125 102875 60 minutes 55000 180278 007 17125 92875 60 minutes 20 minutes 10000 180278 586 27875 47875 40 minutes 55000 180278 007 92875 17125 Based on observed means The mean difference is significant atthe 05 level Frequency Residual ferRecall 5 me an is an an 2m mm a 72m nun Residual for Recall Mean 72 02515 2m Dev 415955 N 21 Residual farRecall 5 me n n Expecml Normal Value Normal ClQ Plot of Residual for Recall 3 a 3 Observed Value I 2m Isolalianrima DependentVariable Recall ANOVA Recall Sum of Squares df Mean Square F Sig Between Groups 896000 5 179200 13785 000 Within Groups 234000 18 13000 Total 1130000 23 Multiple Comparisons Tukey HSD q 317803 Mean Difference Treatment J Treatment lJ Std Error Sig 95 Confidence Interval Lower Bound Upper Bound N20 N40 850000 254951 037 166024 3976 N60 850000 254951 037 3976 166024 v20 100000 254951 999 71024 91024 V40 350000 254951 742 116024 46024 v60 950000 254951 016 176024 13976 N40 N20 850000 254951 037 3976 166024 N60 1700000 254951 000 88976 251024 V20 950000 254951 016 13976 176024 V40 500000 254951 400 3 1024 131024 v60 100000 254951 999 91024 71024 N60 N20 850000 254951 037 166024 3976 N40 1700000 254951 000 251024 88976 v20 750000 254951 079 156024 6024 V40 1200000 254951 002 201024 3 8976 V60 1800000 254951 000 261024 9 8976 V20 N20 100000 254951 999 9 1024 71024 N40 950000 254951 016 176024 13976 N60 750000 254951 079 6024 156024 V40 450000 254951 510 126024 36024 v60 1050000 254951 007 186024 2 3976 V40 N20 350000 254951 742 4 6024 116024 N40 500000 254951 400 131024 31024 N60 1200000 254951 002 38976 201024 V20 450000 254951 510 3 6024 126024 v60 600000 254951 224 141024 21024 v60 N20 950000 254951 016 13976 176024 N40 100000 254951 999 71024 91024 N60 1800000 254951 000 98976 261024 V20 1050000 254951 007 23976 186024 V40 600000 254951 224 21024 141024 The mean difference is significant atthe 05 level Ten common misuses of statistics in agronomic research and reporting39 t Larry A Nelson and John O Rawlings ABSTMCI Te common m of allth in agronomic re mrei and reporting on 6W Some of these we I man at ehcnges in muscles philosophy over the yen to which biologist ln general and lgronomim ht poeticalquot have not responded in terms of their tints analytic and Intermuuonnl techniques Others hove been erected by Ill ormependence on computers with out careful study of the but 1 pattern or without careful consideration at the nlculatlons which the com pater ls performing The Importance of pinning experi ments properly lth View towrd sohsequent and Is emphasized Careful wellcontrolled experimental technique Is also recommended Proper planning molly mmtlogkdcomrlsoumbendevltboot momntotlemoluedhlit macho multiple comparison The matter of39nlsuse of multiple comparison procedures such I Dunn s Multiple Ruae Test Is also obtusetit It ls pointed out that In cases where logic structure doesn39t exht in the trust nelta a nee event the use of multiple coupthou procedures ls M Additional Mex words 5mm Ranrch Ac r ODERN applied statistics has been utilized ex tensively in agronomic research in the United States and elsewhere for the past 40 years During this period some of our concepts have changed For ex ample there now is much less emphasis on hypothesis testing and far more emphasis on estimation of the mag nitudes of differences and other effects No longer are we expecting an experiment to provide the last word based upon the result of some mechanical process such as a hypothesis test at a stated level of signi cance Now we are looking for indications of effects and we rely on the data to provide us estimates of the magnitudes of these effects There has also been some revision in our thinking about some concepts as a result of extensive application of statistical techniques to biological prob lems For example formerly 39we recommended the use of large plots in eld experiments because the variance V of large plots is small Now we recommend the use of small plots with a compensating increase in number of replications to use the available resources 39Contritmtion from the Inst of Statistics North Carolina State Uni Raleigh NC 27650 Professors of statistics North Carolina State UnivH Raleigh NC 2 36 39 103 The ready availability of ef cient electronic coma puters has had both positive and negative effects 0n the positive side we can process data very ef ciently and accurately at low cost There are analyses which can be run which would have been impossible before the ad vent of modern computing techniques On the other hand ready access to statistical packages by researchers with limited statistical background has increased the misuse of statistical procedures Today there is abundant evidence of the misuse of statistics by research agronomists A current issue of any one of the plant science journals will provide ample cases in point Agronomy Journal is attempting to im prOve the situation but authors are asking for more as sistance in deciding which statistical techniques they should use and how they should be used The purpose of this paper is to point out and discuss IO common misuses of statistics in agronomic applica tions Some of these are more a lack of use of statistics rather than misuses MISUSE IN PLANNING EXPERIMENTS MistralFalling Io Involve Statistical Considerations ct the Pinning Stage of the Experiment Statistical considerations should be involved at the conceptual stage of the experiment This perhaps is secondary only to the need for a good res earchable idea Tht reason for involving statistich principles early is to insure that one obtains quality data which bear upon the problem being studied There are sampling considera tions with respect to the populations to which the results of the experiment will be extrapolated the populations of environmental conditions of experimental material and of treatments There are experimental design cono siderations such as which experimental design should be used what treatments should be studied and how many replications are needed There are a number of practical considerations relating to the orientation size and shape of plots and blocks involvement of a statistician t REP I REP II E Ream I39CHEMIan A REP I i REF 3 5 9291 l CHEMICAL a l REP I i REP I l REF 1 1 Circular c Fig 1 Bungle of III natal my cerium for plant dbeue control lo a South American conchy atenenrlyrteyein planning ofanexperiment panictdarly if the researcher is not well trained in statistics is helpful in focusing on the speci c questions to be answered and the relevant statistical methods for estimation endor hypothesis testing He also can provide nssist nricein choosing treatments in such a way that the treatment comparisons can be made ef ciently at the data analysis stage WFUMuWWWm 39 v V muahopanesign The most popular design in agronomic research is the randomized complete block design It is a simple dgign and is reasonably ef cient in most cases if the blocking has been constructed appropriately The second most popular design is some version of the splitplot design There are many situations in which a splitoplot design is used where clearly n randomized complete block design I with factorial arrangement of treatments would be pre ferred In cases where there are only a few levels of the wholeplot factor or there are few replications the wholeplot error in thesplitplot analysis of var39unce 7 wilibeestimetedtvithlowpredsionandtherewillnotbe v a good test on thewboleplot factor A factorial ar nnaement of treatmenrs within the framework of a ran domized mplete block design will provide equal and adequate precision for all effects ie both sets of main effects and interactions There are many instances in which a splitplot design arises out of an innppropriate hurtling of what was in tended to be a 39 complete block At some point 39in the conduct of the experiment certain subsasofthetrcntmentsarenandledingroupssuchas in data n exposing to treatment rectors harvesting by maturity etc Such nonrandom handling of the experimental units may introduce positive interl plot correlations among the units handled as cups generating a splitplatquot design Failure to recognize this can lead to inappropriate analyses and erroneous experimental error variance estimates A proper experimental design can be destroyed by failing to receptive what constitutes the experimental unit For exarhple an individ may try to provide replication by subdividing larger plots to which treat ments have been applied Fig l in an aerial spraying experiment for disease control three chemicals were 39applied im long strips and then the strips were subdivided to provide what the researcher considered were replications Figure l is not a randomized complete block design which39the ruearcher had intended to use Each strip is one experimental unit and the subdivisions are samples not replicates A randomized complete Block design Fig 2 was then provided the researcher as an appropriate alternative In this case there are nine ex perimental units in three blocks of three each Random inuon was carried out as required in the randomized complete block design and the medal spraying was per formed according to this revised plan The danger that chemical effect estimates would be confounded with NELSON l RAWLINGS 0F STATISTICS IN AGRONOMY 10 a V a c39 C A a 9 C A39 w BLOCK r BLOCK 1 mom 1 i Fig 2 An mm alternative procedure no the statistician z u a a land conditions was no longer a problem and proper error replication provided an estimate of experimental 1 Another example of misuse of a proper design is the placement of the wrong factor at the wholeplot level in a splitplot design Recently one of the authors en countered a greenhouse experiment involving a study of the response of 12 different soils to lime There were two lime levels none and 2 t haquot The objective was to see if soils differed in their response to lime A splitplot design was used with lime being the wholeplot factor which was arranged in randomized blocks Soil was the subplot factor Although the overall test of the time x 39 soil interaction was precise comparisons of the two lime soils to the wholeplots and place the lime factor at the subplot level Comparisons of two lime levels within a soil would then be quite precise because the subplot error which is based upon a relatively large number of degrees of freedom would housed for testing m 3Fallilg to Us Proper Randomization Procedures Randomization is used to insure that we will have un biased estimates of treatmmt effects and experimental error Failure to use proper randomization technique could cause certain treatment to be favored or hampered due to the position in which they are placed in the experimental area and cause differences in degrees of precision for different comparisons in our extensive consulting with plent science researchers during the past whichthe researcher did not take the need for random iation seriously and consequently compromised the quality of the experiment and the results therefrom Methods of proper modernization are discussed ex tensively in Statistical methods texts There are also com puting routines available for this purpose in some of the statistical computing packages It is common for randomization to be considered rele vant only at the time of assignment of the treatment to the experimentoi units it is important that the research er take care du nl all prunes of the research that spuri ous correlations are not introduced among experimental units receiving the same treatments or any other cor relations not accounted for by the design see misuse 2 Such might occur for example if potted plants from different rcplicata are grouped for easy administration of a daily nutrient supplement 102 I I JOURNAL OF AGRONOMIC EDUCATION VOL l2 I933 When experiments are conducted in series over sites andor years it is necessary to provide different ran domization for each experiment This will reduce biases which might result from two adjacent treatments interacting and will tend to equalize the precision of all comparisons i There are also experiments in which the entire experi mental process is a chain of individual steps For ex ample plants might be grown according to an experi mental design in the greenhouse and then transferred to the eld and used in a second stage experiment under eld environmental conditions 0139 more likely plants may be grown in a haphazard arranzemem in the green house for part of their growth cycle and then put into a designed experiment at a speci ed stage of growth Proper randomization at that stage would avoid biasing treatment effects and experimental error arising from environmental variation during the earlier stage of growth but a more ef cient experimental design would incorporate provisions for error control at all stages of plant growth Misuse e Uslng an Improper Size of An Experiment it is important to use an appropriate number of repli cations in an experiment Under replieation could result in very imprecise estimates whereas overreplication can be costly Agronomists probably err on the side of underreplication more often than on the other side One still sees selfcontained eld experiments where the researcher is attempting to achieve adequate precision with only two replications There are very few eld situations where this nitrnber of replications would be adequate Table 21 in Cochran and Cox 1957 provides a useful guide to the determination of the proper number of replications if the approximate variability of the ex perimental material coef cient of variation is known and the researcher is willing to estimate differences be tween means of a given percent It is also necessary to r assume a probability level for the test or and a proba bility of rejecting a false null hypothesis Equally important in agronomic research is for the replication to adequately sample the reference popula tion of environmental conditions This is usually ac complished by growing the test over several years at several locationsvvithin the geographical area of inter est it is very unlikely that a test at one site in 1 year will provide a reliable39inference to any except the most re stricted reference population of environments Misuse s Uslng Improper Experimental Technique The precision of data from an experiment depends to a large degree upon the experimental technique used Because statistics deal with variability and methods for dealing with it experimental technique does fall within the realm of a statistician s concern In fact statisticians have perhaps made one of their more important con tributions toagronomic research in asking questions of the researcher about his experimental technique with a vrew to its improvement We nd that many agronomists do not know the meaning of effective blocking In many cases they are blocking just to proVide replication not error control Others attempt to run experiments before they have ade quately become familiar with the experimental process perhaps through small pilot studies and so they do not use the best technique Some lack care in controlling variation We say that their experimental technique is outofcontrol Others do not oversee the experimental process adequately or they do not take note of unusual events which took place at the experimental site Con sequently when these unusual events have generated quotoutliersquot in the experimental data there is no basis for rejection of the extreme datum points from the data set Overall precision can be increased by using a uniform 39 experimental technique throughout the series of experi ments Some ways of standardizing technique are to I write out procedures for conducting various phases of the experiments and a time schedule for their execution 2 make all personnel dealing with the treatments plots and data aware of the various sources of error and the need for good technique 3 apply the treatments uni formly 4 exercise suf cient control over external in uences so that every treatment produces its effect under controlled comparable conditions 5 devise suit able unbissed measures of the effects of treatments and 6 prevent gross errors There are many aspects to technique such as choice of proper plot size and shape choice of dosages39in quanti tative controlled variable experiments and proper timing of operations It is very important from a statistical point of view that individuals who lay out the experi ments are trained in the subject matter discipline as well as in eld plot technique Otherwise it is impossible to provide credentials to imprecise data once they have been collected under dubious sets of circumstances MISUSES N ANALYSIS AND INTERPREIA39HON OF DATA Misuse GUslng Inappropriate Error Terms for Testing Errors or for Calculating Standard Use of an inappropriate error term for testing or for providing standard errors has been a problem for many years but has increased recently with the common39use of statistical computing packages which use a default error term By this is meant that all terms not included in the linear model spelled out in the instructions to the com puter are pooled into an error term which the computer uses for tests and estimates of precision In a very large proportion of the cases the tests of signi cance auto matically provided by computer packages are incorrect Each user of a computer package is responsible for the correctness of his or her analysis The analysis of variance of data from a randomized complete block design in which each plot has been sam pled Table t has a sampling error in addition to the i by y NELSON it RAWLINGS MISQSES 0 ST ATlS l lCS 1N AGRONOMY 103 39 trams 39 4 t 39 39 r 39 we propriate error term for testing especially in mm W w 39 lot sizes or levels of error within each uaadadyudtwhdetthtm mam ummmquot L 39 Sam at us Misuse T Flllh to Study Patterns in Data 39 3 3 9 93 With modern computers there its tendency to rou am z m f quot 5 tiner process data through standard data analysis rou Sunplinlarror 39 an m usual experimental error The appropriate error term for testing treatments is experiniental error not sam pling error The F of 2 based upon forming the ratio of Treatment MS to 11m MS with 9 and 27 degrees of freedom is not signi cant at the 005 level if one placed only Blocks and Treatments in the linear model used in tting by the computer the pooled error would he 27 x many 4 300 x tamm a 590 Using this inap propriate pooled error the computer would use a do nominator of 90 rather than I in the F ratio and the denominator degrees of freedom would he 22 rather thanthc correct value of 2 The resulting inference would be incorrect ie treatments would now be sig ni cant at the anneal 39 Or if the model is speci ed 50 that error is separated from sampling errorquot some cmnputer packages will use the satup ling error in testing the treat ments ategory multing in a large upward bins in th Treatments F ratio 39 l39h39ereareothercaseswheretheresearcher esiresto use the appropriate error term but it is difficult due to the nature of the design constraints A common ex ampleisthedesiutofgrowthchamher experiments Itls difficult to provide enough climbers or replication of we r often shared by a number of researchers and also it is dif cult to change the temperaturemoisture settings for a giyen c her Replication of the femur combina tions twithin the chamber is achieved more readily but the error tenn resulting from this second type of repli cation is not appropriate for testing temperature humidity treatments The answer to this problem usual ly is to provide multiple runs of the experiment using the some lentpemtme settings within a chamber from run to run but with new randominuons of the factorcom binations within a chamber for the various runs The run then serves as a block in a randomized complete block wholeplotth of temperatures and the plots within the chambers within a run are considered as sub r in short the researcher needs to be sure that the ap propriate error term is being used whether the analysis I of variance is being conducted by a desk calmiator V under his personal supervision or whether by an elec tronic computer The computer is in no position to de termine the appropriate errorth with which to test so one should not blindly assume that signi cance tests or error estimates obtained from the computer were ob tained using the proper error The procedure of writing 39 out expectations of mean squares which is described in statistics methods textsts useful in guiding one to the ap tines without careful study of the patterns of variation in the data As a result resarcherr re much less 7 familiar with the behavior of their experimental data than when analyses were done by hand and it is easy for badly behaved data to escape detection The presence of a single outlier can uroNy in ate variances without being detected if for ample only routine analyses of variance are run Heterogeneous inado quacies of the model and model assumptions will sel dom be detected without a careful study of the data if the data set is loo large for a areful study byhand vari on computer programs for editing rsidual analysis tests of normality etc are available for assisting a complete analysis of the data It causes for the outliers canbeidenti editispossibletoreplacethemby missing plot estimates lnsouteeases entire sections of thcdatatorevenentiredataseularerendered invalidity effects of uncontrolled factors or by improper aperi mental design or layout It is important that such cases he recognized and dealt with appropriately Detection of portions of the data where the variance differs from that of other parts of the data may be cc coruplished by comparing39 the ranges among the replications for each treatmmt or by analysis of residuals There are several courses of action once it has been deterntined that the errors in the data set are heterogeneous In some cases a transformation such as the lormmformation may be used Anotherapproa ch is to partition the data into sets which have homo geneous variance and conduct sepuate analyses of variance for each set A careful39study of A the data patterns Will also help to determine if the a priori biological model is adequate or if the patterns show that some other modelwould be more appropriate Misuse Depending Excessively on One Class or 39 Statlsdul Analyses in wommic research the analysis of variance has almost become the universal method of analysis This is the statistical procedure emphasized in all basic statis tical methods courses and it is familiar to most carom mists The analysis of variance is a powerful technique for understanding variational patterns in the data and deserves 39to he a primary tool However excessive de pendence on the analysis of variance or any other single statistical method of analysis is a handicap to the researcher There are two problems associated with this dependence First the particular statistical analysis simply may be inappropriate for the problem either be 1 I06 L JOURNAL OF AGRONOMIC EDUCATION VOL 12 1983 L 39 cause basic assumptions required for the analysis are not satis ed or because other more powerful methods may be aveilahle Secondly other methods of anelysis may be more revealing of the basic structure of the data For example a principal component analysis or other multivariate methods an be very informative of the mnelational structure among several variables The re searcher should always be critical of the standard methodol analysis and seek statistical usistance if he suspects his method is inadequate in any sense MISUSESV IN manna EXPERIMENTAL RESULTS Misuse 9Mlaapplylag Multiple Compariso P roceduees Such quotDuncan39s New Multiple l39est Perhaps the mast frequently occurring of the misuses in agronomic reporting is misuse of multiple compari son procedures There a recent tendency to overuse and misuse mechanical comparison procedures such as Duncan s New Multiple Range Test in inter prelim and reportin research results The plant science journals have faced this problem for a number of years Unfortunately the misuse of these procedures is re corded permanently in some of the papers of these in approaching the problem of drawing inferences about treatment effects it is important to review the plans and the major goals set forth by the matches be fore the experiment uninitiated It is also important to review the experimental and treatment designs and an account of what has talten place at the research site throughout the period of the experiment The trest ments are then studied in detail to see what comparisons are logical from the structure of the treatments As an example consider the following set of ve treatments the 2 x 2 factorial set of treatments consisting of varieties A and B with and without fertilintion plus a check treatment consisting of the standard variety and the standard fertilization The comparisons which would be logical for these ve treatments are Cit vs other four treatments Var A vs Var B 39 Fertilizer vs No fertil39mer and VI A vs Var 8 x Fertilizer vs No fertilizer One should consult a statistics methods text for the methodology to construct contrasts and to calculate the sum of squares associated with each comparison It should he pointed out that multiple comparison pro cedures should not be used for comparing the five 7 means because there are speci c comparisons suggested by the factorial structure of the data We want the power of the tests to be focused on these particular com sous V For situations in which the experiment involves a Quantitative controlled factor eg an Nrate experi mt involving treatments of 0 0 100 150 ammo kghaN the appropriate statistical analysis is to t a Table 39 39 at Included in rheumatism 39 Scum at us 1 atom a mu realm M 1 Ci vsodsus I 212 75 Verityquot I won 50 NotatefN l rm 2 V x N l 23 23 Error l2 m 39Simifhntltheomhel response curve This may be reported in a graph show ing the response relationship together with the equation and measures of precision The preceding approach may be extended to more than one fertilizer variable a re sponse surface in two or more variables would be tted to the yielddata Adoption of a response curve or sur face to represent the behatv39ior of the data completes the tests of signi cance of effects of changes in the inde pendent variables It would be inappropriate for ex ample to ask quotIs the change in Y from X s 10 to X 20 statistically significantquot Or How much does X have to be increased above x a 10 before a statis tically signi cant increase in Y is achieved Questions of this type reflect a general overemphasis on hypothesis testing at the expense of estimation The statistical test of signi cance is an aid for deciding whether changes in Y are real or are an artifact of the random noise in the experimcnt Adoption of the re sponse curve presumably using appropriate tests of sig niftcanoe implies that we have rejected random noise as the explanation for the changes in Y Having adopted the response curve the above questions should be re phmed as estimation questions e What is the esti mated change in Y as X is changed from 10 to 20 and what is the standard error of the estimated changequot or How much does X have to change to produce a bio logically meaningful increase in Y of S units and what is the standard error of this estimated change in Xquot It is dif cult to lay down hard and fast mics for inter preting data from factorial experiments Each experi ment will present a different interpretations pattern Usually if interactions are negligible it is possible to use the techniques dtscribed ab0ve on the main effect means If interaction is sieeable the above techniques are used for one factor and ach level of the other 7 factorts Where treatments have structure the inclusion of a brief analysis of variance table in the RESULTS section of a report or paper will often be quite useful to the reader Tablez The check mean was different from the mean of all of the other treatments Also the Var A mean was different from the Vern mu One can then report the check mean the average of the other four means and the Var A and Var B means This gives the information which existsin this set of data The one situation in which the use of multiple com parisons is justi ed is where there does not appear to be a logical treatment structure A mechanical comparison of all pairs of means using a criterion such as the WallerDuncan low Procedure Steel and Torrie 980 seems appropriate However it should he pointed out that in most experiments there is at least some struc ture in the treatmems 39 Misuse Ill Falling to Report In the MATERIALS 1 AND METHODS Section of the Resend Report the Experimental MI and Statistical Procedures Used One of the common failures in reporting of research results ism inadequately dmibe in the MATERIALS AND METHODS section of the research report or paper the experimental design and statistical procedures used This information is very essential to the reader s proper interpretation of the results reported in the A detailed account of What procedures were used will alsoallay any tears on the part of the reviewer and the editor or the journal and the reader that improper design or statistical procedures were used An example of a statement which would be used to describe the experimental design and mtisticnl proce dures is as follows The experiment was conducted ac cording to a split plot design with a randomized com plete block arrangement of the wholeplot factor Varie ties There were four blocks The subplot factor was N fertilizer rate 0 50 l 150 Itsha of N supplied a an hydrous ammonia Became there was a signi cant Variety x Nrate interaction separate quadratic Nrate 39 response curves were fitted for each or the varieties Economic optimal rates were computed according to the methods of Heady et al 1955 assuming a costprice ratio 000244 N fertillw in kg corn yield in quintals h A a NELSON a mwuucs MISUSESVOF STATISTICS m AGRONOMY nos CONCLUSIONS Misuses of statistics in agronomic applications are far more prevalent than most agronomists realize The statisticians views of statistical concepts have changed comide39rably since the early days of statistical applica tion es when Duncan s New Multiple Range Test was in vogue Agronomists apparently are not fully aware of these changes Widespread use of computers to gether with the ready availability of user friendlyquot software packages have also resulted in a number of misuse ofsutisticr Agronomistt should recognize their need for statistical assistance in planning experiments and in analyzing and interpreting experimental data The re sponsibility for improving the statistical practice in agronomic research rests jointly with the agronomist the statistician the agronomic journal reviewers and editors Such improvement should result in well planned agronomic experiments which focus upon the problems being researched the results from which are clear cut and the conclusions from which are scien ti cally valid and relevant to the underlying problem mauluncut Cochran W 6 end G M Cox 1951 Exwhnemnl dermis 2nd at John Wit and Sam Incquot New York Heady E 0 J T Peak and V G Brm 1955 Crop response when and economic mine in milder use to Ay ic Exp Sta Res Bull 424 Steel R G 0 and R To39rrie I960 Principle and procedure ofm sia 2nd ed MeantHill Book Gunny New York N t ReporB Descnpuve Stahsucs Generahzed Lv vear Mode s waed Mode s Dam Reducuon 5cae Nonparamemc Tess me senes Survwa Mu up e Response Quahty comm Group Statis cs Blood Pressure 1 2 Independent Samples Test 073 7960000 073 7960000 464327 464327 95 Con dence e 72030741 7203313 110741 113133 JMP Output for Treatment of Agoraphobia Experiment Response Severity Summary of Fit RSquare 088434 RSquare Adj 082892 Root Mean Square Error 1658312 Mean ofResponse 1330556 Observations or Sum Wgts 72 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 23 10092778 438816 159570 Error 48 1320000 27500 Prob gt F C Total 71 11412778 lt0001 Effect Tests Source Nparrn DF Sum of Squares F Ratio Prob gt F Drug 3 3 48672222 589966 lt0001 Therapy 2 2 3336111 60657 00045 DrugTherapy 6 6 38552778 233653 lt0001 Depression 1 1 4050000 147273 00004 DrugDepression 3 3 2094444 25387 00675 TherapyDepression 2 2 3108333 56515 00063 DrugTherapyDepression 6 6 1113889 06751 06703 Effect Details Drug Least Squares Means Table Level Least Sq Mean Std Error Mean ElaVil 11666667 039086798 116667 Placebo 16333333 039086798 163333 Prozac 9944444 039086798 99444 XanaX 15277778 039086798 152778 J 20 g 815 E 9 31 0 Q 8 E l l l Elavil Placebo Prozac Xanax Drug LSMeans Differences Tukey HSD 10050 Q266137 I SMeanfil By LSMeanh39 MeaniMeanLi Elavil Placebo Prozac XanaX Std Err Dif Lower CL Dif Upper CL Dif Elavil 0 46667 172222 36111 0 055277 055277 055277 0 61378 025109 50822 0 31955 319335 214 Placebo 466667 0 638889 105556 055277 0 055277 055277 319554 0 491776 04156 61378 0 786002 252668 Prozac 17222 63889 0 53333 055277 055277 0 055277 31934 786 0 68045 02511 49178 0 38622 XanaX 361111 10556 533333 0 055277 055277 055277 0 213998 25267 38622 0 508224 041557 680446 0 Level Least Sq Mean Placebo A 16333333 XanaX A 15277778 Elavil B 11666667 Prozac C 9944444 Levels not connected by same letter are signi cantly different Therapy Least Squares Means Table Level Least Sq Mean Std Error Mean CogBeh 12458333 033850160 124583 Group 13333333 033850160 133333 Psychodynamic 14125000 033850160 141250 w 20 g E 15 10 J CogBah I Group Psychodynamic Therapy LSMeans Differences Student39s t 10050 t201063 047871 047871 047871 047871 047871 0 Level Least Sq Mean Psychodynamic A 14125000 Group A B 13333333 CogBeh B 12458333 Levels not connected by same letter are signi cantly different DrugTherapy 25 Elavil 39 Placebo 20 Prozac quot Xanax 39939 115 C N D E w 10 E a gt D U U1 I I CogBeh G roup Psychodynamic Therapy Depression Least Squares Means Table Level Least Sq Mean Std Error Mean No 14055556 027638540 140556 Yes 12555556 027638540 125556 20 g 15 1 a g 10 No Yes Depression DrugDepressi0n Least Squares Means Table Level Least Sq Mean Std Error E1aVi1No 11777778 055277080 E1aVi1Yes 11555556 055277080 P1aceboNo 17333333 055277080 PlaceboYes 15333333 055277080 ProzacNo 10333333 055277080 ProzacYes 9555556 055277080 XanaXNo 16777778 055277080 XanaXYes 13777778 055277080 25 EIaVII quotIquot Placebo 20 Prozac quot39 Xanax A 01 I Severity LS Means 1 C Depression TherapyDepressi0n Least Squares Means Table Level Least Sq Mean Std Error CogBehNo 12333333 047871355 CogBehYes 12583333 047871355 GroupNo 14250000 047871355 GroupYes 12416667 047871355 PsychodynamicNo 15583333 047871355 PsychodynamicYes 12666667 047871355 25 CogBeh quotP Group 20 Psychodynamic quot g 15 E m o E I l 10 E a gt o m t I No Yes Depression Residuals I I 2 l 2 I I I I gt I l I I I l a 2 1 39 39 E 1 I I I I g a 39 Tu I I I n E I I g 0 I l I 4 I I U l I 39W 1 I I I 392 I I g I I I I I 3 J Q 390 2 I I I 5 E e o 2 3 I I I 0 1 Elavil Placebo Prozac Xanax 0 Drug Therapy Residual Count Normal Quantile Plot Depression Yes Residual Severity 0 I39 39 o I c I n I n I n n I n I

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.