New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Jordane Kemmer


Jordane Kemmer
GPA 3.79


Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Class Notes
25 ?




Popular in Course

Popular in Statistics

This 40 page Class Notes was uploaded by Jordane Kemmer on Thursday October 15, 2015. The Class Notes belongs to ST 302 at North Carolina State University taught by Staff in Fall. Since its upload, it has received 55 views. For similar materials see /class/223977/st-302-north-carolina-state-university in Statistics at North Carolina State University.

Similar to ST 302 at NCS




Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/15/15
Nonparametric Tests 15 The VWIcoxon Rank Sum Test 152 The VWIcoxon Signed RankTest 153 The KruskalrWallis Test H572 rebiislness outlier transforming data ether standard distribulions bootstrap methods permutation tests nonparametric methods rank tests CHAPTER 15 Nonparametric Tests Introduction The most commonly used methods for inference about the means of quan titative response variables assume that the variables in question have nor mal distributions in the population or populations from which we draw our data In practice of course no distribution is exactly normal Fortunately our usual methods for inference about population means the oneesample and twoesample I procedures and analysis of variance are quite robust That is the results of inference are not very sensitive to moderate lack of normality especially when the samples are reasonably large Some practical guidelines for taking advantage of the robustness of these methods appear in Chapter 7 What can we do if plots suggest that the data are clearly not normal es pecially when we have only a few observations This is not a simple question Here are the basic options 1 If lack of normality is due to outliers it may be legitimate to remove the outliers An outlier is an observation that may not come from the same popu lation as the others Equipment failure that produced a bad measurement for example entitles you to remove the outlier and analyze the remaining data If the outlier appears to be real data you can base inference on statistics that are more resistant than Frand s Options 4 and 5 below allow this 2 Sometimes we can transform our data so that their distribution is more nearly normal Transformations such as the logarithm that pull in the long tail ofrighteskewed distributions are particularly helpful Example 710 page 466 illustrates use of the logarithm A detailed discussion of transformations ap pears in the extra material entitled Bansformjng Relationships available on the text CD and Web site 3 In some settings other standard distributions replace the normal distrie butions as models for the overall pattern in the population We mentioned in Chapter 5 page 367 that the Weibull distributions are common models for the lifetimes in service of equipment in statistical studies of reliability There are inference procedures for the parameters of these distributions that replace the procedures when we use speci c nonnormal models 4 Modern bootstrap methods and permutation tests do not require nor mality or any other speci c form of sampling distribution Moreover you can base inference on resistant statistics such as the trimmed mean We recome mend these methods unless the sample is so small that it may not represent the population well Chapter 14 gives a full discussion 5 Finally there are other nonparametric methods that do not require any speci c form for the distribution of the population Unlike the bootstrap and permutation methods common nonparametric methods do not make use of the actual values of the observations The sign test page 468 works with counts of observations This chapter presents rank tests based on the rank place in order of each observation in the set of all the data mntinimus distribution 151 The Wilcoxon Rank Sum Test I573 Setting Normal test Rank test VWcoxon signed ranktest Section 152 Onesample ttest Section 71 One sample Matched pairs Apply onesample test to differences within pairs Twosample ttest Section 72 Two independent samples VWcoxon rank sum test 1 1 Section 5 KruskalWallis test Section 153 Oneway ANOVA Ftest Section 12 Several independent samples lillsljitl EM ll Gill Comparison oftests based on normal distributions with nonparametric tests for similar settings This chapter concerns rank tests that are designed to replace the I tests and oneeway analysis of variance when the normality conditions for those tests are not met Figure 151 presents an outline of the standard tests based on normal distributions and the rank tests that compete with them All of these tests require that the population or populations have continuous dis tributions That is each distribution must be described by a density curve that allows observations to take any value in some interval of outcomes The normal curves are one shape of density curve Rank tests allow curves of any shape The rank tests we will study concern the Center of a population or pop ulations When a population has at least roughly a normal distribution we describe its center by the mean The normal tests in Figure 151 all test hye potheses about population means When distributions are strongly skewed we often prefer the median to the mean as a measure of center In simplest form the hypotheses for rank tests just replace mean by median We devote a section of this chapter to each of the rank procedures Sec tion 151 which discusses the most common of these tests also contains general information about rank tests The kind of assumptions required the nature of the hypotheses tested the big idea of using ranks and the contrast between exact distributions for use with small samples and approximations for use with larger samples are common to all rank tests Sections 152 and 153 more brie y describe other rank tests The Wilcoxon Rank Sum Test Twoesample problems see Section 72 are among the most common in statise tics The most useful nonparametric signi cance test compares two distribue tions Following is an example of this setting 151 Yield no weeds CHAPTER 15 Nonparametric Tests Yield 3 weeds per meter Z SCOFE 5 1U 23932 Normal quantile plots ofcorn yields from plots with no weeds left and with 3 weeds per meter of row right for Bample 151 Does the presence of small numbers ofweeds reduce the yield ofcorn 39 Lamb39sequarter is a common weed in corn elds A researcher planted corn at the same rate in 8 small plots of ground then weeded the corn rows by hand to allow no weeds in 4 randomly selected plots and exactly 3 lamb39sequarterplants per meter ofrow in the other 4 plots Here are the yields of corn bushels per acre in each ofthe plots1 Weeds per meter Yield buacre 0 1667 1722 1650 1769 3 1586 1764 1531 1560 Normal quantile plots Figure 152 suggest that the data may be righteskewed The samples are too small to assess normality adequately or to rely on the robustness of the twoesample ttest We may prefer to use a test that does not require normality The rank transformation We rst rank all 8 observations together To do this arrange them in order from smallest to largest 1531 1560 1586 1650 1667 1722 1764 1769 The boldface entries in the list are the yields with no weeds present We see that four of the ve highestyields come from that group suggesting that yields are higher with no weeds The idea of rank tests is to look just at position in this ordered list To do this replace each observation by its order from 1 smallest to 8 largest These numbers are the ranks 151 The Wilcoxon Rank Sum Test I575 Yield 1531 1560 1586 1650 1667 1722 1764 1769 Rank 1 2 3 4 5 6 7 8 RAN KS To rank observations rst arrange them in order from smallest to largest The rank of each observation is its position in this ordered list starting with rank 1 for the smallest observation Moving from the original observations to their ranks is a transforma tion of the data like moving from the observations to their logarithms The rank transformation retains only the ordering of the observations and makes no other use of their numerical values Working with ranks allows us to dis pense with speci c assumptions about the shape of the distribution such as normality The Wilcoxon rank sum test If the presence of weeds reduces cornyields we expect the ranks of the yields from plots with weeds to be smaller as a group than the ranks from plots with outweeds We might compare the sums of the ranks from the two treatments Treatment Sum of ranks No weeds 2 3 Weeds 13 These sums measure how much the ranks of the weedifree plots as a group exceed those of the weedy plots In fact the sum of the ranks from 1 to 8 is always equal to 36 so it is enough to report the sum for one of the two groups If the sum of the ranks for the weedifree group is 23 the ranks for the other group must add to 13 because 23 13 36 If the weeds have no effect we would expect the sum of the ranks in either group to be 18 half of 36 Here are the facts we need in a more general form that takes account of the fact that our two samples need not be the same size THE WILCOXON RANK SUM TEST Draw an SRS ofsize 111 from one population and draw an independent SRS of size 112 from a second population There are Nobservations in all where N 111 112 Rank all Nobservations The sum Wof the ranks for the rst sample is the Wilcoxon rank sum statistic If the two populations have the same continuous distribution then Whas mean 111N1 WT I576 CHAPTER 15 Nonparametric Tests and standard deviation 7 H1H2N1 W V 12 The Wilcoxon rank sum test rejects the hypothesis that the two pop ulations have identical distributions when the rank sum Wis far from its mean In the corn yield study of Example 151 we want to test H0 no difference in distribution of yields against the oneesided alternative H2 yields are systematically higher in weedefree plots Our test statistic is the rank sum W 23 for the weedefree plots In Example 151 171 4 n2 4 and there are N 8 observations in all The sum of ranks for the weedefree plots has mean 7H1Nl W T 7 49 7 77718 and standard deviation n1n2N1 W T W T34s4 12 Although the observed rank sum W 23 is higher than the mean it is only about 14 standard deviations higher We now suspect that the data do not give strong evidence that yields are higher in the population of weedefree corn The Pevalue for our oneesided alternative is P W 23 the probability that Wis at least as large as the value for our data when Hg is true To calculate the Pevalue PW3 23 we need to know the sampling distrie bution of the rank sum then the null hypothesis is true This distribution depends on the two sample sizes 111 and n2 Tables are therefore a bit unwieldy though you can nd them in handbooks of statistical tables Most statistical software will give you Pevalues as well as carry out the ranking and calcue late W However some software gives only approximate Pevalues You must learn what your software offers This test was invented by Frank Wilcoxon 189271965 in 1945 Wilcoxon was a chemist who met statistical problems in his work at the research laboratories of American Cyanimid Company 151 The Wilcoxon Rank Sum Test I577 Exact Wilcoxon ranksum test data 0weeds and 3weeds ranksum statisticW 23 n 4 m 4 pvalue 0100 alternative hypothesis true mu is greater than 0 Hi 1 5le 133 Output from the SrPLUS statistical software for the data in Example 151 This program uses the exact distribution for W when the samples are small and there are no tied observations I Figure 153 shows the output from software that calculates the ex act sampling distribution of W We see that the sum of the ranks in the weedifree group is W 23 with Pivalue P 010 against the oneisided alternative that weedifree plots have higher yields There is some evidence that weeds reduce yield considering that we have data from only four plots for each treatment The evidence does not however reach the levels usually considered convincing It is worth noting that the twoesample I test gives essentially the same result as the Wilcoxon test in Example 153 I 1554 P 00937 A per mutation test Chapter 14 for the sample means gives P 0084 It is in fact somewhat unusual to nd a strong disagreement between the conclusions reached by these tests The normal approximation The rank sum statistic Wbecomes approximately normal as the two sample sizes increase We can then form yet another zstatistic by standardizing W 7 Wm MW Z 7 aw W7 111 N 12 Tn1n2N112 Use standard normal probability calculations to nd Pevalues for this statise continuity tic Because Wtakes only wholeenumber values the continuity correction l l tmn improves the accuracy of the approximation 39 gt VAM P L E 1 54 The standardized rank sum statistic Win our corn yield example is 7 wi wizsiis W 3464 1 44 Z I578 ManneWhitney test CHAPTER 15 Nonparametric Tests We expect Wto be larger when the alternative hypothesis is true so the approximate Pevalue is HZ 144 00749 The continuity correction page 347 acts as if the whole number 23 occupies the entire interval from 225 to 235 We calculate the Pevalue P W 3 23 as PW 3 225 because the value 23 is included in the range whose probability we want Here is the calculation W7 W 225718 W 225 P P 3 lt W 3 3464 gt PZ130 00968 The continuity correction gives a result closer to the exact value P 010 We recommend always using either the exact distribution from software or tables or the continuity correction for the rank sum statistic W The exact distribution is safer for small samples As Example 154 illustrates however the normal approximation with the continuity correction is often adequate Figure 154 shows the output for our data from two more statistical programs Minitab offers only the normal approximation and it refers to the ManneWhitney test This is an alternative form of the Wilcoxon rank sum test SAS carries out both the exact and approximate tests SAS calls the rank sum Srather than Wand gives the mean 18 and standard deviation 3464 as well as the zstatistic 1299 using the continuity correction SAS gives the approximate twoesided Pevalue as 01939 so the oneesided result is half this P 00970 This agrees with Minitab and up to a small roundoff error with our result in Example 154 This approximate Pevalue is close to the exact result P 01000 given by SAS and in Figure 153 What hypotheses does Wilcoxon test Our null hypothesis is that weeds do not affect yield Our alternative hypothee sis is that yields are lower when weeds are present lfwe are willing to assume that yields are normally distributed or if we have reasonably large samples we use the twoesample test for means Our hypotheses then become Ho 1 2 H21 M1 gt M2 When the distributions may not be normal we might restate the hypothee ses in terms of population medians rather than means Ho medianl mediang H2 medianl gt mediang vsv YheW mmn msumw m anab Mann Illmmev swam mervahndksl eds N 0 mm H Knems 0 mm man PmmeswwaxevwEYA mm u an uvnpmmm MEYA museums Wrzan swam m V am gt m 5 S vm camax awn blsAs vwcmn mom Sums mmm mu massmea by mm was sum Expened smuev Mean wzzus N 5cm umuu umuu 5 n o m m comm mannnnn 3 o m m comm unannnn vwcmn m 5mm Yesl ma nmnn nznnn mum WWW wnh Eammunv Barrenmn m 5 mm m nwav nan mm anzb 2mm 92mm W27 mm a m 3917 t y omvm m mg m 21 Mmmm mmhppmxmmnnfu mdsmmmnmwlbl sg Mm mm mm um m mm 1 ngwa k hwath s snboulwwhuonmm ms but mum mamm mm 5 m m popummmughm mm was armsmgmc Dmus mele nunelume wads m s m Pambolsaxdymmmhxmw smemmmumym c Eda m m mehypmhss m mmme m Wm mans 39Em and also gm acon dmemmvalioxd e Mme m me Fun n m ca mmuupx maivasmnuid em ampl zgl Snmlalmled m my punkum mm smemimmguagM m mm m ammn Formmew the Wkomn mam and5 m a mud me gm and mm mm semng luau hypum 157 CHAPTER 15 Nonparametric Tests average ranks can state in words as H0 The two distributions are the same H2 One distribution has values that are systematically larger Here is a more exact statement of the systematically larger alternative hypothesis Take X1 to be corn yield with no weeds and X2 to be cornyieldwith 3 weeds per meter These yields are random variables That is every time we plant a plot with no weeds the yield is a value of the variable X1 The probabile ity that the yield is more than 160 bushels per acre when no weeds are present is PX1 gt 160 If weedefree yields are systematically larger than those with weeds yields higher than 160 should be more likely with no weeds That is we should have PX1 gt 160 gt PX2 gt 160 The alternative hypothesis says that this inequality holds notjust for 160 but for any yield we care to specify No weeds always puts more probability to the right of whatever yield we are interested in2 This exact statement of the hypotheses we are testing is a bit awkward The hypotheses really are nonparametric because they do not involve any speci c parameter such as the mean or median If the two distributions do have the same shape the general hypotheses reduce to comparing medians Many texts and computer outputs state the hypotheses in terms of medians sometimes ignoring the sameeshape requirement We recommend that you ex press the hypotheses in words rather than symbols Yields are systematically higher in weedefree plots is easy to understand and is a good statement of the effect that the Wilcoxon test looks for Tles The exact distribution for the Wilcoxon rank sum is obtained assuming that all observations in both samples take different values This allows us to rank them all In practice however we often nd observations tied at the same value What shall we do The usual practice is to assign all tied Values the average ofthe ranks they occupy Here is an example with 6 observations Observation 153 155 158 158 161 164 n 1 2 35 35 5 6 The tied observations occupy the third and fourth places in the ordered list so they share rank 35 The exact distribution for the Wilcoxon rank sum Wchanges if the data contain ties Moreover the standard deviation aw must be adjusted if ties are present The normal approximation can be used after the standard deviation is adjusted Statistical software will detect ties make the necessary adjust ment and switch to the normal approximation In practice software is re quired if you want to use rank tests when the data contain tied values It is sometimes useful to use rank tests on data that have very many ties because the scale of measurement has only a few values Here is an example 151 The Wilcoxon Rank Sum Test I57I1 Food sold at outdoor fairs and festivals may be less safe than food sold in restaurants because it is prepared in temporary locations and often by volunteer help What do people who attend fairs think about the safety ofthe food served One study asked this question of people at a number of fairs in the Midwest How often do you think people become sick because of food they consume pre pared at outdoor fairs and festivals The possible responses were 1 very rarely 2 once in a while 3 often 4 more often than not 5 always In all 303 people answered the question Of these 196 were women and 107 were men Is there good evidence that men and women differ in their perceptions about food safety at fairs3 I We should rst ask if the subjects in Example 156 are a random sample of people who attend fairs at least in the Midwest The researcher visited 11 different fairs She stood near the entrance and stopped every 25th adult who passed Because no personal choice was involved in choosing the subjects we can reasonably treat the data as coming from a random sample As usual there was some nonresponse which could create bias Here are the data presented as a twoeway table of counts Response 1 2 3 4 5 Total Female 13 108 50 23 2 196 Male 22 57 22 5 l 107 Total 35 165 72 28 3 303 Comparing row percents shows that the women in the sample are more con cerned about food safety than the men Response 1 2 3 4 5 Total Female 66 551 255 117 10 100 Male 206 533 206 47 10 100 Is the difference between the genders statistically signi cant We might apply the chiesquare test Chapter 9 It is highly signi cant X2 16120 clf 4 P 00029 Although the chiesquare test answers our general question it ignores the ordering of the responses and so does not use all of the available information We would really like to know whether 1542 CHAPTER 15 Nonparametric Tests men or women are more concerned about the safety of the food served This question depends on the ordering of responses from least concerned to most concerned We can use the Wilcoxon test for the hypotheses H0 Men and women do not differ in their responses H2 One of the two genders gives systematically larger responses than the other The alternative hypothesis is twoesided Because the responses can take only ve values there are very many ties All 35 people who chose very rarely are tied at l and all 165 who chose once in a while are tied at 2 Figure 15 5 gives computer output forthe Wilcoxon test The rank sum 7 for men using average ranks for ties is W 140595 The standard ized value is Z 7333 with twoesided Pevalue P 00009 There is very strong evie dence of a difference Women are more concerned than men about the safety of food served at fairs With more than 100 observations in each group and no outliers we might use the twoesample even though responses take only ve values In fact the results for Example 156 are I 33655 with P 00009 The Pvalue for two sample I is the same as that for the Wilcoxon test There is however another reason to prefer the rank test in this example The I statistic treats the re sponse values 1 through 5 as meaningful numbers In particular the possible responses are treated as though they are equally spaced The difference be tween very rarely and once in a while is the same as the difference between once in a while and often This may not make sense The rank test on the Wilcoxon Scores Rank Sums for Variable SFAIR Classified by Variable GENDER Sum of Expected Std Dev Mean GENDER N Scores Under HO Under HO Score Female 196 319965000 297920 661 161398 163247449 Male 107 140595000 162640 661 161398 131 397196 Average Scores Were Used for Ties Wilcoxon TwoSample Test Normal Approximation with Continuity Correction of 5 Statistic S 140595 2 333353 Pr gt Z 00009 FE LEU RE i J Output from SAS forthe food safety study of Example 157 The approximate tworsided Prvalue is 00009 0394 151 YheW mmn msumw unhathugo ymemdauimerswmmlmxumlvalu m 3 5 me award may m 1mm momma muggy m mexwmm We swsmms mmwngtprmmxs um mum mmmmwg magma Rank 4 and permutztmn tests pk 2mm me memmmm mm m comma m Wunxmmm g m m m m 5 wmpew mammal an msmmm m dmbuunns hum Hum 524 mg Wmsmmwmm mmplycmm Lhedmussmn Vile1MPng an wum 7 dsm 15 s mmmme mime 32 leads on m and Mugch mamame ameng Wu w also m m m pm wads mm m pm um um may m 25mm album the 5m Wagm dun an Am pammzlum guns the me mm mm the momn 51 vim the m pa n dmbuums are namch cm launn a the samphng mum m the WM 514 Jim cwm 15 Nnnpzmwarr 13m mums hmed eadvanmeeummhhv Themmcme mm m m 5 W mzmmums mu m g was mm mm album swapa mm mm mo snapcun dmemmva sdn mm 515 m we MHzhie m 11an ss and man 51 n d1 p1 me ha e 11 ybe uabaammempxg am 1 p 11m mdsaguud 1mm bumsan 11 mmaldmbuumma wdswxm mean we mag 15 liymnnimmuwnhmmrm mmma 1m 1955 oMonIwIVsmmsmpks Wm m only youtsokwnlr 4mg quPrwAutsfoInAAk m ham91mm 2amp5 SECrION 151 Summary Nnnvmmnuic tsmmmmew mm m mmmmmo thewph zmmlmm wmhmsam z cm Rankksxmempmw mamnmmmxmmmm pusme m the hstmdaai bum MB m 1 m 1211151 ma 91mg mnsxacmvemem mxm hew cmnnnnkslm Izmme mhzs Smma kbrgavalugd znd e 2 mm quot1 5quot mm D12 Wm 151cm mac m dmbuunnsm 2535s mm mm Thekammngusbaa 1541mm mama ethelwusun WWW Wm comm Comm SECrION 151 Exzrciszs SMMASonue VEthme aw rst um um do m 1 m MAW WWW m mmmummmmn m mummy common AswomxymmmaammmwmmsmmMm awwmmwmxwmmmmememmmmmm mastudymluded Mm smsmsbwmyxmdm Em mum m 1mm 5 mg smmmmmammsmmm Section 151 Exercises I545 and also illustrated with pictures An expert listened to a recording of the chil dren and assigned a score for certain uses of language Here are the data4 Child Progress Story 1 score Story 2 score 1 high 055 080 2 high 057 082 3 high 072 054 4 high 070 079 5 high 084 089 6 low 040 077 7 low 072 049 8 low 000 066 9 low 036 028 10 low 055 038 Is there evidence that the scores of higheprogress readers are higher than those of loweprogress readers when they retell a story they have heard with out pictures Story 1 a Make normal quantile plots for the 5 responses in each group Are any major deviations from normality apparent b Carry out a twoesample ttest State hypotheses and give the two sample means the statistic and its Pevalue and your conclusion c Carry out the Wilcoxon rank sum test State hypotheses and give the rank sum Wfor higheprogress readers its Pevalue and your conclusion Do the I and Wilcoxon tests lead you to different conclusions Repeat the analysis of Exercise 151 for the scores when children retell a story they have heard and seen illustrated with pictures Story 2 Use the data in Exercise 151 for children telling Story 2 to carry out by hand the steps in the Wilcoxon rank sum test a Arrange the 10 observations in order and assign ranks There are no ties b Find the rank sum Wfor the ve higheprogress readers What are the mean and standard deviation of Wunder the null hypothesis that low progress and higheprogress readers do not differ c Standardize Wto obtain a zstatistic Do a normal probability calculation with the continuity correction to obtain a oneesided Peva ue d The data for Story 1 contain tied observations What ranks would you as sign to the 10 scores for Story 1 The corn yield study of Example 151 also examined yields in four plots having 9 lamb sequarter plants per meter of row The yields bushels per acre in these plots were 1 62 7 162 4 1628 1424 15716 CHAPTER 15 Nonparametric Tests There is a clear outlier but rechecking the results found that this is the correct yield for this plot The outlier makes us hesitant to use procedures because Frand s are not resistant a Is there evidence that 9 weeds per meter reduces corn yields when com pared with weedefree corn Use the Wilcoxon rank sum test with the data above and part of the data from Example 151 to answer this question b Compare the results from a with those from the twoesample I test for these data c Now remove the low outlier 1424 from the data for 9 weeds per meter Repeat both the Wilcoxon and I analyses By how much did the outlier reduce the mean yield in its group By how much did it increase the standard deviation Did it have a practically important impact on your conclusions How quickly do synthetic fabrics such as polyester decay in land lls A re searcher buried polyester strips in the soil for different lengths of time then dug up the strips and measured the force required to break them Breaking strength is easy to measure and is a good indicator of decay Lower strength means the fabric has decayed Part of the study involved burying 10 polyester strips in welledrained soil in the summer Five of the strips chosen at random were dug up after 2 weeks the other 5 were dug up after 16 weeks Here are the breaking strengths in pounds5 2weeks 118 126 126 120 129 16weeks 124 98 110 140 110 a Make a backetoeback stemplot Does it appear reasonable to assume that the two distributions have the same shape b Is there evidence that breaking strengths are lower for strips buried longer A subliminal message is below our threshold of awareness but may nonethee less influence us Can subliminal messages help students learn math A group of students who had failed the mathematics part of the City University of New York Skills Assessment Test agreed to participate in a study to nd out All re ceived a daily subliminal message flashed on a screen too rapidly to be con sciously read The treatment group of 10 students was exposed to Each day I am getting better in math The control group of 8 students was exposed to a neutral message People are walking on the street All students partice ipated in a summer program designed to raise their math skills and all took the assessment test again at the end of the program The table on the next page presents data on the subjects scores before and after the program6 a The study design was a randomized comparative experiment Outline this design b Compare the gain in scores in the two groups using a graph and numere ical descriptions Does it appear that the treatment groups scores rose more than the scores for the control group Section 151 Exercises I57 I7 Treatment group Control group Pretest Posttest Pretest Posttest 18 24 18 29 18 25 24 29 21 33 20 24 18 29 18 26 18 33 24 38 20 36 22 27 23 34 15 22 23 36 19 31 21 34 17 27 c Apply the Wilcoxon rank sum test to the posttest versus pretest differ ences Note that there are some ties What do you conclude Conservationists have despaired over destruction of tropical rainforest by logging clearing and burning These words begin a report on a statistical study of the effects of logging in Borneo7 Here are data on the number of tree species in 12 unlogged forest plots and 9 similar plots logged 8 years earlier Unlogged 22 18 22 20 15 21 13 13 19 13 19 15 Logged 17 4 18 14 18 15 15 10 12 a Make a backitoiback stemplot of the data Does there appear to be a dif ference in species counts for logged and unlogged plots b Does logging signi cantly reduce the number of species in a plot after 8 years State hypotheses do a Wilcoxon test and state your conclusion Do new directed reading activities improve the reading ability of elemeni tary school students as measured by their Degree of Reading Power DRP score A study assigns students at random to either the new method treat ment group 21 students or traditional teaching methods control group 23 students The DRP scores at the end of the study appear in Table 1518 Degree of Reading Power scores for thirtLgraders Treatment group Control group 24 61 59 46 43 44 42 33 46 37 43 41 52 43 58 67 62 57 10 42 55 19 17 55 71 49 54 43 53 57 26 54 60 28 62 20 15718 CHAPTER 15 Nonparametric Tests 1510 1511 For these data the twoesample I test Example 714 gives P 0013 and a permutation test based on the difference of means Example l4l2 gives P 0015 Both of these tests are based on the difference of sample means Does the Wilcoxon test based on rank sums rather than means give a similar Pevalue Example 156 describes a study of the attitudes of people attending outdoor fairs about the safety of the food served at such locations You can nd the full data set on the text CD and Web site as the le 857151106 dat It contains the responses of 303 people to several questions The variables in this data set are in order subject hfair sfair sfast srest gender The variable sfair contains the responses described in the example concern ing safety of food served at outdoor fairs and festivals The variable srest contains responses to the same question asked about food served in restaue rants The variable gender contains 1 if the respondent is a woman 2 if he is a man We saw that women are more concerned than men about the safety of food served at fairs Is this also true for restaurants The data le used in Example 156 and Exercise 159 contains 303 rows one for each of the 303 respondents Each row contains the responses of one per son to several questions We wonder if people are more concerned about the safety of food served at fairs than they are about the safety of food served at restaurants Explain carefully why we Cannot answer this question by apply ing the Wilcoxon rank sum test to the variables sfair and srest To study customers attitudes toward secondhand stores researchers inter viewed samples of shoppers at two secondhand stores of the same chain in two cities Here are data on the incomes of shoppers at the two stores pre sented as a twoeway table of counts9 Income code Income City 1 City 2 1 Under 10000 70 62 2 10000 to 19999 52 63 3 20000 to 24999 69 50 4 25000 to 34999 22 19 5 35000 or more 28 24 a Is there a relationship between city and income Use the chiesquare test to answer this question b The chiesquare test ignores the ordering of the income categories The data le eXI5IIdaI contains data on the 459 shoppers in this study The rst variable is the city Cityl or City2 and the second is the income code as it appears in the table above 1 to 5 Is there good evidence that shop pers in one city have systematically higher incomes than in the other Differences 152 The Wilcoxon Signed Rank Test I549 152 The Wilcoxon Signed Rank Test We use the oneesample procedures for inference about the mean of one pop ulation or for inference about the mean difference in a matched pairs setting The matched pairs setting is more important because good studies are genere ally comparative We will now meet a rank test for this setting A study of early childhood education asked kindergarten students to I retell two fairy tales that had been read to them earlier in the week Each child told two stories The rst had been read to them and the second had been read but also illustrated with pictures An expert listened to a recording of the chili dren and assigned a score for certain uses of language Here are the data for ve quotlow progressquot readers in a pilot study10 Child 1 2 3 4 5 Story 2 077 049 066 028 038 Story 1 040 072 000 036 055 Difference 037 7023 066 7008 7017 We wonder if illustrations improve how the children retell a story We would like to test the hypotheses Ho Scores have the same distribution for both stories HE Scores are systematically higher for Story 2 Because this is a matched pairs design we base our inference on the differences The matched pairs test gives If 0635 with oneesided Pevalue P 0280 Displays of the data Figure 156 suggest some lack of normality We would therefore like to use a rank test I 00 704 702 00 02 04 06 08 zescore Differences FIGU RE I56 Normal quantile plot and histogram for the ve differences in Barnple 158 5720 absolute value CHAPTER 15 Nonparametric Tests Positive differences in Example 158 indicate that the child performed bet ter telling Story 2 If scores are generally higher with illustrations the posi7 tive differences should be farther from zero in the positive direction than the negative differences are in the negative direction We therefore compare the absolute values of the differences that is their magnitudes without a sign Here they are with boldface indicating the positive values 037 023 066 008 017 Arrange these in increasing order and assign ranks keeping track of which values were originally positive Tied values receive the average of their ranks If there are zero differences discard them before ranking Absolutevalue 008 017 023 037 066 Rank 1 2 3 4 5 The test statistic is the sum of the ranks of the positive differences We could equally well use the sum of the ranks of the negative differences This is the WIICOXon signed rank statistic Its value here is VW 9 THE WILCOXON SIGNED RANKTEST FOR MATCH ED PAIRS Draw an SRS of size n from a population for a matched pairs study and take the differences in responses within pairs Rank the absolute values of these differences The sum WJr of the ranks for the positive differences is the Wilcoxon signed rank statistic If the distribution of the responses is not affected by the different treatments within pairs then WJr has mean nn 1 MW T and standard deviation nn l2n l 0 T The Wilcoxon signed rank test rejects the hypothesis that there are no systematic differences within pairs when the rank sum WJr is far from its mean In the storytelling study of Example 158 n 5 If the null hypothesis 39 no systematic effect of illustrations is true the mean of the signed rank statistic is nn l 56 75 MW 4 4 Our observed value W 9 is only slightly larger than this mean The oneisided Pivalue is I MW 2 9 152 The Wilcoxon Signed Rank Test I521 a SrPLUS Exact Wilcoxon SignedRank Test data Story2Story l signedrank statistic V 9 n 5 pvalue 04062 alternative hypothesis true mu is greater than 0 b SPSS Wilcoxon SignedRank Test Story2 Story l Positive Wilcoxon Signif N Ranks Statistic z twotailed 5 2 9 405 686 l itlltlt l5 Output from a SrPLUS and b SPSS for the storytelling study of Be ample 159 SrPLUS reports the exact Prvalue P 04062 SPSS uses the normal approxr imation Without the continuity correction and so gives a less accurate Prvalue P 0343 onersided Figure 157 displays the output of two statistical programs We see from Fig ure 157a that the oneisided Pivalue forthe Wilcoxon signed rank test with n 5 ob servations and W 9 is P 04062 This result differs from the test result P 0280 but both tell us that this very small sample gives no evidence that seeing illustrations improves the storytelling of lowiprogress readers I The normal approximation The distribution of the signed rank statistic when the null hypothesis no diff ference is true becomes approximately normal as the sample size becomes large We can then use normal probability calculations with the continuity correction to obtain approximate Pevalues for W Let s see how this works in the storytelling example even though 11 5 is certainly not a large sample g For n 5 observations we saw in Example 159 that MW 75 The EX M P LE 1 5 1 u standard deviation of W under the null hypothesis is nn l2n 1 quotWT T 7 5511 T 24 v1375 3708 15722 cwmus Nonpmmmrm Banana mama mm pm a 5pm A5 Wm QMEMmmwmams ummmww 2 m Wmdadmgand mgmes mdadmmdhhk m cmmu y mumw n m w m mmmmmkmmmawmm wmmmmm 0 4 memwdn u m Fm 5 ma mMma mewmmmmmms l mmmmmmwmm Mouwmmmmmm MWWMWW TM 1 WM mammamENs mahzmdledbasm wavaaeaxm A a mmamxcmad meolw namsemswemma mm m neganva Lhe Wu puma 9 mmldmbuunnimlh w mm mwmesmammwm bmwmmmug wxewecanmd emmwmmn Saimarew dndus H5952 mph madam As m the as ulvhe momn Rmksum v15 mm pm Mam W bngawwdg an a m gnu mm 12 Mamba niacnllege mus gnu quot L quot3 mmmmmmnw mmmcmsmmm Mame pqmammm m can imam mesa bum nayuxzzoss a nu Ramazm mmalvsm7ma7msem mwaxmmugsAsumzmsssAemu Dmme5lt52lt5 454 mm magma be bwer set w m no de Wesee nus mm 12 gm ma my We mu Mummy m hyymr I39m m me Mumquot mm m mm gums Anmmmmmm dawns mue155 Shmme nong and moumxvammgmevmmmwzdmhm 152 The Wilcoxon Signed Rank Test I572 5 n o 0 a O a E E 75 8 5 E 5 710 715 I I I I I I I 73 72 71 0 1 2 3 zescore Thai 1 151 Normal quantile plot of the differences in scores for two rounds of a golf tournament for Example 1511 The absolute values of the differences with boldface indicating those that were negative are 5526555164331 Arrange these in increasing order and assign ranks keeping track of which values were originally negative Tied values receive the average of their ranks 3 3 35 35 Absolute value 1 2 4 5 5 5 5 5 6 16 Rank 2 5888881112 The Wilcoxon signed rank statistic is the sum of the ranks of the negative dif ferences We could equally well use the sum of the ranks of the positive dif ferences Its value is VW 505 Here are the twoisided Pivalues for the Wilcoxon signed rank test for the golf score data from several statistical programs E39AMPL E15JE Program P value Minitab F 01388 SAS F 01388 SVPLUS F 01384 SPSS F 01363 S 724 CHAPTER 15 Nonparametric Tests 1512 All lead to the same practical conclusion these data give no evidence for a systematic change in scores between rounds However the Pevalues reported differa bit from pro gram to program The reason for the variations is that the programs use slightly differ ent versions of the approximate calculations needed when ties are present The exact result depends on which of these variations the programmer chooses to use For these data the matched pairs ttest gives t 09314 with P 03716 Once again tand W lead to the same conclusion SECTION 152 The Wilcoxon signed rank test applies to matched pairs studies It tests the null hypothesis that there is no systematic difference within pairs against ale ternatives that assert a systematic difference either oneesided or twoesided Summary The test is based on the Wilcoxon signed rank statistic W which is the sum of the ranks of the positive or negative differences when we rank the abso lute values of the differences The matched pairs ttest and the sign test are alternative tests in this setting Pivalues for the signed rank test are based on the sampling distribution of VW when the null hypothesis is true You can nd Pvalues from special tables software or a normal approximation with continuity correction SECTION 152 Exercises Statistical software is Very helpful in doing these eXercises Ifyou do not have access to software base yo or work on the normal approximation with continuity correction The concentration of carbon dioxide CO2 in the atmosphere is increasing rapidly clue to our use of fossil fuels Because plants use C02 to fuel photosyne thesis more C02 may cause trees and other plants to grow faster An elaborate apparatus allows researchers to pipe extra C02 to a 307meter circle of forest They set up three pairs of circles in different parts of a forest in North Care olina One of each pair received extra CO2 for an entire growing season and the other received ambient air The response variable is the average growth in base area for trees in a circle as a fraction of the starting area Here are the data for one growing season11 Pair Control Treatment 1 006528 008150 2 005232 006334 3 004329 005936 a Summarize the data Does it appear that growth was faster in the treated plots 1513 1514 1515 Section 152 Exercises I572 b The researchers used a matched pairs I test to see if the data give good evidence of faster growth in the treated plots State hypotheses carry out the test and state your conclusion c The sample is so small that we cannot assess normality To be safe we might use the Wilcoxon signed rank test Carry out this test and report your result d The tests lead to very different conclusions The primary reason is the lack of power of rank tests for very small samples Explain to someone who knows no statistics what this means A student project asked subjects to step up and down for three minutes and measured their heart rates before and after the exercise Here are data for ve subjects and two treatments stepping at a low rate 14 steps per minute and at a medium rate 21 steps per minute For each subject we give the resting heart rate beats per minute and the heart rate at the end of the exercise12 Low Rate Medium Rate Subject Resting Final Resting Final 1 60 75 63 84 2 90 99 69 93 3 87 93 81 96 4 78 87 75 90 5 84 84 90 108 Does exercise at the low rate raise heart rate signi cantly State hypotheses in terms of the median increase in heart rate and apply the Wilcoxon signed rank test What do you conclude Do the data from the previous exercise give good reason to think that stepping at the medium rate increases heart rates more than stepping at the low rate a State hypotheses in terms of comparing the median increases for the two treatments What is the proper rank test for these hypotheses b Carry out your test and state a conclusion Can the full moon in uence behavior A study observed 15 nursing home pa tients with dementia The number of incidents of aggressive behavior was re corded each day for 12 weeks Call a day a moon day if it is the day of a full moon or the day before or after a full moon Table 152 gives the average nume ber of aggressive incidents for moon days and other days for each subject13 The matched pairs test Example 7 7 page 459 gives P lt 0001 and a permue tation test Example 1414 page 1454 gives P 00001 Does the Wilcoxon signed rank test based on ranks rather than means agree that there is strong evidence that there are more aggressive incidents on moon days 7 726 CHAPTER 15 Nonparametric Tests 1516 1517 1518 if Wt Aggressive behaviors of dementia patients Patient Moon days Other days Patient Moon days Other days 1 333 027 9 600 159 2 367 059 10 433 060 3 267 032 11 333 065 4 333 019 12 067 069 5 333 126 13 133 126 6 367 011 14 033 023 7 467 030 15 200 038 8 267 040 A matched pairs study of the effect of a summer language institute on the ability of teachers to comprehend spoken French had these improvements in scores between the pretest and the posttest for 20 teachers 206633237666630110233 Exercise 741 applies the test to these data Exercise 1453 applies a permui tation test based on the means Show the assignment of ranks and the cal culation of the signed rank statistic VW for these data Remember that zeros are dropped from the data before ranking so that n is the number of nonzero differences Within pairs Example 156 describes a study of the attitudes of people attending outdoor fairs about the safety of the food served at such locations The full data set is available on the text CD and Web site as the le 857151106 dat It contains the responses of 303 people to several questions The variables in this data set are in order hfair sfair sfast srest subject gender The variable sfair contains responses to the safety question described in Example 156 The variable srest contains responses to the same question asked about food served in restaurants We suspect that restaurant food will appear safer than food served outdoors at a fair Do the data give good eve idence for this suspicion Give descriptive measures a test statistic and its Pivalue and your conclusion Why might we hesitate to accept a small Pivalue as good evidence against Hg for these data How often do nurses use latex gloves during procedures for which glove use is recommended A matched pairs study observed nurses Without their knowl edge before and after a presentation on the importance of glove use Here are the proportions of procedures for which each nurse wore gloves14 1519 1520 1521 Section 152 Exercises I527 Nurse Before After Nurse Before After 1 0500 0857 8 0000 1000 2 0500 0833 9 0000 0667 3 1000 1000 10 0167 1000 4 0000 1000 11 0000 0750 5 0000 1000 12 0000 1000 6 0000 1000 13 0000 1000 7 1000 1000 14 1000 1000 Is there good evidence that glove use increased after the presentation Exercise 737 page 481 reports readings from 12 home radon detectors ex posed to 105 picocuries per liter of radon 919 978 1038 996 1114 966 1223 1193 1054 1048 950 1017 We wonder if the median reading differs signi cantly from the true value 105 a Graph the data and comment on skewness and outliers A rank test is appropriate b We would like to test hypotheses about the median reading from home radon detectors H0 median 105 H2 median 75 105 To do this apply the Wilcoxon signed rank statistic to the differences be tween the observations and 105 This is the oneisample version of the test What do you conclude Exercise 739 page 482 gives data on the vitamin C content of 27 bags of wheat soy blend at the factory and ve months later in Haiti We want to know if vitamin C has been lost during transportation and storage Describe what the data show about this question Then use a rank test to see whether there has been a signi cant loss Exercise 710 page 474 presents these data on the weight gains in kilo grams of adults who were fed an extra 1000 calories per day for 8 weeks15 Subject 1 2 3 4 5 6 7 8 Weight before 557 549 596 623 742 756 707 533 Weight after 617 588 660 662 790 823 743 593 Subject 9 10 11 12 13 14 15 16 Weight before 733 634 681 737 917 559 617 578 Weightafter 791 660 734 769 931 630 682 603 528 CHAPTER 15 Nonparametric Tests a Use a rank test to test the null hypothesis that the median weight gain is 16 pounds as theory suggests What do you conclude b If your software allows give a 95 con dence interval for the median weight gain in the population 153 The KruskaI Wallis Test We have now considered alternatives to the matched pairs and twoesample tests for comparing the magnitude of responses to two treatments To com pare more than two treatments we use oneeway analysis of variance ANOVA if the distributions of the responses to each treatment are at least roughly nor mal and have similar spreads What can we do when these distribution re quirements are violated Lamb39sequarter is a common weed that interferes with the growth of corn A researcher planted corn at the same rate in 16 small plots of ground then randomly assigned the plots to four groups He weeded the plots by hand to allow a xed number of lamb39sequarter plants to grow in each meter of corn row These numbers were 0 1 3 and 9 in the four groups of plots No other weeds were allowed to grow and all plots received identical treatment except for the weeds Here are the yields of corn bushels per acre in each of the plots16 Weeds Corn Weeds Corn Weeds Corn Weeds Corn permeter yield permeter yield permeter yield permeter yield 0 1667 1 1662 3 1586 9 1628 0 1722 1 1573 3 1764 9 1424 0 1650 1 1667 3 1531 9 1627 0 1769 1 1611 3 1560 9 1624 The summary statistics are Weeds n Mean Std dev 170200 5422 162825 4469 161025 10493 157575 10118 JoshO JgtugtJgtJgt The sample standard deviations do not satisfy our rule of thumb that for safe use of ANOVA the largest should not exceed twice the smallest Normal quantile plots Fig ure 159 show that outliers are present in the yields for 3 and 9 weeds per meter These are the correct yields fortheir plots so we have nojusti cation for removing them We may want to use a rank test Because this test is an alternative to the onerway analysis ofvariance Ftest you should rst read Chapter 12 Yield bushels acre Yield bushels acre 175 170 165 160 155 153 The KruskaleWallis Test ISrZQ No weeds 1 weed per meter 166 39 uT b 164 V1 n 2 G g 162 5 a 160 9 39 158 I I I I I I I I I I I I I 73 72 0 1 2 3 73 72 7 0 1 2 3 zescore zescore 3 weeds per meter 9 weeds per meter 160 E E 2 155 CD B E 150 E O gt 145 I I I I I I I I I I I I I 73 72 0 1 2 3 73 72 71 0 1 2 3 zescore zescore HELME 5 3 Normal quantile plots for the corn yields in the four treatment groups in Bample 1513 Hypotheses and assumptions The ANOVA F test concerns the means of the several populations represented by our samples For Example 1513 the ANOVA hypotheses are H01 M0M1M3M9 H2 not all four means are equal For example MO is the mean yield in the population of all corn planted un der the conditions of the experiment with no weeds present The data should consist of four independent random samples from the four populations all normally distributed with the same standard deviation The Kruskali Wallis test is a rank test that can replace the ANOVA F test The assumption about data production independent random samples from 950 CHAPTER 15 Nonparametric Tests each population remains important but we can relax the normality assump tion We assume only that the response has a continuous distribution in each population The hypotheses tested in our example are H0 Yields have the same distribution in all groups H2 Yields are systematically higher in some groups than in others If all of the population distributions have the same shape normal or not these hypotheses take a simpler form The null hypothesis is that all four pop ulations have the same medianyield The alternative hypothesis is that not all four median yields are equal The KruskaleWallis test Recall the analysis of variance idea we write the total observed variation in the responses as the sum of two parts one measuring variation among the groups sum of squares for groups SSG and one measuring variation among individual observations within the same group sum of squares for error SSE The ANOVA Ftest rejects the null hypothesis that the mean re sponses are equal in all groups if SSG is large relative to SSE The idea of the KruskaleWallis rank test is to rank all the responses from all groups together and then apply oneeway ANOVA to the ranks rather than to the original observations If there are Nobservations in all the ranks are always the whole numbers from 1 to N The total sum of squares for the ranks is therefore a xed number no matter what the data are So we do not need to look at both SSG and SSE Although it isn t obvious without some unpleasant algebra the KruskaleWallis test statistic is essentially just SSG for the ranks We give the formula but you should rely on software to do the arithmetic When SSG is large that is evidence that the groups differ TH E KRUSKALeWALLIS TEST Draw independent SRSs of sizes n1 n2 111 from I populations There are Nobservations in all Rank all Nobservations and let R1 be the sum of the ranks for the 1th sample The KruskaleWallis statistic is 12 R2 H 173N 1 NltN1gtZm T When the sample sizes 11139 are large and all Ipopulations have the same continuous distribution Hhas approximately the chiesquare distribu tion with If 1 degrees of freedom The KruskaleWallis test rejects the null hypothesis that all popula tions have the same distribution when His large We now see that like the Wilcoxon rank sum statistic the KruskaleWallis statistic is based on the sums of the ranks for the groups we are comparing 153 The KruskaleWallis Test I551 The more different these sums are the stronger is the evidence that responses are systematically larger in some groups than in others The exact distribution of the KruskaleWallis statistic H under the null hypothesis depends on all the sample sizes 111 to 111 so tables are awkward The calculation of the exact distribution is so timeeconsuming for all but the smallest problems that even most statistical software uses the chiesquare approximation to obtain Pevalues As usual there is no usable exact distribue tion when there are ties among the responses We again assign average ranks to tied observations ln Example 1513 there are I 4 populations and N 16 obsere vations The sample sizes are equal 17 4 The 16 observations are ranged in increasing order with their ranks are Yield 1424 1531 1560 1573 1586 1611 1624 1627 Rank Yield 1628 1650 1662 1667 1667 1722 1764 1769 Rank 9 10 11 125 125 14 15 16 There is one pair of tied observations The ranks for each ofthe four treatments are Weeds Ranks Sum of ranks 0 10 12 5 14 16 52 5 1 4 6 11 125 33 5 3 2 3 5 15 250 9 1 7 8 9 250 The KruskaleWallis statistic is therefore 12 R3 H NN1Z 7 3N1 12 5252 3352 252 252 7 3 17 1617lt 4 4 4 4 12 2721282125751 556 Referring to the table of chiesquare critical points Table F with df 3 we nd that the Pevalue lies in the interval 010 lt P lt 015 This small experiment suggests that more weeds decrease yield but does not provide convincing evidence that weeds have an effect Figure 1510 displays the output from the SAS statistical software which gives the results H 55725 and P 01344 The software makes a small l9732 CHAPTER 15 Nonparametric Tests Wilcoxon Scores Rank Sums for Variable YIELD Classified by Variable WEEDS Sum of Expected Std Dev Mean WEEDS N Scores Under HO Under HO Score 0 4 525000000 340 824014563 131250000 1 4 335000000 340 824014563 83750000 3 4 250000000 340 824014563 62500000 9 4 250000000 340 824014563 62500000 Average Scores Were Used for Ties KruskalWallis Test ChiSquare Approximation CHISQ 55725 DF Prgt CHISQ 01344 iiiijtlil at 115111 Output from SAS for the KruskalrWallis test applied to the data in Exr ample 1514 SAS uses the chirsquare approximation to obtain a Prvalue adjustment for the presence of ties that accounts for the slightly larger value of H The adjustment makes the chiesquare approximation more accurate It would be important if there were many ties As an option SAS will calculate the exact Pvalue for the KruskaIAWallis test The result for Example 1514 is P 01299 This result required more than an hour of computing time Fortunately the chiesquare approximation is quite accurate The ordinary ANOVA Ftest gives F 173 with P 02130 Although the practical conclusion is the same ANOVA and KruskaIAWallis do not agree closely in this example The rank test is more reliable for these small samples with outliers SECTION 153 Summary The KruskaIAWallis test compares several populations on the basis of inde pendent random samples from each population This is the oneeway analysis of variance setting The null hypothesis for the KruskaIAWallis test is that the distribution of the response variable is the same in all the populations The alternative hypoth esis is that responses are systematically larger in some populations than in others The KruskaIAWallis statistic H can be viewed in two ways It is essentially the result of applying oneeway ANOVA to the ranks of the observations It is also a comparison of the sums of the ranks for the several samples When the sample sizes are not too small and the null hypothesis is true H for comparing Ipopulations has approximately the chiesquare distribution with If 1 degrees of freedom We use this approximate distribution to obtain Pevalues 1522 1523 Section 153 Exercises I553 SECTION 153 Exercises Statistical software is needed to do these eXercises Without unpleasant hand cal culations lfyou do notha Ve access to software nd the Kruskale Wallis statistic H by hand and use the chiesquare table to get approximate PeValues Does bread lose its vitamins when stored Here are data on the vitamin C con tent milligrams per hundred grams of flour in bread baked from the same recipe and stored for 1 3 5 or 7 days17 The 10 observations are from 10 different loaves of bread Condition Vitamin C mg 100 g Immediately after baking 4762 4979 One day after baking 4045 4346 Three days after baking 2125 2234 Five days after baking 1318 1165 Seven days after baking 851 813 The loss ofvitamin C over time is clear butwith only 2 loaves ofbread for each storage time we wonder if the differences among the groups are signi cant a Use the KruskaleWallis test to assess signi cance then write a brief sum mary of what the data show 3 Because there are only 2 observations per group we suspect that the come mon chiesquare approximation to the distribution of the KruskaleWallis statistic may not be accurate The exact Pevalue from the SAS software is P 00011 Compare this with your Pevalue from a Is the difference large enough to affect your conclusion Many studies suggest that exercise causes bones to get stronger One study examined the effect ofjumping on the bone density of growing rats Ten rats were assigned to each of three treatments a 607centimeter high jump a 307centimeter lowjump and a control group with nojumping Here are the bone densities in milligrams per cubic centimeter after 8 weeks of 10jumps per day18 Group Bone density mgcm3 Control 611 621 614 593 593 653 600 554 603 569 Low jump 635 605 638 594 599 632 631 588 607 596 Highjump 650 622 626 626 631 622 643 674 643 650 a The study was a randomized comparative experiment Outline the design of this experiment b Make sideebyeside stemplots for the three groups with the stems lined up for easy comparison The distributions are a bit irregular but not strongly nonnormal We would usually use analysis of variance to assess the sig ni cance of the difference in group means 973 4 CHAPTER 15 Nonparametric Tests 1524 1525 1526 c Do the KruskaleWallis test Explain the distinction between the hypothee ses tested by KruskaleWallis and ANOVA d Write a brief statement of your ndings Include a numerical comparison of the groups as well as your test result To detect the presence of harmful insects in farm elds we can put up boards covered with a sticky material and examine the insects trapped on the boards Which colors attract insects best Experimenters placed six boards of each of four colors at random locations in a eld of oats and measured the number of cereal leaf beetles trapped Here are the data19 Color Insects trapped Lemonyellow 45 59 48 46 38 47 White 21 12 14 17 13 17 Green 37 32 15 25 39 41 Blue 16 11 20 21 14 7 Because the samples are small we will apply a nonparametric test a What hypotheses does ANOVA test What hypotheses does KruskaleWallis test b Find the median number of beetles trapped by boards of each color Which colors appear more effective Use the KruskaleWallis test to see if there are signi cant differences among the colors What do you conclude Exercise 1524 gives data on the counts of insects attracted by boards of four different colors Carry out the KruskaleWallis test by hand following these steps a What are I the 11139 and N b Arrange the counts in order and assign ranks Be careful about ties Find the sum of the ranks R1 for each color c Calculate the KruskaleWallis statistic H How many degrees of freedom should you use for the chiesquare approximation to its null distribution Use the chiesquare table to give an approximate Pvalue Here are the breaking strengths in pounds ofstrips ofpolyester fabric buried in the ground for several lengths of time 0 Time Breaking strength 2weeks 118 126 126 120 129 4weeks 130 120 114 126 128 8weeks 122 136 128 146 140 16 weeks 124 98 110 140 110 1527 1528 1529 Section 153 Exercises I555 Breaking strength is a good measure of the extent to which the fabric has decayed a Find the standard deviations of the 4 samples They do not meet our rule of thumb for applying ANOVA In addition the sample buried for 16 weeks contains an outlier We will use the KruskaleWallis test b Find the medians of the four samples What are the hypotheses for the KruskaleWallis test expressed in terms of medians c Carry out the test and report your conclusion Example 156 describes a study of the attitudes of people attending outdoor fairs about the safety of the food served at such locations The full data set is available on the text CD and Web site as the le eg15006 dat It contains the responses of 303 people to several questions The variables in this data set are in order subject hfair sfair sfast srest gender The variable sfair contains responses to the safety question described in Example 156 The variables srest and sfast contain responses to the same question asked about food served in restaurants and in fastefood chains Ex plain carefully why we Cannot use the KruskaleWallis test to see if there are systematic differences in perceptions of food safety in these three locations In Exercise 15 7 you compared the number of tree species in plots of land in a tropical rainforest that had never been logged with similar plots nearby that had been logged 8 years earlier The researchers also counted species in plots that had been loggedjust 1 year earlier Here are the counts of species Plot type Species count Unlogged 22 18 22 20 15 21 13 13 19 13 19 15 Logged lyear ago 11 ll 14 7 l8 l5 15 12 13 2 l5 8 Logged 8 years ago 17 4 l8 l4 l8 l5 15 10 12 3 Use sideebyeside stemplots to compare the distributions of number of species per plot for the three groups ofplots Are there features that might prevent use of ANOVA Also give the median number of species per plot in the three groups 3 Use the KruskaleWallis test to compare the distributions of species counts State hypotheses the test statistic and its Pevalue and your conclusions In a study of heart disease in male federal employees researchers classi ed 356 volunteer subjects according to their socioeconomic status SES and their smoking habits There were three categories of SES high middle and low Individuals were asked whether they were current smokers former smoke ers or had never smoked Here are the data as a twoeway table of counts22 573i CHAPTER 15 Nonparametric Tests SES Never 1 Former 2 Current 3 High 68 92 51 Middle 9 21 22 Low 22 28 43 The data for all 356 subjects are stored in the le eXI529daI on the text CD and Web site Smoking behavior is stored numerically as 1 2 or 3 using the codes given in the column headings above a Higher SES people in the United States smoke less as a group than lower SES people Do these data show a relationship of this kind Give percents that back your statements b Apply the chiisquare test to see if there is a signi cant relationship be tween SES and smoking behavior c The chiisquare test ignores the ordering of the responses Use the Kruskali Wallis test with many ties to test the hypothesis that some SES classes smoke systematically more than others CHAPTER 15 Exercises 1530 Exercise 1449 page 1459 presents data on the time required for the tele phone company Verizon to respond to repair calls from its own customers and from customers of a CLEC another phone company that pays Verizon to use its local lines Here are the data which are rounded to the nearest hour Verizon 1 1 1 1 2 2 1 1 1 1 2 2 1 1 1 1 2 2 1 1 1 1 2 3 1 1 1 1 2 3 1 1 1 1 2 3 1 1 1 1 2 3 1 1 1 1 2 3 1 1 1 1 2 3 1 1 1 1 2 4 1 1 1 1 2 5 1 1 1 1 2 5 1 1 1 1 2 6 1 1 1 1 2 8 1 1 1 1 2 15 1 1 1 2 2 CLEC 1155515555 a Does Verizon appear to give CLEC customers the same level of service as its own customers Compare the data using graphs and descriptive mea7 sures and express your opinion b We would like to see if times are signi cantly longer for CLEC customers than for Verizon customers Why would you hesitate to use a test for this purpose Carry out a rank test What can you conclude 1531 Exercise 7140 page 532 reports data on the selling prices of 9 fouribedroom houses and 28 threeibedroom houses in West Lafayette Indiana We wonder if there is a difference between the average prices of three and fouribedroom houses in this community 1532 Them m mm pman plan s mm mm ChipBYS gems a mammw nbp mui ep ms mmvummm What mmmxmmmmvaamsaew mezmsmeqwmm Smmahypmhsa x a wpahgv any Mam Mammy guns mludmg WM d212 c Carly uulamnmm m cm me meme hypnth gm mmm wugm mmmmmmmmmmm smudgownlul ungusspnxs 1mg mmxmmummucana sumehea mui ewl s 13 mm dammedan mmpl aepmpsim anaezn plznezmd 20W mm mm cmsr me coma m an m m the m D man banan Hagenem 39mn Madam 3mm WigwamWm hummoid e hem saxeCFLkpa cubqume an on Wmmwg emoxmhmmmm oxnmman me we ea me 4667 ms mm 46 MN USA we 5026 son ea 4694 u Hmvlbamlai use 4201 ms A309 um Mm 278 1 3953 ms was 3787 3916 ma am am am 3797 3879 3823 s 3778 am mnbamy lnw 367a 2702 3652 zen sens zsAs an 371 3517 3632 3666 3568 360 am 3463 cwm vs Nonpmmmrm Du anumpb zmalyslsd mmlud danmnuid edmandamm hx swmeoxmngmmmemmmg 1m de ut rymamn 5 2mm WM can mmmuw Ambym com1212 s 12995 22 w WWW m a m mm offom mm m m smdym Ethwpu 222mm 22 m mm Wm N 0051175 my a m was magma a 9mm W5 0070K 1m comm M 2 ham 2 Ammunm m 236 196 2m am 217 1234 m3 153107 130 cw 227 122 222 268 22 2A3 257 2a 155 079 ms 122 1m 527 517 we A22 gm 2 m 72 225 299 2a 222 Em 1534 201535139 mt 42 15M Dos MW 25 mmm mum Mamba m 21mm claY and m pm a What dB Lhe am mm 2 mm mm the mm m anew Ava mmmm s me am gums m mg 52mm a Imuva dB mung Summnzeymn39 conkme abuuuhe scloip lmmmalun maxim commloivh tege zble 25 1535 nae m m be h d famcebelm Lhe mmnmonma c 222 m 21mm pm and and nuclei m 2122 pms 15 um 25mm a we m Lhe mm mum of m nuclei m 21mm and CW 15 the Mme m 21mm and 2122 9M2 5quot W57 Use m 1535 hedmshawdqm udcmkadmxmnw shzsmehw aumncunmlm 21m snag mm the mews madam m m mum 1 Limest mu Sadamed m the we w a cdd h m m cummle a are cockedquot m pm 1537 A m mom mommaum Carly mm mumDI mmlnlisnmproc mm Haexsasmpl madmd mem ywlkmsmmaiswu umcem mam nagkmmm mwmsmgmmmmu ames m5 0 5 mget mu mmelmam s uikcum ms h uzlcumpnsmazd eDDSHmallnli c 1533 Mimmdaslgm kamch ame mama s WHekwmaNo Wm 112cc a cmmmmmmwmwmmmmmm EW th Wmmmmgmpmgw Owens NM L 39 0191 wmsommpxmmmmmwoxmmms sum canninstalusxsmwmmm 00167 mungmm swam anthemmam as mp ww ms gauge 15 7 Mm m m use the Wmquot M m 15 m W5 9 v m quotmph commons mm man Sw me 1m 0 as m 25 Comm ms mm InE use 15 SZMA39wndLhmvhemrmx miuw spam mums m alum PI me pbnldm a imamW among the mm a Lhe year Au man a as 124d Wm M seems ma mmMuw Hm Meete e P ssxbbmnwue compams mm A says mmu 15 Nuts 1 mwm hmquot ymvsyml VIIVany z Forwms s Nuns mg mm mm 5 s mnemmw 4 ma z m mnmm Mmeqml yfrxaxkaswm newmmmm magma mammal m mesem mmwmm 395 was um m m bemslmp ammugm m mmhyymless mum w 5 a Hazy mm gm xwms MM and cmusamm m and new mm min gm Minast mm MS 99517de Wynn197 mwm mm swam hum Myqu s Swm Alum gtmgmnm mm mu m 511 MS was man umU W mm s Damymded wvmm Pagemlewin x c ymmucgwmmwam mm dem 7 mm mm by cm Calmm Dwe mum m sudy mm 5 c H Canon 9 x Yuande Leyvmn nu mmsammmmw ma 2mm mm 542mm 2m umw mst a Th5 sump s adwtzd mm mm 0 mm m masonquot 4mm Madum mm m mummy some m a sum m mum m 9 wmm mm Smcme Mam xm Md mmR Jawuh Emmmh 27 1923 W m m s m n ma 1925mm Wymmn mmmmm m summon 5qu mm wammmmamemmmewm macqmmemy 5mm 2 am mum 12 smdtmmm mum swm mm aw m m m c S 40 CHAPTER 15 Nonparametric Tests 13 These data were collected as part of a larger study of dementia patients conducted by Nancy Edwards School of Nursing and Alan Beck School ofVeterinary Medicine Purdue University 14 L Friedland et al quotEffect of educational program on compliance with glove use in a pediatric emergency departmentquot American Journal of Diseases of Childhood 146 1992 pp 135571358 15 James A Levine et al quotRole ofnonexercise activity thermogenesis in resistance to fat gain in humansquot Science 283 1999 pp 2127214 Data for this study are available from the Science Web site www sciencemagorg 16 See Note 1 17 Data provided by Helen Park See H Park et al quotFortifying bread with each of three antioxidantsquot Cereal Chemistry 74 1997 pp 2027206 18 Data provided by Jo Welch Purdue University Department of Foods and Nutrie tion 19 Modi ed from M C Wilson and R E Shade quotRelative attractiveness of various luminescent colors to the cereal leaf beetle and the meadow spittlebugquot Journal ofEcoe nomic Entomology 60 1967 pp 5787580 20 See Note 5 21 See Note 7 22 Ray H Rosenman et al quotA 4eyear prospective study of the relationship of different habitual vocational physical activity to risk and incidence of ischemic heart disease in volunteer male federal employeesquot in P Milvey ed The Marathon Physiological Medical Epidemiological and Psychological Studies New York Academy of Sciences 301 1977pp 6277641 23 Michael Wayne Peugh quotField investigation of ventilation and air quality in duck and turkey slaughter plantsquot MS thesis Purdue University 1996 24 We thank Ethan J Temeles of Amherst College forproviding the data His work is described in Ethan J Temeles and W John Kress quotAdaptation in a plantehummingbird associationquot Science 300 2003 pp 6307633 25 Based on A A Adish et al quotEffect of consumption of food cooked in iron pots on iron status and growth ofyoung children a randomised trialquot The Lancet 353 1999 pp 7127716 26 For more details on multiple comparisons see M Hollanderand D A Wolfe None parametric StatisticalMethods 2nd ed Wiley 1999 This book is a useful reference on applied aspects of nonparametric inference in general


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Anthony Lee UC Santa Barbara

"I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.