### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 215 Class Note for STAT 30100 with Professor Sorola at Purdue

### View Full Document

## 15

## 0

## Popular in Course

## Popular in Department

This 14 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 15 views.

## Similar to Course at Purdue

## Reviews for 215 Class Note for STAT 30100 with Professor Sorola at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

CHAPTER 12 OneWay Analysis of Variance ANOVA Oneway analysis of variance is used when you want to compare more than two means It is a technique that generalizes the twosample I procedure which compares two means Like the twosample Itest it is robust and useful Examples 1 The presence of harmful insects in farm fields is detected by erecting boards covered with a sticky material and then examining the insects trapped on the boards To investigate which colors are most attractive to cereal leaf beetles researchers placed six boards of each of four colors in a field of oats in July 2 An ecologist is interested in comparing the concentration of the pollutant cadmium in five streams She collects 50 water specimens for each stream and measures the concentration of cadmium in each specimen Note The first example is an experiment with four treatments the colors and the second example is an observational study where the concentration of cadmium is compared between the five streams In both cases we can use ANOVA to compare the mean responses We will use the F statistic to compare the variation among the means of several groups with the variation within the groups In the ANOVA test an SRS from each population is drawn and the data is used to test the null hypothesis that the populations are all equal against the alternative that not all are equal If we reject the null we need to perform some further analysis to draw conclusions about which population means differ Assumptions of the ANOVA l The data is normally distributed 2 The population standard deviations are equal The OneWav ANOVA Model The oneway ANOVA model is xijzyi8ij fori lI and j lnl The 81 are assumed to be from an N 0 039 distribution The parameters of the model are the population means ul 2 u and the common standard deviation 6 Note I the number of groups N the total sample size n the sample size for group 139 Example 1 The strength of concrete depends upon the formula used to prepare it One study compared five different mixtures Six batches of each mixture were prepared and the strength of the concrete made from each batch was measured a What is the response variable b Give the values for the n andN Estimating the population parameters The unknown parameters in the statistical model for ANOVA are the I population means 1 and the common standard deviation 6 0 To estimate 1 we use the sample mean for the ith group 0 To estimate 039 the common standard deviation Our second assumption in the ANOVA model was that our population standard deviations are all equal An official test is not recommended so we use the following rule of thumb If the largest standard deviation is less than twice the smallest standard deviation we can use methods based on the assumption of equal standard deviations and our results will still be approximately correct Pooled Estimator of 6 If we assume all the population standard deviations are equal each s is an estimate of 039 We combine these into a Pooled Estimator of o S 111 1S12n2 ls22 n1 1S12 P 111 1n2 1 n1 1 In the SPSS ANOVA output Sp 2 IMSE The best way to approach a problem which involves comparing more than two groups is as follows 1 Find an estimate for the mean and standard deviation for each group and plot the means on a graph 2 Find the five number summary for each group and do sidebyside box plots to see how much overlap there is between the groups 3 Run ANOVA Discussion of ANOVA Testing vaotheses in OneWav ANOVA 0 State null and alternative hypotheses H0 Hui 212 2quotquot1111 H a not all the ul s are equal at least one is different 0 Find the test statistic F MSG MSE The F statistic has the F I lN 1 distribution 0 Find the Pvalue on the printout 0 Compare the Pvalue to the I level If Pvalue S 0 then reject H 0 If Pvalue 2 0 fail to then reject H 0 0 State your conclusions in terms of the problem The ANOVA output see pg 736 for more detail Source Sum of Degrees of Mean Square F Sig Squares Freedom Groups DF 1 1 Between G P Value Groups D FG MS E Error E DFEZNJ Within SS MSE SSE 5 Groups D FE Total SST DFT N 1 MST SST DF T Note N is the total number of observation the sum of all the 111 The coefficient of determination R2 SSG SST R2 is the percent of variation in the model that is accounted for by the FIT part of the model For example 1 Answer the following questions c What are the degrees of freedom for the model for error and for the total d State the null and alternative hypotheses e Give the numerator and denominator degrees of freedom for the F statistic Example 2 From Moore and McCabe 4th edition The presence of harmful insects in farm fields is detected by erecting boards covered with a sticky material and then examining the insects trapped on the boards To investigate which colors are most attractive to cereal leaf beetles researchers placed six boards of each of four colors in a field of oats in July The table below gives data on the number of cereal leaf beetles trapped Color Insects trapped Lemon yellow 45 59 48 46 38 47 White 21 12 14 17 13 17 Green 37 32 15 25 39 41 Blue 16 11 20 21 14 7 Write hypotheses Using SPSS Enter the data into SPSS in vertical columns with the following labels color numberitrapped and treatment All the densities are listed in one long column color is where you list lemon yellow white green and blue Treatment is a numerical way of describing your group Make lemon yellow be 1 white be 2 green be 3 and blue be 4 For some reason ANOVA needs a numerical column for the factor box 1 Identify the response variable n N and I for this study 2 Make a table giving the mean and standard deviation for each color group Make a graph of the means Is it reasonable to pool the variances Using SPSS Analyze gt Compare Means gt Means Move numberitrapped into Dependent List box Move treatment into IindependentLlist box Click the Options box to get your summary statistics Click OK Note lemon yellow 1 White 2 green 3 and blue 4 Report number trapped treatment Mean Std Deviation Minimum Maximum Median N 1 4717 6795 38 59 4650 6 2 1567 3327 12 21 1550 6 3 3150 9915 15 41 3450 6 4 1483 5345 7 21 1500 6 Total 2729 14948 7 59 2100 24 mean Plot of the Means To get the plot to the left you need to do a scatterplot of treatment against the means above You can use the chart editor to get the lines going from point to point Or you can get this plot when running the oneway ANOVA 5D DEI AUDU 3D DEI ZEI DEI IEIEIEI I I I I I 2 3 4 colortreamment 3 Do sidebyside boxplots for each group Using SPSS to get the boxplots Graphs gt Boxplol gtDe ne Move numberilrappea into the Variable box Move color into the Categorical Axis box Click OK 2 an an number trapped 2w 4 Run the analysis of variance Write the hypotheses report the F statistic and the PValue Write your conclusions Using SPSS Analyze gt Compare Means gt One Way ANOVA Move numberitrappea into the Dependenth39st box Move treatment into the Factor box Ifyou want a plot of the means you can click options and Iileans P102 Click OK ANOVA number trapped Sum of Squares df Mean Square F Sig Between Groups 4218458 3 1406153 30552 000 Within Groups 920500 20 46025 Total 5138958 23 5 What is the estimate for population standard deviation 6 What is R2 Example 3 From Moore and McCabe 4th Edition Recommendations regarding how long infants in developing countries should be breastfed are controversial If the nutritional quality of the breast milk is inadequate because the mothers are malnourished then there is risk of inadequate nutrition for the infant On the other hand the introduction of other foods carries the risk of infection from contamination Further complicating the situation is the fact that companies that produce infant formulas and other foods benefit when these foods are consumed by large numbers of customers One question related to this controversy concerns the amount of energy intake for infants who have other foods introduced in to the diet at different ages Part of one study compared the energy intakes measured in kilocalories per day kcald for infants who were breastfed exclusively for 4 5 or 6 months The data are below Breastfed for 4 5 6 months months months 499 490 585 620 395 647 469 402 477 485 177 445 660 475 485 Energy Intake 588 617 703 kcaUd 675 616 528 517 587 465 649 528 209 518 404 370 738 431 628 518 609 639 617 368 704 538 558 519 653 506 548 10 1 Identify the response variable ni N and l for this study 2 Make a table giving the sample size mean and standard deviation for each group of infants Is it reasonable to pool the variances Report Energy Time Mean N Std Deviation Median Minimum Maximum BF4 57000 19 122958 60900 209 738 BF5 48300 18 112948 51200 177 639 BF6 54188 8 93963 50650 445 703 Total 53020 45 118906 52800 177 738 3 Show sidebyside boxplots for the 3 groups Enn 7EIEI BEIEI 3 sun x N Am sun 10 znn 23 0 mm EM EFE EFE Time 11 4 Run the analysis of variance Report the F statistic and P value Write the hypotheses for your test What do you conclude ANOVA Energy Sum of Squares df Mean Square F Sig Between Groups 71288325 2 35644163 2718 078 Within Groups 55081053 42 13114545 Total 6220992 44 5 What is the estimate for population standard deviation 6 What is R2 Multiple Comparisons Multiple comparisons are used when 0 The means differ The ANOVA s H 0 is rejected 0 When we are unable to formulate specific questions in advance of the analysis 12 To perform a multiple comparison procedure compute the t statistic for all pairs of means using the formula Xi xj 1 1 sp 77 n n 1 If Iij 2 I we declare that the population means ul and uj are different Otherwise we conclude that the data do not distinguish between them The value of I depends on which multiple comparison procedure we choose We will use Bonferroni s multiple comparison procedure SPSS will do this for you and give you the Pvalue All combinations of means are compared using the above table For example if you have three means u1 will be compared to u2 and u3 and u2 will be compared to u3 Another approach is to compare means using simultaneous con dence intervals for the difference between means Simultaneous con dence intervals for all differences ul uj between population means have the form a l l f fil s 77 1 J F n n The critical values I are the same as those used for the multiple comparison procedure chosen Note If the confidence interval includes the value 0 then that pair of means will not be declared significantly different and visa versa 13 Example 4 Going back to example 2 use Bonferroni s multiple comparison s procedure to determine which pairs of means differ significantly Summarize your results in a short report Using SPSS Analyze gt Compare Means gt OneWay ANOVA Move numbered trapped into DependentList box Move treatment into factor box Click Post Hoc box Click Bonferroni Click Continue Click OK Remember lemon yellow 1 White 2 green 3 and blue 4 Multiple Comparisons Dependent Variable number trapped Bonferroni Mean Difference 95 Confidence Interval I treatment Jtreatment IJ Std Error Sig Lower Bound Upper Bound 1 2 31500quot 3917 000 2003 4297 3 15667quot 3917 004 420 2713 4 32333quot 3917 000 2087 4380 2 1 31500 3917 000 4297 2003 3 15833 3917 004 2730 437 4 833 3917 1000 1063 1230 3 1 15667 3917 004 2713 420 2 15833quot 3917 004 437 2730 4 16667quot 3917 002 520 2813 4 1 32333 3917 000 4380 2087 2 833 3917 1000 1230 1063 3 16667 3917 002 2813 520 The mean difference is significant at the 05 level Example 5 Going back to example 3 explain Why you do not need to use a multiple comparison procedure for these data 14

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.