### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Statistical Methods and Computing 22S 105

UI

GPA 3.72

### View Full Document

## 31

## 0

## Popular in Course

## Popular in Natural Sciences and Mathematics

This 8 page Class Notes was uploaded by Cullen Conn on Friday October 23, 2015. The Class Notes belongs to 22S 105 at University of Iowa taught by Mary Cowles in Fall. Since its upload, it has received 31 views. For similar materials see /class/228075/22s-105-university-of-iowa in Natural Sciences and Mathematics at University of Iowa.

## Popular in Natural Sciences and Mathematics

## Reviews for Statistical Methods and Computing

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/23/15

22S105 Statistical Methods and Computing Contingency Tables and the ChiSquare Test Introduction to ANOVA Lecture 21 Apr 10 2006 Kate Cowles 374 SH 33570727 kcowles statuiowaedu 3 7 death certi cate incorrect and required rer coding of underlying cause of death 0 Question of interest Are there differences between the two hospitals with respect to practices in completing death certi cates 0 One way to address the question Test null hypothesis that within each category of death certi cate status the proportions of death certi cates coming from Hosptial A are the same 2 The Chisquare test for differences among more than 2 proportions We are interested in the independent samples case Example 0 A study investigated the accuracy of death certi cates by comparing the results of 575 autopies to the causes of death listed on the certi cates 0 Two hospitals participated in the study 7 community hospital labeled A 7 university ospital labeled B Three possible cases 7 death certi cate con rmed accurate 7 death certi cate contained inaccuracies but did not require correction of underlying cause of death 4 Another multiple comparisons problem Ho pcp pr Ha PC7 pz390TPc7 Pr0Tpi7 pr 0 We will rst test whether there are any sig ni cant differences 0 Only if we reject H0 in the overall test will we do pairwise tests to nd out which popur lation proportions are different Results Hospital A Hospital B Total Con rmed accurate 157 268 425 lnacc7 no recoding 18 44 62 lncorrect7 recoding 54 34 88 Total 229 346 575 The overall sample proportion of death certi r cates from hospital A is 229 7 0398 575 If H0 is true7 we would expect this same proporr tion of hospital A certi cates in all three cater gories 7 According to Table E7 the 05 cuto under a Chirsquare distribution With 2 df is 599 We can reject H0 because 2162 gt 599 The prvalue lt 0001 We conclude that the proportions of death cerr ti cates from Hospital A are not the same for the three different categories of certi cate sta tus 6 Observed and expected counts Hospital A Hospital B Hospital A Hospital Accurate 157 268 1693 2557 lncorrect 18 44 247 373 Recode 54 34 350 530 The Chirsquare statistic is X2 2162 0 r 3 rows 0 c 2 columns 0 So the degrees of freedom is r 71c 71 21 2 E This Chisquare test in SAS options linesize 72 data dthcert input hosp 55 status 55 count datalines A C 157 A I 18 A R 54 B C 268 B I 44 B R 34 proc freq data dthcert tables status hosp expected weight count run proc freq data dthcert tables status hosp chisq weight count run TABLE OF STATUS BY HOSP STATUS HOSP Frequencyl Expected l Percent I How Pct l Col Pct TABLE OF STATUS BY HOSP STATUS HOSP Frequencyl Percent I How Pct l Col Pct IA IE I 7777777 rrr rrr c l 157 l 268 2730 I 4661 3694 I 6306 6856 77 46 7777777 rrr rrr 1 18 44 313 765 2903 I 7097 786 1272 7777777 rrr rrr R 54 l 34 939 591 6136 38 64 2358 9 83 7777777 rrr rrr Total 229 3 3983 6017 Total 7391 Total 229 39 83 STATISTICS FOR TABLE OF STATUS BY HOSP 6017 10000 Statistic DF Value Prob Chi Square 2 21 523 0 001 Likelihood Ratio ChiiSquare 2 21189 0001 ManteliHaenszel ChiiSquare 1 12864 0001 Phi Coefficient 0193 Contingency Coefficient 0190 Cramer s V 0193 Sample Size 575 The sample proportions are Hospital A Hospital B Total Con rmed accurate 157 268 0369 lnacc no recoding 18 44 0409 lncorrect recoding 54 34 0614 Total 229 346 575 More advanced methods provide tests and con dence intervals to formalize analysis of which population proportions are signi cantly differ ent w Goal to compare population means under three different treatments 0 a threerindependentrsample problem 0 Call the population mean heart rates M1 for when pets are present M2 for when friends are present and M3 for when women perform task alone then H03M1M2M3 aIM1M20rM1M3OW2M3 gtk not onersided or Qrsided u Comparing more than two population means Example Does the presence of pets or friends affect responses to stress 0 Allen Blascovich Tomaka and Kelsey 1988 Journal of Personality and Social Psychol 099 0 subjects 45 women who described theme selves as dog lovers 0 randomly assigned to three groups to do a stressful task 1 alone 2 with a good friend present 3 with their dog present 0 Subjects7 mean heart rate during the task was one measure of the effect of stress w SAS descriptive statistics Analysis Variable BEATS iiiiiiiiiiiii ii GROUPC iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii N Mean Std Dev Minimum Maximum 15 825240667 92415747 626460000 990460000 iiiiiiiiiiiii ii GROUPF iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii N Mean Std Dev Minimum Maximum 15 913251333 83411341 769080000 1021540000 iiiiiiiiiiiii ii GROUPP iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii N Mean Std Dev Minimum Maximum 15 734830667 99698202 586920000 975380000 17 To infer about the three population means7 we might use the tworindependentrsample t test 3 times 0 Test H0 M1 M2 to see if mean heart rate when pet is present differs from mean when friend is present 0 Test H0 M1 M3 to see if mean heart rate when pet is present differs from mean when alone oTest H0 M2 M3 to see if mean heart rate when friend is present differs from mean when alone 19 Multiple comparisons procedures in statis tics 0 issue how to do many comparisons at once with some overall measure of con dence in all our conclusions 0 two steps overall test of whether there is good eve idence of any differences among parame ters we wish to compare followrup analysis to decide which of pa rameters differ and to estimate size of dif ferences 13 Problem with this approach 0 3 prvalues for 3 different tests dont tell us how likely it is that three sample means are spread apart as far as these are 0 might be that i1 7348 and i2 9132 are significantly different if we look at just 2 groups but not significantly different if we know they are the smallest and largest means in 3 groups 7 As more and more groups are considered7 we expect gap between smallest and largest sample mean to get larger 7 Imagine comparing heights of shortest and tallest person in larger and larger groups of people 0 the probability of Type I error for the whole set of trtests will be much bigger than the or level set for each one Step one OneWay Analysis of Vari ance ANOVA ostep one overall test for some difference among 3 or more population means 0 uses an F test to compute a prvalue Dogs7 friends7 and stress example Amlysls of Variance Procedure class Levels Values GROUP 3 c F P Muuber of observatlons 1n data set 45 Amlysls of Variance Procedure Dependent Varlable BEATS Sum of Mean Source DF Squares Square F Value Pr gt F Model 2 23876889920 11938444960 1408 00001 Error 3561 2994916 Corrected Total 44 Rquuare cV MSE BEATS Mean 0401360 1116915 92083030 82444089 Source DF Anova SS Mean Square F Value Pr gt F GROUP 2 23876889920 11938444960 1408 00001 F distributions 0 many different F distributions identi ed by two parameters numerator degrees of freedom l r 1 denominator degrees of freedom N r l 22 Main idea of ANOVA What matters is how far apart sample means are relative to variability of individual obser vations o F statistic F variation among the sample means variation among individuals in the same sample 0 compare to a cutoff value in an F distribu tion Notation o I number of different populations Whose means we are studying o 72239 number of observations in sample from ith population 0 N total number of observations in all samr ples combined 24 Example Do four varieties of tomato plant differ in mean yield7 Agronomists grew 10 plants of each va riety and recorded the yield of each plant in pounds of tomatoes What are 0 the populations of interest 0 the variable of interest 0 I 0 each 72239 o the degrees of freedom for the ANOVA F statistic Assumptions for OneWay ANOVA 0 We have I independent simple random same ples7 one from each of l populations 0 Each population 239 has a normal distribution with unknown mean my As with tetests7 if sample sizes are large enough in each sample7 Central Limit The orem says inference based on sample means is OK even if population distributions are not exactly normal 27 Step two individual ttests With cor rection for multiple comparisons This is the follow up test 0 should be carried out only if the F test from onerway ANOVA is signi cant at the chosen signi cance leve Goal to set the overall probability of commitr ting a type I error at 04 when doing pairwise comparisons of k different means 0 we will perform 2 k tworindependentrsample trtests 0 we will conduct each one at the signi cance level o This is called the Bonferroni correction 26 o All of the populations have the same stanr dard deviation 0 unknown funlike tetests7 there is no general proce dure when population standard deviations are not assumed to be equal rough rule of thumb if largest sample stanr dard deviation is no more than twice the smallest sample standard deviation7 then population standard deviations probably are close enough to equal that ANOVA procedure is OK 7 very conservative Dogs7 friends7 and stress example 0 There are k 3 samples7 so there are I 3 different pairs to compare 0 To get an overall signi cance level or 05 on all 3 tests considered together7 we conduct each one at 05 7 0167 0 3 i That is7 we would consider the difference between two population means to be sig nificantly different from zero at the 05 level only if the prvalue for the the trtest for that pair was less than 0167 31 SAS does the adjusting and prints a grouped list of the classes Means with the same letter are not signi cantly different at the specified alpha level Analysis of Variance Procedure Bonferroni Dunn T tests for variable BEATS NOTE This test controls the type I experimentwise error rat generally has a higher type 11 error rate than REGWQ Alpha 005 df 42 MSE 8479285 Critical Value of T 2 49 Minimum Significant Difference 83847 Means with the same letter are not significantly different Bon Grouping Mean N GROUP A 91325 15 F B 82524 15 C C 73483 15 P Equivalently7 we could multiply the prvalue from each trtest by 3 gtk If the result was less than 05 we would consider the difference between two pop ulation means to be significantly differ ent from zero at the 05 level 32 Oneway ANOVA in SAS options linesize 79 data pet infile temppetdat input group 55 beats run proc sort data pet by group run proc means data pet by group var beats run proc anova data pet class group model beats group run proc anova data pet class group model beats group means group bon alpha 05 run

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.