# 721 Note 17 for STAT 100 at PSU

Date Created: 02/06/15
Feb 18 Statistic for the day Average annual beer consumption of American college students Almost 4 billion cans Estimated percentage of freshman class nationwide that will drop out for alcohol related reasons 7 Assignment Read Chapter 12 pp 235238 Exercises 9 11 15 17 Exercise 1 Follow the 4 steps and answer the Research Question Was there a relationship between sex and ownership of cell phones among STAT 100 students in 2004 Data Rows sex Columns cell phone no yes All female 12 124 136 male 14 87 101 All 26 211 237 Counts and percents Spring 2004 Rows Sex Columns Cel 1phone No Yes All Female 12 124 135 882 9118 10000 Male 14 87 101 1386 8614 10000 So 91 18 ofwomen in the sample say yes but only 8614 ofmen m the sample say yes Are they statistically signi cantly different The strategy for determining statistical significance I First gure out what you expect to see ifthere is no difference between females and males l Second gure out how far the data is from what is expected I Third decide ifthe distance in the second step is large I Fourth iflarge then claim there is a statistically signi cant difference Step 1 We must compute what the skeptic expects No Yes Women 3977 B 136 Men C D lOl 26 Zl l 237 77 M 1492 Repeat for B C D 237 Step 1 cont d Gl ell elL SE39IVr39fU Ct39Jll39Tlf Red Expected counts if skeptic is correct Cellphone No Yes All Female 1 24 135 1492 12108 Male 14 87 101 11 08 89 92 Total 26 211 237 Eileen Cibteivevjircuius Step 2 12714922 571 124421082 7 070 1492 39 12108 739 7 2 if 2 14 1108 2769 8 8992 2095 1108 8992 ChiSq 0571 0070 0769 0095 1506 Red Expected counts if skeptic is correct Step 2 cont d Cellphone No Yes All Female 12 124 135 14 92 121 08 Male 14 87 101 11 08 89 92 Total 26 211 237 ChiSq 0571 0070 0769 0095 1506 Step 3 Accepted definition of large for scientific purposes Something is we when Cnirsquared distribution With it is in the 0uter5 n tail 1 degree offreedorn 0f the appropriate distribution a lfcnirsquared statistic is larger than 3 84 it is declared large and the research advocate Wins Cumquot 3 at Our chisquared value 1506 95 an this side 5 an this sde Step 4 No statistically significant difference Rows Sex Columns Cellphone No Yes All Female 12 124 135 882 9118 10000 Male 14 87 101 1386 8614 10000 Hence the difference 9118 ofwomen Versus 8614 ofmen is not statistically signi cant in this case Note sample size has been automatieany considered Counts and percents Fall 2001 Rows sex Columns cellphone no yes A11 female 25 51 77 3377 6623 10000 male 19 16 35 5429 4571 10000 So 55 23 ofwomen in the sample say yes but only 45 71 ofmen in the sample say yes Are they statistieany signi cantly different FALL 2001 results Expected counts are below obsesved counts no yes Total Female 1 1 77 3094 4505 Male 19 16 35 14 06 20 94 Total 45 67 112 ChiSq 0788 0529 1734 1164 4215 FALL 2001 It is large this time Chisquared distribution Witn i degree or freedom Area above 3 84 is 05 5 in We 2 it enisnnaren is in here it 95 in We is neeiaren large and ins researen advocate Wins oiii enisnuaien is A 215 But our chisquared is 4 215 so the research advocate winsi There vim astatistiea11y signi cant difference in 2001 Change over time Cell phone ownership for sample of STAT 100 students Signi cant Semester Women Men dliielelica Fall 2001 66 2 ot77 45 7 ot35 Yes Spring 2004 Qt 2ot136 86 Wu othl No Spring2005 974otll7 942ori03 No Spring 2005 A cautionary tale Note that two of Rows Sex columns Cellphone No yes A11 the expected counts are Female 3 114 117 5 479 11221 117nn sma er an Male 6 97 1BR 421 9379 1u3uu This can make All 9 211 22D our results somewhat iffy The best approach in this case Report the result no signi cant difference but point out the small expected counts of479 and 421 Why 1 degree of freedom No Yes Women l36 llen lot 26 211 237 Note that gray box is the ONLY one we can in arbitrarily Once thatbox is lled a11 others are determined by margins How many degrees of freedom here hypothetical 2X3 table Always Sometimes Neter Women One 1 Two d1 Men Degrees offreedom d0 a1ways equal Number ofrows 71 x Number ofcolumns e 1 Exercise 2 Followthe 4 steps and answer the research question Is there a statistically signi cant difference in calories between small and large sandwiches Data Response Calories Low High Explanatory Small 5 2 7 Slze Large 2 5 7 7 7 14 Solution Expected counts are below orserved nunrs low high Total small 7 3 50 350 large 2 5 7 350 350 Total 7 7 14 ChiSq 0643 0643 0643 0643 2571 In this case the skenu39c wins and the research advocate 1oses So we cannot claim that there is arelationship between size and calories But note small expected counts

