### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# BASICAPPLIEDSTATISTICS STAT0200

Pitt

GPA 3.52

### View Full Document

## 8

## 0

## Popular in Course

## Popular in Statistics

This 87 page Class Notes was uploaded by Josefa Cartwright Jr. on Monday October 26, 2015. The Class Notes belongs to STAT0200 at University of Pittsburgh taught by Staff in Fall. Since its upload, it has received 8 views. For similar materials see /class/229432/stat0200-university-of-pittsburgh in Statistics at University of Pittsburgh.

## Popular in Statistics

## Reviews for BASICAPPLIEDSTATISTICS

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/26/15

C 2007 Nancy Pfenning Looking Back Review El 4 Stages of Statistics Lecture I Data Production discussed in Lectures 14 I Displaying and Summarizing Lectures 512 Binomial Random Variables Probability El Finding Probabilities discussedin Lectures 1314 uDefInItIon El Random Variables introduced in Lecture 15 uWhat If Events are Dependent Binomial uCenter Spread Shape of Counts Proportions 39 WWquot El Sampling Distributions uNormaI ApprOXImatlon I Statistical Lnterence 2 mm mm mm Eiemenhw shims mm tithe aw 7mm 2mm mm mm ammw Statstics mm am an mm Us 2 De nition Review De nition I Discrete Random Variable one whose Binomial Random Variable counts sampled possible values are nite or countably individuals falling into particular category in nite like the numbers 1 2 3 I Sample size n is fixed I Each selection independent of others Looking Ahead To perform inference about I Just 2 possible values for each individual categorical variables need to understand I Each has same probability p of falling in behavior of sample proportion A rst step is to category of interest understand behavior of sample counts We will eventually shift from discrete counts to a normal approximation which is continuous mm mmquot swam mt mm Wm m cmwm m Emmanm WWW W W Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning E Example A Simple Binomial Random Variable Example A Simple Binomial Random Variable I Background The random variableX is the I Background The random variableX is the count of tails in two ips of a coin count of tails in two ips of a coin I Questions Why is X binomial What aren II Responses and p How do we display X I Sample size n xed I Each selection independent of others I Just 2 possible values for each I Each has same probability p C Zuni Nancy Ptenning Eiementary Statistics Looking attrie Big Picture Li a 5 C Zuni Nancy Ptenning Eiementary Statistics Looking atthe Big Picture Li a 7 i i 39 e Example A Simple Binomial Random Variable Example Determining R V is Binomial Looking Back We alreaaj discussed this random i Background Consider following RV Variable when learning aboutpmbabilily I Pick card from deck of 52 replace pick another distributions Xno of cards picked until you get ace El Responses Display with El Question IsX binomial 127 Probability i i i o 1 Xnumber of tails 0 mm Nancy Pfenning Eiernentaiy Statistics Looking atthe Big Picture Li B in c ZEIEI7 Nancy Ptennirig Eiernentaiy Statistics Looking atthe Big Picture Li E ii Elementary Statistics Looking at the Big Picture 2 C 2007 Nancy Pfenning Example Determining if R V is Binomial El Background Consider following RV I Pick card from deck of 52 replace pick another Xno of cards picked until you get ace El Response 2mm mnwmm Eiemenuwsuusucs mm tithe swim Us 13 Example Determining if R V is Binomial El Background Consider following RV I Pick 16 cards without replacement from deck of 52 Xno of red cards picked El Question IsX binomial 2mm mm mm ammw Sialsiics mm um aw mm Us 4 Example Determining if R V is Binomial El Background Consider following RV I Pick 16 cards without replacement from deck of 52 Xno of red cards picked El Response 2mm mnwmm Eiemenuwsuusucs mm tithe swim Us 5 Example Determining if R V is Binomial Elementary Statistics Looking at the Big Picture El Background Consider following RV I Pick 16 cards with replacement from deck of 52 Wno of clubs Xno of diamonds Y no of hearts Zno of spades El Question Are WX Y Z binomial 2mm mm mm ammw Sialsiics mm um aw mm Us 7 C 2007 Nancy Pfenning Example Determining if R V is Binomial El Background Consider following RV I Pick 16 cards with replacement from deck of 52 Wno of clubs Xno of diamonds Y no of hearts Zno of spades El Response 2mm mnwmm Eiemenuwsuusucs mm 31th swim Us 9 Example Determining if R V is Binomial El Background Consider following RV I Pick with replacement from German deck of 32 doesn t include numbers 26 then from deck of 52 back to deck of 32 etc for 16 selections altogether Xno of aces picked El Question IsX binomial 2mm mm mm ammw Stalsiics mm um aw mm Us 2 Example Determining if R V is Binomial El Background Consider following RV I Pick with replacement from German deck of 32 doesn t include numbers 26 then from deck of 52 back to deck of 32 etc for 16 selections altogether Xno of aces picked El Resp onse 2mm mnwmm Eiemenuwsuusucs mm 31th swim mm Example Determining if R V is Binomial Elementary Statistics Looking at the Big Picture El Background Consider following RV I Pick 16 cards with replacement from deck of 52 Xno of hearts picked El Question IsX binomial 2mm mm mm ammw Stalsiics mm um aw mm Us 24 C 2007 Nancy Pfenning Example Determining if R V is Binomial El Background Consider following RV I Pick 16 cards with replacement from deck of52 Xno of hearts picked Response I fixed n 16 El I selections independent with replacement I just 2 possible values heart or not I samep 025 for all selections 2mm mnwmm Eiemenuwsuusucs mm tithe swim maze Requirement of Independence Snag I Binomial theory requires independence I Actual sampling done without replacement so selections are dependent Resolution When sampling without replacement selections are approximately independent if population is at least 10n 2mm mm mm ammw Sialsiics mm um aw mm Us 27 Example A Binomial Probability Problem El Background The proportion of Americans who are lefthanded is 01 Of 44 presidents 7 have been le handed proportion 016 El Question How can we establish if being lefthanded predisposes someone to be president 2mm mnwmm Eiemenuwsuusucs mm tithe swim mm Example A Binomial Probability Problem Elementary Statistics Looking at the Big Picture El Background The proportion of Americans who are le handed is 010 Of 44 presidents 7 have been lefthanded proportion 016 El Response Determine if 7 out of 44 016 is when sampling at random from a population where 010 fall in the category of interest 2mm mm mm ammw Sialsiics mm um aw mm Us cm Solving Binomial Probability Problems Use binomial formula or tables Only practical for small sample sizes Use software Won t take this approach until later Use normal approximation for countX Not quite more interested in proportions Use normal approximation for proportion Need mean and standard deviation C 2mm Nancy Pfenning Elementary Statistles Luuklng attne Big Picture Li a at C 2007 Nancy Pfenning Example Mean of Binomial Count Proportion I Background Based on longrun observed outcomes probability of being lefthanded is approx 01 Randomly sample 100 people I Questions On average what should be the I count of lefties I proportion of lefties C 2mm Nancy Pfenning Elementary Statistles Luuklng attne Big Picture Li a 32 i l Example Mean of Binomial Count Proportion I Background Based on longrun observed outcomes probability of being lefthanded is approx 01 Randomly sample 100 people I Responses On average we should get I count of lefties I proportion of lefties C 2mm Nancy Pfenning Elementary Statistles Luuklng attne Big Picture Li a 34 Elementary Statistics Looking at the Big Picture Mean and SD of Counts Proportions Count X binomial with parameters n p has I Mean np I Standard deviation inp1 p Sample proportion 13 has I Mean 19 I Standard deviation pl p TL Looking Back Formulas for s a39 require independence population at least 10n C ZEIEI7 Nancy Pfenning Elementary Statistles Luuklng attne Eilg F39lCturE Li a 35 C 2007 Nancy Pfenning Example Standard Deviation of Sample Count C 2mm Nancy Pfenning Eiementaiy Statistics Looking atthe Big Picture I Background Probability of being lefthanded is approx 01 Randomly sample 100 people Sample count has mean 10001 10 standard deviationi100O11 01 3 III Question How do we interpret these E Example Standard Deviation of Sample Count I Background Probability of being lefthanded is approx 01 Randomly sample 100 people Sample count has mean 10001 10 standard deviationilOOO11 01 3 El Response On average expect sample count lefties Counts vary typical distance from is C 2mm Nancy Pfenning Eiementaiy Statistics Looking atthe Big Picture Li a 38 Example SD of Sample Proportion deviation 011o1 100 2 003 C 2mm Nancy Prenning Eiementaiy Statistics Looking atthe Big Picture I Background Probability of being lefthanded is approx 01 Randomly sample 100 people Sample proportion has mean 01 standard I Question How do we interpret these Elementary Statistics Looking at the Big Picture Example SD of Sample Proportion I Background Probability of being lefthanded is approx 01 Randomly sample 100 people Sample proportion has mean 01 standard deviation 0 110 1 El Response On average expect sample proportion lefties Proportions vary typical distance from is C ZEIEI7 Nancy Pfenning Eiementaiy Statistics Looking atthe Big Picture Li a 4i C 2007 Nancy Pfenning E Example Role of Sample Size in Spread Example Role of Sample Size in Spread I Background Consider proportion of tails in I Background Consider proportion of tails in various sample sizes n of coinflips various sample sizes n of coinflips I Questions What is the standard deviation for El Responses I n1 I n1 sd I n4 I n4 sd I n16 I n16 sd Because of n in the denominator of the formula for standard deviation spread of sample proportion as n increases cnum Nancy Pfenning Eiementary Statistics Luuking attne Big Picture LiBAZ cnum Nancy Pfenning Eiementary Statistics Luuking attne Big Picture HEM a Shape of Distribution of Count Proportion Example Underlying Com lp D Slrlbunon Binomial countX or proportion 13 for I Background Distribution of count or repeated random samples has shape proportion of tails in nl coinflip p05 approximately normal if samples are large enough to offset underlying skewness Central Limit Theorem For a given sample size n shapes are identical for count and proportion P bey F bbiw J o aiism siandavddevialicn5 icoinliip sundardd ttttttt n5 II Question What are the distributions shapes C 2mm Nancy Pfenning Eiementary Statisties Luuking attne Big Picture Li a 45 C 2mm Nancy Pfenning Eiementary Statisties Luuking attne Big Picture Li a 4B Elementary Statistics Looking at the Big Picture 8 C 2007 Nancy Pfenning Example Underlying Coin ip Distribution Example Disiribuiionfor 4 Coin ips I Background Distribution of count or I Background Distribution of count or proportion of tails in n1 coin ip p05 proportion of tails in n4 coinflips p05 2 816 7 39 j 0 MM 0 5 i39 aspmpomonoi u A i oitaiisi milsm wmliin a i was a 4 Xcounu o 25 mean5 75 i bploponiu II Response II Question What are the distributions shapes 02uu7 Nancy Pfenning Eiernentaiy Statistics Looking atthe Big Picture LiBAE 02uu7 Nancy Pfenning Eiernentaiy Statistics Looking atthe Big Picture LiBAB Example Distribution for 4 Cain ips Shift from Counts to Proportions I Background Distribution of count or I Binomial Theory begins with counts PTOPOI UOD 0f 1331113 111 quot4 COln lPS 19205 I Inference will be about proportions 7i fi ii i tails m 4 Oi ails in 4 standard deviationzi Coinllips i 1 id Lie l 2 infii II Response 02uu7 Nancy Prennirig Eiernentaiy Statistics Looking atthe Big Picture LiB 5i 02uu7 Nancy Pfenning Eiernentaiy Statistics Looking atthe Big Picture UB 52 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning E Example Distribution of 13 for I 6 Coin ips Example Distribution for I 6 Coin ips I Background Distribution of proportion of I Background Distribution of proportion of tails in nl6 coinflips 1305 tails in nl6 coinflips 1305 Pmbabilliy f P 2 NW II Response C 2mm Nancy Pfenning Eiernentaiy Statistics Luuking atthe Big Picture Li a 53 C 2mm Nancy Pfenning Eiernentaiy Statistics Luuking atthe Big Picture ME 55 Example Underlying Distribution of Lefties Example Underlying Distribution of Lefties I Background Distribution of proportion of I Background Distribution of proportion of lefties p0 1 for sarnples of n1 lefties p0 1 for saniples of n1 9 i J isample propmiion ieil handed II Question What is the shape I Response C 2mm Nancy Pfenning Eiernentaiy Statistics Luuking atthe Big Picture LiE 5B C 2mm Nancy Pfenning Eiernentaiy Statistics Luuking atthe Big Picture ME 58 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example Dist of p of Lefties for n I 6 Example Dist of 13 0f Lefties for n I 6 I Background Distribution of proportion of I Background Distribution of proportion of lefties p0 l for nl6 lefties p0 l for nl6 T7 Probabiiily i i M25 in ism mo ms gesamgie propeiiiair ieitriariuea II Response II Question What is the shape i i 1575 mi 3 v5 Opo lorl leihhande cnum Nancy Prennirig Eiernentaiy Statistics Looking althe Big Picture UB 59 cnum Nancy Prennirig Eiernentaiy Statistics Luuking althe Big Picture Example Dist 0f 13 0f Lefties for 11 00 Example Dist of p of Lefties for 11 00 I Background Distribution of proportion of lefties p0 l for n100 n100 I Background Distribution of proportion of lefties p0 l for n100 Fmbablllly Probability 5 u 5 pesampie pvuponmn ie rhanded 20 5 El Response 5 u 5 Diample pmpo mn ieitriarigeg II Question What is the shape Eiementaiy Statistics Looking atthe Big Picture C 2mm Nancy Prenning LiBBZ Eiem ntaiy Statistics Looking atthe Big Picture C 2mm Nancy Prenning Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Rule of Thumb Example Applying Rule of Thumb Sample Proportion Approximately Normal A El Background Consider distribution of Dlstribution of p IS approximately normal 1f sample size n IS large enough relative to shape determined by population salnple proportion for various 7 andp PmPOm mP El Question Is shape approximately normal Require np Z 10 and 39n1 7 p 2 10 l 714 1705 Together these require us to have larger n forp close to 0 or 1 I quot20 1705 underlying distribution skewed right or le l 7120 p0l I 7120 1709 I n100p0l Example Applying Rule of Thumb Example Lefthanded Presidents Problem El Background Consider distribution of El Background The proportion of Americans sample Propomon for Varlous 7 andP who are le handed is 01 We consider El Response Normal Pp27440 16 for a sample of 44 presidents 39 F4 1705 7np403952lt10 El Question Can we use a normal I nizo piO39S 7 npi200395i10n139p approximation to nd the probability that at 39 quot40gt P Ol 7 np ZOWU NO least 7 of 44 016 are lefthanded n20 1709 n1 p201 092lt10 I n100 p0l np1000110 n1p100o990 both 2 10 2mm mnwmm amnuwsmsm mm alive mm mass mmmmnm mm ammmsmm wwwmaw mm mm Elementary Statistics Looking at the Big Picture 12 C 2007 Nancy Pfenning 5 Example Solving the Lefthanded Problem II Response I Background The proportion of Americans who are lefthanded is 01 We consider Pp2744O16 for a sample of 44 presidents approx is poor Probability i 0 o i bility Approximated probability is 01 0 C 200 Looking at the Big Picture L1671 Example From Count to Proportion and Vice Versa I Background Consider these reports I In a sample of 87 assaults on police 23 used weapons I 044 in sample of 25 bankruptcies were due to med bills I Question In each case what are n X and f5 C 2007 Nancy Pfenning Elementan Statistics Looking at the Big Picture L1472 Versa 39mi Example From Count to Proportion and Vice I Background Consider these reports I Response I First has n X 13 I Second has n 13 Z X C 2007 Nancy Pfenning Elementary Statistics Looking at the Big Picture I In a sample of 87 assaults on police 23 used weapons I 044 in sample of 25 bankruptcies were due to med bills L1474 Lecture Summary Binomial Random Variables II De nition 4 requirements for binomial II RVs that do or don t conform to requirements I Relaxing requirement of independence II Binomial counts proportions I Mean I Standard deviation I Shape II Normal approximation to binomial C 2007 Nancy Pfenning Elementan Statistics Looking at the Big Picture L1486 Elementary Statistics Looking at the Big Picture 13 C 2007 Nancy Pfenning Looking Back Review Lecture 5 El 4 Stages of Statistics sin 16 Variables I Data Production discussed in Lectures 14 g l Displaying and Summarizing Focus on Categorical Vanables El Single variables 1 quantitative El Relationships between 2 Variables Displays and ISummarIes I Probability Data Production Issues l Statistical Inference uLookIng Ahead to Inference Details about Displays and Summaries 2mm mmmm amnuwsmsm mm tithe swim mmmmnm mm ammw Statstics mm um aw mm r52 Four Processes of Statistics Handling Single Categorical Variables Population El Display I Pie chart I Bar graph 1 PRODUCE DATA El Summary I Count I Percent I Proportion 2 mm mm mm Eiemenhw shims mm tithe aw 7mm 2mm mm mm ammw Statstics mm um aw mm m Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning De nitions Notation El atistic number summarizingylgample El 13 sample proportion a statistic pha El E2 rameter number summarizingEPpulation El p population proportion a parameter 0mm L55 in Example Issues to Consider Example Issues to Consider El Background 246 of 446 students at a certain El Background 246 of 446 students at a certain university had eaten breakfast on survey day university had eaten breakfast on survey day El Questions El Responses I Are intro stat students representative of all I Representative students at that university I Unbiased I Would they respond Without bias Display I How to display and summarize the info summary ate breakfast I Can we conclude that a majority of all students at I car yet say if majority eat breakfast overall that university eat breakfast m WWW 5 L5 Elementary Statistics Looking at the Big Picture 2 C 2007 Nancy Pfenning E Example Issues to Consider Example Statistics vs Parameters I Background 246 of 446 students at a certain I Background 246 of 446 students at a certain university had eaten breakfast on survey day university had eaten breakfast on survey day U Responses Pie Chart of Breakfast El Questions no 200 448 I Is 246446055 a statistic or a parameter How do we denote it I Is the proportion of all students eating breakfast a statistic or a parameter How do we denote it yes 246 552 e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L5 in e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L5 ii i i Example Statistics vs Parameters Example Summary Issues I Background 246 of 446 students at a certain I Background Location state for all 1696 university had eaten breakfast on survey day TV series in 2004 with known settings Responses I California I 246446055 is a denoted l 412 in New York 4121696 024 I Proportion of all students eating breakfast is a I 683 in Other States 6831696 040 denoted I Questions I 0350240400999mistake I Why is it not appropriate to use this info to draw conclusions about a larger population in 2004 e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L5 is e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L5 i4 Elementary Statistics Looking at the Big Picture 3 C 2007 Nancy Pfenning Example Summary Issues El Background Location state for all 1696 TV series in 2004 with known settings I 601 in California 6011696035 I 412 in New York 4121696024 I 683 in other states 6831696040 Responses El 2mm mnwmm amnuwsmsm mm tithe swim 1516 Example Notation El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Questions How do we denote sample and population proportions Are they statistics or parameters mmmmnm mm ammwsmms mmmmw mm mm Example Notation El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Responses I sample proportion is a I population proportion 7 is a not known in this case 2mm mnwmm amnuwsmsm mm tithe swim 15m De nitions Elementary Statistics Looking at the Big Picture El Mode most common value El Majority more common of two possible values same as mode El Minority less common of two possible values mmmmnm mm ammwsmms mmmmw mm 1521 C 2007 Nancy Pfenning Example Role of Sample Size El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Question Would we be more convinced that a majority of all prions would choose correctly if 170 out of 200 were correct 2mm mnwmm amnuwsmgm mm tithe swim t522 Example Role of Sample Size El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Response 2mm mm mm ammw Statstics mm um aw mm L524 Example Sampling Design El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Question Is the sample biased 2mm mnwmm amnuwsmgm mm tithe swim i525 Example Sampling Design Elementary Statistics Looking at the Big Picture El Background In study of 20 antarctic prions birds 17 correctly chose the one of two bags that had contained their mate El Response 2mm mm mm ammw Statstics mm um aw mm L527 C 2007 Nancy Pfenning Example Stud Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each arm One bag had contained mate the other not El Question I What were researchers attempting to show 2 mm mm mm Example Study Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each aim One bag had contained mate the other not El Response 2mm mm mm ammw Statstics mm um aw mm mu Example Stud Design El Background Antarctic prions presented with Yshaped maze at the end of each arm One bag had contained mate the other not El Question I Why use bags and not birds themselves 2mm mnwmm amnuwsmgm mm mm swim t li Example Study Design Elementary Statistics Looking at the Big Picture El Background Antarctic prions presented with Yshaped maze a bag at the end of each aim One bag had contained mate the other not El Response 2mm mm mm ammw Statstics mm um aw mm L533 C 2007 Nancy Pfenning Example Stud Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each arm One bag ontained mate the other not El Question I Why had contained bird no longer in bag 2mm mnwmm amnuwsmsm mm mm swim 1534 Example Study Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each aim One bag had contained mate the other not El Response 2mm mm mm ammw Statstics mm um aw mm ma Example Stud Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each arm had contained mate the other not El Question I OK to always place correct bag on right 2mm mnwmm amnuwsmsm mm mm swim 1537 Example Study Design Elementary Statistics Looking at the Big Picture El Background Antarctic prions presented with Yshaped maze a bag at the end of each aim One bag had contained mate the other not El Response 2mm mm mm ammw Statstics mm um aw mm L539 C 2007 Nancy Pfenning Example Stud Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each arm One bag had contained mate the othe El Question I Should the other he just any empty bag 2mm mnwmm amnuwsmsm mm tithe awaits mu Example Study Design El Background Antarctic prions presented with Yshaped maze a bag at the end of each aim One bag had contained mate the other not El Response head Res eai rkem were car xi to avoid was In 39 o 1139 I Looking 4 their so mmmmnm mm ammw Statstics mm um aw mm L542 Two or More Possible Values Example Proportions in Three Categories Lu nking Ahead In Probability and Inference most categorical variables discussed have just two possibilities Still we often summarize and display categorical data with more than two possibilities 2mm mnwmm amnuwsmsm mm tithe awaits ma Elementary Statistics Looking at the Big Picture El Background Student wondered if she should resist changing answers in multiple choice tests Ask Marilyn replied I 50 of changes go from Wrong to right I 25 of changes go from right to Wrong I 25 of changes go from Wrong to Wrong El Question How to display information 2mm mm mm ammw Statstics mm um aw mm L544 C 2007 Nancy Pfenning Example Proportions in Three Categories El Background Student wondered if she should resist changing answers in multiple choice tests Ask Marilyn replied I 50 ofchanges go from Wrong to right I 25 ofchanges go from right to Wrong I 25 of changes go from Wrong to Wrong El Response 2mm mnwmm amnuwsmsm mm tithe swim ma De nition El Bar graph shows counts percents or proportions in various categories marked on horizontal axis with bars of corresponding heights 2mm mm mm ammw Stalslics mm um aw mm L547 Example Bar Graph El Background Statistics instructor can survey students to determine proportion in each year 15 2 3rd 4 Other El Questions I How to display the information I What to look for in display amnuwsmsm mm tithe swim ms Example Bar Graph Elementary Statistics Looking at the Big Picture El Background Statistics instructor can survey students to determine proportion in each year 15 2 3rd 4 Other El Responses I Construct a bar graph El 77 ion horizontal axis El 77 7 graphed vertically I Look for 7 tallest bar to tell What year is most common compare heights of all 5 bars 2mm mm mm ammw Stalslics mm um aw mm 155 C 2007 Nancy Pfenning 3 L Overlapping Categories Example Overlapping Categories If more than two categorical variables are considered at once we must note the possibility that categories overlap I Background Report by ResumeDoctorcom on over 160000 resumes I 13 said applicant had communication skills Looking Ahead In Probability we will need to distinguish I 7 said applicant was a team player between situations where categories do and do not overlap II Question Can we conclude that 20 claimed communication skills or team player 0 2mm Nancy Pfenrllrlg Elementary Statistles Luuklng attne Big F39lcture L5 52 e 2mm Nancy Prenan Elementary statistles Leean attne Big Flcture L5 53 Example Overlapping Categories Processing Raw Categorical Data I Background Report by ResumeDoctor com Small categorical data sets are easily handled on over 160000 resumes Wlthout software I 13 said applicant had communication skills I 7 said applicant was a team player I Response e mi Naney Pfenrllrlg Elementary Statistles Luuklng attne Big F39lcture L5 55 e mi Nancy Prenan Elementary Statistles Leean attne Big Flcture L5 5B Elementary Statistics Looking at the Big Picture 10 C 2007 Nancy Pfenning Example Proportion from Raw Data Example Proportion from Raw Data ll Background Harvard study claimed 44 of El Background Harvard study claimed 44 of college students are binge drinkers Agree on survey college students are binge drinkers Agree on survey design and have students selfreport on one design and have students selfreport on one occasion in past month alcoholic drinks more than 5 occasion in past month alcoholic drinks more than 5 males or 4 females Or use this data set males or 4 females Or use this data set yes no yes no no yes yes no yes no no yes no yes yes no yes no no yes yes no yes no yes yes no no yes yes yes yes no no yes yes yes I10 yes yes no no yes FIO yes yes no no yes no yes yes yes yes yes no yes yes yes yes no no yes no yes no no no yes no yes no 710 yes n0 quot0 yes I10 HO yes no 0 yes no no no no no yes yes no no no no yes yes yes no no no no no yes no no no no no 2 if C 23 23 32 s 22 3 C 23 33 3 El Question Are data consistent with claim of 44 El Response c2uu7 Naney Pfennan Elementary Statlstlcs Leukan attne elg Pletere L5 57 c2uu7 Nancy Pfennan Elementary Statlstlcs Lunklng attne elg Pletere L5 an Example Proportion from Raw Data Lecture Summary Categorical Variables ll Background Harvard study claimed 44 of El Display pie chart bar graph Zonege giants aredbinge dlrtinkers Agree on SUI VCY ll Summarize count percent proportion 651g an ave Stu ems 56 39report on one I Sampling data unbiased representative occas10n in past month alcoholic drinks more than 5 I males or 4 females Or use this data set D DeSIgIL pmdllced unblased summary 0f data I Inference Will we ultimately draw conclus1on about 0 ulation based on sam 1e LookingAhead How different would the sample p p p percentage have to be to convince you that your D MOde Majorlty most common values sample is significantly dif arent from Harvard s El Larger samples prov1de more info This is an inference question El Other issues Two or more possibilities Categories overlap How to handle raw data o 2mm Nancy Pfennan Elementary Statlstlcs Lunklng attne Big F39lcture L5 El o 2mm Nancy Pfennan Elementary Statlstlcs Lunklng attne Big F39lcture L5 as Elementary Statistics Looking at the Big Picture 11 C 2007 Nancy Pfenning 7 E l l l Looking Back Review Lecture 23 El 4 Stages of Statistics I Data Production discussed in Lectures 14 Inference for Categorical Variable I Displaying and Summarizing Lectures 542 More About Hypothesis Tests I Probability discussed in Lectures 1320 I Statistical Inference Examples of Tests With 3 Forms of Alternative El 1 categorical con dence intervals hypothesis tests EIHOW Form of Alternative Affects Test 1 quantitative ElWhen P Value is Small Statistical Significance ElHypothesis Tests in LongRun ElReIating Test Results to Confidence Interval categorical and quantitative 2 categorical 2 quantitative e 2mm Nancy Pfennan Elementary etatlstles Leeklng attne ale F39leture e 2mm Nancy Pfennan Elementary etatlstles Leeklng attne ale F39leture L23 2 l l 39 7 Three Types of Inference Problem Review Hypothesis Test About p Review In a sample of 446 students 055 ate breakfast State null and alternative hypotheses H o and H a 1 What is our best guess for the proportion of all Null is status quo alternative rocks the boat students who eat breakfast p gt 100 Hoppo VS Ha pltpo P01ntEst1mate p 72 pO 2 What interval should contain the proportion of 1 l Consider sampling and study designl all Students Who eat breakfaSt 2 Summarize with standardize to Z assuming Confidence Interval that H0 p 2 390 is true consider if Z is large 3 Do more than half 50 of all students eat 3 Find Pvalueprobof Z this far abovebelowaway breakfast from 0 consider if it is small Hypothesis Test 4 Based on size of Pvalue choose H 0 or H a e 2mm Nancy Pfennan Elementary etatlstles Leeklng attne ale F39leture L23 3 e 2mm Nancy Pfennan Elementary etatlstles Leeklng attne ale F39leture L23 4 Elementary Statistics Looking at the Big Picture 1 Checking Sample Size Cl vs Test ll Confidence Interval Require observed counts in and out of category of interest to be at least 10 ma 2 X 2 10 n1 13n X2 10 III Hypothesis Test Require expected counts in and out of category of interest to be at least 10 assume p p0 TWO 2 10 n1 190 Z 10 e 2mm Nancy F39fErlrllrlg Elementary Statisties Luuklng attne Eilg F39lcture L23 5 C 2007 Nancy Pfenning E Example Checking Sample Size in Test I Background 304000075 students picked 7 at random from 1 to 20 Want to test H 0 p005 vs Ha pgt005 II Question Is n large enough to justify finding Pvalue based on normal probabilities e 2mm Nancy F39fErlrllrlg Elementary Statisties Luuklng attne Eilg F39lcture L23 6 Example Checking Sample Size in Test I Background 304000075 students picked 7 at random from 1 to 20 Want to testHO p005 vs Ha pgt005 il Response n P0 nlpo Looking Back For con dence interval checked 30 and 370 both at least 10 e 2mm Nancy F39fErlrllrlg Elementary Statisties Luuklng attne Eilg F39lcture L23 8 Elementary Statistics Looking at the Big Picture Example T est with gt Alternative Review CI Note Step 1 requires 3 checks I Is sample unbiased Sample proportion has mean 005 I Is population 210n Formula for sd correct I Are npo and nlpo both at least 10 Find or estimate Pvalue based on normal probabilities 1 Students are typical h n39 04 1 issue at hand 2 pr005 sd of is 005l 005nd Z 075 quot oo51 o05 39 400 3 Pvalue PZ Z 229 is small just over 001 4 RejectHo conclude Ha picks were biased for 7 e 2mm Nancy F39fErlrllrlg Elementary Statisties Luuklng attne Eilg F39lcture L23 9 C 2007 Nancy Pfenning Example Test with Less Than Alternative Example Test with Less Than Alternative I Background 111230 of surveyed commuters at a El Background 111230 of surveyed commuters at a university walked to school uanerSlty walked to SCh001 II Question Do fewer than half of the university s D Response FlrSt Wnte H03 VS Ha commuters walk to school 1 Students need to be representative in terms of year 2 Output9 13 2 Test and CI for One Proportion Test and CI for One Proportion Test ofp05 vsplt05 Test ofpO5 vsplt05 Sample X N Sample p 9507 Upper Bound ZValue P Value Sample X N Sample p 9507 Upper Bound ZValue PValue 1 111 230 0482609 0536805 053 0299 1 111 230 0482609 0536805 053 0299 3 Pvalue 4 RejectHo c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L23 in c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L23 i3 Example Test with Less Than Alternative Example Test with Not Equal Alternative CI Note Pvalue is a lefttailed probability because I Background 43 of Florida s community college alternative was less than students are disadvantaged II Question Is disadvantaged at Florida Keys Community College 169356475 unusual Test and CI for One Proportion Test of p 043 vs p not 043 Sample X N Sample p 950 CI Z Value P Value 1 169 356 0474719 0422847 0526592 170 0088 o 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L23 M o 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L23 i5 Elementary Statistics Looking at the Big Picture 3 C 2007 Nancy Pfenning Example Test with Not Equal Alternative Example Test with Not Equal Alternative 393 BaCkgmund 43 0f Florida s community COllege CI Note Pvalue is a twotailed probability because students are disadvantaged alternative was not equal El Response First write H 0 vs H a 1 356043 3561043 both210 pop210356 2 p z Test and CI for Cine Proportion Test of p 043 vs p not 043 Sample X N Sample p 9507 CI Z Value P Value 1 169 856 0474719 0422847 0526592 170 0088 3 Pvalue 4 RejectHO C 2mm Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture L23 i7 e mi Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture L23 iB E 9 90959899 Rule Outside Probabilities Onesided or Twosided Alternative I Form of alternative hypothesis impacts 05 Pvalue I Pvalue is the deciding factor in test e area025 1 area0 l area 005 area025 I Alternative should be based on what researchers hopefear suspect is true 3005 before snooping at the data area01 i i i 4545 I 70 just gm 1 6459 L1 645 I Z I If lt or gt is not obv1ous use twoSided 4960 l I 2325 1 329326 alternative more conservatlve 72576 2576 e mi Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture L23 in e 2007 Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture L23 2i Elementary Statistics Looking at the Big Picture 4 C 2007 Nancy Pfenning E Example How F arm of Alternative A ects Test Example How F arm of Alternative A ects Test I Background 43 of Florida s community college El Background 43 of Florida s community college students are disadvantaged students are disadvantaged II Question Is disadvantaged at Florida Keys 539 Response NOW Wme H03 VS H03 169356475 unusually high 1 Same checks of data production as before 2 Same 0475z170 Test of p 043 vs p gt 043 Sample X N Sample p 9501 Lower Bound Z Value PValue 1 169 356 0474719 0431186 170 0044 3 Now Pvalue 4 Reject H 0 e 2mm Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture L23 23 e 2mm Nancy Ptenning Eiernentary Statistics Looking attne Big Picture L23 25 i 39 e Pvalue for One or TwoSided Alternative Thinking About Data I Pvalue for onesided alternative is half Before getting caught up in details of test Pvalue for twosided alternative consider evidence at hand I Pvalue for twosided alternative is twice Pvalue for onesided alternative For this reason twosided alternative is more conservative larger Pvalue harder to reject Ho e 2mm Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture L23 2B e 2mm Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture L23 27 Elementary Statistics Looking at the Big Picture 5 C 2007 Nancy Pfenning Example Thinking A bout Data at Hand El Background 43 of Florida s community college students are disadvantaged At Florida Keys the rate is 475 Question Is the rate at Florida Keys signi cantly lower El 2mm mnwmm Eiemenuwsuusucs mm tithe swim 1232s Example Thinking About Data at Hand El Background 43 of Florida s community college students are disadvantaged At Florida Keys the rate is 475 El Response 2mm mm mm ammw Stalslics mm um aw mm m cm De nition alpha 01 cutoff level which signi es a Pvalue is small enough to reject H 0 Eiemenuwsuusucs mm tithe swim 12331 How Small is a Small PValue Elementary Statistics Looking at the Big Picture I Avoid blind adherence to cutoff 05005 I Take into account 1 Past considerations is 10 Written in stone or easily subject to debate Future considerations What would be the consequences of either type of error I Rejecting H0 even though it s true I Failing to reject He even though it s false I Consider decisions encountered so far U 2mm mm mm ammw Stalslics mm um aw mm m 32 C 2007 Nancy Pfenning Example Reviewing P values and Conclusions El Background Consider our prototypical examples I Are random number selections biased PvaluF001 l I Do fewer than half of commuters walk Pvalue4299 I Is disadvantaged signi cantly different PvaluF0088 I Is disadvantaged signi cantly higher Pvalue0044 El Question What conclusions did we draw based on those Pvalues 2mm mnwmm amnuwsmsm mm mm aiwmme 1233 Example Reviewing P values and Conclusions El Background Consider our prototypical examples I Are random number selections biased Pvalue4011 I Do fewer than half of commuters walk Pvalue0299 I Is disadvantaged signi cantly different Pvalue4088 I Is disadvantaged signi cantly higher Pvalue0044 El Response Consistent with 005 as cutoff Oi I P value001 l 9Rej ect 7 I P value02999 Reject 7 I P value0088 9Rej ect 7 I P value0044 9Rej ect 7 2mm mm mm gummy Statstics mm um aw mm m 35 Example CutO s for Small quotP Value El Background Bookstore chain will open new store in a city if there s evidence that its proportion of college grads is higher than 026 the national rate El Question Choose cutoff 010 005 001 I if no other info is provided I if chain is enjoying considerable pro ts owners are eager to pursue new ventures if chain is in financial difficulties can t afford losses if unsuccessful due to too few grads 2mm mnwmm amnuwsmsm mm mm aiwmme mas Example CutO s for Small quotP Value Elementary Statistics Looking at the Big Picture El Response Choose cut0ff010 005 001 if no other info is provided El use77 I if chain is enjoying considerable pro ts owners are eager to pursue new ventures El use 7 I if chain is in financial difficulties can t afford loss if unsuccessful due to too few grads El use 2mm mm mm gummy Statstics mm um aw mm m as C 2007 Nancy Pfenning De nition Role of Sample Size n Statistically significant data produce Pvalue small enough to rejectHo Z plays a role l Large 11 may reject H 0 even though observed proportion isn t very far frompo Z 13 290 2 from a practical standpoint iPo1Po po1P0 TL Reject Ho ifPvalue small if Z large if I Sample proportion 13 far from p0 Very small Pvalue strong evidence against Ho but p not necessarily very far from po l Small 11 may fail to reject H 0 even though I Sample size n large it is false I Standard deviation small if pois close to 0 or 1 Failing to reject false H0 is 2 type of error e 2mm Nancy F39fErlrllrlg Elementary Statistles Leeklng attne Big F39lcture L23 3a e 2mm Nancy F39fErlrllrlg Elementary Statistles Leeklng attne Big F39lcture L23 4n i l De nition Hypothesis Test and LongRun Behavior I Type I Error reject null hypothesis even Repeatedly carry out hypothesis tests of p05 though it is true false positive based on 20 coinflips using cutoff 5 i Probability is cutoff Ct In the long run 5 of the tests will reject I Type 11 Error fail to reject null HO p05 even though it s true hypothesis even though it s false false negative e 2mm Nancy F39fErlrllrlg Elementary Statistles Leeklng attne Big F39lcture L23 M e 2mm Nancy F39fErlrllrlg Elementary Statistles Leeklng attne Big F39lcture L23 42 Elementary Statistics Looking at the Big Picture 8 C 2007 Nancy Pfenning J i J HypotheSIS Test and LongRun Behav10r Confidence Interval and Hypothesis Test Results 20 mi lps test H0 pgggy sb gggggg equal 50 l Con dence Interval range of plausible values TlTITHTH39lTHHT e HH 39 p39 p i eads45 Zquot45quotquotVal e 655 A I Hypothesis Test decides if a value is plausible HTI HHTHHTITHTHTlTHHT i Proportion of head 40 2389 p39Vame39371 4 IIIfOImally Z iagvaalueaaw 4 El If 170 is in confidence interval don t re ect Ho 7170 39 El pr0 is out51de confidence interval reject Ho 7170 THHHl tTHHHTHT HHH Z 2 24pvaiue 025 Relationship between 95 confidence interval pr p lhead 3975 l and twosided test with 05 as cutoff for pvalue 0 llips oi 20 o 95 chests do not reject Ho If 0 IS here I39e GCt HO 15 39 5 oi tests reject Ho i i l Cl 95 confidence interval i for population proportion V V Tl39H HTTTHTTHHTHHH proportion of heads 8204O 2289 pevalue37t A do not reiect Ho C 2mm Nancy Pfenning Elementary Statistics Looking atthe Big Picture L23 43 C 2mm Nancy Pfenning I If W is here do not relem Ho ppo I L23 44 Example Test Results Based on C Example Test Results Based on C I Background A 95 confidence interval for I Background A 95 confidence interval for proportion of all students choosing 7 at proportion of all students choosing 7 at random from numbers 1 to 20 is random from numbers 1 to 20 is 0055 0095 0055 0095 I Question Would we expect a hypothesis test I Response to reject the claim p005 in favor of the claim pgt005 Elementary Statistics Looking at the Big Picture 9 Example CI Results Based on Test El Background A hypothesis test did not reject HO p0 5 in favor of the alternative H 11 plt05 El Question Do we expect 05 to be contained in a con dence interval for p 2mm mnwmm amnuwsmgm mm tithe swim mm C 2007 Nancy Pfenning Example CI Results Based on Test El Background A hypothesis test did not reject HO p05 in favor of the alternative Ha plt05 El Response 2mm mm mm ammw Statstics mm um aw mm m 5 Lecture Summary iiI ore Hypothesis Tests for Proportions El Examples with 3 forms of alternative hypothesis El Form of alternative hypothesis I Effect on test results I When data render formal test unnecessary I Pvalue for lsided vs 2sided alternative Cutoff for small Pvalue Statistical signi cance role of n Type I or H Error Hypothesis tests in longrun EIEIEIEI Relating tests and confidence intervals 2mm mnwmm amnuwsmgm mm tithe swim Liam Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Looking Back Review El 4 Stages of Statistics Lecture 32 I Data Production discussed in Lectures 14 TWO categorlcal vanables I Displaying and Summarizing Lectures 512 Square I Probability discussed in Lectures 1320 I Statistical Inference uFormulating Hypotheses to Test Relationship D 1 mg r ald s ss d mew39z 1 quantitative discussed in Lectures 2427 39red 2sample severalsample Lectures 2831 a uTest based on Proportions or on Counts a cat and quan39 pal a a uChisquare Test uConfidence Intervals 2 quantitative 0mm WWW ammum mm m mm WWW mm mmmsuere summer We m2 Inference for Relationship Review Example 2 Categorical Variables Hypotheses I H o and H a about variables not related or related El Background We are interested in Whether or not El Applies to all three C Q C c Q Q smoking plays a role in alcoholism I H O and Ha about parameters equality or not El Question How wouldHo and H a be written El C9Q pop means equal I in terms ofVariables El C9C pop proportions equal I in terms ofparameters El Q Q pop slope equals zero 2 mm mm mm Elemenhw sums mm Me an new m2 3 2mm mm mm ammw slums mm mm aw mm m 4 Elementary Statistics Looking at the Big Picture 1 C 2007 Nancy Pfenning Example 2 Categorical Variables Hypotheses Example Summarizing with Proportions I Background We are interested in whether or not El Background Research Question Does smoking smoking plays a role alcoholism a role alcoholism El Response 1 Question What statistics from this table should 39 germs of Variables we examine to answer the research question El 0 smoking and alcoholism 7 i related El Ha smoking and alcoholism 7 irelated Alcohollc NOt AICOhOhC TOtal in terms of parameters Smoker 30 200 230 El 10 Pop proportions alcoholic 7 for smokers nonsmokers El Ha Pop proportions alcoholic if for smokers nonsmokers Nonsmoker 1 0 760 770 Total 40 960 1 000 The word not appears in Ho about variables in Ha about parameters c 2mm Nancy Pfenning Elementary Statistics Looking atthe Big Picture L32 E c 2mm Nancy Pfenning Elementary Statistics Looking atthe Big Picture L32 7 Example Summarizing with Proportions Example T est Statistic for Proportions I Background Research Question Does smoking I Background One approach to the question of play a role in alcoholism whether smoking and alcoholism are related is to El ResponseCompare proportions response compare PTOPOTUOHS f0r explanatory Alcoholic Not Alcoholic Total Alcoholic Not Alcoholic Total Smoker 30 200 230 51 53 0130 A 10 Smoker 30 200 230 Non smoker 760 770 p2 7 75 0013 Nonsmoker 10 760 770 Total 960 1000 Total 40 960 1000 El Question What would be the next step if we ve summarized the situation with the difference between sample proportions 01300013 c 2mm Nancy Pfenning Elementary Statistics Looking atthe Big Picture L32 3 c 2007 Nancy Pfenning Elementary Statistics Looking atthe Big Picture L32 iEI Elementary Statistics Looking at the Big Picture 2 C 2007 Nancy Pfenning Example Te Slam ch P 7 01907 lions Advantage of z Inference for 2 Proportions El Background One approach to the question of Can test against onesided alternative whether smoking and alcoholism are related is to compare proportions Alcoholic Not Alcoholic Total Smoker 30 200 230 151 T200 2 0130 Non smoker 10 760 770 132 7 0013 Total 40 960 1000 El Response the difference between sample proportions 01300013 1 1 In fact stan diff is normal Z c 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Ellg F39lcture L32 l2 c 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Ellg F39lcture L32 l3 t L ed Another Comparison in Considering Categorical Dlsadvantage of z Inference for 2 Proportions Relationships Review 2by 2 table comparing proportions straightforward ll Instead of cons1dermg how different are the Larger table comparing proportions complicated cal st standardize one difference A A proportions in a twoway table we may cons1der Ju p1 p2 how different the counts are from what we d expect if the explanatory and response variables were in fact unrelated El Compared observed expected counts in wasp study Obs A T Exp A NA T B 16 15 31 B 1 1 31 U 7 31 U 11 31 T 40 22 62 T 40 22 62 Liam e 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Ellg F39lcture L32 l4 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Inference Based on Counts Example Table of Expected C oums To test hypotheses abOut relationship in I Background Data on smoking and alcoholism r byc table compare counts observed to counts expected if H 0 equal proportions in response of interest were true II Question What counts are expected if H o is true e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L32 i6 e 2mm Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture L32 i7 Example Table of Expected Counts Example Table of Expected Counts El Background Data on smoking and alcoholism Expected Counts have been lled in El Response Overall proportion alcoholic is El Response Overall proportion alcoholic is 004 If proportions alcoholic were same for S and NS expect If proportions alcoholic were same for S and NS expect I 401000230 if smokers to be alcoholic I 401000230 if smokers to be alcoholic I 401000770 finonsmokers to be alcoholic also I 401000770 if nonsmokers to be alcoholicalso I 9601000230 if smokers not alcoholic I 9601000230 fismokers not alcoholic I 9601000770 iiinonsmokers not alcoholic I 9601000770 finonsmokers not alcoholic e 2mm Nancy Prenning Eiementary Statistics Looking atthe Big Picture L32 i3 e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L32 22 Elementary Statistics Looking at the Big Picture 4 C 2007 Nancy Pfenning Example Table of Expected Counts Chi Square Statistic I Components to compare observed and expected counts one table cell at a time observed expected2 component expected Components are individual standardized squared differences I Chisquare test statistic X2 combines all CI Note Each expected count is Column total KROW total com onents b summin them u Expect Table total p y g l I 40230 1000 77 smokers to be alcoholic Chisware sum of Obser ig cjggeded I 407701000 77nonsmokers to be alcoholic also I 9602301000 smokers not alcoholic Chisquare is sum of standardized squared differences I 9607701000 7 nonsmokers not alcoholic c Zuni Nancy Pfenning Eiernentary Statistics Luuking attne Big Picture L32 24 c Zuni Nancy Pfenning Eiernentary Statistics Luuking attne Big Picture L32 25 Example ChiSquare Statistic Example ChiSquare Statistic I Background Observed and Expected Tables I Background Observed and Expected Tables observed expected2 observed expected2 I expected El Question What is chisquare sum of W El Response Find chisquare sum of e Zuni Nancy Pfenning Eiernentary Statistics Luuking attne Big Picture L32 2B e Zuni Nancy Pfenning Eiernentary Statistics Luuking attne Big Picture L32 23 Elementary Statistics Looking at the Big Picture 5 C 2007 Nancy Pfenning Example Assessing ChiSquare Statistic Example Assessing ChiSquare Statistic El Background We found chisquare 64 El Background We found chisquare 64 El Question Is the chisquare statistic 64 large El Response c 2mm Nancy Pfenning Elementary Statistics Luuking aime Big Picture L32 23 c 2mm Nancy Pfenning Elementary Statistics Luuking aime Big Picture L32 3i ChiSquare Distribution ChiSquare Density Curve 2 chisquare sum of W follows predictable For chis2quare with 1 df P2 2 384 005 pattern assuming H o is true known as 9 If X is more than 384 PValue is less than 005 chisquare distribution with df rl x cl I r number of rows possible explanatory values I C number of columns possible response values 2 7 7 rightlaii Properties of chisquare area05 I Nonnegative based on squares Properties of chisquare 0 7 2 w 5 I Meandf 1 for smallest 2x2 table I Nonnegative i f iti qgirilwih i if 7 y a e I Spread depends on df I Meandfl for smallest 2x2 table I Sk ewe d right I Spread depends on df I Skewed right e 2mm Nancy Pfenning Elementary Statistics Luuking althe Big Picture L32 32 e 2mm Nancy Pfenning Elementary Statistics Luuking althe Big Picture L32 33 Elementary Statistics Looking at the Big Picture 6 C 2007 Nancy Pfenning Example Assessing Chi Square Continued El Background In testing for relationship between smoking and alcoholism in 2x2 table found V2 64 El Question Is there evidence of a relationship in general between smoking and alcoholism not just in the samp e 2mm mnwmm Eiemenuwsuusucs mm tithe swim 11234 Example Assessing C hi Square Continued El Background In testing for relationship between smoking and alcoholism in 2x2 table found X2 64 El Response For df2lx2ll chisquare considered large if greater than 384 9chi 9 square64 large Pvalue small evidence of a relationship between smoking and alcoholism 2mm mm mm ammw mm mm um aw mm m as Inference for 2 Categorical Variables z or X2 For 2x2 table 22 X2 I 2 statistic compan39ng proportions9 combined tail probability005 for I chisquare statistic compan39ng counts9 lighttail prob005 for X2 39 384 2mm mnwmm Eiemenuwsuusucs mm tithe swim 11237 Example Relating ChiSquare amp z El Background We found chisquare 64 for the 2by2 table relating smoking and alcoholism El Question What would be the 2 statistic for a test comparing proportions alcoholic for smokers vs nonsmokers 2mm mm mm ammw mm mm um aw mm m as Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning E Example Relating ChiSquare amp Z Assessing Size of Test Statistics Summary ll Background We found chisquare 64 for the When test statistic is large 2by2 table relating smoking and alcoholism I Z greater than 196 about 2 El Res onse p 1 depends on df greater than about 2 or 3 I I F depends on DFG DFE I X2 depends on djErlxc1 greater than 384 about 4 if d l e ZEIE7 Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng allne El lg F39lcture L32 4n e ZEIE7 Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng allne El lg F39lcture L32 4i i l 39 7 ExplanatoryResponse 2 Categorical Variables Example summaries mpaaed by R0195 I Roles impact what summaries to report El Background Compared proportions alcoholic 2 resp for smokers and nonsmokers expl I Rl notim 11 rP l 0 es do pact X stat st 0 O va ue Alcoholic Not Alcoholic Total Smoker so 200 230 731 0130 Non smoker 10 760 770 172 0013 Total 40 960 1000 El Question What summaries would be appropriate if alcoholism is explanatory variable e ZEIE7 Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng allne El lg F39lcture L32 42 e ZEIE7 Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng allne Big Picture L32 43 Elementary Statistics Looking at the Big Picture 8 C 2007 Nancy Pfenning Example Summaries Impacted by Roles Example Summaries Impacted by Roles I Background Compared proportions alcoholic I Background Compared proportions alcoholic resp for smokers and nonsmokers expl T6513 for smOk rS and non1 0kers expll Alcoholic Not Alcoholic Total A39C h quotc NOtA39COhO m Total Smoker 30 200 230 A m Smoker 30 200 230 11 238 0130 Nonsmoker 10 760 770 Nonsmoker 10 760 770 p2 W 0013 Total 40 960 1000 Total 40 960 1000 II Note we can summarize by saying I alcoholics are 3 to 4 times as likely to be II Response Compare proportions resp smokers for CXPD I Smokers are 10 times as likely to be alcoholics c 2mm Nancy Pfenning Elementary Statistics Looking attne Big Picture L32 45 c 2mm Nancy Pfenning Elementary Statistics Looking attne Big Picture L32 47 Guidelines for Use of ChiSquare Procedure Rule of Thumb for Sample Size in ChiSquare I Need random samples taken independently from I Sample sizes must be large enough to offset non several populations normality of distributions I Confounding variables Should be separated out Require expected counts all at least 5 in 2x2 table I Sample sizes must be large enough to offset non Requlremem adjusted for larger tables normality of distributions I N e e d populations at least 10 times sample Sizes Looking Back Chisquare statistic only follows chi square distribution individual counts vary normally Our requirement is extension of requirement for single categorical variables TIP 2 107 n1 i P 2 10 with 0 replaced by 5 because of summing several components C 2mm Nancy Pfenning Elementary Statistics Looking attne Big Picture L32 43 C 2mm Nancy Pfenning Elementary Statistics Looking attne Big Picture L32 4a Elementary Statistics Looking at the Big Picture 9 C 2007 Nancy Pfenning Example Role of Sample Size Example Role of Sample Size I Background Suppose counts in smoking and I Background Suppose counts in smoking and alcohol alcohol twoway table were 1 10th the originals twoway table were 1 10th the originals II Question Find chisquare what do we conclude II Response Observed counts 1 10th 9 expected counts 1 10th 9chisquare 1 10th instead of 64 However the statistic does not follow X2istribution because expected counts 092 2208 308 7392 are not all at least 5 individual dists are not normal C 2mm Nancy Ptenning Eiernentary Statistics Looking attne Big Picture L32 5n C 2mm Nancy Ptenning Eiernentary Statistics Looking attne Big Picture L32 52 1 739quot Confidence Intervals for 2 Categorical Variables Example C0n dence Intervalst 2 13701907170175 Evidence of relationship to what extent does I Background explanatory Variable affect response I Nonsmokers 95 CI for pop prop alcoholic 00050021 Focus on proportions 2 approaches I Smokers 95 CI for pop prop alcoholic 009 017 I Compare con dence intervals for population I Question What do the intervals suggest about proportion in response of interest one interval relationship between smokmg and alcoholism for each explanatory group I Set up con dence interval for difference between population proportions in response of interest 1St group minus 2nd group C 2mm Nancy Ptenning Eiernentary Statistics Looking attne Big Picture L32 53 C 2mm Nancy Ptenning Eiernentary Statistics Looking attne Big Picture L32 54 Elementary Statistics Looking at the Big Picture 10 C 2007 Nancy Pfenning Example Con dence Intervals for 2 Proportions Example Difference between 2 Proportions CI El Background El Background 95 CI for difference between I Nonsmokers 95 CI for pop prop alcoholic 00050021 population proportions alcoholic smokers minus I Smokers 95 CI for pop prop alcoholic 009 017 nonsmokers is 0088 0146 El Response Overlap 7 9 Relationship between D Question What does the interval suggest about smoking and alcoholism 7 likely to be relationship between smoking and alcoholism alcoholic if a smoker 2mm NamHerman EiemenDHSlahshcs mm tithe BlvVlduie mas Emo7Nanw mm Eiemenlawstalsllcs minimums mm mm Example Difference between 2Proportz39ons CI Leotul e summary Inference for Cat Cat Chi Square El Background 95 CI for difference between H l t f M t population proportions alcoholic smokers minus D ypo 6565 m ems 0 I as or Famine ers k 0 088 0 146 El Inference based on proportions or counts nonsmo ers 1s D Chrsquare test Table of expected counts El Response Entire interval above zero suggests smokers significantly more likely to be alcoholic9there a relationship Chisquare statistic chisquare distribution Relating z and chisquare for 2x2 table Relative size of chisquare statistic Explanatoryresponse roles in chisquare test Guidelines for use of chisquare Role of sample size Con dence intervals for 2 categorical Variables DUE 2mm NamHerman EiemenDHSlahshcs mm tithe BlvVlduie mas Emo7Nanw mm Eiemenlawstalsllcs lnakmva healv mats L19 an Elementary Statistics Looking at the Big Picture 11 C 2007 Nancy Pfenning 7 7 E l l l Looking Back Review Lecture 6 El 4 Stages of Statistics Quantitative Varables I Data Production discussed in Lectures 14 I Displaying and Summarizing Dlsplaysa Begln summarles El Single variables 1 cat Lecture 5 El Relationships between 2 variables I Probability I Statistical Inference llSummarize with Shape Center Spread llDispays Stemplots Histograms llFive Number Summary Outliers Boxplots C 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Eilg F39lCturE C 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Eilg F39lCturE LB 2 De nition Example Issues to Consider El Distribution tells all possible Values of a I Background Intro stat student earnings year before variable and how frequently they occur game Earmleg ar OS Brittany 3 Dominique 7 Adam 1 I Questions I Data representative of What population I Responses unbiased I How to summarize I Sample average 3776 9 population average lt 5000 C 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Eilg F39lCturE LB 3 C 2mm Nancy F39fErlrllrlg Elementary Statlstles Luuklng attne Eilg F39lCturE LB 4 Elementary Statistics Looking at the Big Picture 1 C 2007 Nancy Pfenning Example Issues to Consider De nitions U BaCRgl Ollndi 111th Stat Student earnings Summarize values of a quantitative variable by i Eamfg telling shape center spread B tt 3 ngijriihue 7 III Shape tells which values tend to be more or imam less common ii Responses El Center measure of what is typical in the I Data represent all students at that univeti f dlstrlbutlon of a quantltatlve varlable g Back I Responses unbiased 1f I How to summarize Task at hand D Spr adz measure Of how muCh the l Sample average 3776 9 population average lt 5000 dIStrlbutlon s values vary lLaaking Ahead This is an inference question I 0 2mm Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture LB 5 0 2mm Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture LB E Definitions Displays of a Quantitative Variable El Symmetric distribution balanced on either side of Displays help see the shape of the distribution center El Stemplot El Skewed distribution unbalanced lopsided Advantage most detail I Skewed left has a few relatively low Values I Disadvantage impractical for large data sets El Skewed right has a few relatively high values U HiStOgl am II Outliers values noticeably far from the rest 39 Afivantage works We f r any 5126 data set D Unimodal Singlepeaked ll3 Dislacivantage some detail lost El X El Bimodal twopeaked 0 p 0 U f o 11 1 11 H h I Advantage shows outliers makes comparisons C9Q D In 0rm a V21 ues equa y 09111111011 at S ape I Disadvantage much detail lost I Normal a particular symmetric bellshape 0 2mm Nancy Prenning Eiernentary Statistics Luuking attne Big Picture LB 7 c ZEIEI7 Nancy Ptenning Eiernentary Statistics Luuking attne Big Picture LB 8 Elementary Statistics Looking at the Big Picture 2 C 2007 Nancy Pfenning De nition Example Constructing a Stemplot D Stemplot vertical list of Stems each El Background Masses in 1000 kg of 20 dinosaurs 5 0000010204060707101111121517171829325056 fouowed by honzomal list Of one39dlg leaves El Question Display with stemplot What does it tell Stems 139d1glt leaves us about the shape gt gt gt 0107 WWW Mum mm m mm m WWW mm amumm WWW m Wu Example Constructing a Stemplot Modi cations to Stemplots El Background Masses in 1000 kg of 20 dinosaurs D Toofew Stems Split 00 0001 02 04 06 0707 10 11 11 12 15 171718 29 32 50 56 El Response DO not Skip the 4 stem Why I Split in 2 1st stem gets leaves 04 2quot gets 59 I Split in 5 1st stem gets leaves 01 2quot gets 23 etc st th Long tang skewed l Split in 10 1 gets 0 10 gets 9 I 1 peak El Too many stems Truncate last d1g1ts Most below 2000 kg a few unusually heavy mmmmnm mm amnmvsmms makmva heaiv mm swim mm mm tithe aw 7mm 2 mm mm mm Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example Splitting Stems El Background Credits taken by 14 other students 47111111131314141517171718 Questions El I What shape would We guess for other nontraditional studenw I How to construct stemplot to make shape clear 2mm mnwmm amnuwsmsm mm tithe swim m is Example Splitting Stems El Background Credits taken by 14 other students 47111111131314141517171718 El Responses I Expect shape skeWed due to I Stemplot lst attempt has too few stems 0 i 4 7 11111334457778 2mm mm mm ammw mm mm um aw mm mu Example Truncating Digits El Background 1 nutes spent on computer day before 0 10 20 30 30 30 30 45 45 60 60 60 67 90 100 120 200 240 300 420 El Question How to construct stemplot to make shape 0 ear 2mm mnwmm amnuwsmsm mm tithe swim L521 Example Truncating Digits Elementary Statistics Looking at the Big Picture El Background 1 nutes spent on computer day before 010 20 30 30 30 30 45 45 60 60 60 67 90 100 120 200 240 300 420 El Response Stems 0 to 42 too many truncate last digit work with 100 s stems and 10 s leaves Skewed 39 most times less than 100 mimtes but a few had umsually long times 2mm mm mm ammw mm mm um aw mm m2 C 2007 Nancy Pfenning Displays of a Quantitative Variable De nition El Stemplot El Histogram El Boxplot 2mm mnwmm amnuwsmsm mm atthe swim 1524 El Histogram to display quantitative values Divide range of data into intervals of equal width u Find count or percent or proportion in each Use horizontal axis for range of data values vertical axis for countpercentproportion in each 2mm mm mm ammw Statstics mm um aw mm L525 Example Constructing a Histogram Example Constructing aHistogram El Background Prices of 12 used upright pianos 100 450 500 650 695 1100 1200 1200 1600 2100 2200 2300 El Question Construct a histogram for the data What does it tell us about the shape 2mm mnwmm amnuwsmsm mm atthe swim ma El Background Prices of 12 used upright pianos 100 450 650 695 1100 1200 1200 1600 2100 2200 2300 El Response We opted to put 500 as left endpoint of2nd interval be consistent apriee of1000 would go in 3rd interval not 2nd 2mm mm mm ammw Statstics mm um aw mm Lazs Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning De nitions Review El Shape tells which values tend to be more or less common El Center measure of what is typical in the distribution of a quantitative variable El Spread measure of how much the distribution s values vary 2mm mnwmm amnuwsmsm mm atthe swim Lam De nitions El Median a measure of center I the middle for odd number of values I average of middle two for even number of values El Qualtiles measures of spread I 151 Quartile Ql has onefourth of data values at or below it middle of smaller half I 3ml Quartile Q3 has threefourths of data values at or below it middle of larger halt By hand for odd number of values omit median to nd quartiles 2mm mm mm ammw Statstics mm um aw mm Lacm De nitions El Percentile value at or below which a given percentage of a distribution s values fall A Closer Look Q1 is 25th percentile Q3 is 75 h percentile El Range difference between maximum and minimum values El Interqualtile range tells spread of middle half of data values written IQRQ3Ql 2mm mnwmm amnuwsmsm mm atthe swim L531 Ways to Measure Center and Spread Elementary Statistics Looking at the Big Picture El Five Number Summary 1 lVLinimum 2 Q1 3 Median 4 Q3 5 Maximum El Mean and Standard Deviation more useful but less straightforward to nd WWW Mm mmwmm mammal We L532 C 2007 Nancy Pfenning Example Finding 5 Number Summary and I QR Example Finding 5 Number Summary and I QR El Background Credits taken by 14 nontraditional El Background Credits taken by 14 nontraditional students 4 71111 11 131314141517171718 students 4 71111 11 131314141517171718 El Question What are the Five Number Summary El Response range and IQR 1 Minimum 77 2 Q1 3 Median 7 4 Q3 77 5 Maximum 77 Range isi IQR is i CJZEIEI7 Nancy Pfenning Elementary Statistics Looking atthe Big Picture LB 33 CJZEIEI7 Nancy Pfenning Elementary Statistics Looking atthe Big Picture LB 35 De nition Displays of a Quantitative Variable The 15 TimesIQR Rule identifies outliers II Stemplot II below Qll5IQR considered low outlier II Histogram II above Q3l 5IQR considered high outlier El Boxplot 15TimesIQR Rule to Identify Outliers IQRsQS Q1 15 IQR 15 IQR l l l l I r I or 63 low outliers high outliers CJZEIEI7 Nancy Pfenning Elementary Statistics Looking atthe Big Picture LB 36 CJZEIEI7 Nancy Pfenning Elementary Statistics Looking atthe Big Picture LB 37 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning i J E De nition Example Constructing Boxplot El Background Credits taken by 14 nontraditional students had 5 No Summary 4111351718 El Questions I Are there outliers I How do we construct a boxplot A boxplot displays median quartiles and extreme values with special treatment for outliers 1 Bottom whisker to minimum nonoutlier 2 Bottom of box at Q1 3 Line through box at median 4 Top of box at Q3 5 Top whisker to maximum nonoutlier Outliers denoted C 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture LE 38 e 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture LE 39 Exam le C onstructz39n Box at Exam le C onstructz39n Box at p g P p g P El Background Credits taken by 14 nontraditional El Background Credits taken by 14 nontraditional students had 5 No Summary 4 11 135 17 18 students had 5 No Summary 4 11 135 17 18 D D CVEditS about I IQR between I I and I 7 shape is leftskewed Maximum189 19 f 15xIQR7 Q3179 39 9 I Q 1155IIQI Ilcgt71110ut111ers Median1359 14 I t 52 Q Q 7777 1g ou iers 77 21119 E IQRQSQ1 9 15mm 15 iQR l 39 39 I 61 63 Minimum 49 4 low outiiers high outiiers c 2mm Nancy Pfenning Eiernentary Statistics Luuking atthe Big Picture i c ZEIEI7 Nancy Pfenning Eiernentary Statistics Luuking atthe Big Picture LB 43 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Lecture Summary Quantitative Displays Begin Summaries El Display stemplot histogram Shape Symmetric or skewed Unimodal Normal Center and Spread I median and range IQR u identify outliers EIEI a display with boxplot 2 mm mm mm aimuw mm mm mm aw 7mm Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning E l l l Looking Back Review Lecture 35 El 4 Stages of Statistics TWO Quantitative Varables I Data Production discussed in Lectures 14 I 1 E I Displaying and Summarizing Lectures 512 nterva stlmates I Probability discussed in Lectures 1320 I Statistical Inference llPl for Individual Response CI for Mean Response ElExplanatory Value Close to or Far from Mean D 1 categorical discussedin Lemres 213923 El 1 quantitative discussed in Lectures 2427 ElApprOXImatlng Intervals by Hand Wdth of PI VS CI ll cat and quan paired 2sample severalsample Lectures 2831 D I 39 El 2 cateorical discussed in Lectures 3233 ElGuldellnes for Regression Inference c 2mm Naney Pfennlng Elementary etatlstles Lnnklng attne Ellg Pletere c 2mm Nancy Pfennlng Elementary etatlstles Luuklng attne Ellg Pletere L35 2 Correlation and Regression Review Population Model Parameters and Estimates ll Relationship between 2 quantitative variables Summarize line rel tionShi between Sampled x andy values Wlth line y b1 mm1m1zmg sum of I Display With scatterplot squared residuals yi gyz Typical residual size is I Summarize A 2 A 2 El Form linearorcurved 91 311 Model for population relationship is My 39 and responses vary normally with stan ard dev1ation I Use bo to estimate g Also equation ofleastsquares regressmn line lets us D Use 1 to estimate 61 predict a response yfor any explanatory value x I Use 8 to estimate 039 El Direction positive or negative El Strength strong moderate weak If form is linear correlation r tells direction and strength cl 2mm Nancy Pfennan Elementary Statlstles Lnnklng attne Elg F39lcture L35 3 cl 2mm Nancy Pfennan Elementary Statlstles Lnnklng attne Elg F39lcture L35 4 Elementary Statistics Looking at the Big Picture 1 C 2007 Nancy Pfenning E RegreSSion Nun HYPOtheSiS ReVieW Con dence Interval for Slope Review Confidence interval for 51 is ll H0 31 0 9no population relgtionship between x and y 1 I multiplierlt8 Eb 1 Test statistic t glib Pvalue is probability of I this extreme if H 0 true wahere lmulnpls g t dIStj Wlth df39 Where t has n2 d n is arge 0 con 1 ence interva 1s 1 b1 i 2 SE51 If CI does not contain 0 reject H O conclude x and y are related L35 5 e 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Big F39lcture L35 6 e 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Big F39lcture i l 39 7 Interval Estimates in Regression Example An Interval Estimate I Background Property owner thinks reassessed Seek Prediction and Confidence Intervals for value 40000 of his 4000 sqft lot is too high I InleIdual resPonse to glven i value PI Sizes for random sample of 29 local lots have mean 539 For large quot9 approx 95 P13 31 i 25 5619 sqft values have mean 34624 r0927 I Mean response to subpopulation with given regression equation g 1551 588593 s6682 x value CI ll Question Is there evidence that his value is D For large n approx 95 CI g j 2 significantly higher than usual Both intervals centered at predicted yvalue Q These approximations may be poor if n is small or if given x value is far from average x value L35 8 e 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Big F39lcture L35 7 e 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Big F39lcture Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example An Interval Estimate Example An Interval Estimate I Background Property owner thinks reassessed I Background Prope owner thinks reassessed value 40000 of his 4000 sqft lot is too high valuof hi sqft lot is too high Sizes for random sample of 29 local lots have mean Sizes for random sample of 29 local lots have mean 5619 sqft values have mean 34624 r0927 sqft values have mea r0927 regression equation y 1551 588535 s6682 regression equation 3 1551 58859 s6682 7 39 Response First note his lot is than quot average but valued than average some cause E 40000 7 39 39 for concern because the relationship is strong and Ex 30 00 39 positive But it s not perfect so we seek statistical 39 39 evidence of an unusually high value for the lot s 01 quot W size CJZDDWanch39fenmng 0 200 400 6 13 8000 10 00 12 00 L359 CJZDDWanch39fenmng Eiementarv Statistics tankingattheaigpmture L35 ii Example An Interval Estimate Example An Interval Estimate I Background Property owner thinks reassessed I Background Property owner thinks reassessed value 40000 of his 4000 sqft lot is too high value 40000 of his 4000 sqft lot is too high Sizes for random sample of 29 local lots have mean Sizes for random sample of 29 local lots have mean 5619 sqft values have mean 34624 r0927 5619 sqft values have mean 34624 r0927 regression equation 27 1551 58859 s6682 regression equation 1 1551 588595 s6682 Response Predictg Response Software precise prediction interval Approximate range of plausible values for 3213 135 rsgegigbservamg 0 CI 539 0 PI individual 4000 sq ft lot is 1 25094 1446 22127 28060 11066 39121 Values of Predictors for New Observations New Dbs Size 1 4000 SuggCStS 409000 15 too hlgh Relevant here is for individual not for mean 31337 NENEV F39fEWHNQ EiEmEmaN 51315 LDDWQ alive 9 9 Picture L35 i4 31337 NENEV F39fEWHNQ EiEmEmaN 51315 LDDWQ alive 9 9 Picture L35 i3 Elementary Statistics Looking at the Big Picture 3 C 2007 Nancy Pfenning Example A Interval Estimate Prediction Interval vs Con dence Interval El Background Prope owner thinks reassessed El Prediction interval corresponds to 6895 997 Rule value 40000 of hi 4000 sqft lot is too high for data where an individual is likely to be Sizes for random sample of 29 local lots have mean I PI is wider individuals vary a great deal 5619 Sflft V3111 haVe mean 349624 r0927a El Con dence interval is inference about mean range ref316551011 equatlon i9 1551 538590 aS6a682 of plausible values for mean of subpopulation Response Software precise prediction interval I CI is narrower can estimate mean with more precision P d39 d A f N Ob 39 39 39 39 39 39 NZ 3 Fit 831 servamggm CI 950 P1 ll Both PI and CI in regress10n utilize info about x to 1 25094 1446 lt 22127 28060 11066 39121 be more precise about y P1 or mean y CI Values of ors for New Observations New lbs Size 1 4000 Note Fit is predicted value for size4000 to 2mm Naney F39fErlrllrlg Elementary Stallstlcs Leenng attne Big Pletere L35 l7 to 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Leenng attne Big Pletere L35 l8 Examples Series of Estimation Problems Example Estimate Individual Wt N0 Ht Info El Based on sample of male weights estimate I Background A sample of male weights have I I weight of individual malel N0 regression mean 1708 standard deviation 331 Shape of I mean weig o a ma es needed distribution is close to normal El Based on sample of male hts and weights estimate El Question What interval should contain the weight weight of individual male 71 inches tall of an individual male I I mean weight of all 7linchtall males I weight of individual male 76 inches tall mean weight of all 76inchtall males Examples use data from sample of college males L35 2D to 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Leenng attne Eilg F39lcture L35 l0 to 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Leenng attne Eilg Picture Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example Estimate Individual Wt No Ht Info Examples Series of Estimation Problems El Background A sample of male weights have mean standard deviatio Shape of distribution is El Response Need to know distribution of weights is approximately normal to apply 6895997 Rule Approx 95 of individual male weights in interval 2mm mnwmm amnuwsmsm mm 31th swim 11522 El Based on sample of male weights estimate l weight of individual male I mean weight of all males El ase on samp e 0 ma e e1ghts and weights est weight of individual male 71 inches tall mean weight of all 71inchta11 males weight of individual male 76 inches tall mean weight of all 76inchta11 males mumm mm amnmysmms makinvanheaiv We Example Estimate Mean Wt N0 Ht Info Example Estimate Mean Wt N0 Ht Info El Background A sample of 162 male weights have mean 1708 standard deviation 331 El Questions What interval should contain the mean Weight of all males How does it compare to the interval for an individual male s weight717osi 2331 1046 2370 amnuwsmsm mm 31th swim 11525 El Background A sample of 162 male weights have mean 1708 standard deviation 331 El Responses Need to know sample size n to construct approximate 95 con dence interval for mean Interval for mean involves division by square root of n 7 7 than interval for individual 1708 A 2331 10462310 mumm mm amnmysmms tmkmvameaiv We Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning i J Examples Series of Estimation Problems II Based on sample of male weights estimate I weight of individual male I mean weight of all males II Based on sample of male heights and weights est weight of individual male 71 inches tall Need mean weight of all 71inchtall males regresszon weight of individual male 76 inches tall mean weight of all 76inchtall males C 2mm Nancy Ptenning Eiernentaiy Statistics Luuking attne Big Picture L35 29 E Examples Series of Estimation Problems 300 n 2 o VS 39 El 3 I P 200 r S i t O I U I 39 c 39 i i I i i g i o 8 i 39 i s I I 3 E o 39 39 s 39 100 7 i i 65 7O 75 80 HTmae c ZEIEI7 Nancy Ptenning Eiernentaiy Statistics Luuking attne Big Picture L35 an 1 Example Find Individual Wt Given Average Ht I Background Regression of male weight on height has r 045 p00009 strong evidence of moderate positive relationship Reg line 13 l88 50822 and s296 lbs 3y 2 331 lbs mean ht about 71 El Questions I How much heavier is a sampled male for each additional inch in height I Why is 8 lt 8y I What interval should contain the weight of an individual 7linchtall male Got interval estimates for x7l New Obs Fit SE Fit 950 CI 950 PI 17283 235 16820 17747 11420 23147 C 2mm Nancy Ptenning Eiernentaiy Statistics Luuking attne Big Picture L35 3i Example Find Individual Wt Given Average Ht I Background Regression of male weight on height has r 045 p00009 strong evidence of moderate positive relationship Reg line 7 188 50822 and s296 lbs 3y 2 331 lbs mean ht about 71 El Responses I For each additional inch in height a male weighs about 7 lbs more slope I 8 lt 8y because wts vary 77 about line than about mean I Software ii for x7 1 New Obs Fit SE Fit 9507 CI 950 PI 1 17283 235 16820 17747 11420 23147 C ZEIEI7 Nancy Ptenning Eiernentaiy Statistics Luuking attne Big Picture L35 33 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example Est Individual Wt Given Average H t Example Est Individual Wt Given Average H t El Background Regression of male weight on height El Background Regression of male weight on height has r 045 1700009strong evidence of moderate has r 045 p00009strong evidence of moderate positive relationship Reg line Q 7188 5081 positive relationship Reg line y 7188 5081 and F296 lbs Got interval estimates for x7l and s296 Got interval estimates for x7l New Obs Fit SE Fit 9507 or 950 Fl New Obs F E 1393 9 39 quot P1 1 17283 2435 18820 17747 11420 23147 1 17283 2435 16820 17747 C 11420 23147 El Questions I How do we approximate interval estimate for wt El Responses I Predicty for x7l of an individual 71 inchtall male by hand 131 I Is our approximate close to the true interval I Close 0mm WW WWW W am am 11534 WWW m Es WWW m was Examples Series of Estimation Problems Example Est Mean Wt Given Average Ht El Background Regression omale wts on hts has r 045 p00009strong evidence of moderate positive relationship Reg line Q l88 5087 El Based on sample of male heights and weights est NW Dbsand 2939651E391tG0t Integ gzecsilmates for PI Of individual male inches 1 172 83 235 16820 17747 11420 23147 El Based on sample of male weights estimate I weight of individual male I mean weight of all males I I mean weight of all 7linchtall males D Quesnonsz I What interval should contain mean Weight of all 71inch I weight of 1nd1v1dual male 76 inches tall tau males I mean might Of all 7639in0h39tau males I How do We approximate the interval by hand Is it close Need to know sample size to get margin of error for mean mm swarms mm mm mm m aa mumm mm Eiememwsmsms mmmmaw mm Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example EslMean Wt Given Average H l Examples Series of Estimation Problems I Background Regression of 162 male wts on hts 300 have r 45 p0009 strong evidence of moderate positive relationship Reg line 23 188 5082 and s296 lbs Got interval estimates for x71 is New Obs Fit SE Fit 950 CI 950 PI El 1 17283 235 16820 17747 11420 23147 I 200 El Response I Software ii for x71 I Predicty for x71 C1 100 I I Close 77 65 7O 75 80 HT c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L35 41 c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L35 42 Example Est Wt Given T all vs Average Ht Example Est Wt Given T all vs Average Ht El Background Regression of male wt on ht El Background Regression of male wt on ht produced equation 7 188 5082 produced equation 3 188 v For height 71 inches estimated weight is For height 71 inches estimated weight is g 71884 50801 1727 g 718850871 1727 I Question How much heavier will our estimate be I Response Since slope is about 5 predict 5 more for height 76 inches lbs for each additional inch more lbs for 76 which is 5 additional inches Instead of weight about 173 estimate weight about c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L35 43 c 2mm Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture L35 45 Elementary Statistics Looking at the Big Picture 8 C 2007 Nancy Pfenning Examples Series of Estimation Problems ExampleF ind Individual Wt Given Tall Ht El Background Regression of male weight on height has r 045 1700009strong evidence of moderate positive relationship Reg line g 7188 5081 and s296 lbs Got interval estimates for x76 New Dbs Fit SE Flt El Based on sample of male weights estimate I weight of individual male I mean weight of all males El Based on sample of male heights and weights est 9507 CI 950I PI 18858 20784 13897 25745 l weight of individual male 71 inches tall 1 19521 488 I mean weight of all 7linchtall males D Quesnonsz I What interval should contain the Weight of an individual I weight of 1nd1v1dual male 76 inches tall male 76 inches tam I mean weight Of all 76inChtall males I How does the interval compare to the one for ht71 New Obs Fit SE F11 9507 CI 9507 PI 1 17283 2135 168120 17747 11420 23147 0mm WWW amuwmm W mm mm 11547 WWW m gimmwsums WWW m was Example F ind Individual Wt Given Tall Ht Example Est Individual Wt Given Tall Ht El Background Regression of male weight on height El Background Regression of male weight on height has r 045 1700009strong evidence of moderate has r 045 p00009strong evidence of moderate positive relationship Reg line 1 188 5081 positive relationship Reg line 17 7188 508m and F296 lbs Got interval estimates for x76 and s296 lbs Got interval estimates for x76 New quot35 it Fit 39v P1 New Dbs Fit SE Fit 9507 CI 9507 PI 1 821 458 18853 20734 13397 25745 1 19821 488 18858 20784 13897 25745 El Questions El Responses I How do We approximate the prediction interval by hand I 7 for F76 I Predicted Wt t about 7 lbs more for F76 than for 71 I Is it close to the true interval New Dbs Fit SE Fit 95039 CI 950 PI 1 172335 235 15820 17747 11420 23147 2mm rimmm amuuwsmm mm mm mm 1155 DmmNanw mm ammuwsmm wwwmeaw mm Elementary Statistics Looking at the Big Picture Example EstIndividual Wt Given Tall Ht El Background Regression of male weight on height has r 045 p00009strong evidence of moderate positive relationship Reg line g 7188 5083 and F296 lbs Got interval estimates for x76 Neu Obs F11 SE Fit 9501 CI 1 198421 488 18858 20784 El Responses I Predicty for F76 PI I Close 2mm mmmnm Eiemenuwsuusucs mm mm mm 1155 C 2007 Nancy Pfenning Examples Series of Estimation Problems El Based on sample of male weights estimate I weight of individual male I mean weight of all males El Based on sample of male heights and weights est I weight of individual male 71 inches tall I mean weight of all 7linchtall males I weight of individual male 76 inches tall I mean weight of all 76inchtall males WWW m ammuwaum WWW N am Example EstMean Wt Given Tall Ht El Background Regression of 162 male wts on hts has r 045 p00009strong evidence of moderate positive relationship Reg line 1 188 5081 and F296 lbs Got interval estimates for x76 Fit SE Fit 950 CI 9507 PI 821 438 18858 207484 13397 25745 New bs 1 19 El Questions I What interval should contain mean Weight of all males who are 76 inches tall I How do We approximate the interval by hand Is it close 2mm mmmnm Eiemenuwsuusucs mm mm mm 11556 Example EstMean Wt Given Tall Ht El Background Regression of 162 male wts on hts has r 045 p00009strong evidence of moderate positive relationship Reg line g 188 5083 and s296 lbs Got interval estimates for x76 New Dbs FiL SE Fit 9507 CI 9507 PI 1 198421 438 18558 20784 13897 25745 El Responses I Refer to I Predict y for F761 7 Cl Close 2mm mm mm Emmy mm mm m aw mm as 5a Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning l J E Examples Series OfESlimalion PVOblems Interval Estimates in Regression Review 300 39 Seek interval estimates for l Individual response to given x value PI I Mean response to subpopulation with given x value CI ll For large n approx 95 CI g l 2J l Intervals approx1mate1y correct only jar x values close to mean otherwise wider ll Especially CI much wider for x far from mean 39200 WT male 100 l l 65 7O 75 80 L35 59 e 2mm Nancy Pfennan Elementary Stallstlcs Luuklng attne Ellg F39lcture L35 EEI Elementary Stallstlcs Luuklng attne Ellg F39lcture cnum Nancy Pfennan PI and CI for x Close to or Far From Mean Summary of Example Intervals 95 prediction interval for individual CI always 9 300 e 8 margin of error in PI for ghjgg 95 confidence Interval for mean narrower I I lndlvrdual IS apprDXImately 2s gte menQ s cemered at samp e than PI E er margin of error in Cl mean weight 17083 for mean is more than Essqnm for heights height 71 95 prediction interval for individual farfrom mean Indies 95 confidence interval for mean I Intervals centered at weight 17284 predicted for height 71 Regression 95 c 95 prediction interval for individual ll I fair 95 PI hel ht 76 incges 95 confidence interval for mean 65 7o 75 50 I lintervals centered at weight HT male main l 19821 predicted lorheight 76 y l l l I l l l l l weight 100 120 140 160 180 200 220 240 260 Elementary Statistics Luuklng attne Ellg Picture L35 B3 L35 El e 2mm Nancy Pfennan Elementary Stallstlcs Luuklng attne Ellg F39lcture e 2mm Nancy Pfennan Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning J i J Summary of Example Intervals Summary of Example Intervals 95 prediction interval for individual 95 prediction interval for individual I I i no he39ght 95 confidence interval for mean 9 he39ght 95 confidence interval for mean info used info used i i intenals centered at sample intervals centered at sample mean weight 17083 CI and PI mean weight 17083 0 39 39 39 39 39 39 O 39 39 39 39 39 height 71 95 o prediction Interval for indivrdual Cart be height 71 95 A prediction interval for Indivrdual 39 Ch95 95 confidence interval for mean narrower If 39 Ch95 95 confidence interval for mean l l x m 0 lven l l ntervals centere i 9 f g ntervals centered at weight 17284 predicted 17284 predicted for height 71 CI and PI 95 prediction interval for individual centered at 95 prediction interval for individual 1 39 Iheavier wt 1 height 76 o height 76 o inches 95 o confidence interval for mean inches 95 o confidence interval for mean for taller ht I intervals centered at weight I intervals centered at weig main i 19821 predicted forheight 76 main 19821 predicted lorheighl 76 I weight 1t o 1 20 140 100 18 0 2 gt0 2 2 2 40 2 60 weight 160 1 20 140 1dr 18 0 2 30 2 20 2 40 2 60 c 2007 Nancy Prennirig Elementary Statistics Looking atthe Big Picture L35 B4 c 2007 Nancy Prenning Elementary statistics Looking atthe Big Picture L35 B5 quoti 7 7 summary 0f Example InterValS Guldellnes for Regress1on Inference 95 prediction interval for individual I Relationship must be linear i no height info used 95 Conf39de merva39 for 93 I Need random sample of Independent observatlons intervals centered at sample mean weight 17083 I Sample Slze must be large enough to offset non height 71 95 prediction interval for individual nonnahty inches 95 confidence intervai for mean I Need population at least 10 times sample size ITImervals centered at weight I Constant spread about regression line 17284 predicted for height7i CI andPI O 1 b l ut 1ers 1 uentla 0 SCI VatIOIIS ma 1111 act 95 prediction interval for individual Wlderfor Wt 1t y p height 76 39l o lf lrfrom resu S hes 95 o confidence interval for mean avera 8 Hi 1 h 1 quot 6 l il g I Co oundmg varlab es s on d be separated out intervals centered at weig main i 19321 predicted for heigh E weight 1t 0 1 20 140 100 18 0 2 Jo 2 20 2 40 2 60 c 2007 Nancy Prennirig Elementary Statistics Looking atthe Big Picture L35 EB c 2007 Nancy Prennirig Elementary Statistics Looking atthe Big Picture L35 B7 Elementary Statistics Looking at the Big Picture Lecture 19 Sampling Distributions Proportions nTypical Inference Problem Definition of Sampling Distribution 13 Approaches to Understanding Sampling Dist uApplying 689599 Rule 2 mm mm mm Elemenbw slums mm mm aw mm C 2007 Nancy Pfenning Looking Back Review El 4 Stages of Statistics I Data Production discussed in Lectures 14 I Displaying and Summarizing Lectures 512 I Probability El Finding Probabilities discussedin Lectures 1314 El Random Variables discussed in Lectures 1518 El Sampling Distributions Means I Statistical Luterence 2mm Mm mm ammw shims mm um aw mm Hg 2 Typical Inference Problem If sample of 1 00 students has 013 left handed can you believe population proportion is 010 Solution Method Assume temporarily that population proportion is 010 n of sample proportion as high as 013 If it s too improbable we won t believe population proportion is 010 2mm mmnmmv ElemenDHStaushcs mm tithe aivrmme L193 Key to Solving Inference Problems Elementary Statistics Looking at the Big Picture For a given population proportion p and sample size n need to nd probability of sample proportion 13 in a certain range Need to know sampling distribution of 73 Note 13 can denote a single statistic or a random variable 2mm mm Mm ammw slums mm um aw mm L19 4 De nition a C 2007 Nancy Pfenning samples of a given size Sampling distribution of sample statistic tells probability distribution of values taken by the statistic in repeated random spread shape Looking Back We summarize a probability distribution by reporting its center C 2mm Nancy Prenan Elementary Stallstlcs Luuklng attne Eilg F39lCturE Llaa ml Behavior of Sample Proportion Review For random sample of size n from population with p in category of interest sample proportion 15 has I mean p I standard deviation p 1 p l shape approximately normal for large enough n Looking Back Can find normal probabilities using 689599 7Rule etc C 2mm Nancy Pfennan Elementary Stallstlcs Luuklng attne Elg FlCturE Li a a Rules of Thumb Review l l Without replacement I up and n1p both at least 10 C 2mm Naney FfErlrllrlg Elementary Stallstlcs Luuklng attne Eilg F39lCturE formula for standard deviation of 13 approximately correct even if sampled guarantees 13 approximately normal I Population at least 10 times sample size n Elementary Statistics Looking at the Big Picture Understanding Dist of Sample Proportion 3 Approaches Intuition I Handson Experimentation I d 3 Theoretical Results Looking Ahead We llfind that our intuition is consistent with experimental results and both are con rmed by mathematical theory C 2mm Nancy FfErlrllrlg Elementary Stallstlcs Luuklng attne Elg FlCturE Li a a C 2007 Nancy Pfenning Example Intuit Behavior of Sample Proportion El Background Population proportion of blue MampM s is p16017 El Question How does sample proportion f behave for repeated random samples of size I n25 a teaspoon Experiment sample teaspoons ofMampMs record sample proportion of blues on sheet and in notes need a calculator 2mm mnwmm amnuwsmsm mm m swim Lng Example Intuit Behavior of Sample Proportion El Background Population proportion of blue MampM s isp16017 Looking Ahead The shape of the underlying distribution will play a role in the shape of 13 56 16 forrFI 2mm Mm mm ammw Statstics mm um aw mm mg m Example IntuitBehavior of Sample Proportion El Background Population proportion of blue MampM s is p0l7 El Response For repeated random samples of size 25 13 is a quan RV summarize with I I 2mm NanmFtennmv amnuwsmsm mm mm swim L1 12 Example Intuit Behavior of Sample Proportion Elementary Statistics Looking at the Big Picture El Background Population proportion of blue MampM s is p0 17 El Response For repeated random samples of size 25 f is a quan RV summarize with I Center Some 13 s more than 017 others less should balance out so mean of 73 s is I Spread of s sd III For 116 could easily get 13 as low asi as high asi El For 1125 unlikely to get p as low asi as high asii 2mm mm Mm ammw Statstics mm um aw mm mm C 2007 Nancy Pfenning Example Intuit Behavior of Sample Proportion El Background Population proportion of blue MampM s is p0 17 El Response For repeated random samples of size 25 13 is a quan RV summarize with I Shape 3 close to 017 most common far from 017 in either direction increasingly less like1y9 2mm mnwmm amnuwsmsm mm m swim L19 5 Example Intuit Behavior of Sample Proportion El Background Population proportion of blue MampM s isp017 El Question How does sample proportion f behave for repeated random samples of size I 7125 a teaspoon I n75 a Tablespoon Experiment sample Tablespoons ofMampMs record sample proportion of blues on sheet and in notes 2mm Mm mm ammw 3mm mm um aw mm Hg 7 Example IntuitBehavior of Sample Proportion El Background Population proportion of blue MampM s is p017 El Response For repeated random samples of size 75 13 is a quan RV summarize with I Center I Spread I Shape 2mm Nmmnnmv amnuwsmsm mm mm swim L19 a Example Intuit Behavior of Sample Proportion Elementary Statistics Looking at the Big Picture El Background Population proportion of blue MampM s isp017 El Response For repeated random samples of size 75 f is a quan RV summarize with I Center Some 13 s more than 017 others less should balance out so mean of 73 s is I Spread Compared to spread of samples of 25 13 for samples of size 75 will have standard deviation 2mm mm minim ammw mm mm um aw mm mg 2 C 2007 Nancy Pfenning Example Intuit Behavior of Sample Proportion Understanding Sample Proportion El Background Population proportion of blue MampM s is p0l7 El Response For repeated random samples of size 75 13 is a quan RV summarize with I Shape 13 s clumped near 017 taper at tails9 2mm mnwmm amnmsmgm mm mm mm mazz 3 Approaches 1 Intuition 2 Hands0n Experimentation Theoretical Results Looking Ahead We 39ll nd that our intuition is consistent with experimental results and both are con rmed by mathematical theory 2mm Mm mm Eiemenlaiv shims mm um aw mm Hg 2 Central Limit Theorem Behavior of Sample Proportion Implications Approximate normality of sample statistic for repeated random samples of a large enough size is cornerstone of inference theory El Makes intuitive sense Ei Can be veri ed with experimentation El Proof requires higherlevel mathematics result called Central Limit Theorem 2mm Nmnmmv Eiemenuwsuusucs mm tithe gimme HQle F0 sample of size n from population w1t p in category of interest sample proportion 15 has I mean p 9 13 is unbiased estimator of p sample must be random 2mm mm minim ammw mm mm um aw mm Hg 25 Elementary Statistics Looking at the Big Picture Behavior of Sample Proportion Implications C 2007 Nancy Pfenning proportion 13 g has I meanp 1 p p C ZEIEI7 Nancy Pfenning Elementary Statistics Luuking althe Big Picture population size must be at least 10n For random sample of size n fro with p in category of interest samp e 39 Standard deVlatlon n in denominator 9 13 has less spread for larger samples LiBZE mi Behavior of Sample Proportion Implications For random sample of size n from population with p in category of interest sample proportion 13 has I mean p I standard deviation p in p I shape approx normal for large enough n 9can find probability that sample proportion takes value in given interval C 2mm Nancy Pfenning Elementary statisties Luuking atthe Big Picture Li a 27 i 2 Example Behavior of Sample Proportion MampM s is p0 l 7 n25 how does 13 behave C ZEIEI7 Nancy Pfenning Elementary Statisties Luuking atthe Big Picture I Background Population proportion of blue I Question For repeated random samples of LiBZE Elementary Statistics Looking at the Big Picture Example Behavior of Sample Proportion I Background Population proportion of blue MampM s isp0l7 El Response For repeated random samples of n25 13 has I Center mean I Spread standard deviation I Shape not really normal because C ZEIEI7 Nancy Pfenning Elementary Statistics Luuking atthe Big Picture Li a an s Example Sample Proportion for Larger n I Background Population proportion of blue MampM s is p0 l 7 III Question For repeated random samples of 1175 how does 13 behave e mi Nancy Pfenning Eiementary Statistics Looking althe Big Picture Li a at C 2007 Nancy Pfenning gt Example Sample Proportion for Larger n I Background Population proportion of blue MampM s isp017 II Response For repeated random samples of n75 13 has I Center mean I Spread standard deviation I Shape approximately normal because e 2mm Nancy Pfenning Eiementary statisties Luuking althe Big Picture Li a 33 6895997 Rule for Normal RV Review Sample at random from normal population for sampled value X a RV probability is El 68 thatX is Within 1 standard deviation of mean El 95 thatX is Within 2 standard deviations of mean I 997 thatX is Within 3 standard deviations of mean e mi Nancy Pfenning Eiementary Statisties Looking atme Big Picture Li a 34 Elementary Statistics Looking at the Big Picture 6895997 Rule for Sample Proportion For sample proportions 5 taken at random from a large population with underlying p probability is El 68 that 7 is within 1 Way PP of p I 95 that 13 is within 2 PUT P of p n 997 that 15 is within MM ofp n e mi Nancy Pfenning Eiementary Statistics Luuking althe Big Picture Li a 35 Example Sample Proportion for n 75 190 1 7 C 2007 Nancy Pfenning sd 0171 017 MT 0043 about behavior of 33 e 2007 Nancy Prenning Eiementary Statistics Luuking althe Big Picture I Background Population proportion of blue MampMs is p0 17 For random samples of 1175 13 approx normal with mean 017 and III Question What does 6895997 Rule tell us LiaaB mi Example Sample Proportion for 1175 190 1 7 I Background Population proportion of blue MampMs is p0 17 For random samples of n75 13 approx normal with mean 017 and sd 0171017 MT 0043 I Response The probability is approximately I 068 that is Within 10043 of 017 in l 095 that is Within 20043 of 017 in I 0997 thal is Within 30043 of 017 in e 2007 Nancy Ptenning Eiementary statistics Luuking atthe Big Picture Li a 38 90959899 Rule Review i l c 2007 Nancy Prenning Eiementary Statistics Luuking atthe Big Picture For standard normal Z the probability is I 090 that Z takes a value in interval 1645 1645 CI 095 that Z takes a value in interval 1960 1960 CI 098 that Z takes a value in interval 2326 2326 I 099 that Z takes a value in interval 2576 2576 Liaaa Elementary Statistics Looking at the Big Picture Example 90959899 Rule for 1175 p017 I Background Population proportion of blue MampMs is p0 17 For random samples of n75 13 approx normal with mean 017 and sd 017g0 17 0043 I Question What does 90959899 Rule tell us about behavior of P e 2007 Nancy Ptenning Eiementary Statistics Luuking atthe Big Picture Li a 4n C 2007 Nancy Pfenning Example 90959899 Rule for n75 p01 7 I Background Population proportion of blue MampMs is p0 17 For random samples of n75 33 approx normal with mean 017 and sd O171 O17 MT 0043 I Response The probability is approximately 090 that pis within 70043 of 017 in I 095 that is within 70043 of017 in I 098 that is within 70043 of0 17 in I 099 thatj is within 70043 of017 in e mi Nancy Prenning Eiementary Statistics Luuking aims Big Picture Li a 42 iLi Typical Inference Problem Review If sample of I 00 students has 013 lefthanded can you believe population proportion is 010 Solution Method Assume temporarily that population proportion is 010 fin of sample proportion as high as 013 If it s too improbable we won t believe population proportion is 010 e 2mm Nancy Pfenning Eiementary statistics Luuking atthe Big Picture Li a 43 Example Testing Assumption About p El Background Earlier we asked If sample of I 00 students has 013 lefthanded can you believe population proportion is 010 II Response If p010 15 for n100 has mean sd iW O f39ll 003 and shape approx normal since 100mm and 1001010 are both 2 10 According to Rule the probability is 106820 16 that 13 would take a value of 013 1 sd above mean or more Since this isn t so improbable we can believe p0 10 e mi Nancy Prenning Eiementary Statistics Luuking atthe Big Picture Li a 44 Lecture Summary Distribution of Sample Proportion Ii Typical inference problem I Sampling distribution de nition I 3 approaches to understanding sampling dist I Intuition I Handson experiment I Theory II Center spread shape of sampling distribution I Central Limit Theorem I Role of sample size I Applying 6895997 Rule e mi Nancy Pfenning Eiementary Statistics Luuking atthe Big Picture w 46 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning i Looking Back Review Lecture 25 III 4 Stages of Statistics I Data Production discussed in Lectures 14 Inference for Quantitative Variable Displaying and Summarizing Lectures 512 Hypothes1s Tests Probability discussed in Lectures 13 20 I Statistical Inference EIZ TGSt abOUt POPUIQtion Mean 4 Steps El 1 categorical discussed in Lectures 2123 EIExampleS 1Slded 0quot ZSlded Alternathe 1 quantitative con dence intervals hypothesis tests EIRelating Test and Confidence Interval EIFactors in Rejecting Null Hypothesis Eilnference Based on tvs z categorical and quantitative 2 categorical 2 quantitative e 2mm Nancy Prenning Eiernentary Statistics Luuking attne Big Picture e 2mm Nancy Prenning Eiernentary Statistics Luuking attne Big Picture L25 2 Three Types of Inference Problem Behavior of Sample Mean Review Mean yearly earningsfar sample of 446 students at For random sample of size n from population a particular umverszty was 3 776 with mean M agd standard deviation 0 1 What is our best guess for the mean earnings of sample meanX has all students at that university I mean u Point Estimate I ta d d d t 2 What interval should contain mean earnings for S n ar 6V1 Ion all the students I shape approx1mately normal for large Con dence Interval enough 7 3 Is this convincing evidence that mean earnings 91f 039 is known standardized X follows for all the students is less than 5000 Z standard normal distribution Hypothesis Test c 2 D7 Nancy Prenning Eiernentary Statistics Luuking attne Big Picture L25 3 c 2mm Nancy Prenning Eiernentary Statistics Luuking attne Big Picture L25 4 Elementary Statistics Looking at the Big Picture 1 C 2007 Nancy Pfenning Hypothesis Test About M with z Hypothesis Test About to with 2 Details H gt MO I 1 Consider sampling and study design I m 50 Problem Statement H0 n no vs Ha a lt no 2 Summarize With 1 standardize to IEI L 75 0 assuming H0 2 M M0 is true isz larg 1 Consider sampling and study design 3 Find prob of Z this far abovebelowaway from 0 A CE cc 3 2 Summarize With CC standardize to 4 7 P value onSIder If It IS small assumi11g H0 M M0 is true is Z large a 4 Based on Size of PValue choose H o orHa 3 Find PValue prob of Z this far I If sample is biased mean of X is not No abovebelowaway from 0 is it small I If poplt10n sd of X is not 0 4 Based on size of PValue choose H 0 or H a I If n is too small distribution of X is not normal won t standardize to Z graph data see guidelines c 2mm Naney F39fErlrllrlg Elementary Stallstlcs Luuklrlg attne Eilg Pleture L25 5 c 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklrlg attne Eilg Pleture L25 E 7 Hypothesis Test About M with 2 Details Hypothesis Test About M with 2 Details 1 Consider sampling and study design EM0 H0 2 U to VS Ha Mug 2 Summarize with CE standardize to z 0 Alternative gt PValue is righttailed probability assuming H0 2 a M0 is true isz large 3 Find prob of Z this far abovebelowaway from 0 PValue consider if it is small 4 Based on size of PValue choose H 0 orHa I Assess PValue based on form of alternative value hypothesis greater less or not equal 0 mluo xlbar hypothesized observed population mean sample mean c 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklrlg attne Eilg Pleture L25 7 c 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklrlg attne Eilg Pleture L25 8 Elementary Statistics Looking at the Big Picture 2 C 2007 Nancy Pfenning a i E Hypothesis Test About M withz Details Hypothesis Test About a with z Details HO IU IMO Ha ILLO H0 M 10 VS Ha L0 AlternatiVepccfcci P Vraluei l fttailed PTObab itY Alternative 7 Pvalue is twotailed nrobabilitv 39Va U9 p value xbar muo observed hypothesized mUO sample population T hypothesized T population mean CJZDWWWP WW E EWW V 5 3quot WW 3 3 9 PWE L25 9 mm observed sample mean either of these L25 l Example Test with OneSided Alternative Example Test with OneSided Alternative El Background Earnings of 446 surve ed universi El Background Earnings of 446 surveyed university students had mean 3776 Assuni students had mean 3776 Assume pop sd 6500 El Question Are we convinced that u lt 5000 El Response State H0 vs Ha Looking Ahead In reallife problems we rarely know 1 Students representative in terms of earnings the value of the population standard deviation 2 Output Shows sample mean and Z Eventually we ll learn how to proceed when all we One Sample z Earned know is the sample standard deviation s Test of mu 5 vs mu lt 5 The assumed sigma 65 Variable N Mean StDev SE Mean Earned 446 3 776 6 503 O 308 Variable 95 0 Upper Bound Z P CZEE7N F39f El t Sttt L R th El F39t L75 ii F39f El t Sttt L k4 ttg gt 3 39 O i3 Elementary Statistics Looking at the Big Picture 3 C 2007 Nancy Pfenning Example Test with OneSided Alternative I Background Earnings of 446 surveyed university students had mean 3776 Assume pop sd 6500 I Response One Sample Z Earned Testofmu5vsmult5 The assumed sigma 65 Variable N Mean StDev SE Mean Earned 446 3776 6503 0308 Variable 9507u Upper Bound Z P Earned 4282 398 0000 3 Pvalue Small 4 RejectHO Conclude C 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Eilg F39lCturE L25 l5 E Example Test with OneSided Alternative CI Note Pvalue is a lefttailed probability because alternative was less than I Response One Sample Z Earned Testofmu5vsmult5 The assumed sigma Mean StDev SE Mean Variable Earned 446 3776 6503 0308 Variable 95 O Upper Bound Z P Earned 282 3 98 0000 3 Pvalue PZ E398 0000 4 RejectHO C 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Eilg F39lCturE L25 l7 Example Notation I Background Want to test if mean of all male shoe sizes could be 110 based on a sample mean 11222 from 9 male students Assume pop sd 15 III Question How do we denote the numbers given C 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Eilg F39lCturE L25 iE Example Notation I Background Want to test if mean of all male shoe sizes could be 110 based on a sample mean 11222 from 9 male students Assume pop sd 15 CI Response 110 is proposed value of population mean 77 11222 is sample mean if 9 is sample size 15 is population standard deviation if C 2mm Nancy F39fErlrllrlg Elementary Stallstlcs Luuklng attne Eilg F39lCturE L25 2n Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning i i E Example Intuition Before Formal Test Example Intuition Before Formal Test I Background Want to test if mean of all male shoe I Background Want to test if mean of all male shoe sizes could be 110 based on a sample mean 11222 sizes could be 110 based on a sample mean 11222 from 9 male students Assume pop sd 15 from 9 male students Assume pop sd 6500 El Question What conclusion do we anticipate by El Response eyeballing the data Sample mean 1 1222 seems close to proposed o1 10 if Sample size 9 small Sd 15 not very small iiiiiii Anticipate standardized sample mean 2 large 77 9 PValue small 9conclude population mean 7 e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L25 2i e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L25 23 Example Test with T wo Sided Alternative Example Test with T wo Sided Alternative El Background Want to test if mean of all male shoe El Background Want to test if mean of all male shoe sizes could be 110 based on a sample mean 11222 sizes could be 110 based on a sample mean 11222 from 9 male students Assume pop sd 15 from 9 male students Assume pop sd 15 El Question What do we conclude from the output El Response Z 044 Large PValue twotailed 0657 Small Conclude pop mean may be 110 One Sample Z Shoe Test of mu 11 vs mu not 11 The assumed sigma 15 OneSample Z Shoe Test of mu 11 vs mu not 11 The assumed sigma 15 Variable N Mean StDev SE Mean Variable N Mean StDev SE Mean Shoe 9 11222 1698 0500 Shoe 9 11222 1698 0500 Variable 950 CI Z P Variable 950 CI Z P Shoe 10242 12202 044 0657 Shoe 10242 12202 044 0657 L e ui L n e 2mm Nancy Ptenning e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L25 24 Elementary Statistics Looking at the Big Picture Example Test with T woSided Alternative El Background Want to test if mean of all male shoe sizes could be 110 based on a sample mean 11222 from 9 male students Assume pop sd 15 CI Note Pvalue is probability of sample mean as far from 110 as 11222 is same as probability of standardized sample mean Z as far from 0 as 044 is One Sample Z Shoe Test of mu 11 vs mu not 11 The assumed sigma 15 Variable N ean StDev SE Mean Shoe 9 11222 1698 0500 C 2007 Nancy Pfenning Variable 9 0 a Z P Shoe 10242 12202 044 0657 L e e C 2mm Nancy Pfenning L25 27 Example Test with T woSided Alternative PValue as probability of sample mean as far from 110 in either direction as 11222 p value2PX 2 11222 pvaiuezcombined area 1 sample mean 1 ii 222observed r 2 L25 28 Example Test with T woSided Alternative PValue as probability of standardized sample mean z as far from 0 in either direction as 044 p value2PZ 2 44 valuecombined area657 p o i 1 0 A 1 C 2mm Nancy Pfenning Eiementary Statistics Leean attne Big Picture p 2 Example Test with T woSided Alternative Same area under curve just different scales on horizontal aXis due to standardizing on right Ho mui1 vs Ha mUE 11 pvalue umbmm avsa gt 11 mmm 11 j 24 it a C 2mm Nancy Pfenning Eiementary Statistics Leean attne Big Picture L25 3n Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning E Example Test Results and Con dence Interval Example Test Results and Con dence Interval I Background Tested if mean of all male shoe sizes I Background Tested if mean of all male shoe sizes could be 110 based on a sample mean 11222 from could be 110 based on a sample mean 11222 from 9 male students Assumed pop sd 15 Pvalue 9 male students Assumed pop sd 6500 Pvalue was 0657 did not reject Ho was 0657 did not reject HO El Question Would we expect 110 to be contained El Response in a con dence interval for M OneSample Z Shoe Test of mu 11 vs mu not 11 The assumed sigma 15 Variable N Mean StDev SE Mean Shoe 9 11222 1698 0500 Variable 950 CI Z P Shoe 10242 12202 044 0657 c 2mm Nancy Prennirig Eiernentary Statistics Luuking atthe Big Picture L25 31 c 2mm Nancy Prennirig Eiernentary Statistics Luuking atthe Big Picture L25 33 1 Example Test Results and Con dence Interval Example Test Results and Con dence Interval El Background Tested if mean earnings of all El Background Tested if mean earnings of all students at a university could be 5000 based on a students at a university could be 5000 based on a sample mean 3776 for n446 Assumed pop sd sample mean 3776 for n446 Assumed pop sd 6500 Pvalue was 0000 rejected Ho 6500 Pvalue was 0000 rejected Ho El Question Would 5000 be contained in the El Response confidence interval for U e 2mm Nancy Prenning Eiementary Statistics Looking atthe Big Picture L25 34 e 2mm Nancy Ptenning Eiementary Statistics Looking atthe Big Picture L25 3B Elementary Statistics Looking at the Big Picture Factors That Lead to Rejecting Ho C 2007 Nancy Pfenning J r J Ux l Sample mean far from 0 I l Sample sizen large l l Standard deviation 0 small C 2mm Nancy F39fErlrllrlg Elementary Statistles Luuklng attne Elg F39lCturE Statistically signi cant data produce Pvalue small enough to reject H O 2 plays a role 2 2m Reject HO if Pvalue small i z arge 39 Role of Sample Size n l Large 11 may reject Ho even if sample mean is not far from proposed population mean from a practical standpoint Very small Pvalue strong evidence against Ho but E not necessarily very far from No l Small 11 may fail to reject H 0 even though it is false Failing to reject false H0 is 2 type of error C 2mm Nancy F39fErlrllrlg Elementary Statistles Luuklng attne Elg F39lCturE L25 ae De nition Review i l though it is true false positive I Type 11 Error ull hypothesis even though it s false false negative I Reject HO correct or Type I C 2mm Nancy F39fErlrllrlg Elementary Statistles Luuklng attne Elg F39lCturE I Type I Error reject null hypothesis even Test conclusions determine possible error I Do not rejectHoz correct or Type II Elementary Statistics Looking at the Big Picture Example Errors in aMedical Context I Background A medical test is carried out for a disease HIV ll Questions I What does H0 claim I What are the implications ofa Type I Error I What are the implications ofa Type 11 Error I Which type of error is more worrisome C 2mm Nancy F39fErlrllrlg Elementary Statistles Luuklng attne Elg F39lCturE L25 4n C 2007 Nancy Pfenning J i J Example Errors in aMedtcal Context Example Errors in a Legal Context I Background A medical test is carried out for a I Background A defendant is on trial dlsease HIV El Questions D Responses I What does H0 claim I What does H0 claim H 0 I What are the implications ofa Type I Error I What are the implications ofa Type I Error I What are the implications Ofa Type H Error I What are the implications ofa Type II Error I Which type of error is more worrisome I Which type of error is more worrisome L25 42 C 2mm Nancy Ptenning Eiementary Statistics Lnnking atthe Big Picture L25 43 C 2mm Nancy Ptenning Eiementary Statistics Lnnking atthe Big Picture Example E77075 1 a Legal come Behavior of Sample Mean Review D BaCkgl Oundi A defendant is on trial For random sample of size n from population El Responses with mean u standard deviation 0 sample I What does H0 claim mean X has H O I mean M I What are the implications ofa Type I Error 039 l standard deV1at10n I What are the implications ofa Type II Error I shape approximately normal for large enough n I Which type of error is more worrisome c 2mm Nancy Ptennirig Eiernentaiy Statistics Lnnking atthe Big Picture L25 45 c 2mm Nancy Pfenning Eiernentaiy Statistics Lnnking atthe Big Picture L25 4E Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning J r J Sample Mean Standardizing to 2 Sample mean standardizing to l 91f a is known standardized X follows For o39unknown and n small t s n 2 standard normal distribution Z I t like 2 centered at 0 since X centered at H H a is unknowiagn is large enough I tlike z symmetric and bellshaped if X normal 20 or 30 thens O and ii N I tmore spread thanz sdgtl s gives less info 7 s n N Z thas nl degrees of freedom spread depends on n Can use 2 if 039 is known or n is large What if 039 is unknown and n is small e 2mm Nancy Pfennlng Elementary Statlstles teeklng attne Eilg F39lcture L25 47 e 2mm Nancy Pfennlng Elementary Statlstles Leeklng attne Eilg F39lcture L25 48 Inference About Mean Based on 2 or I Inference by Hand Based on 2 or I I a known standardized E is 2 0175mm 0 i know may usez if 039 unknown butn large sma samplequot lt 30 0 fa Z 5 2 t I 0 unknown standardized E is 2 large sample n 2 3O Z Z shy 7 RV z 04 7 Z 2 used if 039 known or n large I used if 0 unknown and n small l l l l l t l r l l 4 ea 72 el 0 l 2 3 z or t standardized difference between sample mean and proposed population mean e 2mm Nancy F39fErlrllrlg Elementary Statlstles teeklng attne Eilg F39lcture L25 43 e 2mm Nancy Pfennlng Elementary Statlstles teeklng attne Eilg F39lcture L25 5n Elementary Statistics Looking at the Big Picture 10 C 2007 Nancy Pfenning 2 distribution Review t distribution standardizing with sigma standardizing with 3 results in 2 distribution results in t distribution standard deviationgt1 depends on n standard deviation 1 l l l l l I 0 1 x bar mu 0 1 xbar mu sigmasqrt n is z ssqr t n is t L25 52 e2uu7 Nancy Pfenning Elementary Statistics Luuklng attne Big Picture L25 5i e ZEIEI7 Nancy Pfenning Elementary Statistics Luuklng attne Big Picture Example Distribution of t vs 2 Example Distribution of t vs 2 39 El Background Form 9 S t has8df El Background Form 9 S t has 8df area05 area05 area05 area05 area025 areau025 area025 area025 area areaz01 area01 areazm area005 ar a005 area005 ar a005 0 4n r 0 r 99 l l l l l l l l 2 9 62311 86 0 185 tfor8df i 231186 0 185 tfor8df 3 639 29 336 3 6 29 336 El Question How does Ptgt2 compare to Pzgt2 El Response Ptgt2 between and e ZEIEI7 Nancy Prenning Elementary Statistics Luuklng attne Big Picture L25 53 e ZEIEI7 Nancy Pfenning Elementary Statistics Luuklng attne Big Picture L25 55 Elementary Statistics Looking at the Big Picture C 2007 Nancy Pfenning Example Distribution of t vs 2 Lecture Summary Inference for Means Hypothesis Tests tDist El 2 test about population mean 4 steps El Examples lsided and 2sided alternatives 9305 area05 El Relating test and con dence interval area025 area025 El Factors in rejecting null hypothesis area01 areavo1 I Sample mean far from proposed population mean I I Sample size large 005 ar a005 area4 i i l Standard deV1atlon small i 1i 4545 5 316451 I Z El Inference based on 2 or t 72 525960 1 329326 I Population sd known standardize to z 2576 l Population sd unknown standardize to t 2576 gt W D Response Pz 2 be 6611 El Comparmgz and td1str1butlons C 2mm Nancy Pfenning Eiementary Statistics Luuking aims Big Picture L25 57 C 2mm Nancy Pfenning Eiementary Statistics Luuking aims Big Picture Li a 58 Elementary Statistics Looking at the Big Picture 12

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I bought an awesome study guide, which helped me get an A in my Math 34B class this quarter!"

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.