Popular in Course
verified elite notetaker
Popular in Statistics
This 4 page Class Notes was uploaded by Orval Funk on Monday September 28, 2015. The Class Notes belongs to STAT111 at University of Pennsylvania taught by Staff in Fall. Since its upload, it has received 14 views. For similar materials see /class/215436/stat111-university-of-pennsylvania in Statistics at University of Pennsylvania.
Reviews for INTRODUCTORYSTATISTICS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/28/15
Statistics 111 Lecture 3 Collecting Data Surveys and Sampling 52m mm Stat111rtedure arsamplmg 1 Administrative Notes First recitation this Friday Sept 17 2mg Stat111rLedure arsamplmg 2 Outline for Lecture Another Example of Confounding Introduction to Sampling Voluntary Response Samples Simple Random Samples Sources of Sampling Bias 0 More complicated sampling schemes 0 Preview of Inference o Bias versus Variability 52m mm Stat111rLedure arsamplmg 3 Confounding Example Simpson s Paradox The relationship between two variables can change dramatically after considering a third confounding 39able Example observational study ofUC Berkeley graduate admissions in 1971 Malvs Applicants FemaleApplicants m m Seems to indicate a sex bias towards males 52m 17 2mg Stat111rLedure arsamplmg a Confounding Example Simpson s Paradox eems to reverse when we subdivide applicants into their different departments nerummm A greater proportion of females applying to epartmen s with lower acceptance rates Association between gender and admission rates does not imply causation 52m mm Stat111rtedure arsamplmg 5 Survey Definitions Population entire group of objects or people aboutwhich information is soug Census survey of an entire population Sample survey that examines only a portion of the population Parameter a numerical characteristic of the population Statistic a numerical characteristic of the sample 52m 17 2mg Stat111rLedure arsamplmg 5 Why Sample Expense cheaper than a census Nielson ratings based on 5000 out of an estimated 1055 million US households with TVs 0 Time quicker than a census Exit polls gives news agencies valuable information on e ection day in order to project election before all votes census are counted 0 Sampled units must sometimes be destroyed or changed to measure characteristics Reliability studies testing lifetime of light bulbs strength ofwindshields etc 52m 172uu9 Statttt tenure Ersampllng 7 Sampling Bias Systematic errors that result in a sample that is not representative of the overall population of interest Just like in experiments we must be cautious of potential sources of bias in our sampling results 52m 172uu9 Statttt tenure Ersampllng a Voluntary Response Samples People choose to be included in sample themselves by responding to a general appeal Eg Amazon consumer ratings n pvwwvamazurl cornrevlevvpmuumEuuuazotsu Results are often biased because people with strong opinions usually negative are more likely to respond and be included in the sample 52m 172uu9 Statttt tenure Ersampllng s Hite Report Women and Love 1987 Hite mailed 100000 questionnaires to groups ofwomen professionals counseling centers church societies senior citizens centers n y 50 were re ume 84 ofwomen are not satis ed emotionally with their relationshipsquot p 804 women married ve or more years are having sex outside of heir marriages p 856 95 of women report forms of emotional and psycholo ical harassment 39om men with whom they are in love relationshipsquot p 8 84 ofwomen re ort forms of condescension from the men in heir love relationships p 809 Sam 172uu9 Statttt tenure Ersampllng m Simple Random Sampling SRS Just as an experiment can be improved by randomization so can sampling Each individual in the population has an equal chance of being included in the sample Does not allow selfresponse or evaluators to influence makeup of the survey kinda like doubleblinding in experiments 52m 172uu9 Statttt tenure Ersampllng M Some History Presidential Elections presidential electio The poll represents 30 years constant evolutlorl and perfecthrl In 1912 Literary Digest began using surveys to predict US ns In the 1936 Roosevelt vs Landon election they polled 10 million voters 1293669 salo tnev would vote for Landon 972897 salo tnev would vote for Roosevelt Reality Landslide victory 61 to 37 for FDR What went wrong 52m 172uu9 Statttt tenure Ersampllng 2 Biases in Random Samples Randomization doesn t correct for certain problems with mpling Bias 1 Undercoverage some groups in the population are lett out of he process of choosing the sample Bias 2 Nonresponse sampled individuals can not be contacted or do not cooperate Eg 1936 presidential polls Lovv response rate less tnan 25 of questionnaires returned overa rpooreroernograpnlcs sample orvotersrelleo neavllv on lists of automoblle and telephone owners wnlcn were generally rnore affluent voters Well at Ieastwe Ieamed from those mistakes right 52m 172uu9 Statlll lecture arsampllng Recent Presidential Elections Using exit polls several networks report eariy that Gore wins Florida on 2000 election Using exit polls several pundits predict Kerry will win Ohio in 2004 election In general we have gotten better but still can make mistakes especially when difference itselfis so small 52m 172uu9 Statlll lecture arsampllng More Potential Problems with Surveys Response Bias respondents may not answer truthfully to survey questions Illegal or unpopular behaviour such as drug usage Controversial topics such as teen sexual activity Race or gender of interviewer can in uence answers about race or genderrelated questions Respondents often have trouble remembering past events eg yearly nutrition and health surveys 52m 172uu9 Statlll lecture arsampllng 15 More Potential Problems with Surveys Wording of questions can be confusing or intentionally lead the respondent Do you favor a ban on disposable diapers It is estimated that disposable diapers account for less than 2 ofthe trash in todays land lls In contrast beverage containers thirdclass mail and yard wastes account for 21 ofthe trash in land lls Given this would it be fairto ban disposable diapers o Complicated multipart forms that require lots of skipped questions lead to a drop ofin response 52m 172uu9 Statlll lecture arsampllng More Complicated Random Surveys Weakness of simple random sampling isthat you cannot use extra information about population similar to blocking in experiments small group of people in one part ofthe city are wealthier Stratified random sampling individuals are divided into groups called strata Simple random sampling done within each stratum National surveys can be even more complicated by using multistage sampling cheaper 52m 172uu9 Statlll lecture arsampllng 7 Dinner and Drugs Study Study by CASA that linked frequent family dining to reduced risk of substance abuse There is no more important hing that a parent can doquot Some problems with study that relate to what we know about surveys and observational studies Problem 1 undercoverage of minority groups Survey not representative of teen population 52m 172uu9 Statlll lecture arsampllng Dinner and Drugs Study II 0 Problem 2 high level of nonresponse in survey Many households declined to answer didn t complete 39 e survey or denied permission 0 us Problem 3 observational study with lots of potential confounding variables Drug use itselfwasn t measured but rather a risk score for drug use Study isn t adjusted for age which is also associated 39 drug use Most statisticians would be really cautious interpreting a study such as this onequot 52m mm mm demure arsampllng Next Class Lecture 4 Exploring Data Graphical summaries of a single variable Moore and McCabe Section 11 52m 17 2mg Stat 111 rLectuve 3 7 Sampling 2n
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'