University of South Carolina
Engineering
Introduction to Statistical Reasoning
Professor: Wilma sims
Fall 2016
Tags: Math and Statistics
Cost: Free
STAT 110- week 1 notes
Notes for chapter 1 and chapter 2
Uploaded: 01/28/2017
Where Do Data Come From?  The field of statistics is applicable in every discipline. To give an idea of how widespread its use  is, consider the following examples illustrating the scope of applications:  From ratemyprofessors.com, STAT 110 instructors often get comments about the practice test  looking like the real exam…..should you believe this claim? Here are some interesting stats  from Exam 2 last semester  Summary statistics:  •   Column n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3

Exam 2

379 78.51715 318.37735 17.843132 0.9165401

82  102 0  102  70  90

Where Do Data Come From?

•  A claim found on MedPageToday  http://www.medpagetoday.com/Pediatrics/GeneralPediatrics/36848?utm_content=&utm_medium=email&utm_campaign=DailyHeadlines&utm_s ource=WC&xid=NL_DHE_2013-01-15&eun=g387768d0r&userid=387768&email=leslieahendrix@gmail.com&mu_id=5380723 “Fast Foods Tied to Allergies, Excema in Kids”  Can we believe this claim? Let’s look at the article….  Another claim found on MedPageToday  http://www.medpagetoday.com/MeetingCoverage/AAIC/33780?utm_source=breaking-news&utm_medium=email&utm_campaign=breaking news “IVIG Stops Alzheimer’s in Its Tracks”  51% of adults age 18 and older are married today – 72% in 1960  http://www.pewsocialtrends.org/2011/12/14/barely-half-of-u-s-adults-are-married-a-record-low/?src=sdt-carouselStatistics is the science (or art) of data.  _____________________are the objects described by a set of data. Individuals may be  Individuals people, but they may also be animals or things.  A _____________________is any characteristic of an individual. A variable can take different  Variable values for different individuals.  Values The actual measurements recorded for individuals are called ___________.  Example 1   What are the individuals?  Students enrolled   What are the variables?  major, points, grade Example 2   Individuals?  Cars  Variables?  -make/model -vehicle type -transmission type -cylinders -city MPG/ highway MPGName Major Points Grade Advani, Sura Comm 397 B Barton, David Hist 323 C  Brown, Annette Lit 446 A  Chiu, Sun Psyc 405 B  Cortez, Maria Psyc 461 A  Make Vehicle Transmission Cylinders City Highway   Subcompact Automatic 6 19 Subcompact Manual 6 20 Midsize Automatic 6 20

What are the variables?

What are the individuals?

BMW 27 BMW 29 Buick 30 Chevy SUV (2WD) Automatic 6 16 21 Chapter 1 Page 2  Example 3  In an agricultural study in Kansas, researchers want to know which of three fertilizer  compounds produces the highest wheat yield (in kg/plot). An experimenter uses 15 plots of  land. Each fertilizer is applied to 5 plots of land. After harvest, the resulting yield is measured.  Individuals?  plots of land Variables?  -fertilizer type (independent) -wheat yield (dependent) Ways to Gather Data  1. Observational Study  Fertilizer 1 Fertilizer 2 Fertilizer 3 64.8 56.5 65.8 60.5 53.8 73.2 63.4 59.4 59.5 48.2 61.1 66.3 55.5 58.8 70.2 An observational study observes individuals and measures variables of interest but does  not attempt to influence the responses.  A response variable is a variable that measures an outcome or result of a study.  The purpose of an observational study is to describe some group or situation.  The ____________________ for a statistical study is the entire group of individuals about  population which we want information.  A ____________________ is the part of the population from which we actually collect  sampleinformation and is used to draw conclusions about the whole.  2. Sample Survey  A Sample survey is a type of observational study that surveys a group of individuals by  studying only some of its members (selected because they represent the larger group of  individuals)  It is a survey because the individuals provide their own responses  It is a sample survey because the individuals participating in the survey are  a sample of the population  Chapter 1 Page 3  A census is a sample survey that attempts to include the entire population as the  sample.  The US Census is required by the constitution every 10 years  You can see 2010 Census Data and info here  http://www.census.gov/2010census/ Lots of really cool data can be found here. Let’s take a look at some in class.  We’ll always miss some people in the census count…  http://www.cnsnews.com/news/article/dozens-us-cities-line-contest-2010-censu Example 4  The University of Pennsylvania’s National Annenberg Election Survey conducted a poll from  July 30 to August 5, 2004. They asked: Do you favor or oppose Federal funding of research on  diseases like Alzheimer’s using stem cells taken from human embryos? The survey reported  that the poll consisted of 1345 randomly selected adults in the United States.  Population?  adults in the US Sample?  Example 5  1345 randomly selected adults in the US The American Community Survey (ACS) contacts 3 million households, including some in every  county in the US. This new Census Bureau survey asks each household questions about their  housing, economic, and social status.  Population?  Sample?  Example 6  All US households 3 million households Video adapter cables have pins that plug into slots in a computer monitor. The cable will not  work if pins are bent or broken. A store chooses 5 cables from each lot and inspects the pins.  If any of the cables have bent or broken pins, the entire lot is sent back.  Population?  Sample?  lot of adapter cables 5 cablesChapter 1 Page 4  Example 7  A sociologist wants to know the opinions of employed adult women about government  funding for day care. She obtains a list of the 580 members of a women’s club and mails a  questionnaire to 100 of these women selected at random. Only 41 questionnaires are  returned.  Population?  Sample?  employed adult women 41 women who returned the survey What percentage of the women contacted responded?    41/100 = 41% 3. Experiment  An experiment deliberately imposes some treatment on individuals in order to observe  their responses.  The purpose of an experiment is to study whether the treatment causes a change  in the response.  Virtually all scientific research involves conducting well-designed experiments.  Researchers hope the results from these experiments support a research  hypothesis.  Example 8  Salmonella bacteria are widespread in human and animal populations, and there are over  2,000 known serotypes. The reported incidence of salmonella illnesses in humans is about 17  cases per 100,000 people. A food scientist wants to see how withholding feed from pigs prior  to slaughter can reduce the size of gastrointestinal tract lacerations during the actual  slaughtering process. This is an important issue since pigs infected with salmonella may  contaminate the food supply through these lacerations (among other routes, including fecal  matter and meat juices). He chose 45 pigs from 3 farms.  Individuals =  Population =  pigs prior to slaughter all pigs prior to slaughter Sample =    45 pigs from 3 farmsChapter 1 Page 5  Three treatments (we’ll give a formal definition of “treatment” in later chapter):  Treatment 1: no food withheld prior to transport  Treatment 2: food withheld 12 hours prior to transport  Treatment 3: food withheld 24 hours prior to transport  Data were measured on many variables: body temperature prior to slaughter, weight prior to  slaughter, treatment assignment, the farm from which each pig originated, number of  lacerations recorded, size of laceration (cm)  How should we assign pigs to one of the three treatments?  randomly Why would one want to use animals from three farms?  different environments- dirty farms, infection, different practices... Why might body temperature or prior weight be of interest?  could indicate existing illness and weight Example 9  Classify the Data Collection Type for the following questions of interest:  – Is your school’s football team called for fewer penalties in home games than away  games?  observational study – Do college students perform better on exams when Mozart is playing softly in the  background than when no music is playing?  experiment – Are college students satisfied with the quality of education they are receiving?  sample surveyWord of Caution: Statistical conclusions hold “on average” for groups of individuals. They  don’t tell us much about one individual.  Chapter 1 Page 6  Samples, Good and Bad  Goal of Sampling  We want to make a statement about a large group of individuals (the population), but  oftentimes it is not practical or even possible to measure each individual in the population. In  this case, we choose a sample of individuals that is (hopefully) representative of the  population. What happens when our sample is not representative of the population?  How to Sample Badly  biased The design of a statistical study is _____________ if it systematically favors certain outcomes.  A ___________________________________ chooses itself by responding to a general appeal.  voluntary response sample  -individuals volunteer themselves to be in the sample  self- selection sample -also called a ___________________________________________ convenience sample Selection of whichever individuals are easiest to reach is called _________________________.   -researcher chooses who to ask to participate   -individuals can still choose not to participate  Convenience samples and voluntary response samples are often biased.  Example 1  Ann Landers once asked the readers of her nationally syndicated newspaper advice column, “If  you had it to do over again, would you have children?” She received nearly 10,000 responses,  almost 70% saying “no.” Is it true that 70% of parents regret having children?  NO Problems: -voluntary response survey -strong feelings -% of parents who wouldn't have kids is much higher in a bad  sample than in a population @ large Example 2  A student at the university is conducting a survey to find the opinion of her fellow students on  the availability of student parking on campus. She stands outside of a dorm and polls fellow  students as they leave the dorm. Which bad sampling method is this?  convenience sample Problem: - not random - missed commuting studentsExample 3  The popular radio Ace&TJ Show recently asked fans to vote on their website to the following  question  A nurse at KATE MIDDLETON'S hospital who was pranked by two Australian DJs last week was found DEAD in her home on Friday.  Police suspect SUICIDE. The DJs are off the air until further notice...a decision they made along with their radio station. Should the  radio DJ's be fired?  This is an example of which type of sampling? voluntary response sample The most basic, good sampling method is known as the Simple Random Sample. The simple  random sample is at the heart of all good sampling schemes. A simple random sample (SRS) of size n individuals from the population is chosen in such a way that:  – Every set of n individuals has an equal chance to be the sample actually selected  – Every individual has an equal chance of being chosen for the sample  The easiest way to do this is to place names in a hat (the population) and draw out a handful  (the sample).  Step 1: Label. Assign a numerical label to every individual in the population. Be sure that all  labels have the same number of digits if you plan to use a table of random digits. Step 2: Software or Table. Use random digits to select labels at random. Use software whenever possible – tables are old fashioned!  http://bcs.whfreeman.com/scc7e/ - Choose “Statistical Applets”, then “Simple Random  Sample”. There are lots of other computer generators available: www.randomizer.org, TI-83,  84, and 89 calculators, www.dougshaw.com/sampling, Statistical packages like R, SAS,  Minitab, etc…. If you are using a table of random digits….  Population labels must each contain the same number of digits  Spaces in the random digits table have no meaning (they are just place holders)  You can start anywhere you like in the table (across rows, up a column, down a column,…) Some people start their population labels at 0 and some start them at 1 (be aware)  Skip repeated codes and those outside the range of labels  duplicate? skip it.Chapter 2 Page 2  Example 4 Take a Simple Random Sample (SRS) of 3 people.  Step 1: Label your “population” elements.  Step 2: (Using random sampling generator) Obtain the sample. 01 11 02 03 04 05 06 07 08 09 10 Step 2: (Using random digits table) Obtain the sample.  Bautista Nemeth 12 Bolen Podboy 13 Clottey Ray 14 Counts Schumacher 15 Draper Tower 16 Hoffman Walters 17 Kumar Wang 18 Li Weimer 19 Lovesky Yu 20Marin Zhang Since we are using the pesky table of random digits, be sure each label (code) has the same  number of digits! Use the following line from a random digits table. Note: In practice you would choose any line  you want, but in class we will use the same line so we learn how to use the table.  05497 12005 13659 81273 Chapter 2 Page 3  Example 5  Take a Simple Random Sample (SRS)  You are reporting on apartments in Columbia. You decide to select 5 complexes at random for  in-depth interviews with residents.  01-Abbott Arms 08-Claire Tower 15-Keswick  02-Asbury Arms 09-Colony East 16-Landmark  03-Ashland 10-Cornell Arms 17-Paces Run  04-Bent Tree 11-Fairways 18-Ravenwood  05-Briargate 12-Fox Run 19-Riverview  06-Brook Pines 13-Green Oaks 20-Stone Ridge  07-Cedarwood 14-Hunter’s Green 21-Whaley’s Mill Use the following portion of Table A at line 140 (read across the row) to sample 5 complexes.  12975 13218 13048 45144 72321 21940 00360 02428 96767 35964 23822 96012  Some final thoughts - Can you Trust a Sample?  We can’t trust results from convenience and voluntary response samples, because they are  chosen in ways that invite bias.  We have more confidence in results from a SRS, because it avoids bias.  The first question to ask of any sample is whether it was chosen at random.  Clearly, the SRS is a handy tool for getting a random sample, but it is not sophisticated enough  to deliver the kind of information we want in many cases. We need more sampling  options…coming up soonChapter 2 Page 4  
