INTRO STATISTICS STAT 2000
Popular in Course
Popular in Statistics
This 51 page Class Notes was uploaded by Ethel Hermiston on Saturday September 12, 2015. The Class Notes belongs to STAT 2000 at University of Georgia taught by Staff in Fall. Since its upload, it has received 60 views. For similar materials see /class/202533/stat-2000-university-of-georgia in Statistics at University of Georgia.
Reviews for INTRO STATISTICS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/12/15
Displaying Data Professor Kaplan STAT 2000 22 Aug 2011 On the rst day of class I asked you two questions Why are you taking this course What do you hope to get out of this course I took about 400 pieces of paper back to my office They needed to be organized so I could learn something from them I need to be able to read the story Goals for Chapter 2 A g 22 29 Siudenls will be able to Identify variables in a study and determine whether a variable is categorical discrete quantitative or continuous quantitative make describe and read pie and bar charts histograms dot plots box plots and stem and leaf displays Describe numerically the center and variability of the distribution of a quantitative variable by finding 5 number summar Mean and Standard Deviation Know which of thetwo sets of measures is appropriate for a given data set 3 Rules of Data Analysis l Make a picture To help you think clearly about the patterns and relationships hiding in your da a 2 Make a picture To show the important features and unexpected values or patterns in your data Make a picture To tell others what your data reveal Major points so far i First step in organizing data draw a picture Appropriate pictures for categorical data Pie chart Bar Graph Why are you taking the course Number of Students Reqwrement for major graduationgrad school 287 I enjoy math It was recommended Statistics is usefulimportant Statistics is better than the other opt ns 11 To learn Statistics Student Expectations I made a list of categories Each time a student wrote something from a category I made a tick mark expanded the categories based on what students wrote Some students wrote about more than one category and were counted multiple times Student Expectations Expectation N mber 0 To gain knowledge skills rs or unde 177 To get a good grade or earn cred Forfun orto meet new people 10 learn the secret of life 4 161 Which type of graph would be best Expectation Results for the expectations data A bar chart is better but a pie chart would be fine A pie chart is better but a bar chart would be fine 4 It doesn t matter eitherwould be fine Use a bar chart a pie chart is inappropriate c Use a pie chart a bar chart is inappropriate 3E 39 I g Gradecredit V On the first day of class at MSU posed the following Multiple Choice Question Students could choose only one Wthh type Of graph would be best response for the MSU expectations data Why are YOU taking this 0393557 A A bar chart is better but a pie chart would be fine B A pie chart is better but a bar chart would be fine A To gain knowledge understanding andor skills 39 gicmaysdeegszeagfgg gzgrr pre reqUISlte It doesn t matter eitherwould be fine To gain a foundation for my profession D Use a par Chan 3 pie Chad is inappropriate major or future class E Use a pie chart a bar chart is inappropriate To get a good grade Other MSU Expectation Results Knowledge I Cred it I Foundation I Good Grade I Other know what a histogram is Of course who doesn t I m pretty sure I do Ithink so but I could use a review I know I ve heard of them but I don t know what they are I ve never heard of a histogram before How and Why are the Expectation Results 39 7 IL Harem LuIlli m pussL law I u Md T 4145145 3 a his huh Gain 5mm Scores on a fairly easy exam E SAT scores of a group of college students 6 Heights of college students A Number of months required to achieve pregnancy for a sample of women trying to get pregnant 0 Number of medals won by countries in the 1992 Summer Olvmnics l39 STAT 2000 Franklin Test 2 Study Guide Test 2 is scheduled for Tuesday October 4 and Wednesday October 5 Practice problems with answer are on eLC Review sessions are Monday and Tuesday night 68pm Fine Arts 300 Review handout is on eLC Probability a measure of how likely an event of interest will occur It is a proportion of times a particular outcome can occur divided by the total number of possible outcome so probability is always between 0 and 1 inclusive so we don39t count endpoints Total of all probabilities equals 1 Outcome a particular event of interest Law of Large Numbers The more you repeat an experiment the more likely you will get a probability that is close to the actual probability of the event occurring Long run large number of experiments Short run a small number of experiments Two Types of Probability 1 Classical Probability what we would expect in the long run or Relative Freguency Probability the observed proportion of successful events PA number of ways an event can occur An outcome that has the same trait as the event ofinterest we call that outcome a quotsuccessquot The probability of the event happening is the number of successes divided by the number of possible outcomes 2 Subjective Probability probability based on subjectivepersonal judgment How to find probabilities List all possible outcomes Sample space the set of all possible outcomes of an experiment Event an subset ofa sample space 0 Two events are independent if the chances of one event occurring have no impact on if the other event occurs 0 If 2 events are not independent they are dependent on each other Complement AC the complement of event A is all outcomes in the sample space that are not in event A Probability of Ac 1 Probability of A If Probability ofA was 62 the complement would be 1 62 38 AND Probabilities Event A AND B occur at the same time the probability of event AAND B consists of the outcomes that are in both A and B 0R Probabilities At least A OR B occurs or both the probability of event A OR B consists of the outcomes that are in event A or B Low Blood High Blood Total Pressure Pressure Under 50 64 51 115 Middle Age 50 and Over 31 73 104 Old Total 95 124 219 Ifwe are given a contingency table we can find the ANDOR probabilities easily For a randomly selected person what is the probability they are middleaged and has low pressure 64219 2922 What is the probability that a randomly selected person is old or has high blood pressure 104 total of old people219 124 total with high blood pressure219 73 old people with high blood pressure219 7078 Note you have to subtract 73219 old people with high blood pressure because they were already counted in the first 2 proportions in the totals Conditional Probability The probability of A given that B has already occurred Ifyou roll a die and you know you rolled an even number what is the probability you rolled a six 13 3333 there are 6 numbers on a die 3 are even and 6 is one of these even numbers Not Working Working Total Type AA 60 700 760 Wpe c 40 660 700 Total 100 1360 1 460 We can use a contingency table to find conditional probabilities Given that a battery selected is AA what is the probability ofit working 700760 9211 What is the probability of randomly selecting a C battery 7001460 4795 Probability Distributions Random variable a unique numerical value for each outcome of a random phenomenon Discrete Random variable a countable number ofpossible EX the number ofheads in 3 ips ofa coin Continuous Random variable an uncountable infinite number ofpossible values they are on an interval EX time taken to complete a marathon in minutes Requirements for a Discrete Variable Distribution 1 For each X the probability PX is between 0 and 1 2 The sum of all probabilities for all possible values 1 The mean ofa probability distribution for a discrete random variable 2x multiplied by Px also called a weighted average The mean of a random variable is called the expected value Each possible value X is multiplied by it39s probability and they are added together Probabilities distributions for continuous random variables are found by finding areas under a normal curve All probabilities are between 0 and 1 and the total area under the curve is 1 Finding probabilities for bellshaped distributions Remember the Empirical Rule 68 of the observations fall within 1 standard deviation of the mean 95 of the observations fall within 2 standard deviations of the mean All or nearly all observations fall within 3 standard deviations of the mean Ifyou are given a distribution with p 50 and o 10 what is the probability below 65 This is where StatCrunch is your best friend STAT gt CALCULATORS gt NORMAL For the mean enter 50 and for the standard deviation enter 10 For Prob X 7 select 5 because we want to know what the probability below 65 then enter 65 and hit calculate Your answer is 9332 Dens ty 004 Mean 50 SIdDev 10 ProbIX lt 3 as 09331925 7 1 39 Snapshot 39 Close 7 vquot Compute We can also use the zscore to find this It39s just as easy Find the zscore 655010 15 For the mean enter 0 and for the standard deviation enter 1 For Prob x 77 selects because we want to know what the probability below 65 then enter 15 in the next box and hit calculate Your answer is the same 9332 Demle 74 Mean39 o 5m Dev Pmmx lt w 15 gt Fungus r 7 r Snapshot Close 1 39 Compute Always set the mean to 0 and the standard deviation to 1 when using the zscore to nd a probability How to find percentile using zscore Example find the zscore that is in the 90th percentile meaning you scored higher than 90 percent of people Set mean to 0 and standard deviation to 0 For Prob x select S because we want to know what the probability below 90 or 90 Do not enter anything in the next bow because this is where your zscore will show up In the next box enter 90 and hit calculate Density 04 Mean39 u smog71 Problxl lt ma 12515515 r 7 Snapshot Close Compute 1 Sampling Distribution 3 different distributions Population Distribution the entire distribution from which we take the sample Sample Data Distribution the distribution of the sample data for a particular given sample The shape of the sample mirrors that population Sampling Distribution the probability a sample statistic such as a sample mean It is a distribution of all the possible values for the sample statistic The shape ofit will be approximately normal under the conditions which we will soon present The Expected Mean of the Sampling Distribution of the Sample Mean the mean of all possible sample means we could obtain in random sampling will be the same as the overall population also denoted with p Standard Deviation the standard deviation ofa sampling distribution is called the standard error it s just another type of standard deviation it measures the variability of a sample statistic like the sample mean Standard error a n so important to know Shape One of the following has to be true to make sure the distribution is bellshaped or approximately normal 0 If the population is normally distributed then the sampling distribution of the sample mean is normally distributed as well regardless of sample size 0 For a large sample n 2 30 the sampling distribution of the sample mean is approximater normal regardless of distribution ofpopulation This is called the Central Limit Theorem Example from HW question that I thought was really helpful to show this Samples 11 16 are selected from a population with mean 80 and standard error 8 Mean 80 given Standard deviation 2 8 16 Shape doesn t say can t determine Sample 11 100 are selected from a population with mean 80 and standard error 8 Mean 80 given Standard deviation 80 8 N100 Shape approximately normal Central Limit Theorum 11 gt30 Samples 11 16 are selected from a normal population with mean 80 and standard error 8 Mean 80 given Standard deviation 2 8 16 Shape normal given It might be important to identify graphs and differentiate between standard deviation and standard error Population is normal with mean 50 and standard deviation 4 and we select a sample of 625 Which graph represents the population B Which graph represents the sampling distribution of the sample mean D find that standard error 4625 16 this is the new standard deviation value Normal Uiau qutiull A Nmmal uiau leull I l I I I I I I I I l 32 41 445 5 54 5 Ci H mlal DIsIlumylull I 43 455439 431 Ed 31339 13 543 Normal Damnmquot I I I I X I I I I I I I 15932 4062 il Ji d SJ SLUE 5032 5042 Example Suppose the test scores of Test 1 has a meanu 82 and standard deviationo 10 We tooka sample ofn 25 students Ifwe took many samples of 25 students and found the sample mean what would the standard deviation of these sample mean test scores be and what is it called called the standard error standard deviation of sample mean would 1025 2 the population is left skewed as well what is the probability that the sample mean of test scores for a sample size n 25 is higher than 83 3085 o In StatCrunch set mean 82 and st dev 2 because we want to know the probability of students in just the sample and for ProbX 7 select gt because we want to know the probability that scored above 83 and enter 83 in the next box and hit calculate DenSIty 02 015 01 005 O 76 78 80 82 84 56 88 x Mean 82 Std Dev 2 ProbIX 39 gt 33 1 030553754 k Snapshot Close 1 A Compute Example from llW question Average temperature in households in 676 F Standard deviation is 42 F A random sample of 51 households is selected What is the probability that the average of this sample will be above 688 F 0207 First find the standard error because we are using the sample 4251 58812 In StatCrunch set mean 676 and st dev 58812 ProbX 7 select gt because we want to know the probability above 688 and enter 688 in the next box and hit calculate Density 07 06 05 04 03 02 01 o 66 67 as 59 X Mean676 Std Dev 56312 Probix in 530 W quot77 CWquot WW What is the probability that the average of this sample will be within 14 degrees of the population mean 676 14 69 676 14 662 we want to find the probability between these 2 number In StatCrunch set mean 676 and st dev 58812 ProbX 7 select lt because we want to know the probability below 69 and enter 69 in the neXt box and hit calculate We get 99135 Keep lt the same and enter 662 in the neXt box We get 00864 subtract this number from 99135 99135 00864 9827 What is the probability that the average of this sample will be within 14 standard errors of the population mean standard error zscore In StatCrunch set mean 0 and st dev 1 since we are using the zscore P 14 s z s 14 P z s 14 0808 P z s 14 9192 9192 0808 8384 Notation Term Population Sample Sampling Distribution 51 I Mean 11 l p Standard 0 s 1 Deviation J Sampling Distributions of the Sample Proportion Population Proportion p Sample proportion phat xn The mean is equal to the population proportion p not sample proportion The standard deviation of the sampling distribution the standard error is Jp1pn and that s all under the square root 0 Be Careful This is not the same as the standard error for the sampling distribution of the sample mean which is st devn 0 Make sure to identify if the problem a sample distribution of means or proportions FIRST Example Consider a very large population of adults where approximately 45 of adults enjoy playing Dance Dance Revolution Suppose sample of size 275 are selected from this population the value ofphat is recorded for each sample Is sampling distribution of phat be approximately normal YES To check this np 2 15 27545 12375 gt 15 n p1 2 15 27555 151 gt 15 What are the mean and standard error Mean ofphat p 45 Standard error p1pn 4555275 03 Using Empirical Rule about 68 of the sample proportion values will be between 42 and 45 03 48 45 03 42 Remember 68 of the sample proportions fall within 1 standard deviations of the mean Using Empirical Rule almost all of the sample proportion values will be between 39 and 54 45 09 54 45 09 39 Remember almost all of the sample proportions fall within 3 standard deviations of the mean What is the probability of getting a sample proportion of 50 or higher from a random sample of 275 people In StatCrunch Set mean 45 and standard deviation 03 and find probability above 2 50 then hit calculate Probability ofa sample proportion of 5 or higher 04779 STAT 2000 Franklin STUDY GUIDE FOR TEST 1 Check eLC for Review Questions w answers Really helpful practice Review session Sept 12th from 68 in Fine Arts Building 300 Check eLC for more details on review sessions There will be a handout posted on for this review session Work the problems Read handout on eLC about what to do on Test Day What is statistics The science of designing studies and analyzing the data that those studies produce It is the science oflearning from data Section 12 We learn about populations using samples population the total set of subjects in which we are interested EX the entire voting public sample a subset of the population for whom we have data EX 200 randomly selected voters subject entities that we measure in a study EX each voter in the sample parameter a numerical value summarizing the population data EX percentage of voters for candidate A in the entire population statistic a numerical value summarizing the sample data EX percentage of voters voting for candidate A in our sample the 200 randomly selected voters know the difference a parameter and a 39 39 39 Notation different symbols are used to differentiate between the mean of a sample and the mean ofa population population mean parameter u mu sample mean statistic 2 xbar population proportion parameter p sample proportion statistic 15 p hat A statistic is descriptive ifit summarizes the actual data in the sample A statistic that makes a conclusion about the population is inferential Different Types of Samplesbe able to identify the type of sample used 0 Simple Random Sample SRS each possible sample of size nis equally likely of being chosen 0 Example Writing names of students of on pieces of paper and putting them into a hat and drawing names 0 AdvantageTends to be a good re ection of the population 0 Systematic Sample using a sampling frame or list generate a starting point at random for the list and select every kth every other every 3 etc subject of the list 0 Example Selecting every other person on a list 0 AdvantageEasy to conduct and a more even sample Stratified Random Sample Divide population into groups strata and take a simple random sample from each group 0 Example Freshmen Sophomores juniors Seniors 0 Advantages There will be enough subjects in each group that are being compared Cluster Random Sample Identify clusters of subjects and take a simple random sample from each cluster 0 Example clusters of different majors o Advantage There does not need to be a sampling frame of subjects 0 Convenience Sampling subjects are selected at convenience of researcher with no pattern or attempt for accurate representation 0 Example Choosing the first 50 people in line to go into a store first 0 Advantage Convenient and simple Know how to use a random number table First pick a random starting point anywhere on the table Go left to right assigning random digits to your sample can be single digits double etc Skip repeating digits Section 21 What are the types of data 2 types of variables know the difference If the variable of interest can be summarized as a word or category it is categorical o EX a person s eye color If the variable of interest can be summarized as a number it is quantitative o EX the oven temperature needed for a recipe A quantitative variable can be discrete or continuous o A quantitative variable is discrete if it can only take on a countable number of values usually a whole number There can be no numbers in between I EX the number ofliving grandparents a person has you can t have 2 and a half grandparents o A quantitative variable is continuous if it can have any number of decimals I EX a person s height Proportions and Percentages Freguencv Number of occurrences Freguengy table lists the number of observations for each category of data Relative Freguency the proportion or percent of observations within a category 0 frequency total of frequencies to make this a percentage multiply by 100 Proportion 30 Percentage 30 Section 22 How to describe data using graphs know different types what they look like when to use them etc Categorical ag Graphs Pie Chart Each quotslicequot is a category The largest piece represents the largest category and the smallest slice represents the smallest category N Amenta Asla a 3 i194 lAuslralia 20 566796 Bar Graph Plots the categories sidebyside on the horizontal axis Highest baris the largest category A bar graph is clearer about what group is the biggest anungy u uAmnusAmm Eulnyu Anquot Ann Mum tannin Pareto Graph A bar graph that is arranged from highest to lowest frequencies mmx Mmmu Am sAmmu Mmilm Arm CmmuY Quantitative Data Graphs Dot Plot Along the horizont represent how many times t O O O 0 Ch number is a dot to O O G 9D 91 92 93 94 95 96 97 98 99 100 Histogram The quantitative equivalent ofa bar graph The quotcategoriesquot on the horizontal aXis are numerical values IQ s of 7th Grader 31 9099 100 09 110119 120 129 Steam and Leaf Plot The onesplace digits farthest right digit are placed on the right side of a quotvertical bar chart These are the quotleavesquot along the stem Numbers must go in ascending order 19 9 20 So the first number on the stem 21 00 and leaf plot represents 199 22 35 55 8 23 25 Shape of a Graph S mmetrical Normal the distribution on either side of the middle is equal Skewed left left quottailquot is stretched out longer than the right tail 1 try to remember this by knowing ifmost of the data is on the right side of the graph it s skewed left so it s opposite ofwhat you would expect it to be Skewed right Right quottailquot is stretched out longer than the left tail Again opposite of what you would expect Ifmost of the data is on the left side of the graph it s skewed right Section 23 How do we describe the center of quantitative data Mean the average of the data set 2x sum of all numbers mean 11 sample s1ze Median the value of the data that occupies the middle position when the data are ranked in ascending order Separates top and bottom 50 of the data Mode the value that occurs most often in the data set the highest frequency these can all be found using StatCrunch After you enter your dataSTAT gt Summary Stats gt Columns cool stuff Also really helpful to know Outlier a data point that is ridiculously far away from the other data points Outliers can mess with data The mean range and standard deviation are affected by outliers so they are not resistant The mode and median are resistant to outliers so they are resistant Section 24 How can we describe the spread of quantitative data Rangethe difference between the largest and smallest observations max min Deviation from the mean difference between the value ofX and the mea x 7 Sample Variance Isl averaging all the squared deviations and dividing by n1 sum of all deviations squaredn1 2 S2 2 x x n 1 Standard Deviation measures roughly the average distance of an observation in a distribution from the mean square root of sample variance I 2 S S Zscoremeasures the number of standard deviations that an observation falls from the mean observation mean standard deviation Empirical Rule If a distribution is bell shaped we can approximate the percentage of data that lie within 1 2 and 3 standard deviations of them mean 0 68 of the observations fall within 1 standard deviation of the mean 0 95 of the observations fall within 2 standard deviations of the mean 0 All or nearly all observations fall within 3 standard deviations of the mean almoslall g N V 1 F gt m l I 39139 quotH u mmn 71 Section 25 How can we describe that position of values in quantitative data Percentilespth percentile is the value such that p of observations in the data fall below or at that value tells you approximately what percent of the data are less than that value 0 The other 100p of the observations in the data arelarger than that value 0 EX if the value lies at the 30th percentile approximately 30 of the data values are less than that value and approximately 70 of the data values are higher than that value Five number summary of position 0 The minimum value median and maximum values are measures of position 0 Can also use the first quartile value Q1 and third quartile value Q3 0 Successful Students Q1 345 Q3 39 Min 29 Med 36 Max 40 o Unsuccessful Students Q1 31 Q3 36 Min 29 Med 35 Max 39 o How to graphically display 5 number summary Box and Whisker Plot Box and Whisker Plot 0 A box goes from the Q1 to Q3 0 A line is drawn in the box at the median o A line is drawn from the box to the smallest observation that is not a potential outlier and another line is drawn from the box to the largest observation that is not a potential outlier whiskers Potential outliers are shown separately Identifying Potential Outliers 15 X interguartile range IQR 0 Find IQR Q3 Q1 0 Find 15 X IQR 0 Find lower boundary Q1 15 X IQR 0 Find upper boundary Q3 15 X IQR o If an observation is less than the lower boundary or greater than the upper boundary the observation is classified as a potential outlier Chapter 3 Association Contingency Correlation and Regression Response Variable a variable that can be eXplained by or is determined by another variable This is the yvariable the variable that goes on the vertical aXis of a graph Explanatory Variable eXplains or affects the response variable This is the Xvariable the variable that goes on the horizontal aXis of a graph Association an association eXists between 2 variables ifa particular value for one value is more likely to occur with certain values of the other variable Lurking Variable related to the response or eXplanatory variable but is not the variable being studied Section 32 How can we eXplore the association between two quantitative variables Scatterplot a graphical display for 2 quantitative variables EXplanatory variable is on the horizontal aXis and the response variable is on the vertical aXis Data x 1 l 2 I a I 4 l 5 I 5 I 7 I a l Scatter Plot ofx Versusy lvl13l4l5967l5 D m Positive association as X increases Y increases Negative association as X increase Y decreases No association as X increases there is no de nite shift in the values on We can calculate the correlation to determine if there is a linear relationship between the variables Linear correlation when the data tends to follow a straight line path can be positive or negative strong or weak No correlation as X increases there is no definite shift in the values on X Example of Positive Example of Negative Example of No Correlation Correlation Correlation Y Y o o o o o o o o o o o o o o o o o o o o o X X Example of Strong Emple of Week Positive Correlation Negative Correlation Correlation r the numerical measure of the strength of the linear relation between X and Y Zx 3c y i sx sy r n l This can be done with StatCrunch too After entering your dataSTATgt Summary Stats gt Correlation 1 r must always be between 1 and l S r S l 2 r gt 0 indicates a positive linear relationship If r 1 there is perfect positive correlation 3 r lt 0 indicates a negative linear relationship If r 1 there is perfect negative correlation 4 If r 0 there is no linear relation between the 2 variables 5 A value of r close to l or 1 indicates a strong linear relationship while a value of r close to zero represents a weak linear relationship 6 The sign of correlation coincides with the Sign of the regression equation coming up next a If correlation is positive then the regression line has a positive slope b If correlation is negative then the regression line has a negative slope 7 Correlation is unitless 8 Correlation only measures the degree of linear association nvrn Section 33 and 34 How to predict the outcome ofa variable know equations Regression Line predicts the value for the response variable Y as a straightline function of the value of X of the explanatory variable y a bx 3 the predicted response a the intercept b the slope x the data point Residual the difference between the observed value on and the predicted value on residual y 32 2 observed predicted StatCrunch can compute the regression equation After entering dataSTAT gt Regression gt Simple Linear Interpretation Recall that the regression equation is y a bx The slope b tells us that a unit increase in x causes the predicted response to increase or decrease by b units The response increases when the slope is positive and it decreases for a negative slope The intercept a tells us what the predicted response would be when x is equal to 0 units Depending on the context it may or may not have a practical interpretation Remember what these look like Posltlve Slope Negatlve Slope r DZAES OVI t pnsnmw Practice Problems for Test 2 1 Chapter 5 A university held a blood pressure screening clinic for its professors The results are summarized in the table below by age group and blood professor is under 50 years old b What is the probability that a randomly selected professor is under 50 and has low blood pressure c What is the probability that a randomly selected professor is under 50 or has low blood pressure d What is the probability that a randomly selected professor is 50 and over given that the professor has high blood pressure 6 What is the probability that a randomly selected professor has high blood pressure given that the professor is 50 and over Page 1 of 7 4 Section 62 Your height is 72 inches The average height for Georgia students could be modeled by a normal distribution with mean 68 and standard deviation 6 a What is the percentile corresponding to your height b What proportion of students will have heights between 60 and 70 inches c What height is 15 standard deviations below the Page 3 of 7 2 Section 61 According to a study ofbook reviews in American history geography and area studies published in Choice magazine LibraryAcquisitions Practice and Theory Vol 19 1995 the overall rating stated in each review was ascertained and recorded as follows lwvould not recommend 2cautious or very little recommendation 31itt1e or no preference 4favorab1erecommended 5outstandingsigni cant contribution W 05 1 099 093 635 1 2 2 mngr e a Is this a valid probability distribution Is this a discrete or continuous random varia e b What is the probability that a book reviewed in Choice has a rating of2 or 3 c Find the expected or average book rating 3 Section 61 You play a game inwhich the numbers 1 through 10 are placed in a bag You randomly pick a number from the bag If you pick the number 10 you win 800 Otherwise you get nothing What is your expected gain in this game Page 2 of 7 5 Section 62 For a normal distribution a Find the probability below u 1656 b Find the probability Within 235 standard deviations of the mean c Find the zscore corresponding to the 72nd percentile Page 4 of 7 8162011 Surveys and Vocabulary Professor Kaplan STAT 2000 17 Aug 2011 Weekly Goals Aug 15 19 Chapter 1 Section 42 Students will be able to Identify the sample and population in a study Explain the concept of selecting a random sample from a population Determine Whether a numerical measure is a parameter or a statistic Use software or a table of random numbers to select a sample Explain simple random sampling systematic random sampling stratified random sampling and cluster sampling Detect bias due to sampling and question wor ing sample 3 m zm l k quot population u1nIvW forf Wankaquaer Study M gwmm39t r m Iath shljorsim random sample parameter statistic simple random sampling systematic random sampling stratified random sampling cluster sampling bias sample a piece or subgroup ofthe population the group of people from Whom data have been collected population everyone in the group aboutwhich you are interested study experimental or observational studies surveys are a subset of observational studies random sample parameter statistic simple random sampling systematic random sampling stratified random sampling cluster sampling bias Bias is any systematic failure of a sample to represent its population man 39l39rulh l Estimates of Professor Kaplan s height and age mc usurcd vuluc pmism lllll rlaultyr m rdvnvd ii m i quotaquot mu 8162011 Bias is any systematic failure ofa sample or estimate to represent its population ortrue value Estimates of Prof K s height Estimates f Pr f KS 3949 are biased because they tend to be too high sample simple random sampling population systematic random sampling Study stratified random sampling random sample cluster sampling parameter b39as statistic More Vocabulary Every 10 years the US government takes a of the population of the US and finds the values of certain parameters like average family size or income But a census is costly so usually if we want to know something about the population we a sample of the population and find the values of statistics like average family size or income and use the statistics as estimates of the parameter SIRS Nielsen Ratings Exit polling Consumer Reports Ratings Public Opinion Polls By using a survey we get to learn something about the population by only asking a sample mam dud mim in llnm mimlv l Statistic Mmmimlmaimm pamn39aimu I arame er Americans39 evaluation of the job Congress is doing is the worst Gallup has ever measured with 13 approving tying the all time low measured in December 2010 Disapproval of Congress is at 84 a percentage point higher than last December39s previous high rating Results for this Gallup poll are based on telephone interviews conducted Aug 1114 2011 with a random sample of 1008 adults aged 18 and older living in all 50 US states and the District of Columbia Population Parameter Sample Statistic i The amount of television usage by children reached an eight year high with kids ages 2 to 5 watching the screen for more than 32 hours a week on average and those ages 6 to 11 watching more than 28 hours The data are based on Nielsen39s national sample which includes 6700 kids ages 2 to 11 2009 report based on data from fourth quarter of 2008 Population Parameter Sample Statistic i sample population study random sample parameter statistic simple random sampling systematic random sampling stratified random sampling cluster sampling bias 8162011 sample population study random sample parameter statistic simple random sampling systematic random sampling stratified random sampling cluster sampling bias Sometimes I say random things What does random mean in that sentence A group of participants was selected at random for the survey What does random mean in that sentence Sometimes I say random things Choose the best meaning for random in the above sentence Haphazard weird A B Without order or 0 Without prior knowledge criteria or method D E By chance Withoutbias out of the ordinary attern A group ofparticipants was selected at random for the survey Choose the best meaning for random in the above sentence A The choice was unexpected or unpredictable B People were chosen without order or reason 0 The choice was fair representative andor without bias D People were chosen by chance E The choices were based on probability and everyone had a chance of being chosen 8162011 Meanings of RANDOM Colloquial Sometimes I say random things A Haphazard weird out of the ordinary Statistical A group of participants was selected at random for the surve E The choices were based on probability and everyone had a chance of being chosen sample simple random sampling population systematic random sampling Study stratified random sampling random sample cluster sampling parameter b39as statistic random Weekly Goals Aug 15 19 Chapter 1 Section 42 Students will be able to Identify the sample and population in a study Explain the concept of selecting a random sample from a population Determine whether a numerical measure is a parameter or a statistic Use software or a table of random numbers to select a sample Explain simple random sampling systematic random sampling stratified random sampling and cluster sampling Detect bias due to sampling and question wording By using a survey we get to learn something about the population by only asking a sample mum dud mim um llnm um i P Statistic Mammmmmm mu Mnulav er Well assuming we do it correctly Opinion Poll Clip Bias is any systematic failure of a sample to represent its population 1 mean U111 We want to eliminate possible sources of a bias from our surveys 13 whiff mc asurcd vuluc Prulskm mnmmmy n rdund it at l quotaquot mu Class polls and video clip are two examples of response bias Other types of bias in surveys Voluntary Response NonResponse Under coverage 8192011 Weekly Goals Aug 15 19 Students will be able to Identify the sample and population in a study Explain the concept of selecting a random sample from a population Determine Whethera numerical measure is a parameter or a statistic Professor Kapan Use software or a table of random numbers to select a sample STAT 2000 Explain simple random sampling systematic 19 Aug 2011 random sampling stratified random sampling and cluster sampling Detect bias due to sampling and question wording Meanings of RANDOM Colloquial Sometimes I say random things A Haphazard weird out of the ordinary Statistical A group of participants was selected at random for the surve E The choices were based on probability and everyone had a chance of being chosen 8192011 Results from a previous class 0 Thought question in the previous slide how did I know what options to provide Results from a previous class 50 Gettysburg Address Actual average is about 43 letters Student guesses tend to be be centered around the true value Estimates based on student choices tend to be high BIASED Random sample tend to be centered around the true average and have smaller range than the estimates without samples 25 35 45 55 65 Types of Samples to represent the population of UGA students SIMPLE RANDOM SAMPLE SRS computer chose the 10 numb being told to make each num a y 39ke y Systematic Random When something is random we cannot predict a particular 0 with what probability the outco en and the d N T have to have equal pro Strat39f39ed Random So we can make predictions ab om events in Cluster Random a DIMENSION 1 THE INVESTIGATIVE CYCLE O L PPDAC Interpretation Concluxions C0 CIUSiOHS Problem New ideas Grasping system dynamics Communication 39 De nmg problem Analysis Plan Data exploration Planning Planned analyxes Measurement system Unplanned analyses v quotSampling design Hypothesis generation Dam 0 Data management 39 Data collection 39 Piloting amp analysis 39 Data management 39 Data cleaning Research Question Inference Conclusions COHCC E Data Research Question Inference Conclusions COlleCt Data Research Question Inference Conclusions COHCC E Data Research Question Inference Conclusions COHCC E Data Research Question Inference Conclusions COHCC E Data Research Question Inference Conclusions COHCC E Data
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'