Elements of Statistics 1 MATH 130

GPA 3.58

## 7

## 0

Date Created: 10/15/15

## Reviews for Elements of Statistics 1

Date Created: 10/15/15

Data Collection MATH 130 Elements of Statistics I J Robert Buchanan Department of Mathematics Spring 2008 Big Definition Definition Statistics is the science of 0 collecting 0 organizing o summarizing and o analyzing information to draw conclusions or answer questions Process of Statistics 7 7 0 Identify the research objective the question to be answered The group to be studied is called a population which is composed of individuals Process of Statistics 7 0 Identify the research objective the question to be answered The group to be studied is called a population which is composed of individuals 9 Collect the data needed to answer the question Data are collected from a subset of the population called a sample Process of Statistics 7 0 Identify the research objective the question to be answered The group to be studied is called a population which is composed of individuals 9 Collect the data needed to answer the question Data are collected from a subset of the population called a sample 9 Organize and summarize the information called descriptive statistics Process of Statistics 7 0 Identify the research objective the question to be answered The group to be studied is called a population which is composed of individuals 9 Collect the data needed to answer the question Data are collected from a subset of the population called a sample 9 Organize and summarize the information called descriptive statistics 0 Draw conclusions about the population from the information called inferential statistics Definition A variable is a characteristic of interest about the individuals in a population Definition A variable is a characteristic of interest about the individuals in a population 0 height 0 weight 0 hair color 0 income 0 zip code ualitie vs Quantitative Variables 7 Definition Qualitative or categorical variables allow for classification of individuals based on some attribute or characteristic Definition Quantitative variables provide numerical measures of individuals Identify each of the following as an example of a qualitative or quantitative variable 0 The breaking strength of a piece of string 9 The number of stop signs in towns of less than 500 people 0 The hair color of children auditioning for a play 9 Whether or not a faucet is defective The number of questions answered correctly on a standardized test 0 The length of time spent on hold to have a question answered by the help desk via telephone Continuous vs Discrete Variables Definition A discrete variable is a quantitative variable that has either a finite number of possible values or a countable number of possible values Definition A continuous variable is a quantitative variable that has an infinite number of possible values that are not countable Identify each of the following as an example of a discrete or continuous variable 0 A poll of registered voters about which candidate they will support 8 The length of time required for a wound to heal after a bandage is applied 0 The number of telephone calls received by a help desk in a 10minute period 9 The distance freshmen can kick a football 6 The number of pages in term papers written for ENGL 110 70 The kind of tree used as a Christmas tree Sources of Data 0 Census a list of all individuals in a population along with certain characteristics of each individual 0 Existing sources Sources of Data 0 Census a list of all individuals in a population along with certain characteristics of each individual 0 Existing sources 0 Survey sampling Definition An observational study measures the characteristics of a population by studying individuals in a sample but does not attempt to manipulate or influence the variables of interest 0 Census a list of all individuals in a population along with certain characteristics of each individual 0 Existing sources 0 Survey sampling Definition An observational study measures the characteristics of a population by studying individuals in a sample but does not attempt to manipulate or influence the variables of interest 0 Designed experiments Definition A designed experiment applies a treatment to individuals referred to as experimental subjects or units and attempts to isolate the effects of treatment on a response variable NEW Causation vs Association Remark observational studies are useful for determining whether there is a relation between to variables but a designed experiment is required to isolate the cause of the relation Causation vs Association Remark observational studies are useful for determining whether there is a relation between to variables but a designed experiment is required to isolate the cause of the relation Studies have observed that women taking hormone replacement therapy HRT seem to have a lower risk of heart disease The researchers observed women who decided for themselves whether or not to take HRT Perhaps women who choose to take HRT are healthier to begin with and thus are already at a lower risk of heart disease Random Sampling Definition A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring The sample is then called a simple random sample Random Sampling Definition A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring The sample is then called a simple random sample 0 Draw names from a hat 0 Assign a number two each individual and select numbers randomly Stratified Sampling Definition A stratified sample is obtained by separating the population into nonoverlapping groups called strata and then obtaining a simple random sample from each stratum The individuals within each stratum should be similar in some way Systematic Sampling Definition A systematic sample is obtained by selecting every kth individual from the population The first individual selected corresponds to a random number between 1 and k Cluster Sampling Definition A cluster sample is obtained by selecting all the individuals within a randomly selected collection or group of individuals Convenience Sampling Definition A convenience sample is a sample in which the individuals are easily obtained Convenience Sampling Definition A convenience sample is a sample in which the individuals are easily obtained Often a convenience sample is obtained via selfreporting or voluntary response Sources of Error in Sampling Definition Nonsampling errors are errors that result from the survey process They are due to the 0 nonresponse of individuals selected to be in the survey 0 inaccurate responses 0 poorly worded questions or o bias in the selection of individuals to be given the survey quotDefinitionquot Nonsampling errors are errors that result from the survey process They are due to the o nonresponse of individuals selected to be in the survey 0 inaccurate responses 0 poorly worded questions or o bias in the selection of individuals to be given the survey Definition Sampling error is the error that results from using sampling to estimate the information regarding a population This type of error occurs because a sample gives incomplete information about the population Homework 0 Read Sections 11 14 0 Pages 913 15 38 51 55 0 Pages 1921 9 19 0 Pages 3032 9 23 0 Pages 3739 11 23 Applications of the Normal Distribution MATH 130 Elements of Statistics I J Robert Buchanan Department of Mathematics Spring 2008 Finding the Area Under Any Normal Curve Recall we have used Table IV to find the area underthe standard normal curve Question what about normal curves with u 7 O andor a 7 1 Finding the Area Under Any Normal Curve Recall we have used Table IV to find the area underthe standard normal curve Question what about normal curves with u 7 O andor a 7 1 For a normally distributed random variable X with mean u and standard deviation 0 0 Calculate the standard normal random variable 27 039 9 Draw a standard normal curve and shade the area to be found 3 Find the area using the standard normal curve and the values in Table IV pages A 10 and A 11 The magnitude of earthquakes recorded in a certain region is normally distributed with a mean u 51 and a standard deviation 0 04 0 What is the probability that an earthquake has a magnitude of less than 58 9 What is the probability that an earthquake has a magnitude of more than 45 9 What is the probability that an earthquake has a magnitude between 40 and 60 9 Would an earthquake of magnitude 63 or higher be unusual 0 What is the percentile rank of an earthquake of magnitude 48 In a study of human pregancy the mean length of pregnancy was 272 days with a standard deviation of 9 days Singapore MedJ 2006 4712 1044 1048 0 What percentage of pregancies last more than 275 days 9 What is Q1 for pregnancies 9 What is the probability that a pregnancy asts no more than 280 days 9 A preterm baby is one whose pregnancy is less than 245 days What percentage of babies are preterm The mean body temperature of healthy men is 367C with a standard deviation of 07C JAMA 1992 26812 1578 1580 Body temperature is normally distributed 0 Find the temperature at P20 9 Find the temperatures bounding the middle 96 of the data 9 Find the temperature at Q3 9 Would a temperature of 350C be unusual for a healthy man Homework 0 Read Section 73 0 Pages 350352 3 7 11 15 19 23 27 31 Hypothesis Tests for a Population Proportion MATH 130 Elements of Statistics I J Robert Buchanan Department of Mathematics Spring 2008 Hypothesis testing about the population proportion is carried out very similarly to the familiar method for hypothesis testing Assumptions 0 Simple random sample of size n g 005N is collected 9 If p0 is the assumed value of the population proportion then np017 p010 G The test statistic will be calculated as Pipe 20 Example Classical Approach 7 7 An insurance company states that 90 of its claims are settled within 30 days A consumer group selected a simple random sample of 75 ofthe company s claims to test this statement The consumer group found that 55 of the claims were settled within 30 days At the 005 significance level test the company s claim that 90 of its claims are settled within 30 days Example cont 7 H0 p 090 H1 p lt 090 lefttailed test a 005 72a 71645 7 5575 7 090 i 09017090 75 Test statistic 20 74811 Eaplecont iquot if if 7 TS z 74 811 7204 645 Decision reject H0 Conclusion the sample data warrants rejection of the claim that 90 ofthe company s insurance claims are settled within 30 days Example PValue Approach 7 A politician claims that she will receive 60 of the votes in an upcoming election The results of a simple random sample of 100 voters showed that 50 of those sampled will vote for her Test the politician s claim at the 005 level of significance Example cont 7 H0 p 060 H1 p 7 060 twotailed test a 005 az 0025 izaz i1960 Test statistic zo 7204 05017050 100 Example cont 7 Pvalue 2PZ gt l7 2040 00414 lt 005 a Decision reject H0 Conclusion the sample data warrant rejection of the claim that the politician will receive 60 of the vote Example Classical Approach 7 The fulltime student body of a college is 50 men and 50 women Suppose an introductory chemistry class contains 30 men and 20 women Does this sample provide sufficient evidence at the 005 significance level to reject the hypothesis that the proportions of male an female students who take this course are the same as in the general student body Example cont 7 H0 p 050 H1 p 7 050 twotailed test a 005 042 00252042 i196 7 3050 7 050 05017050 50 Test statistic zo 141 Eaple cont 7 W 72M 71 96 2M 1 96 Decision fail to reject H0 Conclusion the sample data does not warrant rejection of the claim that the proportions of male and female students in the introductory chemistry class is the same as in the general Example PValue Approach i The popularity of personal watercraft also known as jet skis continues to increase despite the apparent danger associated with their use A sample of 54 watercraft accidents reported to the Nebraska Game and Parks Commission in 1997 revealed that 85 ofthem involved personal watercraft even though only 8 of the motorized boats registered in the state are personal watercraft Suppose the national average proportion of watercraft accidents in 1997 involving personal watercraft was 78 Does the watercraft accident rate in Nebraska exceed the rate in the nation Use the 001 level of significance Example cont 7 H0 p 078 H1 p gt 078 righttailed test a 001 za 2326 Test statistic zo 7 124 07817078 7 T Example cont 7 if 7 7 Pvalue PZ gt 124 1 7PZ lt 124 1 7 08925 01075 gt 001 a Decision fail to reject H0 Conclusion the sample data do not support that claim that the watercraft accident rate in Nebraska exceeds the rate in the nation Homework 0 Read Section 104 0 Pages 498500 3 9 odd 13 17 21 Distribution of the Sample Mean MATH 130 Elements of Statistics I J Robert Buchanan Department of Mathematics Spring 2008 Suppose we have the following population 4 8 1 2 3 4 9 1 O 4 3 5 6 8 9 3 O O 7 2 1 O 3 7 9 3 2 5 7 2 4 7 5 7 7 6 4 1 3 0 If we take random samples of size 10 we might obtain 31 2834514133 32 876744o220 33 6oo74o7o61 Experiment cont 7 The samples were 31 s2 33 2787371757774717379 87 7767 77 474707 27270 6707077747077707671 and there corresponding means were Y1 43 Y2 40 Y3 31 Experiment cont W The samples were 31 s2 33 2787371757774717379 87 7767 77 474707 27270 6707077747077707671 and there corresponding means were Y1 43 Y2 40 Y3 31 Observation since the samples are chosen randomly the mean calculated from the sample is a random variable What is the distribution ofthis random variable Suppose we have a population 12 345 6 we can list all the samples of size n 2 with their corresponding sample means Sample X l Sample Y l Sample Y 12 15 13 20 14 25 1 5 30 16 35 2 3 25 24 30 25 35 26 40 34 35 35 40 36 45 45 45 46 50 56 55 Example cont 7 i if 7 The relative frequency distribution of the sample means is below Y Rel Freq i Y Rel Freq i Y Rel Freq 15 1 20 1 25 2 30 2 35 3 40 2 45 2 50 1 55 1 Question what is the probability of a random sample of size n 2 having a sample mean 25 g Y g 40 Suppose a sample of 100 people yields the following ages 14 16 26 58 31 41 12 5O 35 15 Population Age Distribution Age distribution for the population 20 20 40 60 p 393 a 209 100 Samples of size 30 W If we create 100 random samples of size n 30 and calculate the sample means we get 364 355 368 367 361 369 374 363 346 359 375 366 357 356 365 362 355 366 357 348 364 372 373 375 378 371 368 355 363 364 382 373 367 381 360 365 375 373 364 358 375 374 375 366 374 373 370 366 365 366 365 357 363 363 36 354 371 356 348 354 366 366 361 380 368 363 359 357 357 352 357 357 364 364 361 382 374 348 348 355 358 354 37 355 365 381 366 349 345 361 363 364 376 367 359 367 375 354 355 367 Distribution of the Sample Mens Distribution of Y 35 36 37 38 Increasing the Sample Size n As the sample size increases the mean of the sample means approaches the population mean 5 39 39 i o39 38 n39 aw 36 quot 34 32 39 quot I I I 30 l l l l n 20 40 60 80 Law of Large Numbers LaW of Large Numbers As additional observations are added to a sample the sample mean Y approaches the population mean u a Happens to the Standard eatio 7 7 As the sample size increases the standard deviation ofthe sample means decreases but does not disappear anquot I o quot III at c39f o 00 quotwquotquot s Suppose that a simple random sample of size n is drawn from a large population With a mean u and a standard deviation 0 The sampling distribution of Y Will have mean p and standard deviation 039 a The standard deviation of the sampling distribution of Y is called the standard error of the mean and is denoted 0 For the population of 100 ages In 393 and a 209 0 What is the standard error of the mean for samples of size n 207 G What is the standard error of the mean for samples of size n 257 9 What is the standard error of the mean for samples of size n 307 ristriutions 7 If a random variable X is normally distributed the distribution of the sample mean Y is normally distributed Population Central Limit Theorem Central Limit Theorem Regardless of the shape of the population distribution the sample distribution of Y becomes approximately normal as the sample size increases Central Limit Theorem Central Limit Theorem Regardless of the shape of the population distribution the sample distribution of Y becomes approximately normal as the sample size increases Remark if the sample size is greater than 30 generally the distribution of can be treated as normal Illustration The random variable X is not normally distributed but the means of samples of size 30 randomly sampled from this distribution are nearly normally distributed Applications The average speed of winds in Honolulu HI is u 113 mph The standard deviation of the wind speeds is a 35 mph Assuming that the wind speeds are normally distributed 0 find the probability that a single wind speed reading will exceed 139 mph 9 describe the sampling distribution ofthe means for samples of size n 9 0 find the probability that the mean of 9 wind speed readings will exceed 139 mph Applications cont The average salary for a registered nurse is 45900 and the standard deviation in salaries is 7790 Suppose a sample of salaries of 50 registered nurses is collected What is the probability that the sample mean is between 43000 and 47000 Homework 0 Read Section 81 0 Pages 389392 11 15 19 23 27 31

