by: Celestino Bergnaum


Texas A&M University > Statistics > STAT 201 > ELEM STAT INFERENCE
Celestino Bergnaum
Texas A&M
GPA 3.5


About this Document

Class Notes
This 13 page Class Notes was uploaded by Celestino Bergnaum on Wednesday October 21, 2015. The Class Notes belongs to STAT 201 at Texas A&M University taught by Staff in Fall.

Similar to STAT 201 at Texas A&M




Date Created: 10/21/15
Normal Distributions Sometimes the overall pattern of a large number of observations is so regular that it can be described by a smooth curve Count of years 1954 to 2001 Frequency o 200 300 400 500 600 700 800 900 Mississippi River discharge cubic kilometers Fig ure 1 8a Intmducrinn h the Practice of Statistics Fifth Editian e 2005 w H Freeman and Company 35 40 45 50 55 60 65 70 75 H Rainwater pH values 2 4 5 a m 12 a Introduction to the Practice of Statistics Fifth Edition Grade Gq u W5 39339 it vocabulary 5G0 2005 W HFreeman and Company Figure 122 Four examples of histograms Which demonstrate a symmetric bell shaped pattern Normal distributions can take on many different means and standard deviations Only the general bell shape must remain the same a M Fig ure 1 26 Introductiun to the Practice nmetislics Fifth Edition 2005 w HFreeman and Company Properties of the Normal Distribution 1 Symmetric bell shaped 2 Mean u and standard deviation 0 3 Area under the curve is 1 Because we Will mention normal distributions often a short notation is helpful We abbreviate the normal distribution With mean u and standard deviation 0 as Nu a N0 1 N3 2 N2 05 Q Why are the normal distributions important in statistics Here are two reasons among many others 1 normal distributions are good descriptions for some distributions of real data if you don t believe me take a look at the above gurel 2 many statistical inference procedures based on normal distributions work well for other roughly symmetric distributions this is an advanced point which will be discussed later in the course if time permits The 68 95997 rule aka the empirical rule Although there are many normal curves they all have common properties Here is one of the most important TH E 68 95 997 RULE In the normal distribution with mean a and standard deviation 7 0 Approximately 68 of the observations fall Within a of the mean a 0 Approximately 95 of the observations fall Within 20 of a 0 Approximately 997 of the observations fall Within 30 of a Definition pg 71 introduction to the Practice of Statistics Fifth Edition 2005 W H Freeman and Company 68 of data gt 95 of data i 997 of data I Example 131 Bigger animals tend to carry their young longer before birth The length of horse pregnancies from conception to birth varies according to a roughly normal distribution with mean 336 days and standard deViation 3 days Use the 68 95 997 rule to answer the following questions H Sketch a picture of the distribution of the length of horse pregnancies from conception to birth Also using the short notation discussed earlier to write an abbreviated representation of this distribution 3 What percent of horse pregnancies fall between 333 to 339 days What pro portion fall between 330 to 342 days 00 Almost all 997 of horse pregnancies fall in what range of lengths H What percent of horse pregnancies are shorter than 339 days Standardizing observations As the 68 95 997 rule suggests7 all normal distributions share many properties In fact7 all normal distributions are the same if we measure in units of size 0 about the mean a as center Changing to units is called standardizing STAN DARDIZI NG AN D Z SCORES If Xis an observation from a distribution that has mean M and stan dard deviation 0 the standardized value of Xis X M a Z A standardized value is often called a 2 score Definitionpg 73 Introduction in the Pructite nfStmisticS Fifth Edition lt7 2005 w HFveeman and Company 0 A z score tells us how many standard deviations the original observation falls away from the mean7 and in which direction Observations larger than the mean are positive when standardized7 and observations smaller than the mean are negative 0 z scores allow us to directly compare observations from two different groups Example 132 Scores on the Wechsler Adult Intelligence Scale for the 20 to 34 age group are approximately normally distributed with mean 110 and standard deviation 25 Scores for the 60 to 64 age group are approximately normally dis tributed with mean 90 and standard deviation 25 Sarah7 who is 307 scores 135 on this test Sarah s mother7 who is 607 also takes the test and scores 120 Who scored higher relative to her age group7 Sarah or her mother Who has the higher absolute level of the variable measured by the test At what percentile of their age groups are Sarah and her mother That is7 what percent of the age group has lower scores THE STANDARD NORMAL DISTRIBUTION The standard normal distribution is the normal distribution NO l With mean 0 and standard deviation 1 If a variable X has any normal distribution Not a with mean it and standard deviation 0 then the standardized variable XM 0 Z has the standard normal distribution Definition pg 74 In traductiun to the Practice nmerislics Fifth Edition e 2005 WHFreeman and Company N01J The standard normal distribution Using the standard normal table Example 133 Assume that Z follows the standard normal distribution Consider the following questions 1 What percent of observations fall below 1 ie what is the relative frequency of the event Z lt l 2 What percent of observations fall below 2 ie what is the relative frequency of the event Z lt 2 3 What percent of observations fall below 147 ie what is the relative fre quency of the event Z lt 147 Issue Note that we can not answer the last question by simply using the 68 95 99 rule To answer such a question more generally we need to refer to a standard normal table This table is given in the front cover of the text book and the appendix Table A Table entry area 09292 Example 134 Find the relative frequency of each of the following events in a standard normal distribution In each case7 sketch a standard normal curve with the area representing the relative frequency shaded a Z 3 225 b Z 2 225 a Z gt 177 d 225 lt Z lt 177 Normal distribution calculations Example 135 Using the z score formula given earlier along with our approach to using the standard normal table we can answer the following types of questions H Suppose X N N 32 Find the proportion of the observations that are less than 4 D Suppose X N N 23 Find the proportion of the observations that are less than 4 OJ Suppose X N N2 3 Find the proportion of the observations that are greater than 5 H Suppose X N N 2 3 Find the proportion of the observations that are between 4 and 8 Using the z score formula given earlier along with our approach to using the standard normal table we can answer the following types of questions Example 136 Too much cholesterol in the blood increases the risk of heart dis ease Young women are generally less a licted with high cholesterol than other groups The cholesterol levels for women aged 20 to 34 follow an approximately normal distribution with mean 185 milligrams per deciliter mgdL and standard deviation 39 mgdL a Cholesterol levels above 240 mg dL demand medical attention What percent of young women have levels above 240 mgdL b Levels above 200 mg dL are considered borderline high What percent of young women have blood cholesterol between 200 and 240 mgdL Example 137 The scores of a reference population on the Weohsler Intelligence Scale for Children WISC are normally distributed with u 100 and a 15 a What percent of this population have WISC scores below 100 b Below 80 0 Above 140 d Between 100 and 120 Normal quantile plots It is important to be able to recognize when we have normal data The best possible Visual tool for determining normal data is the normal quantile plot Though we will not discuss how to make a normal quantile plot7 it is important to understand how to interpret such a plot USE OF NORMAL QUANTILE PLOTS If the points on a normal quantile plot lie Close to a straight line the plot indicates that the data are normal Systematic deviations from a straight line indicate a nonnormal distribution Outliers appear as points that are far away from the overall pattern of the plot Definition pg 81 Introduction to the PIERRE nfsmtisu cs Fifth Edilian o 2005 w HFveeman and Company 40 3 so c E g E 20 10 s 2 1 o 1 2 a 39s 392 391 139 l 5 Ideal case data appears to Data appears to be close to normal7 but be from a normal distribution not exactly ie7 roughly normal data Nswcom V5 data Dollars spent 5 2 1 O I 2 5 z ewre z scare Roughly normal data with outliers Skewed data ie7 data is not normal 13 Example 1310 Try to Classify each of the following plots as normal roughly normal or nonnormal Also identify plots with possible outliers N W U39l O O O O O I N O O 0 Breaking strength pounds 3 G O O O 0 Survival time days 3 2 1 0 1 2 3 2 score 2 score Figure 132 Figure 133 Introduction to the Picture umelIs cs Fifth Edition Introduction to the Practice ufsw srlm Fifth Edition a 2005 w HFreeman and Company 9 2005 W H Freeman and Company Rainwater pH value Z score Figure 1 34 Intradndlan In the Mt mofsm sl ics fth Edition a 2005 WHFreeman and Company 14


