### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Econometric Methods EC 228

BC

GPA 3.86

### View Full Document

## 16

## 0

## Popular in Course

## Popular in Economcs

This 48 page Class Notes was uploaded by Jayda Beahan Jr. on Saturday October 3, 2015. The Class Notes belongs to EC 228 at Boston College taught by Staff in Fall. Since its upload, it has received 16 views. For similar materials see /class/218056/ec-228-boston-college in Economcs at Boston College.

## Reviews for Econometric Methods

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/03/15

Woodridge Introductory Econometrics 4th ed Appendix C Fundamentals of mathemati cal statistics A short review of the principles of mathemati cal statistics Econometrics is concerned with statistical inference learning about the char acteristics of a population from a sample of the population The population is a well defined group of subjects and it is important to de fine the population of interest Are we trying to study the unemployment rate of all labor force participants or only teenaged workers or only AHANA workers Given a population we may define an economic model that contains parameters of interest coefficients or elastic ities which express the effects of changes in one variable upon another Let Y be a random variable rv representing a population with probability density function pdf fy6 with 0 a scalar parameter We assume that we know fbut do not know the value of 6 Let a random sample from the pop ulation be Y1YN with Y being an inde pendent random variable drawn from fy6 We speak of Y1 being lid independently and identically distributed We often assume that random samples are drawn from the Bernoulli distribution for instance that ifI pick a stu dent randomly from my class list what is the probability that she is female That probabil ity is 7 where 7 of the students are female so P06 2 1 7 and P06 O 1 7 For many other applications we will assume that samples are drawn from the Normal distribu tion In that case the pdf is characterized by two parameters u and 02 expressing the mean and spread of the distribution respectively Finite sample properties Of estimators The finite sample properties as opposed to asymptotic properties apply to all sample sizes large or small These are of great relevance when we are dealing with samples of limited size and unable to conduct a survey to gener ate a larger sample How well will estimators perform in this context First we must distin guish between estimators and estimates An estimator is a rule or algorithm that speci fies how the sample information should be ma nipulated in order to generate a numerical es timate Estimators have properties they may be reliable in some sense to be defined they may be easy or difficult to calculate that dif ficulty may itself be a function of sample size For instance a test which involves measuring the distances between every observation of a variable involves an order of calculations which grows more than linearly with sample size An estimator with which we are all familiar is the sample average or arithmetic mean of N num bers add them up and divide by N That es timator has certain properties and its applica tion to a sample produces an estimate We will often call this a point estimate since it yields a single number as opposed to an inter val estimate which produces a range of val ues associated with a particular level of confi dence For instance an election poll may state that 55 are expected to vote for candidate A with a margin of error of 14 If we trust those results it is likely that candidate A will win with between 51 and 59 of the vote We are concerned with the sampling distribu tions of estimators that is how the estimates they generate will vary when the estimator is applied to repeated samples What are the finite sample properties which we might be able to establish for a given estimator and its sampling distribution First of all we are concerned with unbiasedness An estima tor W of 0 is said to be unbiased if EW 0 for all possible values of 6 If an estimator is unbiased then its probability distribution has an expected value equal to the population pa rameter it is estimating Unbiasedness does not mean that a given estimate is equal to 0 or even very close to 0 it means that if we drew an infinite number of samples from the population and averaged the W estimates we would obtain 6 An estimator that is biased exhibits BiasW EW 6 The magnitude of the bias will depend on the distribution of the Y and the function that transforms Y into W that is the estimator In some cases we can demonstrate unbiasedness or show that biasO irregardless of the distribution of Y for instance consider the sample average Y which is an unbiased estimate of the popula tion mean u E M 5 t m is lI i lI l lI l lI l 3 s 7 ll 7 Any hypothesis tests on the mean will require an estimate of the variance 02 from a popu lation with mean u Since we do not know u but must estimate it with Y the estimate of sample variance is defined as 1 n Eng W quot 12 21 32 with one degree of freedom lost by the replace ment of the population statistic u with its sam ple estimate Y This is an unbiased estimate of the population variance whereas the counter part with a divisor of n will be biased unless we know u Of course the degree of this bias will depend on the difference between and unity which disappears as n gt 00 Two difficulties with unbiasedness as a crite rion for an estimator some quite reasonable estimators are unavoidably biased but useful and more seriously many unbiased estimators are quite poor For instance picking the first value in a sample as an estimate of the popula tion mean and discarding the remaining n l values yields an unbiased estimator of u since EY1 u but this is a very imprecise estima tor What additional information do we need to evaluate estimators We are concerned with the precision of the estimator as well as its bias An unbiased estimator with a smaller sampling variance will dominate its counter part with a larger sampling variance eg we can demonstrate that the estimator that uses only the first observation to estimate M has a much larger sampling variance than the sample average for nontrivial n What is the sampling variance of the sample average VarO7 Var 1 in YIL i1 1 n 1 n 2 na so that the precision of the sample average de pends on the sample size as well as the un known variance of the underlying distribution of Y Using the same logic we can derive the sampling variance of the estimator that uses only the first observation of a sample as 02 Even for a sample of size 2 the sample mean will be twice as precise This leads us to the concept of efficiency given two unbiased estimators of 0 an estima tor W1 is efficient relative to W2 when VarW1 g VarW2 V6 with strict inequality for at least one 6 A relatively efficient unbiased estimator dominates its less efficient counterpart We can compare two estimators even if one or both is biased by comparing mean squared er ror MSE MSEW E W m2 This ex pression can be shown to equal the variance of the estimator plus the square of the bias thus it equals the variance for an unbiased estima tor Large sample asymptotic properties of estimators We can compare estimators and evaluate their relative usefulness by appealing to their large sample properties or asymptotic properties That is how do they behave as sample size goes to infinity We see that the sample aver age has a sampling variance with limiting value of zero as n gt 00 The first asymptotic prop erty is that of consistency If W is an estimate of 0 based on a sample Y1Yn of size n W is said to be a consistent estimator of 6 if for every 6 gt O PWn 9gte gtOasn gtoo Intuitively a consistent estimator becomes more accurate as the sample size increases without bound If an estimator does not possess this property it is said to be inconsistent In that case it does not matter how much data we have the recipe that tells us how to use the data to estimate 0 is flawed If an estimator is biased but its variance shrinks as n gt 00 then the estimator is consistent A consistent estimator has probability limit or plim equal to the population parameter plimY u Some mechanics of plims let 0 be a parameter and g a continuous func tion so that 7 96 Suppose plimWn 0 and we devise an estimator of 7 Gn 9Wn Then plimGn 7 or plim 9Wn 9plim This allows us to establish the consistency of estimators which can be shown to be transfor mations of other consistent estimators For in stance we can demonstrate that the estimator given above of the population variance is not only unbiased but consistent The standard deviation is the square root of the variance a nonlinear function continuous for positive arguments Thus the standard deviation 8 is a consistent estimator of the population stan dard deviation Some additional properties of plims if plimTn a and plimUn plimTnUn 13 plim TnUn 043 plimTnUn a 0 Consistency is a property of point estimators the distribution of the estimator collapses around the population parameter in the limit but that says nothing about the shape of the distribu tion for a given sample size To work with in terval estimators and hypothesis tests we need a way to approximate the distribution of the es timators Most estimators used in economet rics have distributions that are reasonably ap proximated by the Normal distribution for large samples leading to the concept of asymptotic normality PZn z gtClgtz asn gtoo where clgt is the standard normal cumulative distribution function cdf We will often say Zn NO 1 or Z is asy N This relates to one form of the central limit theorem CLT If Y1Yn is a random sample with mean u and variance 02 Zn 2 Y IAH has an asymptotic standard normal distribu tion Regardless of the population distribu tion of Y this standardized version of Y will be asy N and the entire distribution of Z will become arbitrarily close to the standard nor mal as n gt 00 Since many of the estimators we will derive in econometrics can be viewed as sample averages the law of large numbers and the central limit theorem can be combined to show that these estimators will be asy N In deed the above estimator will be asy N even if we replace a with a consistent estimator of that parameter S General approaches to parameter estima tion What general strategies will provide us with es timators with desirable properties such as un biasedness consistency and efficiency One of the most fundamental strategies for estimation is the method of moments in which we re place population moments with their sample counterparts We have seen this above where a consistent estimator of sample variance is defined by replacing the unknown population u with a consistent estimate thereof Y A sec ond widely employed strategy is the principle of maximum likelihood where we choose an es timator of the population parameter 0 by find ing the value that maximizes the likelihood of observing the sample data We will not fo cus on maximum likelihood estimators in this course but note their importance in econo metrics Most of our work here is based on the least squares principle that to find an esti mate of the population parameter we should solve a minimization problem We can readily show that the sample average is a method of moments estimator and is in fact a maximum likelihood estimator as well We demonstrate now that the sample average is a least squares estimator 7 min Y m2 1 will yield an estimator m which is identical to that defined as 7 We may show that the value m minimizes the sum of squared devi ations about the sample mean and that any other value m would have a larger sum or would not be least squares Standard re gression techniques to which we will devote much of the course are often called OLS ordinary least squares Interval estimation and confidence inter vals Since an estimator will yield a value or point estimate as well as a sampling variance we may generally form a confidence interval around the point estimate in order to make proba bility statements about a population param eter For instance the fraction of Firestone tires involved in fatal accidents is surely not 00005 of those sold Any number of samples would yield estimates of that mean differing from that number and for a continuous ran dom variable the probability of a point is zero But we can test the hypothesis that 00005 of the tires are involved with fatal accidents if we can generate both a point and interval esti mate for that parameter and if the interval estimate cannot reject 00005 as a plausible value This is the concept of a confidence in terval which is defined with regard to a given level of confidence or level of probability For a standard normal NO1 variable 37 M 1 which defines the interval estimate 7 17956 7 17956 We do not conclude from this that the probability that u lies in the inter val is 095 the population parameter either lies in the interval or it does not The proper way to consider the confidence interval is that if we construct a large number of random sam ples from the population 95 of them will contain u Thus if a hypothesized value for u lies outside the confidence interval for a single sample that would occur by chance only 5 of the time P 196 lt lt 196 095 But what if we do not have a standard normal variate for which we know the variance equals unity If we have a variable X which we con clude is distributed as Nu02gt we arrive at the difficulty that we do not know 02 and thus cannot specify the confidence interval Via the method of moments we replace the unknown 02 with a consistent estimate 82 to form the transformed statistic Y u tn S denoting that its distribution is no longer stan dard normal but student s t with n degrees of freedom The 75 distribution has fatter tails than does the normal above 20 or 25 degrees of freedom it is approximated quite well by the normal Thus confidence intervals con structed with the 75 distribution will be wider for small n since the value will be larger than 196 A 95 confidence interval given the symmetry of the 75 distribution will leave 25 of probability in each tail a two tailed t test If ca is the 1001 a percentile in the t distribu tion a 1001 a confidence interval for the mean will be defined as 8 8 y Cap a y Cor2 where s is the estimated standard deviation of Y We often refer to in as the standard er ror of the parameter in this case the standard error of our estimate of u Note well the dif ference between the concepts of the standard deviation of the underlying distribution an es timate of a and the standard error or preci sion of our estimate of the mean M We will return to this distinction when we consider re gression parameters A simple rule of thumb for large samples is that a 95 confidence in terval is roughly two standard errors on either side of the point estimate the counterpart of a t of 2 denoting significance of a param eter If an estimated parameter is more than two standard errors from zero a test of the hy pothesis that it equals zero in the population will likely be rejected Hypothesis testing We want to test a specific hypothesis about the value of a population parameter 6 We may believe that the parameter equals 042 so that we state the null and alternative hypotheses HO 92042 HA 972042 In this case we have a two sided alternative we will reject the null if our point estimate is significantly below 042 or if it is sig nificantly above 042 In other cases we may specify the alternative as one sided For instance in a quality control study our null might be that the proportion of rejects from the assembly line is no more than 003 versus the alternative that it is greater than 003 A rejection of the null would lead to a shutdown of the production process whereas a smaller proportion of rejects would not be cause for concern Using the principles of the scientific method we set up the hypothesis and consider whether there is sufficient evidence against the null to reject it Like the principle that a find ing of guilt must be associated with evidence beyond a reasonable doubt the null will stand unless sufficient evidence is found to reject it as unlikely Just as in the courts there are two potential errors ofjudgment we may find an innocent person guilty and reject a null even when it is true this is Type I error We may also fail to convict a guilty person or reject a false null this is Type 11 error Just as the judicial system tries to balance those two types of error especially considering the con sequences of punishing the innocent or even putting them to death we must be concerned with the magnitude of these two sources of er ror in statistical inference We construct hy pothesis tests so as to make the probability of a Type I error fairly small this is the level of the test and is usually denoted as or For instance if we operate at a 95 level of con fidence then the level of the test is or 005 When we set or we are expressing our tolerance for committing a Type I error and rejecting a true null Given a we would like to minimize the probability of a Type II error or equiva lently maximize the power of the test which is just one minus the probability of committing a Type II error and failing to reject a false null We must balance the level of the test and the risk of falsely rejecting the truth with the power of the test and failing to reject a false null When we use a computer program to calculate point and interval estimates we are given the information that will allow us to reject or fail to reject a particular null This is usually phrased in terms of p values which are the tail prob abilities associated with a test statistic If the p value is less than the level of the test then it leads to a rejection a p value of 0035 allows us to reject the null at the level of 005 One must be careful to avoid the misinterpretation of a p value of say 094 which is indicative of the massive failure to reject that null One should also note the duality between con fidence intervals and hypothesis tests They utilize the same information the point esti mate the precision as expressed in the stan dard error and a value taken from the under lying distribution of the test statistic such as 196 If the boundary of the 95 confidence interval contains a value 6 then a hypothesis test that the population parameter equals 6 will be on the borderline of acceptance and rejec tion at the 5 level We can consider these quantities as either defining an interval esti mate for the parameter or alternatively sup porting an hypothesis test for the parameter Woodridge Introductory Econometrics 3d ed Appendix C Fundamentals of mathemati cal statistics A short review of the principles of mathemati cal statistics Econometrics is concerned with statistical inference learning about the char acteristics of a population from a sample of the population The population is a well defined group of subjects and it is important to de fine the population of interest Are we trying to study the unemployment rate of all labor force participants or only teenaged workers or only AHANA workers Given a population we may define an economic model that contains parameters of interest coefficients or elastic ities which express the effects of changes in one variable upon another Let Y be a random variable rv representing a population with probability density function pdf fy6 with 0 a scalar parameter We assume that we know fbut do not know the value of 6 Let a random sample from the pop ulation be Y1YN with Y being an inde pendent random variable drawn from fy6 We speak of Y1 being lid independently and identically distributed We often assume that random samples are drawn from the Bernoulli distribution for instance that ifI pick a stu dent randomly from my class list what is the probability that she is female That probabil ity is 7 where 7 of the students are female so PYZ 1 7 and PYZ O 1 7 For many other applications we will assume that samples are drawn from the Normal distribu tion In that case the pdf is characterized by two parameters u and 02 expressing the mean and spread of the distribution respectively Finite sample properties Of estimators The finite sample properties as opposed to asymptotic properties apply to all sample sizes large or small These are of great relevance when we are dealing with samples of limited size and unable to conduct a survey to gener ate a larger sample How well will estimators perform in this context First we must distin guish between estimators and estimates An estimator is a rule or algorithm that speci fies how the sample information should be ma nipulated in order to generate a numerical es timate Estimators have properties they may be reliable in some sense to be defined they may be easy or difficult to calculate that dif ficulty may itself be a function of sample size For instance a test which involves measuring the distances between every observation of a variable involves an order of calculations which grows more than linearly with sample size An estimator with which we are all familiar is the sample average or arithmetic mean of N num bers add them up and divide by N That es timator has certain properties and its applica tion to a sample produces an estimate We will often call this a point estimate since it yields a single number as opposed to an inter val estimate which produces a range of val ues associated with a particular level of confi dence For instance an election poll may state that 55 are expected to vote for candidate A with a margin of error of 14 If we trust those results it is likely that candidate A will win with between 51 and 59 of the vote We are concerned with the sampling distribu tions of estimators that is how the estimates they generate will vary when the estimator is applied to repeated samples What are the finite sample properties which we might be able to establish for a given estimator and its sampling distribution First of all we are concerned with unbiasedness An estima tor W of 0 is said to be unbiased if EW 0 for all possible values of 6 If an estimator is unbiased then its probability distribution has an expected value equal to the population pa rameter it is estimating Unbiasedness does not mean that a given estimate is equal to 0 or even very close to 0 it means that if we drew an infinite number of samples from the population and averaged the W estimates we would obtain 6 An estimator that is biased exhibits BiasW EW 6 The magnitude of the bias will depend on the distribution of the Y and the function that transforms Y into W that is the estimator In some cases we can demonstrate unbiasedness or show that biasO irregardless of the distribution of Y for instance consider the sample average Y which is an unbiased estimate of the popula tion mean u E M 5 t m is lI i lI l lI l lI l 3 s 7 ll 7 Any hypothesis tests on the mean will require an estimate of the variance 02 from a popu lation with mean u Since we do not know u but must estimate it with Y the estimate of sample variance is defined as 1 n Eng W quot 12 21 32 with one degree of freedom lost by the replace ment of the population statistic u with its sam ple estimate Y This is an unbiased estimate of the population variance whereas the counter part with a divisor of n will be biased unless we know u Of course the degree of this bias will depend on the difference between and unity which disappears as n gt 00 Two difficulties with unbiasedness as a crite rion for an estimator some quite reasonable estimators are unavoidably biased but useful and more seriously many unbiased estimators are quite poor For instance picking the first value in a sample as an estimate of the popula tion mean and discarding the remaining n l values yields an unbiased estimator of u since EY1 u but this is a very imprecise estima tor What additional information do we need to evaluate estimators We are concerned with the precision of the estimator as well as its bias An unbiased estimator with a smaller sampling variance will dominate its counter part with a larger sampling variance eg we can demonstrate that the estimator that uses only the first observation to estimate M has a much larger sampling variance than the sample average for nontrivial n What is the sampling variance of the sample average VarO7 Var 1 in YIL i1 1 n 1 n 2 na so that the precision of the sample average de pends on the sample size as well as the un known variance of the underlying distribution of Y Using the same logic we can derive the sampling variance of the estimator that uses only the first observation of a sample as 02 Even for a sample of size 2 the sample mean will be twice as precise This leads us to the concept of efficiency given two unbiased estimators of 0 an estima tor W1 is efficient relative to W2 when VarW1 g VarW2 V9 with strict inequality for at least one 6 A relatively efficient unbiased estimator dominates its less efficient counterpart We can compare two estimators even if one or both is biased by comparing mean squared er ror MSE MSEW E W m2 This ex pression can be shown to equal the variance of the estimator plus the square of the bias thus it equals the variance for an unbiased estima tor Large sample asymptotic properties of estimators We can compare estimators and evaluate their relative usefulness by appealing to their large sample properties or asymptotic properties That is how do they behave as sample size goes to infinity We see that the sample aver age has a sampling variance with limiting value of zero as n gt 00 The first asymptotic prop erty is that of consistency If W is an estimate of 0 based on a sample Y1Yn of size n W is said to be a consistent estimator of 6 if for every 6 gt O PWn 9gte gtOasn gtoo Intuitively a consistent estimator becomes more accurate as the sample size increases without bound If an estimator does not possess this property it is said to be inconsistent In that case it does not matter how much data we have the recipe that tells us how to use the data to estimate 0 is flawed If an estimator is biased but its variance shrinks as n gt 00 then the estimator is consistent A consistent estimator has probability limit or plim equal to the population parameter plimY u Some mechanics of plims let 0 be a parameter and g a continuous func tion so that 7 96 Suppose plimWn 0 and we devise an estimator of 7 Gn 9Wn Then plimGn 7 or plim 9Wn g plimWn This allows us to establish the consistency of estimators which can be shown to be transfor mations of other consistent estimators For in stance we can demonstrate that the estimator given above of the population variance is not only unbiased but consistent The standard deviation is the square root of the variance a nonlinear function continuous for positive ar guments thus the standard deviation 8 is a consistent estimator of the population stan dard deviation Some additional properties of plims if plimTn a and plimUn plim Tn I Um or I plim TnUn oz Plim TnUh 0457 3 7 0 Consistency is a property of point estimators the distribution of the estimator collapses around the population parameter in the limit but that says nothing about the shape of the distribu tion for a given sample size To work with in terval estimators and hypothesis tests we need a way to approximate the distribution of the es timators Most estimators used in economet rics have distributions that are reasonably ap proximated by the Normal distribution for large samples leading to the concept of asymptotic normality PZn z gtltDz asn gtoo where ltlgt is the standard normal cumulative distribution function cdf We will often say Zn NO 1 or Z is asy N This relates to one form of the central limit theorem CLT If Y1Yn is a random sample with mean u and variance 02 Zn 2 Y IAH has an asymptotic standard normal distribu tion Regardless of the population distribu tion of Y this standardized version of Y will be asy N and the entire distribution of Z will become arbitrarily close to the standard nor mal as n gt 00 Since many of the estimators we will derive in econometrics can be viewed as sample averages the law of large numbers and the central limit theorem can be combined to show that these estimators will be asy N In deed the above estimator will be asy N even if we replace a with a consistent estimator of that parameter S General approaches to parameter estima tion What general strategies will provide us with es timators with desirable properties such as un biasedness consistency and efficiency One of the most fundamental strategies for estimation is the method of moments in which we re place population moments with their sample counterparts We have seen this above where a consistent estimator of sample variance is defined by replacing the unknown population u with a consistent estimate thereof Y A sec ond widely employed strategy is the principle of maximum likelihood where we choose an es timator of the population parameter 0 by find ing the value that maximizes the likelihood of observing the sample data We will not fo cus on maximum likelihood estimators in this course but note their importance in econo metrics Most of our work here is based on the least squares principle that to find an esti mate of the population parameter we should solve a minimization problem We can readily show that the sample average is a method of moments estimator and is in fact a maximum likelihood estimator as well We demonstrate now that the sample average is a least squares estimator 7 min Y m2 1 will yield an estimator m which is identical to that defined as 7 We may show that the value m minimizes the sum of squared devi ations about the sample mean and that any other value m would have a larger sum or would not be least squares Standard re gression techniques to which we will devote much of the course are often called OLS ordinary least squares Interval estimation and confidence inter vals Since an estimator will yield a value or point estimate as well as a sampling variance we may generally form a confidence interval around the point estimate in order to make proba bility statements about a population param eter For instance the fraction of Firestone tires involved in fatal accidents is surely not 00005 of those sold Any number of samples would yield estimates of that mean differing from that number and for a continuous ran dom variable the probability of a point is zero But we can test the hypothesis that 00005 of the tires are involved with fatal accidents if we can generate both a point and interval esti mate for that parameter and if the interval estimate cannot reject 00005 as a plausible value This is the concept of a confidence in terval which is defined with regard to a given level of confidence or level of probability For a standard normal NO1 variable 37 M 1N5 which defines the interval estimate 7 17956 7 17956 We do not conclude from this that the probability that u lies in the inter val is 095 the population parameter either lies in the interval or it does not The proper way to consider the confidence interval is that if we construct a large number of random sam ples from the population 95 of them will contain u Thus if a hypothesized value for u lies outside the confidence interval for a single sample that would occur by chance only 5 of the time P 196 lt lt 196 095 But what if we do not have a standard normal variate for which we know the variance equals unity If we have a variable X which we con clude is distributed as Nu02gt we arrive at the difficulty that we do not know 02 and thus cannot specify the confidence interval Via the method of moments we replace the unknown 02 with a consistent estimate 82 to form the transformed statistic Y u tn S denoting that its distribution is no longer stan dard normal but student s t with n degrees of freedom The 75 distribution has fatter tails than does the normal above 20 or 25 degrees of freedom it is approximated quite well by the normal Thus confidence intervals con structed with the 75 distribution will be wider for small n since the value will be larger than 196 A 95 confidence interval given the symmetry of the 75 distribution will leave 25 of probability in each tail a two tailed t test If ca is the 1001 a percentile in the t distribu tion a 1001 a confidence interval for the mean will be defined as 8 8 y Cap a y Cor2 where s is the estimated standard deviation of Y We often refer to in as the standard er ror of the parameter in this case the standard error of our estimate of u Note well the dif ference between the concepts of the standard deviation of the underlying distribution an es timate of a and the standard error or preci sion of our estimate of the mean M We will return to this distinction when we consider re gression parameters A simple rule of thumb for large samples is that a 95 confidence in terval is roughly two standard errors on either side of the point estimate the counterpart of a t of 2 denoting significance of a param eter If an estimated parameter is more than two standard errors from zero a test of the hy pothesis that it equals zero in the population will likely be rejected Hypothesis testing We want to test a specific hypothesis about the value of a population parameter 6 We may believe that the parameter equals 042 so that we state the null and alternative hypotheses HO 92042 HA 972042 In this case we have a two sided alternative we will reject the null if our point estimate is significantly below 042 or if it is sig nificantly above 042 In other cases we may specify the alternative as one sided For instance in a quality control study our null might be that the proportion of rejects from the assembly line is no more than 003 versus the alternative that it is greater than 003 A rejection of the null would lead to a shutdown of the production process whereas a smaller proportion of rejects would not be cause for concern Using the principles of the scientific method we set up the hypothesis and consider whether there is sufficient evidence against the null to reject it Like the principle that a find ing of guilt must be associated with evidence beyond a reasonable doubt the null will stand unless sufficient evidence is found to reject it as unlikely Just as in the courts there are two potential errors ofjudgment we may find an innocent person guilty and reject a null even when it is true this is Type I error We may also fail to convict a guilty person or reject a false null this is Type 11 error Just as the judicial system tries to balance those two types of error especially considering the con sequences of punishing the innocent or even putting them to death we must be concerned with the magnitude of these two sources of er ror in statistical inference We construct hy pothesis tests so as to make the probability of a Type I error fairly small this is the level of the test and is usually denoted as or For instance if we operate at a 95 level of con fidence then the level of the test is or 005 When we set or we are expressing our tolerance for committing a Type I error and rejecting a true null Given a we would like to minimize the probability of a Type II error or equiva lently maximize the power of the test which is just one minus the probability of committing a Type II error and failing to reject a false null We must balance the level of the test and the risk of falsely rejecting the truth with the power of the test and failing to reject a false null When we use a computer program to calculate point and interval estimates we are given the information that will allow us to reject or fail to reject a particular null This is usually phrased in terms of p values which are the tail prob abilities associated with a test statistic If the p value is less than the level of the test then it leads to a rejection a p value of 0035 allows us to reject the null at the level of 005 One must be careful to avoid the misinterpretation of a p value of say 094 which is indicative of the massive failure to reject that null One should also note the duality between con fidence intervals and hypothesis tests They utilize the same information the point esti mate the precision as expressed in the stan dard error and a value taken from the under lying distribution of the test statistic such as 196 If the boundary of the 95 confidence interval contains a value 6 then a hypothesis test that the population parameter equals 6 will be on the borderline of acceptance and rejec tion at the 5 level We can consider these quantities as either defining an interval esti mate for the parameter or alternatively sup porting an hypothesis test for the parameter

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.