TOPICS IN STATISTICS
TOPICS IN STATISTICS MATH 218
Popular in Course
Popular in Mathematics (M)
This 6 page Class Notes was uploaded by Donald Gusikowski on Monday October 5, 2015. The Class Notes belongs to MATH 218 at Clark University taught by David Joyce in Fall. Since its upload, it has received 36 views. For similar materials see /class/219538/math-218-clark-university in Mathematics (M) at Clark University.
Reviews for TOPICS IN STATISTICS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/05/15
Summary of basic probability theory part 2 D Joyce Clark University Math 218 Mathematical Statistics Jan 2008 Expectation The empected value EX also called the empectatz39on or mean MX of a random variable X is de ned differently for the discrete and continuous cases For a discrete random variable it is a weighted average de ned in terms of the probability mass function f as EX MX Zx I For a continuous random variable it is de ned in terms of the probability density function f as EX MX Lxf dx There is a physical interpretation where this mean is interpreted as a center of gravity Expectation is a linear operator That means that the expectation of a sum or difference is the difference of the expectations EX Y EX EY and that7s true whether or not X and Y are inde pendent and also EcX cEX where c is any constant From these two properties it follows that EX i Y EX i EY and more generally expectation preserves linear combinations Furthermore when X and Y are independent then EXY EX EY but that equation doesn7t usually hold when X and Y are not inde pendent Variance and standard deviation The vari ance of a random variable X is de ned as VarX 03 EX 7 MX2 EX2 g where the last equality is provable Standard devia tion 039 is de ned as the square root of the variance Here are a couple of properties of variance First if you multiply a random variable X by a constant c to get cX the variance changes by a factor of the square of c that is VarcX CZVarX That7s the main reason why we take the square root of variance to normalize itithe standard de viation of cX is 0 times the standard deviation of X Also variance is translation invariant that is if you add a constant to a random variable the variance doesn7t change VarX c VarX In general the variance of the sum of two random variables is not the sum of the variances of the two random variables But it is when the two random variables are independent Moments central moments skewness and kurtosis The kth moment of a random variable X is de ned as m Thus the mean is the rst moment M p1 and the variance can be found from the rst and second moments 02 2 M1 The kth central moment is de ned as EXMk Thus the variance is the second central moment A third central moment of the standardized ran dom variable X X 7 iiU EX 7 mg 03 53 EX3 is called the skewness of X A distribution that7s symmetric about its mean has 0 skewness In fact all the odd central moments are 0 for a symmetric distribution But if it has a long tail to the right and a short one to the left then it has a positive skewness and a negative skewness in the opposite situation A fourth central moment of X EX 7 W 04 a EltltXgt4gt is callled kurtosis A fairly at distribution with long tails has a high kurtosis while a short tailed distribution has a low kurtosis A bimodal distribu tion has a very high kurtosis A normal distribution has a kurtosis of 3 The word kurtosis was made up in the early 19th century from the Greek word for curvature The moment generating function There is a clever way of organizing all the moments into one mathematical object and that object is called the moment generating function Its a function mt of a new variable t de ned by mt EetX Since the exponential function e has the power se ries t ootk t2 tk e g 1ti w we can rewrite mt as follows 2 miEetX1o1t t2tk That implies that mk0 the kth derivative of mt evaluated at t 0 equals the kth moment M of X In other words the moment generating function generates the moments of X by differentiation For discrete distributions we can also compute the moment generating function directly in terms of the probability mass function fz PXz mt EetX Z empz For continuous distributions the moment generat ing function can be expressed in terms of the prob ability density function f as mt Ea L emez dz The moment generating function enjoys the fol lowing properties Translation lf Y X a then myt emmXt Scaling If Y bx then myt mXbt Standardizing From the last two properties if Xiw i 039 X is the standardized random variable for X then mXt e m meQo Convolution If X and Y are independent vari ables and Z X Y then mzt Trix Myt The primary use of moment generating functions is to develop the theory of probability For instance the easiest way to prove the central limit theorem is to use moment generating functions The median quartiles quantiles and per centiles The median of a distribution X some times denoted is the value such that PX 3 Whereas some distributions like the Cauchy dis tribution dont have means all continuous distri butions have medians If p is a number between 0 and 1 then the pth quantile is de ned to be the number 07 such that PX6p F6p 10 Quantiles are often expressed as percentiles where the 10th quantile is also called the 100pth percentile Thus the median is the 05 quantile also called the 50th percentile The rst quartile is another name for 0025 the 25th percentile while the third quartile is another name for 0075 the 75th percentile Summary of basic probability theory part 1 D Joyce Clark University Math 218 Mathematical Statistics Jan 2008 Sample space A sample space consists of a un derlying set S whose elements are called outcomes a collection of subsets of S called events and a function P on the set of events called a probability function satisfying the following axioms 1 The probability of any event is a number in the interval 01 2 The entire set S is an event with probability PS 1 3 The union and intersection of any nite or countably in nite set of events are events and the complement of an event is an event 4 The probability of a disjoint union of a nite or countably in nite set of events is the sum of the probabilities of those events Pltu El Em From these axioms a number of other properties can be derived including these 5 The the complement E S 7 E of an event E is an event and PE 17 6 The empty set is an event with probability 130 0 7 For any two events E and F PE U E PE PF 7 PE F therefore PE U E S PE 8 For any two events E and F PE PE F PE 9 If event E is a subset of event F then PE S 10 Statement 7 above is called the principle of inclusion and eacclusion lt generalizes to more than two events i PE i Z PE m E PE Ej Ek7 71 1PE1 m E2 m up E In words to nd the probability of a union of n events rst sum their individual probabilities then subtract the sum of the probabilities of all their pairwise intersections then add back the sum of the probabilities of all their 3 way interections then subtract the 4 way intersections and continue adding and subtracting k way intersections until you nally stop with the probability of the n way intersection Random variables notation In order to de scribe a sample space we frequently introduce a symbol X called a random variable for the sam ple space With this notation we can replace the probability of an event PE by the notation PX E E which by itself doesn7t do much But many events are built from the set operations of complement union and intersection and with the random variable notation we can replace those by logical operations for not7 or7 and and7 For in stance the probability PE UE can be written as PX E E but X Also probabilities of nite events can be writ ten in terms of equality For instance the prob ability of a singleton Pa can be written as PXa and that for a doubleton Pab PXa or Xb One of the main purposes of the random variable notation is when we have two uses for the same sample space For instance if you have a fair die the sample space is S 123456 where the probability of any singleton is g If you have two fair dice you can use two random variables X and Y to refer to the two dice but each has the same sample space Soon well look at the joint distri bution of X Y which has a sample space de ned on S gtlt S Random variables and cumulative distri bution functions A sample space can have any set as its underlying set but usually they7re related to numbers Often the sample space is the set of real numbers R and sometimes a power of the real numbers R The most common sample space only has two el ements that is there are only two outcomes For instance ipping a coin as two outcomesiHeads and Tails many experiments have two outcomesi Success and Failure and polls often have two outcomesiFor and Against Even though these events aren7t numbers its useful to replace them by numbers namely 0 and 1 so that Heads Suc cess and For are identi ed with 1 and Tails Fail ure and Against are identi ed with 0 Then the sample space can have R as its underlying set When the sample space does have R as its un derlying set the random variable X is called a real random variable With it the probability of an in terval like ab which is Pab can then be de scribed as Pa S X S b Unions of intervals can also be described for instance P7oo3 U 45 can be written as PX lt 3 or 4 S X S 5 When the sample space is R the probability function P is determined by a cumulative distri bution function cdf F as follows The function F R a R is de ned by PX S a P7oox Then from F the probability of a half open inter val can be found as Pab Fb 7 Fa Also the probability of a singleton b can be found as a limit Pb 111317 ltFltbgt e Fltagtgt From these probabilities of unions of intervals can be computed Sometimes the cdf is simply called the distribution and the sample space is identi ed with this distribution Discrete distributions Many sample distribu tions are determined entirely by the probabilities of their outcomes that is the probability of an event E is PE Z PXz Z Pz mEE mEE The sum here of course is either a nite or count ably in nite sum Such a distribution is called a dis crete distribution and when there are only nitely many outcomes z with nonzero probabilities it is called a nite distribution A discrete distributions is usually described in terms of a probability mass function pmf f de ned by f96 PX96 P96 This pmf is enough to determine this distribution since by the de nition of a discrete distribution the probability of an event E is PE ENC mEE In many applications a nite distribution is uni form that is the probabilities of its outcomes are all the same 1n where n is the number of out comes with nonzero probabilities When that is the case the eld of combinatorics is useful in nd ing probabilities of events Combinatorics includes various principles of counting such as the multipli cation principle permutations and combinations Continuous distributions When the curnu lative distribution function F for a distribution is differentiable function we say its a continuous dis tribution Such a distribution is determined by a probability density function f The relation be tween F and f is that f is the derivative F of F and F is the integral of f Conditional probability and independence If E and F are two events with PF 31 0 then the conditional probability of E given F is de ned to be PE F PF 39 Two events E and F neither with probability 0 are said to be independent or mutually indepen dent if any of the following three logically equiva lent conditions holds PElF PE F PEPF PElF PE PFlE PF Bayes7 formula is useful to invert conditional probabilities It says PElF PF PE PElF PF PElF PF PElF HF PFlE where the second form is often more useful in prac tice
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'