INTRO STAT THEORY II
INTRO STAT THEORY II STAT 703
Popular in Course
verified elite notetaker
Popular in Statistics
This 8 page Class Notes was uploaded by Shane Marks on Monday October 26, 2015. The Class Notes belongs to STAT 703 at University of South Carolina - Columbia taught by Staff in Fall. Since its upload, it has received 7 views. For similar materials see /class/229670/stat-703-university-of-south-carolina-columbia in Statistics at University of South Carolina - Columbia.
Reviews for INTRO STAT THEORY II
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 10/26/15
STAT 703J703 April 17th 2007 Lecz ure 25 Instructor Brian Habing Department of Statistics LeConte 203 Telephone 8037773578 Email habingstatscedu V V STAT 703J703 BHabing Univ of SC 1 quot Today Methods Based on the CDF The Empirical Distribution Function Some Statistical Properties KolmogorovSmirnov Test The Nonparametric Bootstrap V STAT 703J703 B H U U 39 fSC 1 a 1n H1VO quota Recall that the definition of the cumulative distribution function CDF is FXXPXSX Note that FXX is nondecreasing 39 FXX gt1 as X gtoo FXX gt0 as X gt00 V STAT 703J703 B H U U 39 fSC 1 a 1n H1VO quota The advantage of the CDF is that every random variable has one and it has the same definition for both discrete and continuous random variables V 3f STAT 703J703 BHabing Univ of SC xlt 5005001OO plotxpnormxtypequotlquot V 3f STAT 703J703 BHabing Univ of SC plotxpbinomx54ItYPenln V xv STAT 703J703 BHabing Univ of SC quot The empirical distribution function or empirical cumulative distribution function is defined as Fnltxgtx Sx Unlike a histogram there is only one way to plot an EDF V Aw STAT 703J703 BHabing Univ of SC quotj m ww edflt functiony xlt sorty plotcminx 1maxx1r c01typequotnquot xlabquotxquotylabquotPquot linescx11X1C0r0r lty1 linescx1x1r cO1lengthxlty2 V Aw STAT 703J703 BHabing Univ of SC quotj m ww for i in 1lengthx 1 linescxixi1r cilengthx ilengthx lty1 linescxi1xi1 cilengthx i1lengthxlty2 linescxlengthx xlengthx1c11 lty1 ea STAT 703J703 BHabing Univ of SC quot edfrnorm10 linesxpnormx V Aw STAT 703J703 BHabing Univ of SC quotj 10 mri Statistical properties of the EDF Note that we could write the EDF as 1 n Fnx Z I ooxXi V n STAT 703J703 BHabing Univ of SC quot n 39 11 This leads directly to the fact that FnX gt FX as n gtoo for each x With more theory we could prove that supXFnX gt FX gt 0 as n gtoo V PM K39 3 f STAT 703J703 BHabing Univ of SC 12 The KolmogorovSmirnov test uses this quantity to construct a test of the null hypothesis that the data is drawn from a population with Cdf F The test statistic is sum x FX V Aw STAT 703J703 BHabing Univ of SC quotj 13 The command in R is kstest ks test x quotpnormquot O 1 V Aw STAT 703J703 BHabing Univ of SC quotj Mn 14 It is interesting that the distribution of the KolmogorovSmirnov statistic does not depend on F ll V xv STAT 703J703 BHabing Univ of SC 15 quot Nonparametric Bootstrap We previously examined the parametric bootstrap for the case when we assumed the data came from some distribution F 9 with unknown parameter 9 V V STAT 703J703 BHabing Univ of SC 16 Estimating 6 we then generated bootstrap samples from the distribution F The statistic 67 is then calculated for each sample We then use the analogy that the sampling distribution of is to sampling distribution of 6 is to 6 V 3 STAT 703J703 BHabing Univ of SC 17 The nonparametric bootstrap uses the same basic analogy except that we don t have a specific distribution in mind for F Because of this we the parameter 6 that we are focusing on is usually something like the mean variance or median that is universally defined quot m min 3mm m STAT 703J703 BHabing Univ of SC a 18 Example Estimate the variance and bias of the sample standard deviation sfor the sample 395 379 375 271 552 612 174 605 392 569 Generated using Xlt 7 0rbeta 7 0 3 4 so the population has mean 30724 29 and variance 75049 23 06 50 z 7 75 V 3f ma STAT 703J703 BHabing Univ of SC 19 sdbootlt functionxnboots10000 sampsizelt lengthx bootsampslt matrixsamplexsampsizenboots replaceTncolsampsize bootstatslt applybootsamps1sd estbiaslt meanbootstatssdx estseltsdbootstats cestbiasestse V 3f ma STAT 703J703 BHabing Univ of SC 20 How well does it work V STAT 703J703 BHabing Univ of SC 21 quot When can it have trouble Small sample sizes but doesn t everything Statistic is not smooth V Aw STAT 703J703 BHabing Univ of SC quotj min 22