Psychological Statistics PSYC 2101
Popular in Course
Popular in Psychlogy
verified elite notetaker
This 9 page Class Notes was uploaded by Lane Schuster on Sunday October 11, 2015. The Class Notes belongs to PSYC 2101 at East Carolina University taught by Karl Wuensch in Fall. Since its upload, it has received 17 views. For similar materials see /class/221339/psyc-2101-east-carolina-university in Psychlogy at East Carolina University.
Reviews for Psychological Statistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/11/15
Skewness Kurtosis and the Normal Curve Skewness In everyday language the terms skewed and askew are used to refer to something that is out of line or distorted on one side When referring to the shape of frequency or probability distributions skewness refers to asymmetry ofthe distribution A distribution with an asymmetric tail extending out to the right is referred to as positively skewed or skewed to the right while a distribution with an asymmetric tail extending out to the left is referred to as negatively skewed or skewed to the left Skewness can range from minus in nity to positive in nity Karl Pearson 1895 first suggested measuring skewness by standardizing the u mode 039 difference between the mean and the mode that is sk Population modes are not well estimated from sample modes but one can estimate the difference between the mean and the mode as being three times the difference between the mean and the median Stuart amp Ord 1994 leading to the following estimate of skewness skes W Many statisticians use this measure but with the 3 eliminated s that is sk W This statistic ranges from 1 to 1 Absolute values above 3 02 indicate great skewness Hildebrand 1986 Skewness has also been de ned with respect to the third moment about the 3 mean y1 which is simply the expected value ofthe distribution of cubed z 039 scores Skewness measured in this way is sometimes referred to as Fisher s skewness When the deviations from the mean are greater in one direction than in the other direction this statistic will deviate from zero in the direction of the larger deviations From sample data Fisher s skewness is most often estimated by n2 23 n 1xn 2 approximately normally with a standard error ofapproximately While one could use this sampling distribution to construct confidence intervals for or tests of hypotheses about 1 there is rarely any value in doing so 91 For large sample sizes n gt 150 91 may be distributed The most commonly used measures of skewness those discussed here may produce surprising results such as a negative value when the shape of the distribution appears skewed to the right There may be superior alternative measures not commonly used Groeneveld amp Meeden 1984 Copyright 2007 Karl L Wuensch All rights reserved SkewKurtdoc It is important for behavioral researchers to notice skewness when it appears in their data Great skewness may motivate the researcher to investigate outliers When making decisions about which measure of location to report means being drawn in the direction ofthe skew and which inferential statistic to employ one which assumes normality or one which does not one should take into consideration the estimated skewness ofthe population Normal distributions have zero skewness Of course a distribution can be perfectly symmetric but far from normal Transformations commonly employed to reduce positive skewness include square root log and reciprocal transformations Also see Skewness and the Relative Positions of Mean Median and Mode Kurtosis Karl Pearson 1905 defined a distribution s degree of kurtosis as 77 z 3 4 where z the expected value of the distribution on scores which have 039 been raised to the 4th power g is often referred to as Pearson s kurtosis and g 3 often symbolized with 72 as kurtosis excess or Fisher s kurtosis even though it was Pearson who de ned kurtosis as g 3 An unbiased estimator for 72 is nn1ZZ4 3n 12 n 1n 2n 3 n 2n 3 distributed approximately normally with a standard error ofapproximately 24n Snedecor amp Cochran 1967 While one could use this sampling distribution to construct confidence intervals for or tests of hypotheses about 2 there is rarely any value in doing so 92 For large sample sizes n gt 1000 92 may be Pearson 1905 introduced kurtosis as a measure of how at the top of a symmetric distribution is when compared to a normal distribution of the same variance He referred to more flattopped distributions 72 lt 0 as platykurtic less attopped distributions yg gt 0 as leptokurtic and equally flattopped distributions as mesokurtic 72 m 0 Kurtosis is actually more influenced by scores in the tails of the distribution than scores in the center ofa distribution DeCarlo 1967 Accordingly it is often appropriate to describe a leptokurtic distribution as fat in the tails and a platykurtic distribution as thin in the tails Student 1927 Biometn39ka 19 160 published a cute description of kurtosis which I quote here Platykurtic curves have shorter tails than the normal curve of error and leptokurtic longer tails I myself bear in mind the meaning ofthe words by the above memon a technica where the rst gure represents platypus and the second kangaroos noted for lepping Please point your browser to httpmembersaolcom39eff570khtml scroll down to kurtosis and look at Student s drawings Moors 1986 demonstrated that z VarZz1 Accordingly it may be best to treat kurtosis as the extent to which scores are dispersed away from the shoulders ofa distribution where the shoulders are the points where Z2 1 that is Z i1 Balanda and MacGiIIivray 1988 wrote it is best to de ne kurtosis vaguely as the location and scalefree movement of probability mass from the shoulders ofa distribution into its centre and tails If one starts with a normal distribution and moves scores from the shoulders into the center and the tails keeping variance constant kurtosis is increased The distribution will likely appear more peaked in the center and fatter in the tails like a Laplace distribution y2 3 or Student s twith few degrees of freedom y2 6 df 4 Starting again with a normal distribution moving scores from the tails and the center to the shoulders will decrease kurtosis A uniform distribution certainly has a flat top with 72 12 but 9 can reach a minimum value of 2 when two score values are equally probably and all other score values have probability zero a rectangular U distribution that is a binomial distribution with n 1 p 5 One might object that the rectangular U distribution has all of its scores in the tails but closer inspection will reveal that it has no tails and that all of its scores are in its shoulders exactly one standard deviation from its mean Values of 92 less than that expected for an uniform distribution 12 may suggest that the distribution is bimodal Darlington 1970 but bimodal distributions can have high kurtosis if the modes are distant from the shoulders One leptokurtic distribution we shall deal with is Student s tdistribution The kurtosis of t is in nite when dflt 5 6 when df 5 3 when df 6 Kurtosis decreases further towards zero as dfincrease and tapproaches the normal distribution Kurtosis is usually of interest only when dealing with approximately symmetric distributions Skewed distributions are always leptokurtic Hopkins amp Weeks 1990 Among the several alternative measures of kurtosis that have been proposed none of which has often been employed is one which adjusts the measurement of kurtosis to remove the effect of skewness Blest 2003 There is much confusion about how kurtosis is related to the shape of distributions Many authors of textbooks have asserted that kurtosis is a measure of the peakedness of distributions which is not strictly true It is easy to confuse low kurtosis with high variance but distributions with identical kurtosis can differ in variance and distributions with identical variances can differ in kurtosis Here are some simple distributions that may help you appreciate that kurtosis is in part a measure of tail heaviness relative to the total variance in the distribution remember the 04 in the denominator Table 1 Kurtosis for 7 Simple Distributions Also Differing in Variance X freq A freq B freq C freq D freq E freq F freq G 05 20 20 20 10 05 03 01 10 00 10 20 20 20 20 20 15 20 20 20 10 05 03 01 Kurtosis 20 175 15 10 00 133 80 Variance 25 20 166 125 83 577 227 Platykurtic Leptokurtic When I presented these distributions to my colleagues and graduate students and asked them to identify which had the least kurtosis and which the most all said A has the most kurtosis G the least excepting those who refused to answer But in fact A has the least kurtosis 2 is the smallest possible value of kurtosis and G the most The trick is to do a mental frequency plot where the abscissa is in standard deviation units In the maximally platykurtic distribution A which initially appears to have all its scores in its tails no score is more than one oaway from the mean that is it has no tails In the leptokurtic distribution G which seems only to have a few scores in its tails one must remember that those scores 5 amp 15 are much farther away from the mean 33 0 than are the 5 s amp 15 s in distribution A In fact in G nine percent ofthe scores are more than three 0 from the mean much more than you would expect in a mesokurtic distribution like a normal distribution thus G does indeed have fat tails If you were you to ask SAS to compute kurtosis on the A scores in Table 1 you would get a value less than 20 less than the lowest possible population kurtosis Why SAS assumes your data are a sample and computes the 92 estimate of population kurtosis which can fall below 20 Sune Karlsson ofthe Stockholm School of Economics has provided me with the following modi ed example which holds the variance approximately constant making it quite clear that a higher kurtosis implies that there are more extreme observations or that the extreme observations are more extreme It is also evident that a higher kurtosis also implies that the distribution is more singIepeaked this would be even more evident if the sum of the frequencies was constant I have highlighted the rows representing the shoulders of the distribution so that you can see that the increase in kurtosis is associated with a movement of scores away from the shoulders Table 2 Kurtosis for Seven Simple Distributions Not Differing in Variance X Freq A Freq B Freq C Freq D Freq E Freq F Freq G 66 0 0 0 0 0 0 04 0 0 0 0 0 3 0 13 0 0 0 0 5 0 0 29 0 0 0 10 0 0 0 39 0 0 20 0 0 0 0 44 0 20 0 0 0 0 0 5 20 0 0 0 0 0 0 10 0 10 20 20 20 20 20 1539 20 0 0 0 0 0 0 156 0 20 0 0 0 0 0 161 0 0 20 0 0 0 0 171 0 0 0 10 0 0 0 187 0 0 0 0 5 0 0 204 0 0 0 0 0 3 0 266 0 0 0 0 0 0 1 Kurtosis 20 175 15 10 00 133 80 Variance 25 251 248 252 252 250 251 While is unlikely that a behavioral researcher will be interested in questions that focus on the kurtosis ofa distribution estimates of kurtosis in combination with other information about the shape of a distribution can be useful DeCarlo 1997 described several uses for the 92 statistic When considering the shape of a distribution of scores it is useful to have at hand measures of skewness and kurtosis as well as graphical displays These statistics can help one decide which estimators or tests should perform best with data distributed like those on hand High kurtosis should alert the researcher to investigate outliers in one or both tails of the distribution Tests of Significance Some statistical packages including SPSS provide both estimates of skewness and kurtosis and standard errors for those estimates One can divide the estimate by it s standard error to obtain a 2 test of the null hypothesis that the parameter is zero as 5 would be expected in a normal population but I generally nd such tests of little use One may do an eyeball test on measures of skewness and kurtosis when deciding whether or not a sample is normal enough to use an inferential procedure that assumes normality ofthe populations lfyou wish to test the null hypothesis that the sample came from a normal population you can use a chisquare goodness of fit test comparing observed frequencies in ten or so intervals from lowest to highest score with the frequencies that would be expected in those intervals were the population normal This test has very low power especially with small sample sizes where the normality assumption may be most critical Thus you may think your data close enough to normal not significantly different from normal to use a test statistic which assumes normality when in fact the data are too distinctly nonnormal to employ such a test the nonsigni cance of the deviation from normality resulting only from low power small sample sizes SAS PROC UNIVARIATE will test such a null hypothesis for you using the more powerful KolmogorovSmirnov statistic for larger samples orthe Shapiro V lks statistic for smaller samples These have very high power especially with large sample sizes in which case the normality assumption may be less critical for the test statistic whose normality assumption is being questioned These tests may tell you that your sample differs significantly from normal even when the deviation from normality is not large enough to cause problems with the test statistic which assumes normality SAS Exercises Go to my StatData page and download the file EDAdat Go to my SAS Programs page and download the program le g1gZsas Edit the program so that the lNFlLE statement points correctly to the folder where you located EDAdat and then run the program which illustrates the computation ofg1 and 92 Look at the program The raw data are read from EDAdat and PROC MEANS is then used to compute g1 and 92 The next portion of the program uses PROC STANDARD to convert the data to z scores PROC MEANS is then used to compute g1 and 92 on the z scores Note that standardization ofthe scores has not changed the values ofg1 and 92 The next portion ofthe program creates a data set with the z scores raised to the 3rd and the 4th powers The nal step of the program uses these powers ofzto compute g1 and 92 using the formulas presented earlier in this handout Note that the values of 91 and 92 are the same as obtained earlier from PROC MEANS Go to my SASPrograms page and download and run the le Kurtosis Uniformsas Look at the program A DO loop and the UNIFORM function are used to create a sample of 500000 scores drawn from a uniform population which ranges from 0 to 1 PROC MEANS then computes mean standard deviation skewness and kurtosis Look at the output Compare the obtained statistics to the expected values for the following parameters ofa uniform distribution that ranges from a to b Parameter Expected Value Parameter Expected Value Mean Skewness 0 ba2 12 Standard Deviation Kurtosis 12 Go to my SASPrograms page and download and run the le KurtosisTsas which demonstrates the effect of sample size degrees of freedom on the kurtosis of the tdistribution Look at the program V thin each section of the program a DO loop is used to create 500000 samples of N scores where N is 10 11 17 or 29 each drawn from a normal population with mean 0 and standard deviation 1 PROC MEANS is then used to compute Student s tfor each sample outputting these tscores into a new data set We shall treat this new data set as the sampling distribution oft PROC MEANS is then used to compute the mean standard deviation and kurtosis ofthe sampling distributions of t For each value of degrees of freedom compare the obtained statistics with their expected values Mean Standard Deviation Kurtosis 0 df 6 df 2 df 4 Download and run my program KurtosisBeta2sas Look at the program Each section of the program creates one of the distributions from Table 1 above and then converts the data to z scores raises the z scores to the fourth power and computes g as the mean of 24 Subtract 3 from each value of g and then compare the resulting 72 to the value given in Table 1 Download and run my program KurtosisNormalsas Look at the program DO loops and the NORMAL function are used to create 10000 samples each with 1000 scores drawn from a normal population with mean 0 and standard deviation 1 PROC MEANS creates a new data set with the g1 and the 92 statistics for each sample PROC MEANS then computes the mean and standard deviation standard error for skewness and kurtosis Compare the values obtained with those expected 0 for the means and x6n and x24n for the standard errors References Balanda amp MacGillivray 1988 Kurtosis A critical review American Statistician 42 111119 Blest DC 2003 A new measure of kurtosis adjusted for skewness Australian ampNew Zealand Journal of Statistics 45 175179Darington RB 1970 Is kurtosis really peakedness The American Statistician 242 1922DeCarlo LT 1997 On the meaning and use of kurtosis Psychological Methods 2 292307 DeCarlo L T 1997 On the meaning and use of kurtosis Psychological Methods 2 292307 Groeneveld RA amp Meeden G 1984 Measuring skewness and kurtosis The Statistician 33 3 99 Hildebrand D K 1986 Statistical thinking for behavioral scientists Boston Duxbury Hopkins KD amp Weeks DL 1990 Tests for normality and measures of skewness and kurtosis Their place in research reporting Educational and Psychological Measurement 50 717729Loether H L amp McTavish D G 1988 Descriptive and inferential statistics An introduction 3rd ed Boston Allyn amp Bacon Moors JJA 1986 The meaning of kurtosis Darlington reexamined The American Statistician 40 283284 Pearson K 1895 Contributions to the mathematical theory of evolution ll Skew variation in homogeneous material Philosophical Transactions of the Royal Society of London 186 343414 Pearson K 1905 Das Fehlergesetz und seine Verallgemeinerungen durch Fechner und Pearson A Rejoinder Biometrika 4 169212 Snedecor GW amp Cochran WG 1967 Statistical methods 6th ed Iowa State University Press Ames Iowa Stuart A amp Ord JK 1994 Kendall s advanced theory of statistics Volume 1 Distribution Theory Sixth Edition Edward Arnold London Wuensch K L 2005 Kurtosis In B S Everitt amp D C Howell Eds Encyclopedia of statistics in behavioral science pp 1028 1029 Chichester UK Wiley Wuensch K L 2005 Skewness In B S Everitt amp D C Howell Eds Encyclopedia of statistics in behavioral science pp 1855 1856 Chichester UK Wiley Links 0 httpcoreecuedupsycwuenschkStatHelpKURTOSIStxt a log of email discussions on the topic of kurtosis most of them from the EDSTAT list 0 httncoreecu 39 39F WuenschKdocs3r 39quot L39 39 inn distribution of final grades in PSYC 2101 undergrad stats Spring 2007 Copyright 2007 Karl L Wuensch All rights reserved 8 Return to My Statistics Lessons Page
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'