# Statistics I JLCP 782

Mason

GPA 3.86

This 29 page Class Notes was uploaded by Raphael Lang on Monday September 28, 2015. The Class Notes belongs to JLCP 782 at George Mason University taught by David Wilson in Fall.

Date Created: 09/28/15

JLCP 782 Statistics I Homework review I Reading statistical equations some basics I Understanding distributions I Reading raw data into SPSS Reading Statistical Equations Sample size A vectorof data A summation sign Sum all X39s A more complete version Xi EXI39 Grouped versus Ungrouped Distributions I Difference between grouped and ungrouped I Group distribution rules I mutually exclusive I exhaustive I equal interval width I first class contains lowest last class contains highest I Steps I Determine number of class intervals I Determine width of interval W Wm em I Make the intervals Bar charts I For nominal or ordinal level data I Bar chart bars not connected I All possible values are included I Bar charts can be vertical or horizontal I Advantages of bar charts over pie charts Bar Chart in SPSS graph barfsimple DV by IV Histograms I Like a bar chart but for interval or better data I Might be grouped or ungrouped I Ungrouped shows frequency for each value I Grouped shows frequency for a class interval I Might need to play with class interval to find optimal graphical display Line Graph I Simple line graph is similar to a histogram I Data should be interval or higher ordinal might be okay I Handles a large number of values well I Illustrate in SPSS Distribution Shape I Normal I Skewed positive and negative I Kurtosis leptokurtic and platykurtic I See histogram with normal distribution overlay Measures of Central Tendency I Mean median and mode I Most common value I There might be multiple modes bimodal I Useful for nominal and ordinal data I Advantages simple easy to calculate very general I Disadvantages ignores information might be misleading Median I Middle value if odd number of numbers I Mean of the two middle most numbers if even number of numbers I Not affected by extreme values I Useful for skewed interval ratio data eg income I Also appropriate for ordinal data I Median observation is found by N 1 T Median I Median is the 50th percentile I What is a percentile I How is it different than a percent I Can someone be at the 100th percentile Median I Adva nta ges I only one me dian I intuitive appeal I not influenced by extreme values skewl outliers numbers I The sum of all the numbers divided by the number39of Y in n l Balance point in the distribution I As such the mean is pulled by extreme values Computation of the mean gtlt H N4gtHOOQJONJgtMUW Y 5611 509 Characteristics of the Mean I Least squares property I Balance point I Deviations equal 0 Advantages and Disadvantages of the Mean I intuitively appealing I uses all of the data I statistically efficient I distorted by outliers or skewed data How to they Compare How do the mean median and mode relate to one another I In a unimoda symmetric distribution they are the same I Positively skewed distribution mode lt median lt mean I Negativer skewed distribution mean lt median lt mode Which to use I Mode you are interested in the most common Have nominal or ordinal data I Mean you are interested in the average and the distribution is not seriously skewed Have interval or higher data I Median you are interested in the average and the distribution is seriously skewed Have interval or higher data I Median you are interested in the midpoint Measures of Dispersion Variability I Variance ratio I Range Minimum and Maximum I Interquartile Range I Variance and the Standard Deviation Variance ratio VR I For nominal or ordinal data I The larger the VR the more dispersion I Equals the proportion of cases not in the modal category VR 17 fmodal n I VR is at maximum when all categories have the same frequency I If there are 4 categories and 25 of the cases in each then VR is 75 values in the data I Simply the difference between the maximum and minimum Interquartile Range I Difference between the 25th and 75th percentiles I Identifies the middle 50 of the cases I SPSS can find these values for you I Doing it by hand is not always easy I If n equals 157 157 data points then I 25th percentile is the 395th case n171571 25th rank 4 3915 I And the 75th percentile is 1185 75th rank 315 11815 I Find the average between the 39th and 40th ranked values and the 118th and 119th ranked values and you have your IQR Variance and Standard Deviation I Building block for a lot of data analysis I Used to standardize measures I Based on the squared distances between the individual and the mean I Illustrate averaging deviations around the mean I Population versus sample values Definitional formulas Definitional formula sample 52Xi72 n71 sVs2 Computational formula XiZ 7 52 Z n n71 Advantages of Computations formula l Easier to compute really I No rounding problems more accurate when done by hand Interpreting the Standard Deviation I Roughly 15th of the range I In a normal distribution 68 of the cases fall within 1 standard deviation of the mean above and below I In a normal distribution 95 plus a little fall within 2 standard deviations of the mean above and below SPSS Syntax desc var VARNAME statistics mean stddev variance range min max examine var VARNAME plot none percentiles5102550759095 statistics descriptives

