# Week 3 Stats Notes 220

JMU

This 4 page Class Notes was uploaded by Christian Anthony on Monday February 1, 2016. The Class Notes belongs to 220 at James Madison University taught by Mr. Greg Jansen in Winter 2016.

Date Created: 02/01/16

STATS NOTES 3 1/25, 1/27, 1/29 For symmetric distributions, giventhe fact thatit’s got a decent center, the easiest number to calculate with would be the mean. For skeweddistributions,themedianisa more stable number to calculate with. (Inclass, nospecifics were givenon exactly what we would be calculating using these.. sorry.) MEASURESOFSPREAD Range-Maximum number in a data set minus the minimum number ina data set, showshow wide a distribution is. Variance-An average of the squared deviationsofeach datapointfrom the collective mean. The formula for the variance(which is likely we’llnever really use,butis good to know),is here: StandardDeviation-quare rootof the variance. Formula below. * σ=population standard deviation,which isALWAYS given. S standsfor sample Standarddeviation. PROPERTIESOFSTANDARDDEVIATION --Measures the spread around the mean. (Standard deviation goes with mean ONLY). -- S is always greater than 0, and it can only be 0 when there is NO spread, meaning all the data points are the same. -- S has the same units as the observation (If dealing with $, S will be in $). -- S is sensitive (meaning it can be changed by the mean). UNITS In this class, the professor pointed out how units are to be done for him. If a question asks you for 1 or 2 values, include units. 3 or more numbers- units may be omitted. Zscores A Z score is a standardized value that allows us to determine the number of standard deviations a data value is away from the mean. Formula for Z score is below: In this, x equals the data value. The other symbols are in this guide or the previous one. Check ‘em out. EX: From a population with = 4 and , find the number of standard deviations that 2 is from the mean. Z= (2-4)/22, which equals -2/22-92. Putting this is a sentence, you could say: The data value of two is -0.092 standard deviations away from the mean. PERCENTILES: Percentil- The percentage of data that lies below a certain given data value. Quartile5th, 50th, and 75th percentiles. 25thQuartile 50thQuartile 75thQuartile ● First Quartile (Q1) ● Second Quartile (M) ● Third Quartile (Q3) ● 25% of data is below ● Median of data (M) ● Median of the upper this value. half of the data. ● Median of the lower ● 50% of data below ● 75% of data below half of the data. AND above. this point. EX: 3 8 10 12 18 21 Q1: 8 Q3: 18 M: 11 FIVENUMBERSUMMARY: The five number summary consists of the minimum number, the first quartile, the median, the third quartile, and the maximum number. These numbers must be written in order, like follows: Min: 3 Q1: 8 M: 11 Q3: 18 Max: 21 BOXPLOTS: Boxplots- Graphs that show the five number summary. *Before drawing one up, be sure you use consistent number labeling. 1. Draw vertical lines that represent each value in the five number summary. 2. Connect the Q1 to the Q3 on the top and bottom (making a box). 3. Connect the min number to the Q1 and the max to the Q3. OUTLIERS: The InterQuartile Range, or IQR, is the Q3-Q1. This will be used to find outliers. There are two different ways to find outliers. The first is the IQR Method, the second is the 3 Standard Deviations Rule. (The professor said he’ll ask you to use one or the other.) 1) The IQR Method: ● Q1 - (1.5*IQR)= the lower bound for outliers. Anything below the number you get with this is an outlier. ● Q3 + (1.5*IQR)= the higher bound for outliers. Anything above this number is an outlier. 2) The 3 Standard Deviation Rule: ● x- 3SD= the lower bound. ● x+ 3SD= the higher bound.

