by: Wendy Liu

90

3

2

# Statistics 401, Week 3 01:960:401

Statistics > 01:960:401
Wendy Liu
Rutgers
GPA 4.0

## About this Document

Covers distribution of data: standard deviation, variance, normal distributions, etc.
COURSE
Basic Statistics for Research
PROF.
Hei-ki Dong
TYPE
Class Notes
PAGES
2
WORDS
CONCEPTS
Statistics
KARMA
## Popular in Statistics

This 2 page Class Notes was uploaded by Wendy Liu on Thursday September 22, 2016. The Class Notes belongs to 01:960:401 at Rutgers University taught by Hei-ki Dong in Fall 2016.

Date Created: 09/22/16
Week 3: Distribution of Data (rest of Ch.2) 20 September, 2016 Basic Statistics for Research Professor HK Dong Wendy Liu Summary of Standard Notation: Variable sample Population Mean Standard deviation Variance Data set size 5 number summary – forms box & whisker plot  Min – smallest value in data set th  Q1– 25 percentile; 25% of data is below it o aka the median of min-Q 2  Q – aka the median – splits data in half 2 equally at 50% mark  Q3– 75 percentile; 75% of data is below it o aka the median of Q2-max  Max – largest value in data set Outlier data – marked as an asterisk (*) on box&whisker plot  Q + 1.5IQR = upper limit: data points above the upper limit are outliers 3  Q1– 1.5IQR = lower limit: data points below the lower limit are outliers Measures of variation:  Deviation from the mean: o Measure of variation for one data point (not entire data set) o Total deviation for any data set  Positive and negative deviations of diff. data points eventually cancel out o Average deviation from the mean:  Interquartile range: IQR=3 -1  Standard deviation – avg. distance of scores in a distribution from their mean o Sample standard deviation o Population standard deviation  Variance – standard deviation squared o Sample variance o Population variance  Bessel’s correction: use n-1 for samples for the n-1 degrees of freedom o Samples generally won’t have as many outliers as population (if any at all) o Dividing by a smaller value (n-1) results in a larger st.dev., which will be more similar to the true population st.dev. Sum of squares Shape of data distributions:  Normal curve o Bell shaped o Symmetric about the center (mean; z=0) o Area under entire curve = 1 = 100% Uniform distribution o Mean = median = mode (or v. close)  Uniform o Nearly equal frequency of all values o Flat-topped  Skewed left/negatively skewed o Few data points to the left (more negative) of the majority  Skewed right/positively skewed o Few data points to the right (more positive) of the majority z-score – a data point’s distance away from the mean, measured in units of standard deviation  allows for comparison across data sets  z-score of mean: z=0  sample:  population: Empirical Rule – for normal distributions  o 68% of the data will be within 1 standard deviation away from the mean  o 95% of the data will be within 2 standard deviations away from the mean  o 99.7% of the data will be within 3 standard deviations away from the mean  Data points more than 2 stdevs away from the mean are considered outliers o Chebyshev’s Rule – any type of distribution  No useful info for z=1  At least 75% of data will be within z=±2  At least 89% of data will be within z=±3

