Statistics 401, Week 3
Statistics 401, Week 3 01:960:401
Popular in Basic Statistics for Research
verified elite notetaker
Popular in Statistics
This 2 page Class Notes was uploaded by Wendy Liu on Thursday September 22, 2016. The Class Notes belongs to 01:960:401 at Rutgers University taught by Hei-ki Dong in Fall 2016. Since its upload, it has received 90 views. For similar materials see Basic Statistics for Research in Statistics at Rutgers University.
Reviews for Statistics 401, Week 3
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/22/16
Week 3: Distribution of Data (rest of Ch.2) 20 September, 2016 Basic Statistics for Research Professor HK Dong Wendy Liu Summary of Standard Notation: Variable sample Population Mean Standard deviation Variance Data set size 5 number summary – forms box & whisker plot Min – smallest value in data set th Q1– 25 percentile; 25% of data is below it o aka the median of min-Q 2 Q – aka the median – splits data in half 2 equally at 50% mark Q3– 75 percentile; 75% of data is below it o aka the median of Q2-max Max – largest value in data set Outlier data – marked as an asterisk (*) on box&whisker plot Q + 1.5IQR = upper limit: data points above the upper limit are outliers 3 Q1– 1.5IQR = lower limit: data points below the lower limit are outliers Measures of variation: Deviation from the mean: o Measure of variation for one data point (not entire data set) o Total deviation for any data set Positive and negative deviations of diff. data points eventually cancel out o Average deviation from the mean: Interquartile range: IQR=3 -1 Standard deviation – avg. distance of scores in a distribution from their mean o Sample standard deviation o Population standard deviation Variance – standard deviation squared o Sample variance o Population variance Bessel’s correction: use n-1 for samples for the n-1 degrees of freedom o Samples generally won’t have as many outliers as population (if any at all) o Dividing by a smaller value (n-1) results in a larger st.dev., which will be more similar to the true population st.dev. Sum of squares Shape of data distributions: Normal curve o Bell shaped o Symmetric about the center (mean; z=0) o Area under entire curve = 1 = 100% Uniform distribution o Mean = median = mode (or v. close) Uniform o Nearly equal frequency of all values o Flat-topped Skewed left/negatively skewed o Few data points to the left (more negative) of the majority Skewed right/positively skewed o Few data points to the right (more positive) of the majority z-score – a data point’s distance away from the mean, measured in units of standard deviation allows for comparison across data sets z-score of mean: z=0 sample: population: Empirical Rule – for normal distributions o 68% of the data will be within 1 standard deviation away from the mean o 95% of the data will be within 2 standard deviations away from the mean o 99.7% of the data will be within 3 standard deviations away from the mean Data points more than 2 stdevs away from the mean are considered outliers o Chebyshev’s Rule – any type of distribution No useful info for z=1 At least 75% of data will be within z=±2 At least 89% of data will be within z=±3
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'