Chapter 2 Notes

• Deviation—Given a value y and a data point x then x-y represents how far x deviates from y.

o Ex. Given x=10 and y=5. 10 deviates by 5 from 5.

• Bar Charts—Display where the length of a bar corresponds to the frequency or number of observations in a category

• Pie Charts—Slice is proportional to the amount in each category • Relative Frequency—The proportion and is calculated by relative frequency= # in the class/# total.

• Cumulative Frequency—Sum of frequencies of a particular class and all preceding classes.

• Cumulative Relative Frequency—Make the cumulative frequency relative.

• Histogram—A bar graph of frequency or relative frequency • Algorithm

1. Determine the number of classes

2. Find the smallest and largest value

3. Class Width=largest – smallest divided by total number of classes

4. First class is usually the first class with the smallest number and starting at a multiple of the class width

5. Find class boundaries are the average/midpoint between two classes

a. Lower class boundary < xi < upper class boundary 6. Calculate the frequency or relative frequency

7. Create a bar graph!

***Class boundaries are sometimes called cut points. Classes are sometimes called bins.

• Ordered Array—List of all data points in order

o Rank order: increasing order

o Rank order: increasing order

o Reverse rank order: decreasing order

• Dot Plot—Graph where each data point is a point above a horizontal axis (usually a number line) if multiple entries have the same value they are stacked

• Probability Distribution—Assigns a probability to a set of possible outcomes

• Symmetric Distribution—If one were to draw a line down the middle of the distribution the two sides would mirror each other • Skewed (asymmetrical) Distribution—Not symmetric or a group of observation that are not equal on both sides

1. Left Skewed—Left side longer

2. Right Skewed—Right side longer

• Unimodal—Distribution has each one “peak”

• Bimodal—Has exactly two “peaks”

• Multimodal—Has more than one “peak”

• Mode—The value that occurs most frequently, not necessarily unique

• Mean—The average

o Sample Mean—Xi is the data point in a sample

o Population Mean—N is the number of elements in the population. For a finite population.

o Weighted Mean—Suppose the ith observation is given a weight wi

o Trimmed Mean—Ignores equal percentages of the highest and lowest data points.

• Median—Data value in the center of an odered list o Ex. 1,3,4,6,3,4,5—1,3,3,4,4,5,6. 4 is the median.

• Outlier—Data points that are extremely small or large relative to the data set.

• Resistant—Statistics not affected by outliers are called resistant 
• Range—The difference between the largest and smallest value 
• Empirical Rule—Derived from a bell-curve (normal distribution)

o One-sigma rule—68% of data lives within one standard derivation of the mean

o Two-sigma Rule—95% of the data lives within 2 standard deviations of mean

o Three-sigma—99.7% lives within 3 standard deviations of the mean

• Chebyshev's Theorem—Proportion of any data set lying within K standard deviations of the mean is at least 1-1/k2 for k>1 
• Percentiles—Given a set of data xi,…xN, the Pth percentile is a value, say x, such that approximately P% of the data is less than or equal to x and (100-P)% is greater

• Percentile of a Value—Percentile of x=#data pts. =x/total # of data pts.

• Quartiles—The 25th 50th 75th percentiles are the first, second, and third quartiles Q1, Q2, Q3

• Interquartile Range—Difference between third and first quartile • Outliers—A data point is considered an outlier if it is 1.5 times the IQR above Q3 or 1.5 times the IQR be between Q1

• Z-score—The number of standard deviations x away from the mean.

• Mean of Grouped Data—Data points might be binned in classes • Random Experiment—An activity or event where the outcome is uncertain

• Sample Space—The set of all distinct outcomes of an experiment 
• Relative Frequency—Rel Freq. of A=# times A occus divided by the # times of the experiment

• Set Theory—A set is a list without repeats

o Compound event: A combination of two or more events o Union of Events: A & B is the set of outcomes that are included A or B or both. Denoted AυB

• Intersection—Intersection of events A & B is the set of all outcomes that are in both A & B.

• Complement of Event A—the set of all outcomes not in A • Mutually Exclusive—The two sets A and B are mutually exclusive if they have no points in common