Business Statistics 277 Chapter 3b and 6 Notes
Business Statistics 277 Chapter 3b and 6 Notes BNAD277
Popular in Business Statistics
verified elite notetaker
Popular in Business
This 11 page Class Notes was uploaded by Kristin Koelewyn on Monday February 1, 2016. The Class Notes belongs to BNAD277 at University of Arizona taught by Dr. S. Umashankar in Spring 2016. Since its upload, it has received 9 views. For similar materials see Business Statistics in Business at University of Arizona.
Reviews for Business Statistics 277 Chapter 3b and 6 Notes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/01/16
Bnad277: Chapter 3b Notes Descriptive Statistics- Numerical Measures - Measures of Distribution Shape, Relative Location, and Detecting Outliers o z-Scores (focusing a lot on this) o Chebyshev’s Theorem o Empirical Rule o Detecting Outliers - Five-Number Summaries and Box Plots - Measures of Association Between Two Variables - Data Dashboards: Adding Numerical Measures to Improve Effectiveness - Distribution Shape: Skewness o (Never will be asked to compute skewness on test) o Skewness is an important measure of the shape of a distribution. o Formula for skewness: o Skewness can easily be computed using statistical software (Excel) o Example of a symmetric graph (not skewed, mean=median). o Example of a graph skewed moderately to the left (skewed left=negative, mean<median). o Example of a graph skewed moderately to the right (skewed to the right=positive, mean>median). o If the graph is highly skewed right, skewness is positive (often greater than 1.0). o If graph is highly skewed negative, skewness is negative (often less than 1.0). - Z-scores: o The z-score is often called the standardized value. o It denotes the number of standard deviations a data value x is fiom the mean. o Excel’s STANDARDIZE function can be used to computer the z- score. o An observation’s z-score is a measure of the relative location of the observation in a data set. o A data value less than the sample mean will have a z-score less than zero. o A data value greater than the sample mean will have a z-score greater than zero. o A data value equal to the sample mean will have a z-score of zero. - Chebyshev’s Theorem: o Good to use when distribution shape is unknown 2 o At least (1-1/z ) of the items in any data set will be within z standard deviations of the mean, where z is any value greater than 1. o Chebyshev’s theorem requires z>1, but z need not be an integer. ▯ At least 75% of the data must be within z=2 standard deviations of the mean. ▯ At least 89% of the data must be within z=3 standard deviations of the mean. ▯ At least 94% of the data values must be within z=4 standard deviations of the mean. - Empirical Rule: o Good to use when distribution shape is a normal bell shape o The empirical rule can be used to determine the percentages of data values that must be within a specified number of standard deviations of the mean. o The empirical rule is based on the normal distribution. ▯ 68.26% of the values of a normal random variable are within +/- 1 standard deviation of its mean. ▯ 95.44% of the values of a normal random variable are within +/- 2 standard deviations of its mean. ▯ 99.72% of the values of a normal random variable are within +/- 3 standard deviations of its mean. - Detecting Outliers: o An outlier is an unusually small or unusually large value in a data set. o A data value with a z-score less than -3 or greater than +3 might be considered an outlier. o It might be: ▯ An incorrectly recorded data value ▯ A data value that was incorrectly included in the data set ▯ A correctly recorded data value that belongs in the data set o Example: For apartment rents, the most extreme z-scores are -1.20 and 2.27. Using the absolute value of z is greather than or equal to 3, there are no outliers in the data set. - Five Number Summaries and Box Plots: o Summary statistics and easy-to-draw graphs can be used quickly to summarize large quantities of data. o Two tools that accomplish this are five-number summaries and box plots. ▯ Smallest Value ▯ First Quartile ▯ Median ▯ Third Quartile ▯ Largest Value o Lowest Value= 525, First Quartile= 545, Median= 575, Third Quartile=625, Largest Value=715 - Box Plot o Used to identify outliers without finding z-scores o A box plot is a graphical summary of data that is based on a five- number summary. o A key to the development of a box plot is the computation of the median and the quartiles Q a1d Q . 3 ▯ Example: ▯ Q1= 545, Q3=625, Q2= 575 o Limits are located (not drawn) using the interquartile range (IQR). o Data outside these limits are considered outliers. o The locations of each outlier are shown with the symbol *. ▯ Example: Apartment Rents o Whiskers (dashed lines) are drawn from the ends of the box to the smallest and largest data values inside the limits. - Measures of Association Between Two Variables: o Two descriptive measures of the relationship between two variables are covariance and correlation coefficient. - Covariance: o The covariance is a measure of the linear association between two variables. o Positive values indicate a positive relationship (as x goes up, y goes up or vice versa) o Negative values indicate a negative relationship (as x goes up, y goes down or vice versa) ▯ Covariance for samples: ▯ Covariance for population: - Correlation Coefficient: o Correlation is a measure of linear association and not necessarily causation o It means that one is associated with the other, not that one is the cause of the other. ▯ Correlation Coefficient is computed as: o The coefficient can take on values between -1 and +1. o Values near -1 indicate a strong negative linear relationship. o Values near +1 indicate a strong positive linear relationship. o The closer the correlation is to zero, the weaker the relationship. ▯ Example: • From the data above, we can get: - Data Dashboards: Adding Numerical Measures to Improve Effectiveness o Data dashboards are not limited to graphical displays. o The addition of numerical measures, such as the mean and standard deviation of KPIs, to a data dashboard is often critical. o Dashboards are often interactive. o Drilling down refers to functionality in interactive dashboards that allows the user to access information and analyses at increasingly detailed level. Bnad277: Chapter 6 Notes Continuous Probability Distributions - Uniform Probability Distribution - Normal Probability Distribution - Exponential Probability Distribution - Continuous Probability Distributions: o A continuous random variable can assume any value in an interval on the real line or in a collection of intervals. o It is not possible to talk about the probability of the random variable assuming a particular value. o Instead, we talk about the probability of the random variable assuming a value within a given interval. o The probability of the random variable assuming a value within some given interval from x to1x is2defined to be the area under the graph of the probability density function between x and1x 2 - Uniform Probability Distribution: o A random variable is uniformly distributed whenever the probability is proportional to the interval’s length. o The uniform probability density function is: ▯ Where a= smallest value the variable can assume and b= largest value the variable can assume o Expected Value of x: o Variance of x: o Example: Slater’s Buffet ▯ Slater customers are charged for the amount of salad they take. Sampling suggests that the amount of salad taken is uniformly distributed between 5 ounces and 15 ounces. ▯ Use uniform probability function (^above^) where x= salad plate filling weight ▯ Expected Value of x: ▯ Variance of x ▯ Distribution for Salad Plate Filling Weight - Area as a Measure of Probability: o The area under the graph of f(x) and probability are identical. o This is valid for all continuous random variables. o The probability that x takes on a value between some lower value x1and some higher value x can2be found by computing the area under the graph of f(x) over the interval from x 1o x 2 - Normal Probability Distribution: o The normal probability distribution is the most important distribution for describing a continuous random variable. o It is widely used in statistical inference. o It has been used in a wide variety of applications including: height of people, rainfall amounts, test scores, & scientific measurements o Abraham de Moivre, a French mathematician, published The Doctrine of Chances in 1733 and he derived the normal distribution. ▯ Normal Probability Density Function Where: o Characteristics: ▯ The distribution is symmetric; its skewness measure is zero. ▯ The entire family of normal probability distributions is defined by its mean and its standard deviation. ▯ The highest point on the normal curve is at the mean, which is also the median and the mode. ▯ The mean can be any numerical value: negative, zero, or positive. ▯ The standard deviation determines the width of the curve: larger values result in wider, flatter curves. ▯ Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right of the mean). ▯ Basis for the empirical rule: • 68.26% of values of a normal random variable are within +/- 1 standard deviation of its mean. • 95.44% of values of a normal random variable are within +/- 2 standard deviations of its mean. • 99.97% of values of a normal random variable are within +/- 3 standard deviations of its mean. - Standard Normal Probability Distribution: o Characteristics: ▯ A random variable having a normal distribution with a mean of 0 and a standard deviation of 1 is said to have a standard normal probability distribution. ▯ The letter z is used to designate the standard normal random variable. ▯ Converting to the Standard Normal Distribution: • We can think of z as a measure of the number of standard deviations x is from m - Using Excel to Compute Standard Normal Probabilities: o Excel has two functions for computing probabilities and z values for a standard normal distribution: ▯ NORM.S.DIST is used to compute cumulative probability ▯ NORM.S.INV is used to compute the z value • The “S” in the function names reminds us that they relate to the standard normal probability distribution. - Standard Normal Probability Distribution Continued: o Example: Pep Zone ▯ Pep Zone sells auto parts and supplies including a popular multi-grade motor oil. When the stock of this oil drops to 20 gallons, a replenishment order is placed. The store manager is concerned that sales are being lost due to stockouts while waiting for a replenishment order. ▯ It has been determined that demand during replenishment lead-time is normally distributed with a mean of 15 gallons and a standard deviation of 6 gallons.The manager would like to know the probability of a stockout during replenishment lead-time. In other words, what is the probability that demand during lead-time will exceed 20 gallons? P(x>20)=? • Step 1: Convert x to the standard normal distribution: ▯ Step 2: Find the area under the standard normal curve to the left of z=.83 ▯ Step 3: Compute the area under the standard normal curve to the right of z=.83 ▯ The manager wants the probability of a stockout to be no more than .05 • First find the z value compliment of the tail area: (1-.05=.95). ▯ Then convert z .05to the corresponding value of x: ▯ By raising the reorder point from 20 gallons to 25 gallons on hand, the probability of a stockout decreases from about .20 to .05. This is a significant difference. o Excel to Computer Normal Probabilities: ▯ NORM.DIST is used to computer the cumulative probability given and x value. ▯ NORM.INV is used to compute the x value given a cumulative probability. - Exponential Probability Distribution: o The exponential probability distribution is useful in describing the time it takes to complete a task. o The exponential random variables can be used to describe: time between vehicle arrivals at a tollbooth, time required to complete a questionnaire, and distance between major defects in a highway. o In waiting line applications, the exponential distribution is often used for service times. o A property of the exponential distribution is that the mean and standard deviation are equal. o The exponential distribution is skewed to the right. Its skewness measure is 2. ▯ Density Function: ▯ Cumulative Probabilities: o Example: Al’s full-service pump: The time between arrivals of cars at Al’s full-service gas pump follows an exponential probability distribution with a mean time between arrivals of 3 minutes. Al would like to know the probability that the time between two successive arrivals will be 2 minutes or less. o Relationship Between the Poisson and Exponential Distributions:
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'