Chapters 3 and 4 Exam 2
Chapters 3 and 4 Exam 2 BUAL 2600 - 001
Popular in Business Analytics I
Popular in Finance
This 5 page Bundle was uploaded by Kelly Crittenden on Wednesday March 2, 2016. The Bundle belongs to BUAL 2600 - 001 at Auburn University taught by Frances L H Svyantek in Fall 2015. Since its upload, it has received 121 views. For similar materials see Business Analytics I in Finance at Auburn University.
Reviews for Chapters 3 and 4 Exam 2
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 03/02/16
Kelly Crittenden February 8, 2016 – February 22, 2016 Business Analytics 2600001 Chapter 3 Notes Displaying and Describing Quantitative Data Histogram a graph for a quantitative variable. Since there are no categories we usually slice up all of the possible values into bins and then count the number of cases that fall in each bin Relative Frequency Histograms percentage of cases in each bin – both graphs are the same Stem and Leaf Displays these are like histograms but they also give the individual values; the first digit is the “stem” to name the bins (the stem is to the left). Note: Quantitative Data Condition must be satisfied before making a histogram or stem and leaf display. Data must be values of a quantitative value. Shape Center Spread Modes peaks or bumps seen in a histogram Unimodal one main peak Bimodal 2 peaks Multimodal 3 or more peaks sometimes called trimodal Uniform a distribution whose histogram doesn’t appear to have any modes and bars are almost approximately the same height Kurtosis Platykurtic Mesokurtic Leptokurtic Symmetry if halves on either side of the center look approximately like mirror images (zero or close to zero) Tail thinner ends of a distribution Note: if one tail stretches out farther than the other the distribution is skewed to the side of the longer tail Positively skewed: right Negatively skewed: left Outliers those values that stand off away from the body of the distribution Note: Always be careful to point out the outliers in a distribution affect every statistical method we will study can be the most informative part of your data may be an error in the data should be discussed in any conclusions drawn about the data Mean to find the mean of the variable “y” (or x) add all values of the variable and divide that sum by number of data “n” it is known as the balancing point of the distribution Median this is used if the distribution is skewed, contains gaps, or outliers. It is the center value that splits the histogram into 2 equal areas. It is said to be resistant because it isn’t affected by unusual observations or by the shape of the distribution. Note: mean and median are almost the same when symmetric Range the difference between extremes (maxmin) Quartiles values that frame the middle 80% of the data Interquartile defined by the difference between 2 quartile values (IQR= Q3Q1) Five Number Summary reports its median, quartiles, and extremes (max and min) Box plot once you have a five number summary of a variable we can display it in a box plot Time Series Plot A display of values against time Smooth Trace use this to better understand the trend of times, its typically created using a statistics software package Stationary without a strong trend or change in variability; use a histogram Reexpress or Transform to make a skewed distribution more symmetric; apply a simple function to all the data values Log Compensation where the histogram is much more symmetric Kelly Crittenden February 24, 2016 – February 29, 2016 Business Analytics 2600001 Chapter 4 Notes Correlation & Linear Regression Scatterplot plots one quantitative variable against another; an effective display to look for trends, patterns, and relationships between 2 quantitative variables Scatterplots are the ideal way to picture what we call associations. The direction of the association is important. Upper left to the lower right is said to be negative. Lower left to the upper right is called positive. The 2 thing to look for is form. Linear a straight line relationship will appear as a cloud or swarm of points stretched out in a generally consistent, straight form rd 3 look for strength (weak or strong) 4 look for and outlier Outlier an unusual observation standing away from overall pattern of the scatterplot Coordinates (x,y) x = explanatory/ predictor variable (independent) y = response variable (dependent) Correlation Coefficient ratio of the sum of the product ZxZy for every point in the scatterplot to n1 Correlation measures the strength of the linear association between two quantitative variables Quantitative Variables Condition Correlation applies only to quantitative variables Linearity Condition Correlation measures the strength only of the linear association. If the underlying relationship is curved, summarizing its strength with a correlation would be misleading Outlier Condition Unusual observations can distort the correlation. When you see an outlier, it’s often a good idea to report the correlation both with and without the point. Lurking variable simultaneously affecting both of the variables you have observed A linear model is just an equation of a straight line through the data. A linear model can be written in the form y hat = b0 +b1 x where b0 and b1 are numbers estimated from the data and y hat is the predicted value. The difference between the predicted value and the observed value, y, is called the residual and is denoted e. Line of best fit is the line for which the sum of the squared residuals is smallest – often called the least squares line Slope Intercept Least squares lines are commonly called regression lines Regression to the mean each predicted y tends to be closer to its mean than its corresponding x was. Use models when specific assumptions are reasonable, check these conditions: 1. Quantitative Data Condition – linear models only make sense for quantitative data, so don’t be fooled by categorical data recorded as numbers 2. Linearity Assumption – check Linearity Condition – two variables must have a linear association, or a linear model won’t mean a thing 3. Outlier Condition – outliers can dramatically change a regression model 4. Equal Spread Condition – check a residual plot for equal scatter for all xvalues “r2” by tradition is written R2 and called “R squared” (80 – 90 % range) Extrapolation predicting for y from x beyond range of x.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'