Intro Stat Notes Week 3
Intro Stat Notes Week 3 TMATH 110 C
University of Washington Tacoma
Popular in Intro Stat Applications
Popular in Math
This 5 page Class Notes was uploaded by Qihua Wu on Sunday October 18, 2015. The Class Notes belongs to TMATH 110 C at University of Washington Tacoma taught by KENNEDY,MAUREEN C. in Fall 2015. Since its upload, it has received 17 views. For similar materials see Intro Stat Applications in Math at University of Washington Tacoma.
Reviews for Intro Stat Notes Week 3
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/18/15
Descriptive statistics summaries of data sets in quantitative forms Different measures of center in distributions meaning the center of the distribution shifts to nd the distance it shifts subtract center 1 from center 2 Different variability in distributions meaning same center but differs in how widely the data spread out range Measure of center middle of the data set shows where distribution is located along the xaxis mean is the most common for center Sample mean It is x with a dash on top Population mean 1 X Sample size n Population size N Sample individual observation xi Population individual observation Xi have i is subscript for individual observation Mean adding all the data values then divide by the total amount of data you Mean should have one more decimal point than the data you have If mean happened to not have extra decimal points than the data add 0 at the end if it is a decimal 0 if it is a whole number Mean needs to use all data values and it always has the same unit as the data unit is very important Extreme values can strongly affect the mean since mean is the average of all data values Sample mean is unbiased in the long run the sample mean unbiased is always a normal distribution where the center is population mean Since mean is quantitative it is not a summary for ordinal or nominal data only for interval and ratio data Do not use histograms to display extreme values because it would be greatly in uenced use dot plots to better show the pattern Using frequency table to nd mean Use frequency of all classes adding up times the midpoint then divide by the number of frequencies you have to get an estimate of mean Median if another common measure of center the value of median in the data is in the middle of the data set where half of the data above of the median and half below To nd median rst sort the data set in statistic order smallest to largest then nd the middle value If the number of data is odd number it would be at the number of data 2 1 th place If the data set is even number it would be the mean of the middle two numbers Similar to mean it has the same unit as the data Round to one more decimal than data even if get the number directly from the data set still at 0 at the end Median does not use all the data values it is using only one or two data values depending on whether your number of data in the data set is odd or even It is not strongly in uenced by extreme values all extreme values do is to shift the median by the position Mode the value that occurs the most in the data set a data set can be unimodal bimodal multimodal or without a mode Mode is the one with the most frequency in the frequency table If there are two modes and there is a bit of gap between the two values and both median and mean are below them then it is better to calculate the distinct population separately Measure of Central Tendency and Distribution Normal Symmetric mode median and mean are the same Uniform Symmetric median and mean are the same no mode Notice how for symmetric distributions median and mean are always the same Rightskewed unimodal mode is on the left side of the median and mean is on the right side of the median this is because median is not as much affected by the extreme values as the mean Leftskewed unimodal mode is on the right side of median and mean is on the left side of the median When we measure the variability of a data set we are trying to nd the spread of the data meaning the gaps the data points tend to have The simplest measure of this is the range Range largest data value smallest data value Range only use 2 data values the extreme values so it is greatly in uence by extreme values Make sure you have units Variance gives the variability to the spread around the mean if the spread around the mean is wide the variance is high if it is close together and then the variance is low Calculate the variance 1 Calculate the mean 2 Subtract the mean from every data value and square that difference for every data value 3 Add all the squared differences that you get from each data value if it is not squared the sum would always end up to 0 since the mean is the sum of all data values add up together then divide by the amount of data values you have but variance is not about the value of the mean but the spread of distances around the mean 4 Use the sum you get to divide by data set 1 Subtract by one to obtain an unbiased estimate of the population variance because it is found that if we do not subtract one in the long run it would become an underestimate of the population variance because it would lose the extreme values of the population Sample variance 5 2 Population variance 02 The units for variance are the squares of the units of the original data which makes it less commonly use as the standard deviation because the units for variance are meaningless It is in uence by extreme values because we would square the differences between the extreme values and mean The variance has to be greater than or equal to 0 it can never be negative and it would be 0 only if all the data in the data set are the same Round to one more decimal place than the data Standard deviation is the square root of variance and it increases if it has more data values that are farther away from the mean Sample standard deviation 5 Population standard deviation 0 Notice how the symbols for population tend to be Greek letters Same unit as the data It is always greater than or equal to 0 never negative since it is the square root of variance and variance is either positive or 0 In uence by extreme values It is mostly use for comparison because it is a biased estimate of population standard deviation use variance for estimation For every distribution at least 75 of the data lie within 2 standard deviations of the mean mean 2 standard deviations those data are called usual and the 25 of the data that are not included in the range are called unusuaL For a normal distribution 95 of the data are usual Coef cient of variance CV nd the scale between the standard deviation and the mean values making it possible to compare variability for data with different measurements or different units CV standard deviation mean 100 Round the to one decimal place Mean Absolute Deviation another measure of variability Compared to variance instead of squaring the difference between the data point and mean to make it positive take the absolute value of the differences MAD data value mean for every data value n It is not commonly used because variance uses the sum of squared differences which are commonly used in other analysis
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'