# BTM8107-8 Week 2 Activity Understanding and Exploring Assumptions Rated A+

This 14 page Study Guide was uploaded by kimwood Notetaker on Friday November 13, 2015. The Study Guide belongs to a course at a university taught by a professor in Fall. Since its upload, it has received 62 views.

Date Created: 11/13/15

1. Assumptions are needed to draw accurate conclusions about reality test Different assumptions are made for various statistical models and in order for models to reflect reality accurately; their assumptions need to be true. If assumptions are broken, accurate conclusions cannot be drawn about the data distribution. Therefore, part of the data process involves checking to make sure that your data doesn't fail this assumption. 2.. 3. 4. Answer: For the variable “Day 1”: From the histogram of the variable Day 1 we see that the histogram of day 1 looks like normal distribution but it’s curve is slightly above the normal curve so it is leptokurtic and since the normal curve and data distribution curve is approximately same therefore data distribution is symmetric hence the data distribution is approximately symmetric normal distribution with kurtosis is leptokurtic. Also from the P-P plot of the data distribution we see that the data points are lying around the straight line therefore it is normal distribution. For the variable “Day 2” From the histogram of the variable Day 2 we see that the histogram of day 2 looks like normal distribution but the maximum no of data points concentrate left of the normal curve therefore the data distribution is positively skewed and it’s curve is slightly above the normal curve so it is leptokurtic and since the normal curve and data distribution curve approximate same therefore data distribution is approximately normal distribution with kurtosis is leptokurtic and positively skewed. Also from the P-P plot of the data distribution we see that the data points are lying around the straight line therefore it is normal distribution. For variable “Day 3” From the histogram of the variable Day 3 we see that the histogram of day 3 looks like normal distribution but the maximum no of data points concentrate left of the normal curve therefore the data distribution is positively skewed and it’s curve is slightly above the normal curve so it is leptokurtic and since the normal curve and data distribution curve approximate same therefore data distribution is approximately normal distribution with kurtosis is leptokurtic and positively skewed. Also from the P-P plot of the data distribution we see that the data points are lying around the straight line therefore it is normal distribution. 5. Answer: The descriptive statistics of the three variables: Statistics Hygiene Hygiene Hygiene (Day 1 of (Day 2 of (Day 3 of Download Download Download Festival) Festival) Festival) Valid 810 264 123 N Missing 0 546 687 Mean 1.7934 .9609 .9765 Median 1.7900 .7900 .7600 Mode 2.00 .23 .44a Std. Deviation .94449 .72078 .71028 Variance .892 .520 .504 Skewness 8.865 1.095 1.033 Std. Error of .086 .150 .218 Skewness Kurtosis 170.450 .822 .732 Std. Error of .172 .299 .433 Kurtosis Range 20.00 3.44 3.39 a. Multiple modes exist. The smallest value is shown From the descriptive statistics we see that in Day 1 variable there is no missing observation, there is total 810 observations hence the sample size n1=810, in variable Day 2 there is 264 valid observation and 546 missing observation in the sample and therefore the sample size would be n2=264+546=810. In variable Day 3 there is 123 valid observation and 687 missing observation in the sample and hence the total sample size n3=123+687=810. For the variable Day 1, the median =1.7900 which is left to the mean=1.7934 (they are approximately equal) which indicates the data distribution of Day 1 is skewed and also it is clear from the skewness of day 1=8.865, since the kurtosis of day 1=170.450>0 which indicates that the distribution curve is more peaked than the normal distribution and it is clear from the histogram of the day 1. For the variable Day 2, the median =0.7900 which is left to the mean=0.9609 which indicates the data distribution of Day 2 is positively skewed and also it is clear from the skewness of day 2=1.095, since the kurtosis of day 2=0.822>0 which indicates that the distribution curve is more peaked than the normal distribution and it is clear from the histogram of the day 2. For the variable Day 3, the median =0.7600 which is left to the mean=0.9765 which indicates the data distribution of Day 2 is positively skewed and also it is clear from the skewness of day 3=1.033, since the kurtosis of day 1=0.732>0 which indicates that the distribution curve is more peaked than the normal distribution and it is clear from the histogram of the day 3. Yes because if we see the histogram of day 1, day 2 and day 3 then we observe that the data distribution are going to follow approximately normal distribution (not exactly normal curve). Yes it is matched with the visual observation of the data plot. 6. Answer: The descriptive statistics: Descriptive Statistics Percentage Percentage Computer Numeracy on SPSS of lectures literacy exam attended 100 100 100 100 0 0 0 0 58.10 59.765 50.71 4.85 60.00 62.000 51.50 4.00 72a 48.5a 54 4 21.316 21.6848 8.260 2.706 454.354 470.230 68.228 7.321 -.107 -.422 -.174 .961 .241 .241 .241 .241 -1.105 -.179 .364 .946 .478 .478 .478 .478 84 92.0 46 13 a. Multiple modes exist. The smallest value is shown From the histogram of each variable we see that the data distributions curve are not approximately not normal but approximately normal distribution and except the Numeracy all the variables are negatively skewed and numeracy variable is positively skewed so the normality assumption is appeared in the data distribution 7. Answer: Test of Homogeneity of Variances Leven df df Si e 1 2 g. Statis tic Percentage of 1.731 1 9 . 8 1 lectures 9 attended 1 .064 1 9 . Computer 8 8 literacy 0 1 From the levene’s test we Cleary see that the p value corresponding to percentage of lecture attends and computer literacy are 0.191 and 0.801 which are not less than 0.05 at 5% level of significance therefore the variances are equal. 8. Answer: From the levene’s test we Cleary see that the p value corresponding to percentage of lecture attends and computer literacy are 0.191 and 0.801 which are not less than 0.05 at 5% level of significance therefore the variances are equal. The assumption is not violated that’s why we have nothing to do. Technically violation in the above analysis is that we consider the entire variable as independent variables but SPSS exam depends on lecture attend, computer lecture and numeracy so our consideration is slightly wrong so our assumption will little bit violated technically.

