### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Elementary Statistical Methods STAT 30100

Purdue

GPA 3.63

### View Full Document

## 212

## 0

## Popular in Course

## Popular in Statistics

This 25 page Class Notes was uploaded by Bailey Macejkovic on Saturday September 19, 2015. The Class Notes belongs to STAT 30100 at Purdue University taught by Celeste Furtner in Fall. Since its upload, it has received 212 views. For similar materials see /class/207935/stat-30100-purdue-university in Statistics at Purdue University.

## Similar to STAT 30100 at Purdue

## Popular in Statistics

## Reviews for Elementary Statistical Methods

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/19/15

Stat 301 Review Final The nal will be broken down as follows Approximately 50 new material from chapters 2 10 11 8 and 9 Approximately 50 old material from chapters 1 2 3 7 12 and 13 Here is a checklist broken down by section Check Sectlon Concept List Graphs Know which graph to use given a word problem 0 Know how to describe your data based on a given graph Are there any outliers or gaps Is it symmetric skewed left or right Is it unimodal or bimodal Where is the center of the distribution Numerical Know which numerical summaries are most useful based on the Summaries shape of the distribution of your data 0 Know which numerical summaries work best together 0 Understand the concept of a resistant measure know the de nition as well as the measures which are resistant Data Vocabulary concepts collection Anecdotal evidence Available data Unit Population Sample Census Observational study versus experiment Experimental unit Subjects Treatments Factors Factor levels Placebo Control group Statistical signi cance Three principles of experimental design Know how to randomize Problems versus advantages of experiments Non random sampling Random sampling Sampling bias Undercoverage Nonresponse Response bias Parameter Statistics Sampling variability Sampling distribution of a statistic How population size affects the sampling variability of a statistic Ethics of doing experiments with humans and animals Experimental Designs Do not just study the de nitions of these three designs You Designs will need to be able to read a problem and determine which type of design was used You will also need to know how to diagram the design 0 Completely randomized design Randomized block design Matched pairs Sampling Designs Do not just study the de nitions of these designs You will Designs need to be able to read a problem and determine which type of sampling was used 0 Voluntary response sample 0 Simple random sample 0 Strati ed random sample Multistage sample Capturerecapture sample Ch 7 o What kind of stories and graphs go with a ttest con dence interval for the onesample mean matched pairs 2sample comparison of means 0 When it is better to calculate a con dence interval versus conduct a hypothesis test Ch 12 What kind of stories and graphs go with a oneway ANOVA problem Ch 13 What kind of stories and graphs go with a twoway ANOVA problem Ch 8 0 Know how to do con dence intervals for both one and two sample proportion problems 0 Know how to do hypothesis tests for both one and two sample proportion problems 0 Know when it is appropriate to use the formulas in these chapters Ch 9 and 0 Given a twoway table nd the joint distribution of categorical Section 25 variables 0 Given a twoway table nd the marginal distribution of categorical variables Given a twoway table nd the conditional distribution of categorical variables Given a twoway table nd the joint marginal and conditional probabilities Relationship between a 12 test and a two sample proportion test Do a hypothesis test for a 12 test Know when it is appropriate to use a 12 test Ch 2 and 10 Know how to interpret a Normal probability plot scatterplot and residual plot Use SPSS output to nd the following leastsquares regression line correlation r2 and estimate for 6 Find the predicted response and residual for one of the sets of data Use SPSS to nd the prediction interval Hypothesis test for the regression slope state the null and alternative hypothesis obtain the test statistic and P value from SPSS output and state your conclusions in terms of the problem Test for zero population correlation state the null and alternative hypothesis calculate the test statistic and nd the Pvalue and state your conclusions in terms of the problem Outlier versus in uential variables Common response versus confounding Causation Ch 11 Use SPSS output to nd the following Leastsquares regression line correlation r2 and estimate for 6 Use the leastsquares regression line for prediction The F test state the null and alternative hypothesis calculate the test statistic and nd the Pvalue from the SPSS output and state your conclusions in terms of the problem Know how to determine which explanatory variables should be included in a model signi cance tests for i Ch 6 13 ALL hypothesis tests and con dence intervals give you information about the POPULATION parameter When writing your conclusion to a hypothesis test be sure to include the word population Hypotheses 1sam le 0 One percent or HE p p versus proponwn 2320133251 data H i p gt p H i p lt p or H i p i p g Test Stati 39c where 27 i z n To nd the zquot value look at the last row of th tt b1 Jim s ea 839 Hapgtp usePZgtz Ha2pltp use PZltz or H p pm use 2PZgtz Look up Pvalues on Normal table Hypotheses 2sample Two percents or A H versus A A p l p u P1 P2 7 1 2 Proportion Proportions are H P27Z quot1 gt Ha Pl gt P2 Ha H lt p2 or H pl 172 compare 39 T t Statistic39 Categorical data es 39 A A A A P P where pl andpz 2 1 2 n A quot1 2 P1 P To nd the zquot value look at the last row of the t table Note 27 quot1 quot2 Pvalue Haplgtpzuse PZgtz Haipl ltpzuse PZ ltz or Hap1 pz use 2PZgtz Look up Pvalues on TABLE A Know how to calculate marginal joint and Hypotheses ll test a Two categorical conditional distributionspercentages H U There is no relationship between A varrables are Categorical data W when it is appropriate to use the chi square test check the footnote below the output and B Ha There is a relationship between A and B Test statistic Read 12 value from the printout Pvalue Read Pvalue from the printout The problems below have been taken from old nals MATCHING For problems 110 write the letter of the most appropriate statistical analysis technique next to the story Note each answer choice may be used once more than once or not at all 1 Is there a signi cant average difference between A Mean and0r Standard Wednesday and Saturday gas prices if we check these deVlath 20 stations on both days B F1ve number summary 2 What is the median gas price for Lafayette gas 1 1 Stations C Simp e 1near regress1on 3 Does the number of insurgent attacks in the war in Iraq Multlpie hnear regress1on affect gas prices on a weekly bas1s E lsample mean ttest 4 W111 the percentage of people travelmg by plane be higher on Memorial Day weekend or Labor Day F Matched pairs Nest weekend G 2sample Comparison 5 Do region of the country and weather forecast sunny of means west cloudy rainy have an effect on the population average grocery bill for households on Memorial Day H lsample proportion Z weekend test 6 Are region of the country and size of vehicle small 1 239sample Proportion Z39 car large car truck SUV associated teSt 7 Is there a significant difference between the average J Chi39squared teSt Indiana gas price and the average Califomia gas price today if 20 stations in each state are sampled K39 oneway ANOVA 8 Is there a difference in the average number of times a L TWOway ANOVA month a driver fills up his tank for drivers of small cars large cars trucks and SUVs 9 I want to predict the number of people who will travel on Memorial Day this year by looking at gas prices temperatures unemployment rates consumer price indices and presidential approval percentages over the past 30 years 10 Is the average gas price for Indiana stations last Wednesday less than 215 For questions 1115 choose the letter for the graph listed below which would be appropriate for answering the questions Each letter may be used once more than once or not at all A Scatterplot 11 B Side by side boxplots C Histogram D Pie Chart What is the percentage of Indiana vehicles which are small passenger cars large passenger cars trucks SUVs and other Is there much difference between the gas mileage of small passenger cars large passenger cars trucks and SUVs Are gas prices and daily high temperature independent Is there a negative association between the number of hybrid cars registered to a state and the number of people who voted for George W Bush in the election Is the distribution of people per state who own hybrid cars symmetric or skewed Alex is a homeowner and is concerned about heating costs He feels the outside temperature has an impact on the amount of gas used to heat his house So he looks on the website wwwweather com and finds the temperatures for each day and determines the average degree days per month He finds his heating bill and records the gas consumption for each month Below is a record of the results and the output after he entered the data into SPSS Month I Oct I Nov I Dec I Jan I Feb I Mar I Apr I May I June Degreedays 161 262 370 409 306 155 108 79 00 Gas consumption 50 61 84 101 80 43 3 25 11 an 2n Gas consumption 1 2mm sun Degreedays Model Summary Adjusted R R Sg re Suare Std Error of the Estimate 1 991a 983 l 980 4162 ANOVAb Sum of Mean Model Sguares df Sg re F Sig 1 Regression 68990 1 68990 398345 OOOa Residual 1212 7 173 Total 70202 8 a Predictors Constant Degreedays b Dependent Variable Gas consumption Coefficientsa Unstandardized Standardized 95 Confidence Interval quot39 39 Coefficient for B Std Lower Upper Model B Error Beta t Sig Bound Bound 1 Constant 1094 258 4235 004 483 1705 Degreedays 212 011 991 19959 000 187 237 a What is the explanatory variable a Dependent Variable Gas consumption b What is the response variable C Describe the form strength and direction of the relationship 3 1 D quot1 9 Pquot What is the equation of the least squares regression line for the heating season What is the predicted gas consumption when degreedays is 306 Find the residual value when degree days is 306 How much of the variation in gas consumption is explained by the leastsquares regression Do a test to determine if there is a linear relationship between degreedays and gas consumption State your hypotheses test statistic PValue and your conclusion in terms of the story 17 As an and Suppurta Emma39s uutballteam Fetewzntstu du ahtde analysts Hetuuk a beluw mm a em Mum summary MUZB a Pvemcmvs Cunaan Anemnce m cm 13 mm 252295 a a Pvemdms Cansaml Anemancemem h Depemem Vauaue pm mm Scared Eua c ems ummmm sunaamma quotmm quotmm m mm mm a Mad SM Evvm 5 camano 5521 mm 5254 nun 12522 77945 endance e ummw am am was me um um u m Dependemvauab e 7mm mm Scaved E 57 0 3 1 D quot1 9 Pquot What is the explanatory variable What is the response variable Describe the form strength and direction of the relationship What is the equation of the least squares regression line for the number of points scored What is the predicted number of points scored when the attendance is 56400 When the attendance was 56400 Purdue scored 31 points What is its residual How much of the variation in number of points scored by Purdue is explained by the leastsquares regression Do a test to determine if there is a negative linear relationship between attendance at games and number of points scored by Purdue State your hypotheses test statistic P value and your conclusion in terms of the story 18 After thinking some more Pete thought there could be other variables that might affect the number of points Purdue scored One variable of interest is the number of points the opponent scores He added this variable to his analysis and did a multiple regression a Using the output on the next four pages what is the best equation of a line for predicting the number of points Purdue scored in a game use or 01 b Give 4 reasons for why you made that choice Correlations Points Purdue Attendance Points Opponents Sig 2 tailed N Sig 2 tailed N Sig 2 tailed Correlation is signi cant at the 005 level 2 tailed 70 70 a 00 00 50 50 40 39 40 O O o 30 30 IO 0 O O o 20 20 a O O o 10 10 I 0 o 40 00 50000 00000 70000 80000 90000 100000110000 12 000 0 10 20 30 40 Attendance at Game Points Opponents Scored 1 0000 I 110000 0 100000 90000 80000 0 70000 g g 0 O I 00000 I C 50000 O 40000 10 20 30 40 5 Points Opponents Scored SPSS output for using POINTS OPPONENTS SCORED and ATTENDANDCE AT AME to predict POINTS PURDUE SCORED Model Summary Model 1 611al R l R Square 374 Adjusted R Square Std Error of the Estimate 11474 269 3 Predictors Constant Points Opponents Scored Attendance at Game ANOVAb Model Sum of Squares df Mean Square F 1 Regression Residual Total 943 174 1579759 2522933 2 12 14 471587 131647 3582 060a 3 Predictors Constant Points Opponents Scored Attendance at Game b Dependent Variable Points Purdue Scored Coef cienlsquot Unstandardized nnf innts Standardized nnf r innts Std Error Beta Model 1 Constant Attendance at Game Points Opponents Scored B 55 997 3 99E04 2 65E02 13 704 000 286 614 021 4086 2 656 093 3 Dependent Variable Points Purdue Scored Ad ushed 5m Enmm R Suare me Esxmme 325 11 mm a Premdms CansamlA endanceatGwe mowquot a Pvemcmvs camanu A mdance a Game Dependent mm PmmsPumue Scared Eua c ems SPSS output for using just POINTS OPPONENTS SCORED to predict POINTS PURDUE SCORED Model Summary I Adjusted Std Error of Model R R Square R Square the Estimate I1 I 075al 006 071 13892 3 Predictors Constant Points Opponents Scored ANOVAb Sum of Model Squares df Mean Square F Siq 1 Regression 14204 1 14204 074 790a Residual 2508729 13 192 979 Total 2522933 14 3 Predictors Constant Points Opponents Scored b Dependent Variable Points Purdue Scored Coef cienlsquot Unstandardized Standardized nnf innts nnf r innts Model B Std Error Beta t Siq 1 Constant 24931 8649 2882 013 Points Opponents Scored 9284E 02 342 075 271 790 3 Dependent Variable Points Purdue Scored 19 An environmental health professor conducted a study to see whether fastfood workers wearing gloves actually lowers the chance that customers will come down with food poisoning The scientists purchased 371 tortillas from several local fastfood restaurants noting whether the workers were wearing gloves or not 190 of the tortillas came from bare hands restaurants 181 of the tortillas came from glovewearing restaurants The scientists then tested the tortillas purchased for microbe growth They found that the barehands restaurants tortillas gave rise to microbe growth on 18 tortillas and the glovewearing restaurants tortillas gave rise to microbe grth only on 8 tortillas Is the glovewearing restaurants tortillas microbe growth signi cantly lower than the barehands restaurants microbe growth at the 5 signi cance level 1 State your hypotheses for this test 2 Calculate your test statistic 3 Find your Pvalue 4 State your conclusion in terms of the story 20 In a 1984 survey oflicensed drivers in Wisconsin 214 of 1200 men said that they did not drink alcohol Construct a 95 con dence interval for the proportion of men who said that they did not drink alcohol Is your con dence interval calculation reasonable Why 21 On the next page is the SPSS output for a study of alcohol and nicotine consumption among 452 pregnant women Nicotine consumption is divided into 3 categories and alcohol consumption is divided into 4 categories Answer the questions below based on the output that follows a What proportion of the nonalcohol consuming women do not smoke during pregnancy Is this a joint marginal or conditional probability b What proportion of women do not smoke and do not consume alcohol during pregnancy Is this a joint marginal or conditional probability c Find the marginal distribution for alcohol consumption during pregnancy d State the null and alternative hypotheses to test whether there is a relationship between alcohol consumption and smoking during pregnancy e What are the test statistic and Pvalue used to test the hypotheses in part d f State your conclusions in terms of the original problem g Are your results for the above test valid Explain your answer Alcohol Nicotine Crosstabulation Count Nicotine 115 16 or more None Total Alcohol 0110 5 58 76 1199 37 42 84 163 10 16 17 57 90 None 7 11 105 123 Total 65 83 304 452 Note Nicotine is measured in milligramsday and alcohol in ounces per day Alcohol Nicotine Crostabulation Nicotine 115 16 or more None Total Alcohol 0110 Count 5 58 76 Expected Count 109 140 511 760 of Total 11 29 128 168 1199 Count 37 42 84 163 Expected Count 234 299 1096 1630 of Total 82 93 186 361 10 or more Count 16 17 57 90 Expected Count 129 165 605 900 of Total 35 38 126 199 None Count 7 11 105 123 Expected Count 177 226 827 1230 of Total 15 24 232 272 Total Cou nt 65 83 304 452 Expected Count 650 830 3040 4520 of Total 144 184 673 1000 ChiSquare Tests Asy mp Sig Value df 2s ided Pearson Ch iSquare 422523 6 000 Likelihood Ratio 44653 6 000 N of Valid Cases 452 3 0 cells 0 have expected count less than 5 The minimum expected count is 1093 Multiple Choice Circle the letter of the correct answer and write its letter in the blank next to each story 2 2 2 N L 4 Does bread lose its vitamins when stored Twenty small loaves of bread were randomly assigned to one of four storage times one two three or four days After the bread had been stored for its respective amount of days its vitamin C content was measured This is an example ofa simple random sample completely randomized design randomized block design matched pairs design strati ed random sample ECO The department of health wanted to know how many people received u shots this year They thought that females were more likely to get a shot so they randomly selected 500 males and 500 females in Lafayette and West Lafayette to survey This is an example ofa simple random sample completely randomized design randomized block design matched pairs design strati ed random sample WOOL Which of the following is a potential way to reduce sampling variability A Increase your sample size B Decrease your sample size C Increase your population size D Decrease your population size For questions 2527 choose the letter for the type of bias listed below which is a problem in the story 2 VI 26 A Undercoverage B Nonresponse C Response bias John wanted to find out people s opinions regarding Greater Lafayette Health Services desire to build a new hospital Consequently he took a simple random sample of 500 Lafayette and West Lafayette residents listed in the phone book He is concerned however that those not listed in the phone book may have different views What type of bias is he concerned about When John attempted to collect data from those who made it into his sample he was unable to contact some of them and others refused to answer his survey questions What type of bias could this produce John was pleased with the unanimous response to his survey question which read Do you believe that building a new hospital is a waste of recourses and will leave two perfectly good buildings vacant What type of bias could his survey question be producing For questions 2831 choose the letter for the graph listed below which would be appropriate for answering the questions Each letter may be used once more than once or not at all A Scatterplot B Side by side boxplot C Histogram D Bar graph 28 Compare the percentage of Lafayette residents who feel that a new hospital should be built with the percentage that don t feel that a new hospital should be built and the percentage who don t care 29 Is the distribution of people s ages who feel a new hospital should be built in Lafayette symmetric or skewed 30 Is there a positive association between the age and number of times a Lafayette resident visits one of the hospitals in a year 31 Is there a difference in the average number of hospital visits per year between Lafayette residents that would like to see a new hospital built and those who would not or don t care 20 MATCHING For problems 3241 write the letter of the most appropriate statistical analysis technique next to the story Note each answer choice may be used once more than once or not at all 32 33 34 35 36 37 38 39 40 41 As the outdoor temperature in degrees increases do ice cream sales in dollars increase at the Silver Dipper Is there a signi cant average difference between softserve and hardpacked ice cream if we check the prices of both at 20 different ice cream parlors Do high school students spend more money on ice cream on average than college students Is the average number of scoops of ice cream a person eats in a summer week less than 5 Does a person s favorite avor triple chocolate chunky monkey or vanilla or residential proximity to an ice cream parlor reported only as less than 1 mile between 1 and 5 miles or more than 5 miles or their interaction have an effect on the amount of money a person spends on ice cream in a summer Can a person s age residential proximity to an ice cream parlor reported in miles and IQ do a good job of predicting how many ice cream cones that person will eat in a summer Is there a significant difference between how many ice cream cones a year on average freshmen sophomores juniors and seniors eat What is the maximum price for ice cream cones if I look at prices of single scoop cones from 25 different stores Is there a relationship between a person s favorite avor of ice cream triple chocolate chunky monkey or vanilla and their gender Is the percentage of men who like triple chocolate ice cream the best higher than the percentage of women who like triple chocolate ice cream the best 21 A mm C on 0 H 1quot F Mean and or standard deviation Five number summary Simple linear regression Multiple linear regression lsample mean ttest Matched pairs ttest 2sample Comparison of means ttest lsample proportion Ztest 2sample proportion Ztest Chisquared test Oneway ANOVA Twoway ANOVA m H mm mquot stauun39srepum manual 1 a Statethehyputhesesfurthstest b Calculatetheteststahsu Fmd the Frvalue a Stateyuurcundusmnmtermsufthestury the snuuze buttun E On the curvebeluwmsen and clearlylabel thepa the 3 and the Frvalue furthe hyputhesxs team pans athmugh a abuve 43 Is there a relationship between cigarette smoking and drinking With the recent proposed ordinance for West Lafayette peaking his interest an interested citizen selected a SRS of Punim chIlpntc mmmwi rim ctIIlpntc am mi mp manna mmmnrhmi 1mm Count Cigarettes srroked per day None 110 1120 21 Total Alcoholic drinks None 185 90 98 43 416 Per week 12 64 50 45 19 178 35 57 37 40 25 159 6 89 57 61 40 247 Total 395 234 244 127 1000 ChiSquare Tests Asymp Sig Value df 2sided Pearson ChiSq uare 12 845H 9 170 Likelihood Ratio 12598 9 182 ii ear 1 N of Valid Cases 1000 3 0 cells 0 have expected count less than 5 The minimum expected count is 2019 E If a student has no drinks per week what is the probability he she smokes no cigarettes Is this a joint marginal or conditional probability b Determine if the amount of drinking and cigarette smoking are related State your hypotheses your test statistic your PValue and your conclusion in terms of the story c Was it appropriate to do the test in part b Justify your answer 23 STAT 301 REVIEW FOR EXAM 1 ANSWERS 05 WgtO a 2 l 5 7 8 8 8 9 9 9 3 l 0 l 2 4 6 9 4 l 2 5 5 l 2 b Skewed right and unimodal c 25 28 30 375 52 d Yes 52 is an outlier e modi ed boxplot f Median and IQR since they are resistant to outliers and work best with skewed distributions a sports cars b small cars a 1 02 1 799 2 0124 2 566 3 04 b Symmetric c M2l5IQR7 a A sophomore from Harrison High School b All 800 sophomores at Harrison High School c The 150 selected sophomores from Harrison High School d GPA numeric whether the student took the SAT as a sophomore categorical e SRS No you do not know the severitytypes of the surgeries performed at each hospital c e a All preschool children b The 68 preschool children c The amount of improvement Integers ranging from 4 to 9 d Experiment 7 a treatment 6 months of piano lessons or 6 months of computer lessons was imposed on the subjects O LA 4 18 e Completely randomized design Group 1 7 Trt 1 l 34 kids Piano lessons 1 Random i l liCompare assignment 1 Group 2 i Trt 2 l 34 kids computer lessons No the total number of people voting in the elections were greater than in the past It would be more informative to compare the of votes won by presidential candidates a1 b3 C a 121 to 205 b Almost zero c 00392 d 18988 pounds a 1151 b 1820 hours c 09948 e Statistical quality control only pays attention to the internal state of the process whereas capability refers to the ability of the process to meet or exceed the external requirements placed on it Center line 29 LCL 2853 UCL 2947 A

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I made $350 in just two days after posting my first study guide."

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.