### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# 206 Review Sheet for STAT 30100 with Professor Wang at Purdue

### View Full Document

## 16

## 0

## Popular in Course

## Popular in Department

This 24 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Purdue University taught by a professor in Fall. Since its upload, it has received 16 views.

## Similar to Course at Purdue

## Reviews for 206 Review Sheet for STAT 30100 with Professor Wang at Purdue

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Stat 301 Review Final The nal will be broken down as follows Approximately 50 new material Approximately 50 old material from chapters 1 2 3 7 8 and 9 Here is a checklist broken down by section Check Sectlon Concept List Graphs 0 Know which graph to use given a word problem 0 Know how to describe your data based on a given graph Are there any outliers or gaps Is it symmetric skewed left or right Is it unimodal or bimodal Where is the center of the distribution Numerical 0 Know which numerical summaries are most useful based on the Summaries shape of the distribution of your data 0 Know which numerical summaries work best together 0 Understand the concept of a resistant measure know the de nition as well as the measures which are resistant Data Vocabulary concepts collection Anecdotal evidence Available data Unit Population Sample Census Observational study versus experiment Experimental unit Subjects Treatments Factors Factor levels Placebo Control group Statistical signi cance Three principles of experimental design Know how to randomize Problems versus advantages of experiments Non random sampling Random sampling Sampling bias Undercoverage Nonresponse Response bias Parameter Statistics Sampling variability Sampling distribution of a statistic Unbiased estimator How population size affects the sampling variability of a statistic Experimental Designs Do not just study the de nitions of these three designs You Designs will need to be able to read a problem and determine which type of design was used You will also need to know how to diagram the design 0 Completely randomized design 0 Randomized block design 0 Matched pairs Sampling Designs Do not just study the de nitions of these designs You will Designs need to be able to read a problem and determine which type of sampling was used 0 Voluntary response sample 0 Simple random sample 0 Strati ed random sample 0 Multistage sample Ch 7 o What kind of stories and graphs go with a ttestcon dence interval for the onesample mean matched pairs 2sample comparison of means 0 When it is better to calculate a con dence interval versus conduct a hypothesis test Ch 12 What kind of stories and graphs go with a oneway ANOVA problem Ch 13 What kind of stories and graphs go with a twoway ANOVA problem Ch 8 0 Know how to do con dence intervals for both one and two sample proportion problems 0 Know how to do hypothesis tests for both one and two sample proportion problems 0 Know when it is appropriate to use the formulas in these chapters Ch 9 and 0 Given a twoway table nd the joint distribution of categorical Section 25 variables 0 Given a twoway table nd the marginal distribution of categorical variables 0 Given a twoway table nd the conditional distribution of categorical variables 0 Given a twoway table nd the joint marginal and conditional probabilities 0 Relationship between a 12 test and a two sample proportion test 0 Do a hypothesis test for a 12 test 0 Know when it is appropriate to use a 2 test Ch 2 and 10 Know how to interpret a normal probability plot scatterplot and residual plot Use SPSS output to nd the following leastsquares regression line correlation r2 and estimate for 6 Find the residual for one of the sets of data Use SPSS to nd the con dence interval for the regression slope and intercept Hypothesis test for the regression slope state the null and alternative hypothesis obtain the test statistic and P value from SPSS output and state your conclusions in terms of the problem Test for zero population correlation state the null and alternative hypothesis calculate the test statistic and nd the Pvalue and state your conclusions in terms of the problem Outlier versus in uential variables Common response versus confounding Causation Ch 11 Use SPSS output to nd the following Leastsquares regression line correlation r2 and estimate for 6 Use the leastsquares regression line for prediction The F test state the null and alternative hypothesis calculate the test statistic and nd the Pvalue from the SPSS output and state your conclusions in terms of the problem Know how to determine which explanatory variables should be included in a model signi cance tests for j Know how to nd the con dence interval for i 1 sample proportion One percent or proportion Categorical data where 27 i To nd the z value look at the last row of the t table n Hypotheses HE pp versus Hapgtp Hapltp or Hap p Test Stati 39 H pgtp use PZgtz H pltp use PZltz or Ha p p use 2PZ gt zD Look up Pvalues on Normal table 2 sample proportion Two percents or proportions are compared Categorical data Where 71 and 72 7 quot1 To nd the z value look at the last row of the t table x2 quot2 um Hnplp2 versus Haiplgtp2HaiplltpzOrHaIp1 p2 m xlx2 Note f7 n1 n2 H pl gtpzuse PZ gt2 H plltpzuse PZltz or Ha pl p2 use 2PZ gt 2 Look up Pvalues on TABLE A 12 test Two categorical variables are compared Categorical data None Hypotheses H U There is no relationship between A and B Ha There is a relationship between A and B Test statistic Read 12 value from the printout Pvalue ReadPvalue from the printout The problems below have been taken from old nals MATCHING For problems 110 write the letter of the most appropriate statistical analysis technique next to the story Note each answer choice may be used once more than once or not at all 1 Is there a significant average difference between Mean and0r Standard Wednesday and Saturday gas prices if we check these deVlath 20 stations on both days F1ve number summary 2 What is the median gas price for Lafayette gas S 1 1 Stations 1mp e mear regress1on 3 Does the number of insurgent attacks in the war in Iraq Multlpie hnear regress1on affect gas pr1ces on a weekly bas1s lsample mean ttest 4 W111 the percentage of people travelmg by plane be higher on Memorial Day weekend or Labor Day Matched pairs Nest weekend 2sample Comparison 5 Do region of the country and size of vehicle small car of means west large car truck SUV have an effect on the number of people traveling over Memorial Day weekend lsample proportion Z test 6 Are region of the country and size of vehicle small car large car truck SUV associated 239sample Propomon Z39 test 7 Is there a significant difference between the average Indiana gas price and the average California gas price Chl39squared teSt today if 20 stations in each state are sampled Oneway ANOVA 8 Is there a difference in the average number of times a T ANOVA month a dr1ver fills up his tank for drivers of small W0 way cars large cars trucks and SUVs 9 I want to predict the number of people who will travel on Memorial Day this year by looking at gas prices temperatures unemployment rates consumer price indices and presidential approval percentages over the past 30 years 10 Is the average gas price for Indiana stations last Wednesday less than 215 For questions 1115 choose the letter for the graph listed below which would be appropriate for answering the questions Each letter may be used once more than once or not at all A Scatterplot B Side by side boxplots C Histogram D Pie Chart 11 What is the percentage of Indiana vehicles which are small passenger cars large passenger cars trucks SUVs and other 12 Is there much difference between the gas mileage of small passenger cars large passenger cars trucks and SUVs 13 Are gas prices and daily high temperature independent 14 Is there a negative association between the number of hybrid cars registered to a state and the number of people who voted for George W Bush in the election 15 Is the distribution of people per state who own hybrid cars symmetric or skewed 16 Alex is a homeowner and is concerned about heating costs He feels the outside temperature has an impact on the amount of gas used to heat his house So he looks on the website wwwweather com and nds the temperatures for each day and determines the average degree days per month He nds his heating bill and records the gas consumption for each month Below is a record of the results and the output after he entered the data into SPSS Month I Oct I Nov I Dec I Jan I Feb I Mar I Apr I May I June Degreedays 161 262 370 409 306 155 108 79 00 Gas consumption 50 61 84 101 80 43 35 25 11 Model Summary Adjusted R Std Error of Model R R Square Square the Estimate 1 991 a 983 980 4162 ANOVAb Sum of Mean Model Squares df Square F Sig 1 Regression 68990 1 68990 398345 000a Residual 1212 7 173 Total 70202 8 a Predictors Constant Degreedays b Dependent Variable Gas consumption Coefficientsa Unstandardized Standardized 95 Confidence Interval Coefficients Coefficients for B Std Lower Upper Model B Error Beta t Sig Bound Bound 1 Constant 1094 258 4235 004 483 1705 Degreedays 212 011 991 19959 000 187 237 a Dependent Variable Gas consumption a What is the explanatory variable b What is the response variable 0 Describe the form strength and direction of the relationship What is the equation of the least squares regression line for the heating season What is the predicted gas consumption when degreedays is 306 Find the residual value when degree days is 306 How much of the variation in gas consumption is explained by the leastsquares regression What is the 95 con dence interval for the regression coef cient of degreedays What is the 99 con dence interval for the regression coef cient of degreedays Do a test to determine if there is a linear relationship between degreedays and gas consumption State your hypotheses test statistic Pvalue and your conclusion in terms of the story 17 As an em Suppuna39 ufPurdue39s uutballteam Fete Wants 1e du almle analysts Hetuuk a randum sample uf 1 5 games 39nm the 121 three seasuns He Lhmks that me numba39 uf fans at each gamemay affectthe number ufpumts Purdue scares The empm 39nm hs analysxsxs beluw m u Em 39 g mmmem Mndasurrrmly Amman SmEnuvuY Meee1 R RSuuave RSuuave theEsumale 1 e113 373 325 11 m a meme cemammneneaneeacame Mow Meee1 Summiquares e1 Meaniquave r 13 1 Regvessmn an we 1 an we 7 w my mew 1seueee 13 1mm me 2522933 1 e peeems caneemmneneeneem eeme h oepeneemeueme pmmwmemyee eemeeme ummmm sunaamuea eeenew eeenew m emenee1mw11me we a 5mm em 1 5w Museum Upvevaaund 1 Emmav0 55213 1u513 5254 Dun 32522 new enameuem rummw mu e11 2m ma um um Dwendemvauab e me Vumue Scaved 1n What is the explanatory variable What is the response variable Describe the form strength and direction of the relationship What is the equation of the least squares regression line for the number of points scored What is the predicted number of points scored when the attendance is 56400 When the attendance was 56400 Purdue scored 31 points What is its residual How much of the variation in number of points scored by Purdue is explained by the leastsquares regression What is the 95 con dence interval for the regression coef cient of attendance at games Do a test to determine if there is a negative linear relationship between attendance at games and number of points scored by Purdue State your hypotheses test statistic P value and your conclusion in terms of the story 11 18 After thinking some more Pete thought there could be other variables that might affect the number of points Purdue scored One variable of interest is the number of points the opponent scores He added this variable to his analysis and did a multiple regression a Using the output on the next four pages what is the best equation of a line for predicting the number ofpoints Purdue scored in a game use or 01 b Give 4 reasons for why you made that choice Correlations Points Purdue Attendance Points Opponents Scored at Game Scored Points Purdue Scored Pearson Correlation 1 611 075 Sig 2 tailed 016 790 N 15 15 15 Attendance at Game Pearson Correlation 611 1 157 Sig 2 tailed 016 576 N 15 15 15 Points Opponents Scored Pearson Correlation 075 157 1 Sig 2 tailed 790 576 N 15 15 15 Correlation is signi cant atthe 005 level 2 tailed 12 Points Purdue Scored 70 70 a 00 00 50 50 40 40 O 30 0 39D 30 0 E O O O O o I a a 20 g g 20 O E I a 3 o 10 D 10 19 E 0 8 0 40000 50000 00000 70000 80000 90000 100000110000120000 0 10 20 30 40 50 Attendance at Game Points Opponents Scored 120000 110000 39 o 0 100000 90000 80000 d E 70000 E D a a O I O 13 60000 q C o O E 50000 E 39 39 2 a 40000 0 10 20 30 40 50 Points Opponents Scored 13 SPSS output for using POINTS OPPONENTS SCORED and ATTENDANDCE AT GAME to predict POINTS PURDUE SCORED Model Summary Adjusted Std Error of Model R R Square R Square the Estimate 1 611a 374 269 11474 a Predictors Constant Points Opponents Scored Attendance at Game ANovnP Sum of Model Squares df Mean Square F Sig 1 Regression 943174 2 471587 3582 060a Residual 1579759 12 131647 Total 2522933 14 a Predictors Constant Points Opponents Scored Attendance at Game b Dependent Variable Points Purdue Scored Coef cientsa Unstandardized Standardized Coef cients Coef cients Model B Std Error Beta t Sig 1 Constant 55997 13704 4086 002 Attendance at Game 399E04 000 614 2656 021 Points Opponents Scored 265E02 286 021 093 928 a Dependent Variable Points Purdue Scored 14 SPSS nntpllt fur usingqu ATTENDANDCE AT GAME tn predict mst PURDUE SCORED Mada Sunnay Adjusted 5m Evmv av Made R staye R Squave mamas 1 s11 373 325 11 u2a a mummy Cansmnu ANEndance 31 Game Mow Sumai Made Squaves m Mean Square r 51g 1 Regresser 922 ma 1 922 ma 7 w my Resmua 15mm 13 121 517 Tmm 2522 933 1a a Premcmvs CanstanlM endancea Game in Dependemvanab e PmmsPuvdue Scaved Eamhmems ummmm sunaamma Made a 5m Eum Beta 1 5w vauammd vaevaaund 1 ammo 55213 1u513 5254 am 22522 77945 A endancea iame rummw mu E11 2722 ms um um Dwendemvauab e mus mm Scaved 15 SPSS output for using just POINTS OPPONENTS SCORED to predict POINTS PURDUE SCORED Model Summary Adjusted Std Error of Model R R Square R Square the Estimate 1 075a 006 071 13892 a Predictors Constant Points Opponents Scored ANovnP Sum of Model Squares df Mean Square F Sig 1 Regression 14 204 1 14204 074 790a Residual 2508729 13 192979 Total 2522933 14 a Predictors Constant Points Opponents Scored b Dependent Variable Points Purdue Scored Coef cientsa Unstandardized Standardized Coef cients Coef cients Model B Std Error Beta t Sig 1 Constant 24931 8649 2882 013 Points Opponents Scored 9284E 02 342 075 271 790 a Dependent Variable Points Purdue Scored 16 19 An environmental health professor conducted a study to see whether fastfood workers wearing gloves actually lowers the chance that customers will come down with food poisoning The scientists purchased 371 tortillas from several local fastfood restaurants noting whether the workers were wearing gloves or not 190 of the tortillas came from bare hands restaurants 181 of the tortillas came from glovewearing restaurants The scientists then tested the tortillas purchased for microbe growth They found that the barehands restaurants tortillas gave rise to microbe growth on 18 tortillas and the glovewearing restaurants tortillas gave rise to microbe growth only on 8 tortillas Is the glovewearing restaurants tortillas microbe growth signi cantly lower than the barehands restaurants microbe growth at the 5 signi cance level 1 State your hypotheses for this test 2 Calculate your test statistic 3 Find your Pvalue 4 State your conclusion in terms of the story 17 20 In a 1984 survey oflicensed drivers in Wisconsin 214 of 1200 men said that they did not drink alcohol Construct a 95 con dence interval for the proportion of men who said that they did not drink alcohol Is your con dence interval calculation reasonable Why 21 On the next page is the SPSS output for a study of alcohol and nicotine consumption among 452 pregnant women Nicotine consumption is divided into 3 categories and alcohol consumption is divided into 4 categories Answer the questions below based on the output that follows a What proportion of the nonalcohol consuming women do not smoke during pregnancy Is this a joint marginal or conditional probability b What proportion of women do not smoke and do not consume alcohol during pregnancy Is this a joint marginal or conditional probability c Find the marginal distribution for alcohol consumption during pregnancy d State the null and alternative hypotheses to test whether there is a relationship between alcohol consumption and smoking during pregnancy e What are the test statistic and Pvalue used to test the hypotheses in part d f State your conclusions in terms of the original problem g Are your results for the above test valid Explain your answer 18 Alcohol Nicotine Crosstabulation Count Nicotine 115 16 or more None Total Alcohol 0110 5 13 58 76 1199 37 42 84 163 10 16 17 57 90 None 7 11 105 123 Total 65 83 304 452 Note Nicotine is measured in milligramsday and alcohol in ounces per day Alcohol Nicotine Crosstabulation Nicotine 115 16 or more None Total Aloohol 0110 Count 5 13 58 76 Expected Count 109 140 511 760 of Total 11 29 128 168 1199 Count 37 42 84 163 Expected Count 234 299 1096 1630 of Total 82 93 186 361 10 or more Count 16 17 57 90 Expected Count 129 165 605 900 of Total 35 38 126 199 None Count 7 11 105 123 Expected Count 177 226 827 1230 of Total 15 24 232 272 Total Count 65 83 304 452 Expected Count 650 830 3040 4520 of Total 144 184 673 1000 ChiSquare Tests Asymp Sig Value df 2 sided Pearson ChiSquare 422523 000 Likelihood Ratio 44653 000 N of Valid Cases 452 a 0 cells 0 have expected count less than 5 The minimum expected count is 1093 19 Multiple Choice Circle the letter of the correct answer and write its letter in the blank next to each story 22 23 24 Does bread lose its vitamins when stored Twenty small loaves of bread were randomly assigned to one of four storage times one two three or four days After the bread had been stored for its respective amount of days its vitamin C content was measured This is an example of a simple random sample completely randomized design randomized block design matched pairs design strati ed random sample WUOW The department of health wanted to know how many people received u shots this year They thought that females were more likely to get a shot so they randomly selected 500 males and 500 females in Lafayette and West Lafayette to survey This is an example of a simple random sample completely randomized design randomized block design matched pairs design strati ed random sample WUOW Which of the following is a potential way to reduce sampling variability A Increase your sample size B Decrease your sample size C Increase your population size D Decrease your population size 20 For questions 2527 choose the letter for the type of bias listed below which is a problem in the story A Undercoverage B Nonresponse C Response bias 25 John wanted to nd out people s opinions regarding Greater Lafayette Health Services desire to build a new hospital Consequently he took a simple random sample of 500 Lafayette and West Lafayette residents listed in the phone book He is concerned however that those not listed in the phone book may have different views What type of bias is he concerned about 26 When John attempted to collect data from those who made it into his sample he was unable to contact some of them and others refused to answer his survey questions What type of bias could this produce 27 John was pleased with the unanimous response to his survey question which read Do you believe that building a new hospital is a waste of recourses and will leave two perfectly good buildings vacant What type of bias could his survey question be producing For questions 2831 choose the letter for the graph listed below which would be appropriate for answering the questions Each letter may be used once more than once or not at all A Scatterplot B Side by side hoxplot C Histogram D Bar graph 28 Compare the percentage of Lafayette residents who feel that a new hospital should be built with the percentage that don t feel that a new hospital should be built and the percentage who don t care 29 Is the distribution of people s ages who feel a new hospital should be built in Lafayette symmetric or skewed 30 Is there a positive association between the age and number of times a Lafayette resident visits one of the hospitals in a year 31 Is there a difference in the average number of hospital visits per year between Lafayette residents that would like to see a new hospital built and those who would not or don t care 21 MATCHING For problems 324l write the letter of the most appropriate statistical analysis technique next to the story Note each answer choice may be used once more than once or not at all 32 33 34 35 36 37 38 39 40 41 As the outdoor temperature in degrees increases do ice cream sales in dollars increase at the Silver Dipper Is there a signi cant average difference between softserve and hardpacked ice cream if we check the prices of both at 20 different ice cream parlors Do high school students spend more money on ice cream on average than college students Is the average number of scoops of ice cream a person eats in a summer week less than 5 Does a person s favorite avor triple chocolate chunky monkey or vanilla or residential proximity to an ice cream parlor reported only as less than 1 mile between 1 and 5 miles or more than 5 miles or their interaction have an effect on the amount of money a person spends on ice cream in a summer Can a person s age residential proximity to an ice cream parlor reported in miles and IQ do a good job of predicting how many ice cream cones that person will eat in a summer Is there a significant difference between how many ice cream cones a year on average freshmen sophomores juniors and seniors eat What is the maximum price for ice cream cones if I look at prices of single scoop cones from 25 different stores Is there a relationship between a person s favorite avor of ice cream triple chocolate chunky monkey or vanilla and their gender Is the percentage of men who like triple chocolate ice cream the best higher than the percentage of women who like triple chocolate ice cream the best 22 A FPO 1 Mean and or standard deviation Five number summary Simple linear regression Multiple linear regression lsample mean ttest Matched pairs ttest 2sample Comparison of means ttest lsample proportion Ztest 2sample proportion Ztest Chisquared test Oneway ANOVA L Twoway ANOVA 42 A lucal news steeen repuned that 72 uf all penple push the sneeze hetten at least enee befurewakmg up mthemummg Fete an engneenng steeentet Purdue Lhuughtthat smce engtneenng students are usually up late steeytng a hgher percentage uf engneess Weele push the sneeze hetten He eeeeeste take aszmple ufsn engtneenng students and fuund that 39 see they push the sneeze hetten eeeh mummg Is the tme percentage ufpeuple whu push the sneeze hetten etleest enee tn the mummg stgneeenuy hgher than the news stauen39srepenv use eu 1 a StateLhehyputhesesfurthstest h Calculatetheteststansn e and the Frvalue a State yuur cundusmn tn tenns efthe stury e censtxeet a sune cun dance tntemsl fur the pnepemen ufengneenng students whu push the sneeze hetten f On the curve beluw tnsen and clearly label Lhepg the 3 and the petnee fur the hyputhesxs test tn pans ethneegn a abuve 23 43 Is there a relationship between cigarette smoking and drinking With the recent proposed ordinance for West Lafayette peaking his interest an interested citizen selected a SRS of Purdue students surveyed those students and got the results summarized below Alcoholic drinks per week Cigarettes smoked per day Crosstabulation Cwnt Cigarettes smole per day None 110 1120 21 Total Alcoholic dl l rks None 185 90 98 43 416 perweek 12 64 50 45 19 178 35 57 37 4O 25 159 6 89 57 61 40 247 Total 5 2 244 127 101 ChiSquare Tests Asymp Sig Value df 2sided Pearson ChiSquare 12 845 a 9 170 Likelihood Ratio 12 598 9 182 ggcggrnear 7527 1 006 N of Valid Cases 1amp30 a 0 cells 0 have expected count lessthan 5 The minimum expected count is 2019 a If a student has no drinks per week what is the probability heshe smokes no cigarettes Is this a joint marginal or conditional probability b Determine if the amount of drinking and cigarette smoking are related State your hypotheses your test statistic your PValue and your conclusion in terms of the story c Was it appropriate to do the test in part b Justify your answer 24

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.