Statistics ECO 105

This 13 page Class Notes was uploaded by Khalid Hagenes on Sunday October 11, 2015. The Class Notes belongs to ECO 105 at Davidson College taught by Staff in Fall.

Date Created: 10/11/15

Name Statistics Davidson College Economics 105 JanMay 2005 39 Mark C Foley Review 3 Directions This review is closedbook closednotes except for your formula sheet to be taken in one sitting You may use a calculator andor Excei Perform your calculations to 3 decimal places where necessary It does not need to be taken in one sitting There are 150 points on the exam The problems are worth 25 45 30 40 and 10 points respectively You must show all your work to receive full credit Any assumptions you make and intermediate steps should be clearly indicated Do not simply write down a nal answer to the problems without an explanation Please turn in your formula sheet with your exam And please turn in a print out of your Excel work Think cleady and work e ciently Honor Pledge Start times End times Problem 1 Consider the following sample data There is a copy of this data in the le PEconomicsEco 105 Statistics exam3dataxis Lawn Care Expenditures last month Lot Size Income in dollars in acres in dollars Occupation 15 05 51750 Baker 67 18 105449 Architect 57 22 114064 Carpenter 76 2 177000 Baker 113 36 55640 Architect 62 14 48555 Architect 101 24 117243 Architect 39 18 99419 Carpenter 149 33 169129 Architect 80 04 110999 Baker 45 35 65718 Architect 64 3 68090 Baker 55 2 156341 Carpenter 47 32 58125 Architect 46 2 67683 Architect 110 34 158148 Architect 41 09 165217 Carpenter A 97 32 83610 Architect 7 89 13 127231 Baker 100 28 174594 Architect 49 05 126452 Carpenter a Test the null hypothesis that the population correlation coefficient between Lawn Care Expenditures and Lot Size is 0 versus the alternative that it is positive Use the 2 percent significance level Conduct your test on the test statistic scale Use Excel to calculate the critical value and test statistic to 5 decimal places a h 6 r rquot 2139 539quot J3 A SZIV o h m 0quot 1091 H1110 gt 0 an 1532220 2204 wa H o e zzcwt 03 due 39 0071 b What is the pvalue 39 Wabbit 3 c8913 c releases 81 o 040 a 2 t It m 399 5 alt tut l in a 424 Winter n c What is the lowest significance level at which you can reject the null hypothesis that the population correlation coefficient between Lot Size and income is zero in favor oft alternative that it is positive 39 06 l 39 t 1 f 2Ib A test rhh sh39o caper gvd u 7572155 area Muck is clerk 751 V Excel tells 0 mt one above Z81lpl t 9 3ro wrertzri nimii 2 0 M 5 E de is Le 1 21 t357 Problem 2 Le a a Using the data in problem 1 estimate the regression of Lawn Care Expenditures on Lot LS Size Write down the sample regression function You may use Excel to check your work but you must write down the OLS formulas and write out a value for each term of the formulas for the estimated intercept 3 numbers and slope 7 numbers in the way I did the hand calculations in class You may use Excel or a calculator to do the calculations y 394 54 E Lorri3ot PAL 3 231 427 a 353 3 21 Caravans 630quot ifan mmquot M rung A 2quot V fi WSnv Sfcrm21 2 37 5quot 6 315B 15799 5 b Interpret the intercept and slope coefficients of the regression you estimated in part a Bil Far 4 one acre dreams at MS average W a 5 qu Places by 44580 4 l r LirSF39ee 0 W033 Lawn Cm Expandng 4 375 01c cl if draft uke sealquot 7 9 AW 0 of amp c Clearly evaluate the regression model s goodness of fit according to R2 and the standard error of the model Again you may use Excel to check but write down formulas and show me the values that make up each formula uss is t was fquot p quot 39 M vquot 2 95 v W 79237 aquot ood like 27 of 7A wa39a39 n Y 3 exIM 19 401534 m This 5 voidf I39W grace 04221 I L91 cansi ierhj m 6 41 9 ll ndivroba sAwmhddlmm 17PM w quotI m J 7 2 4 I 139 T S If 744 2a L W 3 2322 ohm a meadow g6 44 2439 I l as me xfaMmdmh bm all Val so mm aw 4 6 0 Ira 64275 a Pm f39 77 Mffdrt f as d Calculate the standard error for the estimated slope coefficient Again you may use Excel to check but write down formulas and show me the values that make up each formula 2 z 9 23m c 5 54539 re tag quot quot39 6 xer 4263iquot e Calculate and interpret a 93 con dence interval for the population slope parameter Again you may use Excel to check but write down formulas and show me the values that make up each formula a f Jin Wreck itad oFs ize 2 m 239 39 Star 1 yf fat 595 eff 23 m M 391quot be 1 09 17s l V S afar4M 741 27 Q36 L 27 2275 f Predict the lawn care expenditures for Abe Froman of Chicago who has a 3 acre lot Calculate the 95 prediction interval for the actual value of Mr Froman39s expenditures Again you may use Excel to check but write down formulas and show me the values that make up each formula Sim 53601 8mm 3 1 A h g I 3 34629 9 ya 2 lr tq jjgl s 812943 03 PM ham 3R l7tgafgng w 3 so 395 39 rit23l trrci 9 Consider the following null and alternative hypotheses H0 A a and H 61 gt a where is the population slope coef cient on Lot Size To 4 decimal places calculate the highest value of a for which the null hypothesis can be rejected at the 5 level Again you may use Excel to check but write down formulas and show me the values that make up each formula at 5 EL I 9366 W WWW Q SMQ HS 59gt 06 I W 439 z 5 39 r t 0 Lea1 7 quot7 TINVCIO r1 l Problem 3 a What does OLS stand for Explain what the LS means in terms of estimating a sample regression line Ordilmry Lut gqure Ar Quartz acyne sim like minimiee 77w fan s 5Iar eA earnr3 Over a IbW quotMinimize rcer 9 261 X che veryq Squares Mk4 393 quotsioared errorsquot 4va vice ware b What does BLUE stand for Explain the quotBquot and quotUquot using relevant and properlylabeled graphs 565 Li deqr Va bl ase Ecb g h em 4 4 go93 7h Mlkl mum variance 0Mquot a lane 54ch eshimdbf 19f Lci P39x f5 denote M W 59 a aw hm EvaJ 9 u f l39f39 has a arenas Vmw e snag it noT OLS 3 1h 05 cPPm a39l 09 41L 2 l awmawik w 4m 3 Em Ef HB Le 6146 c Does a sample regression line for a simple regression necessarily have to go through the sample mean point i Support your answer showing why or why Rot gt 5 E 72 yaltg a x 949 PT y07 gt f 4 39 750W 4 X 5 A 4 4 59 um gtltxl y 79x Bx y I am Saws 1 42M F a lzeaon Wm xx mr is y r Lowivy mm guru nexsv n liner 39 d Does the estimazad slope parameter from a simple regression 0 Y on X necessarily have to have the same sign as the sample correlation coef cient between Y and X Using appropriate 6 71 20quot lt2m gxz s 51 5quot Q Sr 4 a 39 Z39 quot 0577th y 9 g 9 a gxy quot I L WI 7 3quot7 sawmilliquot K l m I P 7 5 39 WW9 92 Hmnefah39ve39 so 7 aka mngt ve quot39 GZSVS A 339 T211154quot Gifh 9F 1 Must 415 be 574 a 3 4 6quot I 3 THC v lOEJ Problem 4 Consider the data from problem 1 a Controlling for lot size and income do architects spend more ie one3i c alternative on average than nonarchitects on lawn care at the 10 significance level or No Run the appropriate regression to test this and calculate the relevant statistics to support y nswer a 4 A 7 k t 93 L65 4 3g 4 t 0755 I 00036an 23 Jf02MHCOf Hol z 90 Egg Mvsr dxd 9 5253946 W ethic 9 7L m mrsr a af mw quot 3 r Ilzsr rlv 2 vrhrwr A 4 10 29ch 36va b Write down the sample regression function from part a Interpret the coefficient on income but use units of 10000 by 9mwsm OQO 36l t m 23 feztamed LE 39 4 GE orer 3 c2 Iva4quot agx rhmseaP m Verna L m es 39 r quot7 dblQR 15M eWPa hrH msh 39 0 Write down the null and alternative hypotheses for the whole model Ftest from the following regression LawnCareExp A 31 LorSz39zei zlncomei sArchirecr 3 Estimate this model and use your Excel output to conduct the test H95 3r 9 93 0 gwm c 6 539 H afan Me 9 F0 Pwem 639 0034 4 95quot 39 d Argue for or against adding a quadratic effect of income in the model Do not estimate the model make a theoretical argument indicating the signs you d expect on the income coefficientg Be specific A properlylabeled graph may help A39wtodel allow gr mam ardlmim shr n relUnas39l o ame wasdb WKCM er 33 Fn me 633 m Bi Antwan g A argonarr 9 Aihrlwi shr n giving rn39fla l39be W as rim Cicadas 11quot madam M La 539 Smaller 93644455 124 601154 ar eody Adsvd alrse In M csr 005556 are 16 5quot 4 We rooned brjly 8 Ci 70 63 50 47 4quotvaka Wf539y returns M ff hf39 at as Mme lacues M 040202 W by 5170 50 630 W098 17 6 D Modemk LE ME E Q t 3 SW roam 1quot HO 950 m ranrm cwm a 3W Tr 106 realise m teat 49 m PMDVC I39namc e Do carpenters and bakers spend significantly different amounts on lawn care controlling for income and lot size Use the 5 signi cance level Run the appropriate regression to test this and calculate the relevant statistics to support your answer a A LCB quot39 7 S72 91 MS 4 000 13909 W 27 quA QA H5622 MW 0 33 390 H7 e 0 N9 th lama 51M tr ask diW quot M9772 we can use Excel mm 0 le gt Of 139 Consider the following model LawnCareExpi z 30 lLotSizei zlncomei 53Architecri B4Carpenter a Test the null hypothesis that the population coef cients on Architect and Carpenter are simultaneously zero Write down the null and alternative hypotheses 1775 is a retard F39l eff39 0 Baagirf a H lear wt t 0 it at M579 EMM39Kquot graf ti6 39ngwl39t54 M 107065 02J2 gh ob U wmw hg 5 5HI92M 4 Fa a mob l7 Fzuo 726337quot 19itvejcoro Wm PD 94 35706 21 m quot 6 rm AM 3va Kitr m 4 e m 12 l39zzwmr n39I 4 1quot 39 LyIb 39 Problem 5 Attached is a copy of Table Vlll from Dale Stacy Berg and Alan B Krueger 2002 Estimating the Payoff to Attending a More Selective College An Application of Selection on Observables and Unobservables Quarterly Journal of Economics November 2002 pp 1491 1527 This is the paper we discussed in class As a reminder the sample is a cohort of workers who entered college in 1976 Their earnings were then measured in 1995 only those who were full time and yearround The variable of interest that we discussed in class was quotschool quality which was measured as the average SAT score at the college or university the person attended in these regressions the measure of school quality is Lognet tuition defined specifically as the natural log of college tuition cost minus average student aid Don39t worry about the predicted term in front of logparental income just consider predicted not to be there it s not important for this question Also log means natural log As the title of the table indicates the dependent variable is the natural log of earnings 4 Int 3rpretthe3circled coefficients you may use the term ceteris paribuaquot here 3quot of A n 1 5 my l39 pl 2 39 577quot4Q fe snug Te 72 modquot 97 a 173 rim 3 ad vihen avez gj earnrvjs s 39 s momma 57 l 7 carer Far x Mre wt 7quotquot i 4 Icylog modeI o A f litp A 411 Income f I p 0 PM avcm earn 39 tome of J s q if Man In WWW r rte 39 3967quot am 503413 I z 39 c 6 z gt2 6 5 smsc 2 t gtqu gt 2 4tquot 5 5544 0 3 A c t A g ml 6210 1936 ka 97W Wm i f M 399 git05 IejQWhlt mc Um W90 In 39 a net fv rm H saga ivlyeamwwme S W 0 smile a cam Fry 2th rawIv wtquot 3 awe 71m 4quot 5 33an W Wl Wel If as am wds m 2 4 bars 524763 4 4 am a more ocPQGSEVt ohacl 74 ehffdr en 19v wrhcame 7C1m e 4 cameta 6 mm payed314514 more arean school a for children 54m high Mme intres 393 avian C677 Name Statistics Davidson College Economics 105 Jan May 2005 Mark C Foley Review 1 S39qu keg So a73 Directions This review is untimed closedbook closednotes except for your formula sheet You may use a calculator You may got use Excel Perform your catculations to 3 decimal places unless otherwise directed There are 100 points on the exam Each problem is worth 20 points You must show all your work to receive full credit Any assumptions you make and intermediate steps should be Clearly indicated Do not simply write down a nal answer to the problems without an explanation Please turn in your formula sheet with your exam Carpe diem Honor Pledge Start times End times Problem 1 Consider the following probability distribution X 2 Xx a Fill in the missing cells assuming that those are the only possible values fch and this is a proper probability distribution function b Calculate the mean and variance of X EB a las39 40635 4 i39539 3t 7 5135 q 7670 s 326 VuCX39l EIX J39 11 s to 136 4 Vizxi 31 I 5quot May s so 3261 6459419 c A student needs to know the details of a class assinment due the next day and decides to call fellow classmates for the information She believes that for any particular cell the probability of obtaining the information is 40 She decides to continue calling classmates until the information is obtained Let the random variable X denote the number of calls needed to obtain the information It39s a big state university where students watch professors on closed circuit television so the class size is massive say 487 but that isn39t central to this question Fill in the table below That is find the first four values of X f x and Fx Then provide a general formula for the pdf and cdf X fxs x th Haiku x9 2acrs ve a l Lquot 2c meant a Problem 2 a What is the difference between the probability a standard normai random variable Z is between 2 and 2 and the rule of thumb for the probability a normal random variabie is within 2 standard deviations of its mean XN K WM 2 lt 2 t Feb Fae2 3391 0 12 75 1 439 1 A quot q39 m a w w M 5 ML b Let X N611 1764 What is the value otk such that P59 lt X lt k 5400 im o g or lt8 4 Sqoo p n EC V39 Fee qw k39bld a 5909 F2 Ti 5 65 39 H 9 9 quot 2 20 1 7quot 39 50 o 7 9 614 c The number of car accidents in a city in a given month is normally distributed Thirtythree percent of the time more than 200 occur Fewer than 50 occur 1 percent of the time What are the mean and standard deviation of the number of accidents in a month Carry 5 decimal places throughout this problem including the answers X a air Qwa a Nch39cf m7w336gt0 rXczou6opf2lt3 3446072q t xcs39o imo to 9flt539 0i 9 935344 4Jl The WM t39wb Unkmvw s Md 9quot some 539 gonna a swimsM M D b u5L6ozu 3 q a39 go 4 2510quot awe 17424 Problem 3 a You waltz into a local internet caf It s really dark and the manager tells you that there are 4 computers in the caf and the probability that one of them is available at any given time is 040 Assume that this probability is the same for each computer and the probability of one computer being occupied is statistically independent of any other What is the probability at least one computer is occupied Clearly define any random variables you use 9 FF CUM Wled off09 Faerolo 939uaess 90 1c Nara who ilt beck x u9ea a j El 1 1430 x 0i 39 r O l X Z 3 c FBZ IBo 1 re her itozs 5amp9 b A bank grants mortgages to 87 of all applicants After the applicant gets approval the bank sends an appraiser to evaluate the value of the property The bank pays the appraiser a salary of 2000 per month plus 200 for each appraisal If the bank gets 10 loan applications next month what is the variance of the amount of money the bank will have to pay its appraiser S a gaf uy 11000 32 03 WW X qurf aijar W f oquot 0 49M q oia im X V grimmid F g 0 86499 VMBY ZODLVMEX7 N 1001Nfl 3490 43903 Problem 4 a The life of a new type of light bulb is uniformly distributed between 1200 and 1600 hours The probability is 70 that a randomlyselected light bulb will last at least how long Hi we will Fact 39 5 w uquwm 9532 a are a 3 nova ms 3 a 39I W s W b What is the standard deviation of the num rs a bulb lasts 39 1 Wm 5 be sewawslwie seem II o At the Asheboro Zoo one sea lion swims by the underwater viewing window on average every ten minutes Suppose that the distribution of the time between sea lion sightings follows an exponential distribution A sea lion just went by the window 25 of the time the zoo visitors will have to wait how long or longer before the next sea lion shows up Let T15 be between 924 ion cr yk hys Wat3 EET quot J mognm i m39m Tgtt9f Tawmlw ioJ wucm 0 W rrltea quott ls7vquot 15 quotquot39E 9quot 8 i I c39o fga zs cquot 3922539 d Draw a properlylabeled graph of the exponential distribution in part c and indicate the answer to part c on the graph 4quot 12 5 ig ir wogepo ittrls ie39 The ares 92 e Zookeeper Zachary likes to sit in front of the sea lion viewing window during his lunch break which lasts precisely 45 minutes What is the probability he sees more than 2 sea lions on his lunch break u if 3 gm u v C x s Walks 3 wMow Fe Mes y e I0 with M5 gtlt mm s O o k 7 e t l 2 inni o quot 1 1 1 1quot1 393 Murcia mitts sense o OlU 0 h k a k d I O 39 a cmgm becauger ya lit recs RS Ra im at wk treats

