Research Methods in Economics
Research Methods in Economics ECO 4451
University of Central Florida
Popular in Course
Popular in Economcs
This 142 page Class Notes was uploaded by Toy Kertzmann on Thursday October 22, 2015. The Class Notes belongs to ECO 4451 at University of Central Florida taught by Mark Dickie in Fall. Since its upload, it has received 73 views. For similar materials see /class/227629/eco-4451-university-of-central-florida in Economcs at University of Central Florida.
Reviews for Research Methods in Economics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/22/15
a at 1 390 1 4 Research Methods in Economics ECO 4451 Introduction What Is Research I Stating a question amp seeking its answer Finding an answer already known by others Synthesizing existing knowledge to develop an answer independently Discovering new knowledge Your Research Project I Assignment is to conduct an independent analysis that seeks to answer a research question uWhat does this mean mn mndamemauy n s unknnwv or m mum We san n epen en Ana ysws meme 1252mm quesnnn ve nnzbwhe expected m geneme an Vnu evzhmzmg anmannn v mqhme eeoenee Vnu mew eenemsmns suppn ed we anaNss Wequot ucmg an n epen en na ysws anaNss s m chnnse 2 1252mm quesuemm Less quanmzwe appmzches mdw nhzmev mah vdamnamz mwendmne mm mmeeeymneeeme answev 2 1252mm quesuen nvsn ve a me anaNss synm mm an ecnrmrmc perspedrve rs dear v 51am IHasa rngrmr rannnare a mu m ecnrmmrcmenr urs duahre but nmmwar Research Answers IEe creamy 51am ISuppnned bvme anarysrs conducted IEe generarzabe X1 quesmrrs F The Research Process I Gnnd research h Emprn anew e rephcaled vs apprnpnate desrgrr a rs Questwons Come From Framed by a menu new mama Framed by ehenL buwague Mm undname Meme 2 a research questmn IWhat merests w IPemM expenenue np es m I is curr IWDMG W8 meres n hers IWha have mhers dune dn INann mm mm more Mrkman you an 10 am needed are avauame or can he acqwed er ma mea may not be me best even dropped Common Errors rn Project Soroctron 5 nm a research quesnnn hm39np ana a r new degenerate mm a repnn nn ralnennan an rndependem mvesngannn nr Vnu care mu mucnm he smearve Common Errors rn Project Soroctron Cont The prnmem rsmn nerd generahzed he nblamed Some Common Krnds of WNW urne raunan shdes rHustrate some questmns usmg exampresrmm SE mesh rs Some Common Kwnds of Researcn Questwons DuesAanemnnueasemeurease 57 Dues Iemse euuesuuu reduce Dues Me cnrne m a negnmmuud reduce nuusmg umes7 Dues wszenmg uesm news lead to Some Common Kwnds of Researcn Questwons n A s a uehuemguuxuuusumeumusune DnesA sum 37 quesnnn s ueumsseu as ucneumg nbpmve 37 am even um mesa nummruum ave m m dnwrv mm Some Common Kwnds of Researcn Questwons I Snmelwmes me livech rMne alien MAnn B spveWwernnwn bmtnemzenhne 2 215 nm IHunInandnexmmnuencuimng um uvmmu Some Common Kwnds of Researcn Questwons uwnat areme determmams aw nameL my gun ms 050 w 7 WnaNacmrs mums elemun Immune 7 Whamererrmnes anu me mums ampmmsunnw Some Common Kwnds of Researcn Questwons s x argerrnenerMeremnannmm v7 un mm mm mmngmpaymux mn mum impmmfnp ninium my nmrpiym M mm mmmmmwwmpm mm in Mnde mmw memdmw momma Md mmn m7 IJre mm cnnmmrgnndxpnnd menm Gummy mums Some Common Kwnds of Researcn Questwons 5mm agent Make amnn a7 a cnnxmnpnnn w snnuwm my mangedm n udepnmuwnum oan m m quzstmn 5 mm duab e Ms vepnvzsed annen dues zmnn B have nquot X anF m a Q uestw on s Wm AS upmasmszm Disalary m Dues mm reduce cnme7 DrdNAFTJ cause mummy wages m deem m Me 057 pumnases7 Examp es of Research Q uestw on s Wm derermmes Me Inagm alwmzry m elecmns m mew Wm derermmes shamane Du mgnerqusm semms raise mmsmg pmes In Me area7 Examp es of Research Q uestw on s s Wilma Cums sea We pmzeemm pmgram e ecnve7 Are pmun arms emmmau emem Wm are Me euszs and bene ts a ecunumms7 Examg es of Research Dues mymmy morease W57 and nhsewame charaderwams nvmmng perm person pavmg Each server used mummy for We ames Wh I a s conomxcs Research Mm as much we may nHmMng as by men mans Mmmecnnnmms semunvnnmmaua agents saunas Daommm mmm mmsmms u zummm m ammmm mums hMW mm Wh I a s conomxcs Research M armecnnmmcs segmen n s n mm mm mwcvr nundatmn m behavan n1 ECUHUU39HC am VOW Research 5 nm nbvmm he saw m c amy mug papa m me Wm B ecnnnmmm n we IHnw MEMES HME ancmmn nisczvce Make mm yuurmpm s ecumzmm an fadan are responsmxe forms 50 muease m nhesnv among us adunssmceme ale 197w m bum IWeng gm 5 demvmmed by ca nues m vs ca nues um IWhm s he ecnnnmws when Ehnu Gmssmznznd Sane Ansonnnmm anaNssr zdanbesny HEzaamA 55537 mm mm mama max mm a g E nhsWasanmm enmwnrssm acmnmc mms uue tt IWnal mum tne emnnnncchanges IReadtnepaper Nmetne tncusnn I Ennsumev demand ITestnypnt wszs I V FFty conomtc anaysts o o est by a student Compare the paper by anter and n5 I R n conomtcs esearc student paper Jacnbsy ts mm a pMEVab E Wm ssembte an NHLtezm undev a spawned a v mwmpnmnn ntmuy menu wen m 2 mm We base whim m m nmmmw my Hocanon a hockey sawary budget IReadmeJacnhspaper Nme unm anenmn dale a hypnthess mung nrmmz 2anan 11 mm budget 5 chusemn anw m hypnmess m be mm Research Methods in Economics ECO 4451 Sampling and Sample Designs Sampling Terminology I Population 7 The complete set ofitems ofinterest I Population element 7 An individual member ofpopulation I Census 7 A complete ennnmemtion ofall elements in population I Sample 7 A subset ofthe population selected for investigation Terminology Cont I Frame 7 Population frame list of all elements in o ulation 7 Sample frame list of elemenw from which sample Will be drawn Why Sample not census 0 Cost 0 Sufficiently accurate for most purposes if well designed probability sample 0 Sometimes decrease in accuracy from attempt to make complete census 0 Destruction of sample units Why Sample cont 0 But sampling introduces error in that it is virtually impossible for a sample to A erfectl rel resent the A oA ulation from which it was drawn 0 Two categories of errors 7 Nonsampling error 7 Sampling error Representative 0 How well does the sample represent the population P opulation Sample Parameters Statistics m is 9 Estimation Population Parameter 0 Variables in a population 0 Measured characteristics of a population 0 Assumed to have true values that are unknown to researcher 0 Greek lowercase letters as notation Sample Statistics 0 Variables in a sample 0 Measures computed from data 0 Known once sample is drawn and statistics computed 0 Used to make inferences about population parameters 0 English letters for notation What s a population Technically the population is the complete set of elements of interest 7 For example in a study of corporate pro ts the population is the set of pro ts of all corporations 0 We think of univariate bivariate or multivariate populations 7 If We are interested in Whether pro ts are related to CEO compensation We have a bivariate population Where the elements are the sew ofpairs ofpro w amp compensation What s a population I Technically the population is not the set of all corporations but the set of all pairs ofpro w amp compensation 7 It s pro ts and compensation that we want to be represented in our sample I However We usually proceed by selecting corporations into the sample and then getting info on pro ts amp compensation for the corporations selected I For this reason We speak ofthe corporations or rms or people or Whatever as the population What makes a good sample I It must be representative of the population 7 Basically this means it must contain the same variations that exist in the population I Estimators based on sample must be valid I Validity depends on 7 Accuracy 7 Precision Accuracy is The degree to which bias is absent from the estimator To have Accuracy I Overestimates and Underestimates must balance out in repeated sampling Precision 0 Is low sampling error 0 Different samples would yield similar estimates ls measured by the standard error of estimate a type of standard deviation measurement we will discuss later Errors from Investigating a Sample rather than a census 0 Nonsampling systematic error 7 Results from some imperfection in research desi n or mistakes in execution of desi n such Sampling frame error Nonresponse bias Response orrecording error Systematic Nonsampling Errors I Sampling frame error 7 Some population elements not represented in sampling frame Nonresponse error 7 When results are a ected because some elements selected into sample do not respond or are not measured I Response or recording error 7 Errors in making or recording responses or measur ments Errors from Sample rather than census Cont I Sampling random error 7 Difference between sample statistic and population eter that results from chance Variation in elements selected for inclusion in sample 1W0 determinants or sampnng error e Homogeneity smaller sampling error vs heterogeneity larger sampling error ofpopulation 7 Sample size larger sample reduces sampling error Errors Target Population Sampling same error Sampling Frame Sampling array 4 Planned Sample Numespunse error Actual Sample Stages in the I Define metaiga pupulatiun Selectron of a Sample etermirie ir a pmbability Drnunpmbabili samiiri mahud Wiii be chosen fur SElEEUn samliri units Sampling Units I A single element or group of elements subject to selection in sample I When sampling occurs in one stage the elements selected in the sample are the sampling units 7 Example simple random sample ofcollege students I In multistage sampling we distinguish 7 Primary Sampling Units PSU 7rirst or toplevel 7 Secondary Sampling Units 7 second level 7 Tertiary Sampling Units 7 third Sampling Units Cont I Multi stage sampling 7 Primary secondary tertiary sampling units I Example first select a region PSU then colleges within reg1on SSU then studenm at the colleges TSU Two Major Categories of Sampling I Probability sampling I Known nonzero probability for selecting 39 frame any element from sampling I This probability may be same or different for different elements I Sampling error can be estimated I Nonprobability sampling I Probability of selecting any particular element ofpopulation is unknown I Sampling error is unknown Nonprobability Sampling 0 Convenience 0 Judgment Quota Snowball Probability Sampling 0 Simple random sample 0 Systematic sample 0 Stratified sample 0 Cluster sample 0 Multistage cluster sample Convenience Sampling 0 The sampling procedure of obtaining the people or units that are most conveniently available 7 Used to collect large number of observations quickly Judgment purposive Sampling I An experienced individual selects the sample based onjudgment about some appropriate characteristics required of sample members 7 Example Selecting testmarket cities 7 O en used to select the site for a study ifnot the individual sample elements Quota Sampling I Ensures that various subgroups in a population are represented to the exact extent that investigators desire 7 Example 7 aiming for asample with acertain ofmen and women or a certain quotA ofpeople ofparticular ages or occupations I Strati ed sampling is a probability sampling method With a similar objective I Quota sampling is a nonprobability method Where the quotas typically are lled by convenience sampling Snowball Sampling I A variety of procedures I Some respondents may be selected by probability methods I Additional respondents are obtained from information provided by the initial respondents Probability Sampling 0 Simple random sample 0 Systematic sample 0 Stratified sample 0 Cluster sample 0 Multistage area sample Simple Random Sampling 0 An equal probability of selection method 7 Each element in the population has an equal chance of being included in the sample and 7 each subset of elements of a given size has an equal chance of selection into the sample 0 Simple to implement and easy to analyze resulting data little advance knowledge of population required 0 High cost larger error for given size than stratified sampling Systematic Sampling 0 A random start then systematic rule for drawing subsequent elemenm from sampling frame 7 Example every kth name from the list 7 Equal probability of selection for individual lements but not for all subsets 7 Need to insure that sampling interval is not related to any periodic ordering of sampling fr 0 Properties similar to simple random sample Strati ed Sampling 0 Define strata groups in population and select probability samples from within the different strata eg simple random samples within each stratum Choose a stratification variables that defines groups 7 by age gender race or ethnicity political affiliation etc Strati ed Sampling 0 Criteria for a stratification variable 7 Known to be related to dependent Variable 7 Increases homogeneity within groups 7 Increases heterogeneity between groups 0 Example dependent variable is attitude toward US policy in Iraq 7 A possible strati cation Variable is political party af liation Strati ed Sampling I Proportional 7 size ofeaeh subsample is proportional to size of stratum in population I Disproportional 7 size of subsamples determined by needs of study rather than relative size nrstraturn in population 7 Often used to insure that each subsample size is adequate for analysis separately from the full sample 7 Example 7 one population group may be small ormore important for analysis or may have greatervariability implying that the subsample size must be larger for separate analysis Strati ed Sampling Advantages lnsures adequate representation of all groups 0 Supporm analysis within each group and comparisons between groups 0 Reduces sampling error fora given sample size 7 Because ofhomogeneity Within strata amp homogeneity reduces sampling error Strati ed Sampling Disadvantages Requires sampling frame for each stratum 0 Can be costly relative to cluster sampling Cluster Sampling I PSU is not the individual element in the population but a larger cluster of elements that tend to occur together I Main example is area sampling Where the clusters are geographic areas Ideally a cluster should be representative of the larger population 7 eg as heterogeneous as the population 7 But often have too much homogeneity within a cluster 7 So select anumber ofheterogeneous clusters Cluster Sampling Examples I Sample ofchurchmembers 7 First select sample of churches then members within churches I Sample of school teachers or children 7 First select sample schools then teachers or children within schools I Sample of registered voters 7 First select sample ofareas egg states or precincts or blocks then voters Within those areas MultiStage Sampling I A complex procedure in which sampling unis are selected in two or more steps by combining probability sampling methods 7 One or more of the stages usually involves g cluster samplin I pommon example is multistage area samplin 7 Example select states then counties then precincts Within counties then blocks Within precincts then individual households Within blocks then voters Within households MultiStage Sampling I So multistage sampling involves repeating e W th o steps 7 listing amp sampling 7 that compose a 7 Listing 7 compose sampling frame e g all states ling 7 choose sample from sampling frame e g choose sample ofstates Re eat until 7 ou get to ultimate unis to be sampled 7 List counties within selected sample of states 7 Choose sample ofcounties 7 to The sampling step in any stage may itself be simple random systematic strati ed or even clustered Cluster Sampling Advantages I The main advantage of cluster sampling is to sample economically while retaining the characteristics of a probability sample 7 May be impossiblecostly to list all population elemenw 7 Requires a complete listing of clusters for selection at the rst stage but only a list of individual elemenw Within the sampled clusters 7 O en sampling Within clusters reduces costs especially When clusters geographically de ned area sampling Cluster Sampling Disadvantages I Mainly larger sampling error for given sample size than other probability samples I Why 1 Cluster sampling is subject to two sampling errors 7 Selection of clusters 7 Selection of elements Within clusters I Why 2 Homogeneity within clusters fails to represent heterogeneity within population MultiStage Sampling Controlling Sampling Error I Consider example of singlestage cluster I Total sample size number of clusters x average number of elements Within cluster I Suppose overall sample size is xed 7 Then tradeoffbetween number of clusters vs number of elements within clusters I Typically more homogeneous Within clusters than between clusters 7 So as general rule have many clusters but select few elements from each cluster 7 E g for overall N2000 5 households within each Census block with 400 blocks MultiStage Sampling Sampling Error vs Cost I The general rule have many clusters but select few elements from each cluster 7 Works in direct opposition to goal ofreducing sampling cost 7 Need sampling frame for many clusters 7 Travel costs between many clusters m area sampling I So an additional tradeoff arises between cost and sampling error e The costsampling error tradeoff arises in Same form with all sampling What is the Appropriate Sample Design 0 Degree of accuracy amp precision 0 Resources available including time 0 Advanced knowledge of the population 0 National versus local 0 Need for statistical analysis Estimating Required Sample Size 0 Power approach 7 Control probability of making Type I and Type II errors 0 Estimation approach 7 Control precision by controlling width of con dence interval aron parameter of interest Power Approach 0 Type I error reject a true null 7 or signi cance level Prob reject null l null is true 7 1 or con dence level 0 Type II errror do not reject null when alternative is true 7 3 Prob do not reject null l alternative true 7 1 3 Power probability ofidentifying a true alternative Outline of Power Approach 0 Specify significance level 0 Specify how large the true departure from null would have to be in order for it to be important to reject null Specify standard deviation of sample statistics under null and alternative 0 Specify desired power 0 Compute required sample size Estimation Approach 0 Control precision by controlling width of confidence interval around parameter of interest 7 Simpler than power approach Determinanm of required sample size 7 Variability in population 7 Magnitude of acceptable error width of CI 7 Con dence level Determinants of Appropriate Sample Size I Variancestandard deviation I Magnitude of error I Con dence level Sample Size Formula u F i E y FY E2 Sample Size Formula Example Suppose a survey researcher estimating population mean monthly expenditure on cable wishes to have a 95 percent con dent level Z and a range of error E of less than 200 The estimate of the standard deviation is 2900 Sample Size Formula Example n 19653002 2 28422 808 Sample Size Formula Example Suppose in the same example as the one before the range of error E is acceptable at 400 sample size is reduced Sample Size Formula Example n 2 ET 2 19399002 E 400 5684 2 14212 202 400 Calculating Sample Size 99 Confidence n 25299 n 251292 74532 7453 2 2 4 7265 2 186325 2 1389 347 Standard Error of the Proportion W Confidence Interval for a Proportion p i thSp Sample Size for a Proportion Zzpq 11 E2 i l n7 E Where n Number of items in samples Z2 The square ofthe con dence interval in standard error units p Estimated proportion of successes q 1p or estimated the proportion of failures h2 The square ofthe maximum allowance for error b een the true proportion and sample proportion or zsp squared Calculating Sample Size at the 95 Con dence Level You guess that 60 oflikely voters will choose candidateA and 40 will choose B What sample size to estimate true proportion within 3 5 pe n196264 q4 0352 7 3841 24 7 001225 001225 753 20 Research Methods in Economics ECO 4451 Regression Part II Extensionsto Simple Regression o Multiple regression gt1 independent variable last time we saw how failing to contml for important explanatory variables can lead us to draweyidendy incorrect conclusions lmma regression model Relatively few research problems in economies are amenable to simple 1 indep var regression ohdditional hypothesis tests Multiple regression allows lora greater yadety of hypotheses to be tested Hypothesis Tests aboulaSingle Coefidenl Weselupnullandallemaliyehypolheses suppose lmosided alemaliye lipi0 liplyre As in the simple regression case we lost this hypolhesismilhlhelslalislic TestingaSingleCoeiioienl Bpsimilarreasoninglolhalusediorsimpleregressionii HD is true rm 396 up blowoutdistribution PVC2 vrilh llK degrees offreodom mheredg denotes the standard error of pig We o en are interested in the hypothesis that the slope equals zero Then the oompdedleslimaled coeilll std error Whats New s From a practical standpoint of having a basic working understanding of multiple regression there is not much new to learn alter simple regression v However the multiple regression model opens up the possibility of testing some hypotheses that can t be considered in simple regression Hypodneses involving linear combinations of coefficients Hypothesis Tests about Several Coeiidenls llanphypolhesescanbelesledinmulliple regressionmodel Themoslcommonhypothesisolherlhanlesling asinglecoeliciemislhaldslopecoelicienls are zero Viewedasalesloioverallsigniicanceoi regression nuncon llphlloastomllell ledusinglheFslalislio Testing Joint Signi cance olllll Slope Coef cients TheFslalislieiscompuledbyepmparinglhe varialion explained bylhe regression SSE lo lhe varialion unexplained SSRadjusling for degrees ollreedpm ll F NFKlllK39 We 39 Under lhe null hypplhesislhis slalislie follows anF dislribulion llllh Kldl inlhe numeralpr and NlOdl in lhe denominalor We rejecl lhe null if lhe eompuled F exceeds lhe eriliealFlorour chosen level of slalislieal signi cance Example MarianneFeller lhinWallelsHaveFalOllpers TheInverseRelalionshipbelweenlnepmeand Obesilp Whalislhelinkbelweenincomeandpbesily Policypublieheallhheallheareepsls Economieselleclsolineomedillerences Example Fowler Paper Coneepluallrameworkorlheplp Heallhierlpodismoreeoslly Slressesollowineome Priorevidenee lilreview MissingChouGrpssmanandSaller lln economic analysis of adull obesilp qumal olHealhEeonomles0cllUll ABroader Economic Perspedive Chou0rossmanandSa0er Highling imponance of economic incentives in obesity Foodprices Fas1foodrestaurantsalhome Cigarette and alcohol prices Wagesvalueoflimeinfoodpreparalion Femalelaborforcepanicipalion Fower Data Crosssedionaldata40comiguousslalesinyear 0000 BMI from 000 mpe0e0butstandardmeasureofobesily BMIgt00isobeselalerexamines of 0a0e popula on that is obese Percapila income from BEA EducationandracefromCensus Descriptive Statistics 0000 Mean 8 0 00090 500 00ese 205 0 HS 0 000ege f 0 White 0 B00 0 39393939 Obesity and Income 00905 005 Stat Panoe notenoeot 20000 092900 0080 0E0 Pcnoone 00000 08E05 025 00000 PSqoare 02823 00000 or so on F 00 Pegnessnon 0 708099 7087 08097 00000 Pesndoan 00 095088 0208 000 07 070258 nterpretations 001 increase in incomeis associated with7 Thenulllhallheooeffofinoomeiszerowould be How aboula1000 increase in income Confounding Unconrolledfanorslhalmaydiffer between states and be relaned to both income and obesity Solutions Approaches to treat confounding Adapllhe diffsindiffs approachlolhissituation which lacks random assignment Withinstonedifferencebenneenlimeperiods ofobesilyand incomeand possibly other van39ables Multiple regression Educationraceusedinpaper Otherfoodpricesfasnfoodoutlelsnomen s LRPRwages FowlerObesity Regression Coefis StdErr tStat Pratoe intercept 2585t8 t88t88t 58888 i5E8i PCtncone 8E85 888E85 8888 888887 HS 88888 8882888 8828 875588 Cottege 88858 8888875 8878 88888 White 88288 8888888 85882 858288 Btact 8t858t 8888885 28888 88288 interpretations Income Hontointerpretcoettotsaycoege 81 increase in population nth college educationholding8onihhseducation wan would decrease obesiy by Interpretcoef cientot8oBackusingsiniar reasoning Separatettestsotindividualcoeticiens WhathappenstoRsquaredrelatietosinple regression Testing the Regressionquot Testthenullhypothesisthatasopecoeticients all coetts except intercept are jointly zen Reminder The Fstatistic is computed by comparing the variation explained bythe regression SSE to the variation unexp SSRadjustingtordegreesottreedon ained WerejectthenullittheconpttedFexceedsthe criticalFtorour chosen level of statistical signi cance TestingAllSlopeCoef dents Ho f af f w wo H A Atleast one ofthese is notzero SS 165W F M 5 1035535 555515 39 542 213685FKWK The computed Ffar exceeds the 5 critical value of aboul196 sowewould rejec1 the null hypothesis Hypothesis Tests aboutaSubset ofCoef cients We veseenhowloleslhypolhesesaboula single coef cient 1 or about them all F lslhereawayloleslanimermedialesubselof coef ciems Weo enwamloknowmmsherasubsetof independem van39ablesmore than one variable but fewerlhan all oflhemis linearly relased lo the dependent variable 39 Testthenullhypothesisthatallthecoe msntson this subset ofvan39ables are jointly zero Test ProcedureforaSubset ofCoef s39ents Onewaytotestasubsethme identsstorunm regressions Anunrestrictedregressionmrethesoef sientsare freetotakeanyestimatedvam Amsgressmwherethemef sientsare resridedtobezers Byexdudmgthepeninentvanames NthevanameswearetestmgreaHymatter thenthe unrestricted regression showd tthe data mush setters Theunresmdedregressmshowdhavemuoh smaHerresduass Sawscancomparethesumofsquaredresiduas sewseenthesewsoregressons inmnmmanen nmm iheFsiaiisiieiseonpuiedbyeenparingihe variaiion unexplained byihe resirieied regression ESSR in ihe vaiaiion unexplained byihe nnresirieied regression USSRadjusiing in degrees oiireedon iiSSiiUSSR r Jr USS iiiKi i10 nirere J equals ire runner efresirieiiens hmMMWMMMMMs Supposenenaniioiesiiniheobesiyexanpie nheiher all slope eoeiieienis besides income are zen Wenaniiniesinheiheriheiourvariabies concerning edueaiion and raceiaken iogeiher have any linear relaiion in ihe dependeni variable H i eei Wf nf nz WMMmmmmmn EWMWWWMM H i eei iir nniirnr0 MNWWMWWM ThereareMresirieiionsiJ4i iheresirinedregressiennillexeludeiheiour assoriaied variables leaving only income and ihe iniereepi The unresirieied regression will include ihe iour vaiabies along niih anyihing else in ihe resirieied regression Restricted Regression 009775 SO0E77 7807 0009 70ercept 28000 092900 080 0E79 0000070 00000 08E05 025 00007 00070 07 SS 078 F 97 Pegressm 7 708099 7087 78097 00007 Pes7dua7 00 795088 0208 700 07 272258 0WMMWMWM RSSR105000 Unrestricted Regression 00970 StdErr 7807 00009 7097090 250570 0000007 50000 75E07 00700009 0E05 000E05 0000 000007 00 007000 0002000 00720 075500 0077ege 00050 0000075 0070 000000 07009 002700 0000000 05002 050200 0000 070507 0000005 20000 007200 00000 0 SS 008 F 0790 Regressmn 5 7007007 00707 700000 090000 02 700 5505 20050 700 07 272 2507 mmwmmmmwm M000 Example Ftest forsubset of coef dents 00000 0 F USK Wmmmmmmmmw 0mmmnmmwmmmm 250 0WMOMWMO nmnmmmmmmmu lnasinilaruayueoanleslnlrlinearreslridion on ooeltoienls Example of alinear reslnolion Ho 2 an 39 n W llllnu dloolr39 H A 1 2 our quot u ll U nln lo i dlool39 lhaluould not be an interesting hypothesis to lestbutonlyilluslralesalinearreslriolion Consultaneconometricsbooklorapplioablelesls of hypotheses like this Building a Regression Model OWMWMWMHWWWO mummmmmun ummmmmwmmu MMMWMa ommnummmumu mummmmmnmm qmumm Not a question of math or statistics butol economics mummnwmmnmu r Choose relevant variables amp how they should be measured a Dependentuariable o Obeslou study use to obesell oueruelgnt nean Blll by state what a lndependentuariables What are the let eoononlolnoentlues and other factors alledlng tne dependent uarlablel o Econonlc theory o Pr or enplrlcal eudence o lnstltutlonal lnonledge o Common sense and erperlenoe Holltoneasuretnelndependentuarlablesed mean lncone on state or beloll povennlne or what Key Steps in BuildingaRegression Model 2 Consider the effects of omitted variables Are they correlated with the included variables of interest confounding How can this he addressed Whatis theenperimental analogy Key Steps in BuildingaRegression Model a Match data to the model o level of aggregation Best is to match data to the hypothesis 9 isthehypothesismacroeconomicor microeconomci o Macroeconomicaggregates o llicrodataohseruatonoilndiudua dedsion maling unts persons households ms t Exampleobesity studybest match would he data onindiuiduals ntermediate levels olaggregaiion eg data ystate Suppresses some variation C H Key Steps in BuildingaRegression Model a Match data to the model continued o Sourceoiuariation Crosssection umeseries Panel data o Samplesizeconsiderations o Variables Usuallythere is an imperfect linkhetueen the variables that economic or other hehauioral theories emphasizeand measurahleuariahles Do the best you can to match it up Key Steps in BuildingaRegression Model 4 Choose functional form for regression model e Economi theory or other conceptual framework usually is helpful in choosing variables and determining their expected relationships but usually does not specify thefunctional form e Prior research e Properties expected or desired see Seth Jacobs hochey paper e Convenience Building a Regression Model e0nly after going through the preceding steps are you ready to estimate the regression and test the hypotheses QualifalivelDisereleExplanatoryVariables Sometimeswewanlloaeoounliorlheinfuenee oi aiadorlhal is nolmeasured quantitatively but has onlyalimiled range of ealegoies For examplesomepeople have argued that the traditionalcookingoilheUSSodheonlrihules lo the higher ohesily rales lhereindependenlly of lower incomes and education levels Weeannolmeasure sodhness quanlilalively hulneean calegoize slales by region Thismeansusingealegoriealorsoealled Wrenches DummyVariablesExample De neavanablelhailakeslhevalueoneifa slate is in the Soulhas de ned bylhe Census Bureau Thevariablelakeslhevaluezeroifas1aleis not in the South This 01 van39able or dummyvariable then indica1es whether ornol an observation came from the South Wecanincludelhisvariableinlheregressionlo lest Mather obesilyis higher in the 30mm controlling forlhe other independem variables There arefourCensus regions and11sla1es in lheSoulh Census Regions Regression with DummyVariable Coeffs StdErr Stat Pvalue ntemept PC ncome HS CoHege White B ack South RSquare ANOVA Regression Residua Tota Interpretations TheooelioientolSouthisdlhsuggestingthat the percentage of population that is obese isdll higher in the South than in other regions alter controlling lorinoome education and rare dilerenoerdetueenrtater Howeverthetstatistioindioatesthatuecould not rejed the null hypothesis thd the true coef cient is zero Note that the single dummy variable South distinguishes between states in the South and all other states depending on nhether i is l or 0 Wedonotneedaseoonddunnpuariableto identilpnonSouthstates DummpVariables llthere are two possible categories like SouthlnotSouthnalellenale narriedunnarriedeto only one dunnpuaiable is needed Moregenerallyiltherearelloategoriesueneed lldl dunnpuariables to account lorall the differences between groups lheooelioientoleaohdunnprariahle measures the inpad olthis category relathe to the import ol the excluded category Example Jamison Schweitzer Housing PriceszDoes Your School s Grade Really Matter 0therthings equal are houses ruodh more when located in zones for better schools Eoononiolraneuorkhedonionodellorpdoesol goods that are bundles of oharaderistios Priceolhouseuieuedasalundionol oharadedstios olthe property like size nhether it hasapool and oharaderistios ol the neighborhood like quality of school SchweitzerResearoh Design StudysireisTempleTeraoeinTampaarea Small enough that many neighborhood oharaoerisriosmighr plausibly be assumed constant throughoulsuoh as orimerares Samepropenyraxraesrhroughourero Threeelementarrsohoolsrharallfeedinrorhe samemiddle and high schools Thusifsohoolqualiryaffeorshousingprioesir must be elementary school quality in this case Schweitzer Housing Data Crossseerionalmicrodara195housesrheir sales prices oharaorerisriosand elemerrary sohoolzone Housing oharaererisrios square footage age number of bedroomsnumber of bahroomsand dummyvariablesfor Whetherhousehasaswimmingpool Whetherhousehasa replaoe Whetherhouseisonwarerfronr Natural logarithm of sales price is dependent variable SchweitzerSohool Data School qualiry measured byrhe school s grade on the prior FCAT SehoolsaregradedAroF TherhreeschoolsinTempleTerraoeearned oneAoneBand one C ConsiderFCATgradeaoaregorioalvariable with three categories inrhis oasegrade ofA BorC Weneedrwodummyrariables Descriptive StatisticsSchweitzer 0000 Mean 50 Pgee 20200032 25320200 501 200352 00202 Age 3002 052 Bed 302 002 000 220 P00 050 522eg2aee 022 000001 000 Agade 050 Bgade 025 We 00 Regressgon Results 00925 3002 232a RvaIue 20e2eeg 303022 000020 305005 255302 sg 00050 gees 0225 55035 Age 000022 00035 035322 000E05 Bed 000230 002050 300202 0000032 0th 000350 003020232005 003025 0002 00520 00303 00035 033002 522eg2aee 00002 000322 03020 0020003 Waterme 023203005032 300320 032E05 Bgade 03000000300 30233 0000320 Regression 0 053030 2 Res200a2 205 0050 2 2002 200 203000 nterpretations2 Charagerigicsofhouse Waisrfrom Schoolqualily Whyarecoef ciemsnegaeive2 Effegs of BorCschoolmeasured 202600020 02102 Doesschool quaily ma0er2 Howie lest hypothesis lhai coef ciems of both school grade dummies arejoimlyzero ie school grade has no linear relation to log price Does School Quality Matter H0 Bgm 030 000 0 H A 201010 one of these 00001000 is not zero 500a000000m300 00mmmmmmwm0 WWWWWWWMWWW MWWWWM 39mmmmwmmmmmo WMWWMa Restricted Regressmn 0090 020E002 200 P0309 00000 0052020 0000005 00502 000E000 000 000055 505E05 0500 002050 Age 000005 000020 50550 00002500 Bed 0000202 0020005 50050 00000200 0th 0000205 0000002 2505 000220500 000 005202 0052000 0505 000500025 50000 0000205 0002052 00055 00000000 00000 0250020 0050005 0500 25050E05 RSquaIe 0000050 00000 0 SS 00 F SignUnanceF Regressic 20050200 02202 02002202 525550520 0es00a0 002 010502 00520 0000 0002050055 Example Ftest forsubset of coef dents 0000002 0 F WM Mmmwmmmmmmw mmmmmmmwmmmm ammmm 0WMWMWMO Research Methods in Economics ECO 4451 Review of Basic Statistics Review of Simple Statistics I Here is a brief review 7 of What you should have learned in the prerequisites for this class I Descriptive statistics 7 Using sample statistics like measures ofcentral tendency and dispersion to describe a sample of observations I Inferential statistics 7 Using sample statistics to make inferences about unknown population parameters Estimation and hyp othesls testing Descriptive Statistics I Measures of central tendency 7 Mean median mode I Measures of dispersion 7 Variance or standard deviation range Measures of frequency 7 Counw proportions Descriptive Statistics In papers using statistical analysis you should present appropriate descriptive statistics for important variables 7 Eg means and standard deviations or proportions 7 O en presented in a table 7 Possibly separately by different groups or sub samp es particular y ifyour paper involves a comparison between groups Descriptive Statistics Usually some brief discussion of the descriptive statistics is appropriate 7 Give the reader some idea about the type of units in the sample 7 Give the reader a feel for the scale of the data 7 Give information about the amount of variation Note inspection of descriptive statistics often reveals the source of problems you may be having with statistical procedures Inferential Statistics Now instead ofusing statistics to describe a sample we use sample statistics to make inferences about a population parameter For example we use the sample mean to estimate the value of the population mean Then we may want to test some hypothesis about the population mean Look at the basic statistical tests Review of Simple Statistical Tests 0 Many research questions can be addressed with very simple statistical tesm 7 O en a good research design leads to a simple test while a bad design requires complex statistical procedures for analysis of the data Since many classic research questions imply a comparison between groups the two sample or multiplesample tests are especially useful Overview 0 Tests concerning population means 7 One sample test 7 Two independent samples test 7 K gt 2 independent samples 7 Matched samples 0 Tests concerning population proportions 7 One sample 7 More general tests Overview Cont 0 In the context of discussing tests for differences between groups we ll also look at a methodology called differencesin differences 7 This o en allows you to get more out of simple tests Tests about Population Means I Population model xuq Population Model I Value of Variable y for population element i Viewed as sum of two components pop mean of y same for all elemenw q deviation of element i from mean VaIies over elemenw Viewed as random VaIiation aIising from heterogeneity among sample elemenw Population Model Cont I For to be the population mean of the y s it has to be the case that the average or expected Value ofthe e in the population is zero yi I ei Eei0 Additional Assumptions I We also assume 7 The random deviations from means are statistically independent 7 ion is drawn i the same probability distribution so they all have a common Var39 coveiej0 Ve 0392 Vej The Sample I We draw a simple random sample of n observations from the population I Compute the sample mean possibly have been drawn We consider th properties ofthe sample mean in a repeated sampling context I Since our sample is only one of many that might e I In repeated samples the sample mean Would Vary mly across the samples due to different mndom selections of elements from the population different e1 Central Limit Theorem I The CLT says that ifthe sample size 7 n 7 is large a on average across repeated samples the mean ofsample means equals the population mean 7 The variance ofthe sample means across different samples equals the population variance divided by n e The distribution ofsample means across dilrerent sample 15 norm Sampling Distribution of Sample Mean I Thus For samile mean 17 111 2y 11 Ea V07 02 n Y Nuazn Standard1z mg I Thus 177 We standardlze z 17 J z N N01 The mean ofz is zero because the mean ofsample means equals the population mean The variance ofz is one because the variance of sample means equals the standard deviation of y divided by the square root ofsample size Testing One Population Mean I We Want to use sample info to infer Whether the tme population mean equals some Value say k H U y k H A y t k Now YNNQ 0 n so ifH is tme V k then 17 NNk 172 n and Y 7 k z N N01 17 I So ifthe null hypothesis is true the expected value ofz computed as above is zero How Big 21 Difference is Big Enough I Of course the zstatistic will not likely equal zero exactly even if the null hypothesis is true 7 This is because ofsampling variation the sample mean ofy values will not exactly equal k I So we have to ask how big a difference between the sample mean and the hypothesized population mean of k is enough to conclude that e re population mean is not equal to k Testing Using the Standardized Sample Mean I Again the zstatistic will not likely equal zero exactly even if the null is true I If 2 is vy different from zero sample evidence is inconsistent with the null hypothesis and we reject I But ifz is prey close to zero sample evidence is not inconsistent with the null hypothesis and we do not reject the null I So how do you de ne yery different from zero vs pretty close to zero for z How Big is Big I How big is big Two elements to the answer 7 First we measure the difference between our sample mean and the hypothesized population mean relanve tn the expected amount ofvariation among the many possible samples we could have drawn That s what we ve done in computing 2 the difference relmwe to the square root of expected sampllng yanation 7 Second we ask howbig achance ofbeing Wrong we are willing to live with Specifically how big a chance ofa Type I met is 0K7 Classical Hypothesis Testing I Ifthe null is tIue but we reject it we make an enor 7 A Type I enor I There s alwa 5 some chance of drawing an unusual sample that leads us to reject a We null I How big a chance are we willing to take Statistical Signi cance Are we willing to take a 10 chance of rej ecting tIue null incorrectly rejecting a We null in 10 of 100 possible samples then 7 Signi cance level 10 con dence level 90 Or are we willing to take only a 5 chance 7 Signi cance level 05 con dence level 95 N r We demand alargerzbefore rejecting than in 1 Or are we willing to take only a 1 chance 7 Signi cance level 01 con dence level 99 e We demand a still largerz before rejecting than in 2 Equot Testing One Population Mean HE y k tested againstHA y k 7 file 2 7 o J We know k and n and can compute 17but we don39t know 039 So we use the sample standard deviation instead and compute N01 Y 7k where sz LIJZXX if 7quot 11 sJ Example I We test the null that the population mean is 30 against a twosided alternative at 5 percent signi cance The sample mean is 31 With a sample variance of 25 in a sample of 100 observations Z3I30510 720 7 The sam le mean is two standard errors away from the hypothesized population mean At signi cance of 05 in a twotail test the critical value ofz is 196 e The computed test statistic exceeds the critical value so we reject the null Z or t I When We use the sample standard deviation instead of the population standard deviation in the formula for z the resulting sample statistic no longer has a standard normal distribution but a t distribution 7 However ifn is large the standard normal distribution closely approximates the I 7 But ifn is small we use the distribution I What s small 7 Certainly anything less than n 7 30 e For n gt 30 the standard normal approximation is ne Review the Method I The method of hypothesis testing just applied carries over to other situations 7 Temporarily assume that the null hypothesis is true 7 Determine What should happen in sample if the null h 39s is true 7 Check Whether What should happen did happen Yes Do not rejectthenull The sample evidenceis consistent with the null No Reject the null Although we may have by c ance drawn an unusual sample we choose to reject the null when the sample evidence is inconsistent with the hypothesis Comparing Two Population Means I Classic research questions involve comparisons between groups I Consider two populations A and B yLAuAexAgt yxBIuBexB39 I The groups might be men amp women or small amp big rms or states with amp without capital punishment I The set of assumptions made for a single population now applies to each population separately Testing for a Difference between Two Population Means I We want to know are the two population means equal Hui m 5 HA m I 5 Draw asimple random sample from each population Sampling Distribution of Two Sample Means I Apply the Central Limit Theorem to each sample Y1 Niani Y7 NBagn8 Difference between Means I For sample and population dYA7Yb 6uA1uB Using properties orrneans samp e rnean difference is on average over repeated samples equal to the population ditrerenee in means EdEZ E A r 5 Difference between Means I If samples are independent then the Va1iance of the difference equals the sum of the Va1iances Vd aw75 2min VZV7r nAa n5 I Recall that independence implies zero covaIianc Sampling Distribution of Difference between Means Using properties ofnormal distribution the d39 ren 39 an i ormally distributed So 2 2 d N6UA nA 08 MB and d 7 5 zdi 2 N01 J AnAUBn3 Testing Difference between Means H 5 k tested against HA 5 k J AnAJrs ng We use the sample variances ian and n5 are quotlargequot and use the normal approximation Testing Difference between Mean But if nA and n5 are quotsmallquot we assume equal variances in the two populations and estimate this common variance with the pooled sample variance and use the tdistribution d 7 k JsZanA 1n5 S n 71sin5 ins nAn57 Matched Samples I Use matched samples to reduce heterogeneity by removing the heterogeneity that is common to all elemenw in the matched set 7 Get repeated observations ofsame element 7 Get observations on two or more similar matched elements e g twins siblings I Consider a matched pair yxi1exigt Mz zexr Matched Samples I You have n ofthese pairs 7 Compute the difference between observations for each pair 7 Compute the sample mean and sample variance ofthe set ofn differences I Then proceed to test a hypothesis about the mean difference between matched pairs just as you would test a hypothesis about any mean d yrym POquotErin VG mom Testing Differences between Kgt2 Population Means 0 A widely used statistical model is ANOVA 0 Based on comparing the variation between the K samples to the variation within each sample 0 Idea is that if the variation between samples is big compared to the variation within samples then chances are the samples came from populations with different means Testing Differences between Kgt2 Population Means 0 However all inferences from ANOVA also follow from a regression model 0 Regression is more flexible easier to add model extensions and is the main statistical procedure used in economics 0 So we will not separately cover ANOVA but will take up regression next week Examples of Difference between Means Tests I Consider Does the Death Penalty Deter Murder by Tammra Hunt I Compare murder rates with and without death penalty 7 Crosssection of states is murder rate higher on average in states without executions 7 Time series of states did the murder rate fall in states implementing the death penalty when a11owed by Supreme Court a Panel data allows an approach based on diirerencesin differences BetweenState Differences 2003 I Consider two populations A with death penalty and B without death penalty yiA AeIA yxB 3ei3 H0 d Ius sSO 6A718 0 HA IuA ltIus so 5 A 743 lt 0 Note the alternative is onesided because the research hypothesis is that the death pena1ty deters rnurder Thus the testatistic must be large and negative to reject the nu11 In zvm ufthz alternatzve Testing the null hypothesis I The test can be conducted by computing the t statistic note one sample size is less than 30 manually I Or it can be conducted automatically using statistical software or Exce I In Excel select Tools old version or Data new version then 7 Data Analysis 7 trTest Two Sample Test Assuming Equal Vanances Between State Differences 2003 Mast Tvvursample Assuming Equal Variances Death Penalty Nu Death Penalty MurderRale Murder Rate Mean 5 328 473 2 883333333 Variance B Z734D3B73 Z B14Z4 2424 Ohse U n 12 F39uuled Variance 5 433475512 HprIHESlZEd Mean DWEVEHEE El t 48 Stat 31BBBBBZE7 T t al El EIE1333BBB CHUEal me all 1 B77224137 Tltt Wurtall El EIEIZE73333 Cm ltWDIaH 2 EHEI634722 Comments Note that there is no real need to conduct the test since the sample mean murder rate is higher with the death penalty than without 7 There s no way for that evidence to lead us to reject the null in favor of the alternative that the mte is lower with the death penalty 7 But if we do the test the computed test statistic of317 is not less than ie is not more negative than the 5 critical value for the one tzil test of167 7 So we do not reject the null Comments Cont 0 So roughly speaking we are 95 confident that the murder rate is not lower in states with the death penalty 7 That s aperfectly legitimate conclusion 0 What happens when we go a step further can we say that the death penalty evidently does not deter murder 7 Heterogeneity between states Reverse causation WithinState Differences I What if we consider withinstate differences in murder rates 7 Idea is that ifdeath penalty deters murder the murder rate should fall after the penalty is implemented Take the states that adopted the death penalty a er the Comt allowed it 7 Get the mean death rate for each ofthese states over the 4 years before and the 4 years a er 7 Test whether the mean is lower a er than it is before I This is a matched pairs test 7 Each state s before period is matched to its after period Testing the withinstate difference I The test can be conducted manually 7 Compute the A er 7 Before difference for each state with the death penalty 7 Get the sample mean andvariance ofthese differences 7 Test the null hypothesis that the difference is zero against the alternative that it is negative I Or it can be conducted automatically using statistical software or Excel I In Excel select Tools old Version or Data new Version 7 Data Analysis 7 tTest Paired Twu Sample for Meansquot Within State Differences Jest Paved Twe Sample for Means Before 4 year Mean After Z 57 Mean Mean 9 913157995 9 799421953 Va 14 21759249 15 29495149 39 Observatlnns Pearsun Currela 39 mm 9 934731492 Hypntheslzed Mean Difference 9 df 37 9 939919975 F39Tltl metal 9 2931199 Critical metal 1 997993597 F39Tltltvv97ta11 9 5292339 t Crltlcal Metal 2 929192447 Comments I This time we see that the murder rate did decline slightly after reintroduction of the death penalty 7 So We do the test to determine Whether the decline in murder rate is statistically signi cant 7 ie is it larger than could plausibly be due to chance sampling variation 7 The computed test statistic of 064 is not greater in absolute value than the 5 critical value for the onetail test of169 7 So We do not reject the null in favor of the alternative that the murder rate fell Comments Cont I So we are 95 confident that the murder rate is not lower in states after re introduction of the death penalty 7 That s a perfectly legitimate conclusion I What happens when we go a step further can we say that the death penalty evidently does not deter murder 7 Trend 7 Other uncontrolled temporal in uences Differences in Differences I Are there obvious problems With comparing murder rates between states With amp Without executions 7 Many ofthese do not occur in withinstate comparisons I Are there obvious problems With com aring murder rates Within states before and a er death penalty enacted 7 Many ofthese do not occur in betweenstate comparisons I What if you could simultaneously use both Within state and betweenstate differences 7 Would this compound the separate problems ofthe two tests orwould it eliminate many ofthe separate problems Differences in Differences I What if you could simultaneously use both Within state and betweenstate differences 7 Would this compound the separate problems ofthe two tests orwould it eliminate many ofthe separate problems 7 It turns out that diirerencesindilrerences limits the scope for many ofthe problems ofthe two tests just considered 7 Taking withinstate diirerences removes a lot of heterogeneity between states and limits the scope for reverse causa i y 7 Taking betweenstate diirerences in the withinstate differences limits the harm f m trend or other unobserved temporal factors to the extent that these are national in sco Differences in Differences I Compute the Withinstate differences before and a er Supreme Comt allows death penalty for states that use executions and states that do not 7 Com ute the mean ofthese differences in both grour s of states 7 Take the difference between these two means 7 a differenceindifferences I Test the null hypothesis that the mean difference indifferences is zero Testing the difference in differences I To conduct the test 7 Compute the A er 7 Before difference for each state with and without the death penalty 7 Get the sample mean and variance ofthese differences separately for the with and without states 7 Test the null hypothesis that the with vs without diff r ce is zero 39nst the alternative that it is negative 7 an ordinary diirerence between means test for two independent samples I Do the rst step manually and then the second two either manually or using statistical software or Excel Testing the differenceindifferences 979 Twnrsample Assumng Equal Va ances D7fference WW7 D7fference wrmout eam ena death enat Mean 79 744799942 79 247979997 Vamance 7 944599929 9 999974977 Observatmns 72 999729 Vamance 7 944729599 Hypnthesued Mean D7fference 9 97 Stat 9 242995997 F39Tltt metal 9 494599994 tCHtmal metal 7 977224797 1 mm 9 999997299 Conclusions Discussion Tests concerning Population Proportions I O en We Want to make inferences about how quently an event occurs 7 in What proportion of possible cases 7 Example 7 before an election we conduct apoll to estimate the proportion oflikely voters who will Vote Ior a particular candidate I We may Want to test this proportion for one population 7 eg is it greater than 50 I Or erhaps We Want to compare between two populations 7 e is it the same for men and Women The twooutcome case I Start With the case oftWo possible outcomes 7 A startup fails ornot 7 A student graduates in fouryears or not 7 Apatient lives or dies x 0 if event does not occur x l 11 event ooes occur Let 6 denote the unkown true population probability that the event occurs So in the population the proportion of cases Where x 1 is Sampling I Draw a simple random sample ofn observations on x r The number of trials is n r The number of occurrences is X X Z x This number of occurrences would i7l vary randomly between different possible samples In repeated sampling its mean and variance are EX 7 m9 VX 7 quot517 5 The Sample Propo ion Let p Xrl be the proportion ofcases in the sample in which the event occurred The sample proportion would vary randomly over the many possible samp es that exist In repeated samples its mean and variance are EP1nEX9 Vp 1 n2VX 1 n917 9 These results follow immediately from the observation that isjust the sample mean ofthex 20 Sampling Distribution of Sample Proportion I Once We recognize that the sample proportion is a sample mean We can apply the CLT as long as n is large For sample proportionp 1 m2 x i i E p 9 Vltpgt Has awn p N N69 VI Standardizing I So We proceed as iftesting a mean recognizing that the standardized proportion Will follow a standard normal distribution p719 We standardizez 1 I9lil9n zp N N0 1 Testing One Population Proportion e want to use sample info to infer whether the true population proportion equals some value s HE 6 k H A 6 k HE is true thenp NNkkrki n miu 10 46 N W N01 So ifthe null hypothesis is true thez computed as above s ould e zero except for sampling variation 21 A dumb example I A 2004 poll of Orange County residents indicated that 70 believe building a new arena for the Orlando Magic is very unimportant for county 0 Can we conclude that at least 12 or all residenm view the arena as very unimportant Example HE 6 05 HA 6 gt 05 p NNkk1ekn and 0397 0395 8634 Z p ali3i5466 We reject the null hypothesis at 5 signi cance so we are con dent that 50 ofthe population views anew arenaas very unimportant Comparing Two Population Proportions I Classic research questions involve comparisons between groups I Consider two populations A and B is the population proportion the same in each I There is a test for a twosample difference in proportions I But it is a special case ofthe more general chi square test for contingency tables 7 So we ll considerthe contingency table test only 22 Contingency Tables I Contingency tables are a crosstabulation of the fr quency of occurrence 0 events I Consider a 2 x 3 case 7 The row variable is gender with 2 possible values 7 The column variable is Vote with 3 possible Values Bush Ken39y Nader I We draw a sample of size n for example in an exit poll 7 Then tabulate how many people ofeach gender voted for each candidate Building the Contingency Table I Use the symbol a to denote the actual sample frequency of Votes in each cell of our table For example am number of females F v mg for Bush B aMK number of males ll voting for Kerry K etc This is a 2 x 3 contingency table crosstabulating Vote by gender 23 Contingency Table Tests The most common hypothesis to test is independence or homogeneity 7 Vote is independent of gender I Ifthe null hypothesis is true then the proportion of for example females voting for Bush should equal what Expected Cell Frequencies I Under the null hypothesis of independence the proportion of females voting for Bush should be the same as the proportion ofmales voting for Bush 7 Both proportions should equal the overall sample proportion ofpeople voting for Bush 7 In fact the proportion ofeither gender voting for any candidate should equal the overall sample proportion voting for that candidate I We use this information to compute expected as opposed to actual cell frequencies Building the Expected Contingency Table I Use the symbol e to denote the expected as opposed to actual sample frequency of votes in each cell of our table For example eFB expected number of females F voting for Bush B eMK expected number of males ll voting for Kerry K etc 24 Under the null hypothesis ofindependence we expect that the proportion offemales voting for Bush equals the proportion ofthe sarnple voting for Bush as e FB 7 7 so that em3 7 X as aF Expected Cell Frequencies Z a a X a F5 E sothatem F 5 n n In general the expected frequency in row 1 column ofthe table is given by row 139 sum gtlt column j sum 2 VI Test1ng Independence The test is based on cornparing actual to expected cell frequencies Under the null ofindependence ay info in every cell all e in our 2 x 3 example So our test basically is to add up the ditferences between ac and expected frequencies over all the cells But We have to account for 1 Making sure positive di erences do not cancel out negative di erences and 2 Measuring the differences mlatzve to something 25 Testing Independence Oar rmasure ofdivagmcebetwem actual and expected cell 39equencies the1 is 7 2 M We add this ova all cells to corrpute By e i t 2 l l J l 2y Where there are Mrows andelumns and so Mx N 39 sum This is a chisquare test statistic with degrees offreedom equal to Mr 1 x N7 8 E E o Example of Contingency Table Test 0 Consider Election Economics Is the Economy the Driving Issue behind Voters s Decisions by Jennifer Purdy 0 Survey of n150 likely voters 7 Data include indicator of gender of likely voter and presidential candidate the voter plans to Vote for Example of Contingency Table Test Bush Kerry Nader Revveurn 29 45 EL 74 Female Male 1 35 El 7B Cul sum 7D En El Win This would be a 2 x 3 contingency table But since no one in the sample plans to vote for Kerry we a as well delete him from the analysis leaving a 2 x 2 table 26 Example of Contingency Table Test ACTUAL Bush Kerry Row Sum Female 29 45 74 Male 41 35 76 Col Sum 70 80 150 EXPECTED Bush Kerry Female 345 3947 Male 355 4053 The expected female for Bush frequency is calculated as 70 x 74150 34 533 and so on Testing Independence of Vote and Gender Compute the square ofactua1 7 expected in each cell and divide each by expected frequency Then sum these four values to get the test statistic Female 088662 07758 Male 086328 07554 ChiSqu are 328106 For the femaleBush entry take 2934 533134 533 so and so on Testing Independence of Vote and Gender Kerry Female 088662 07758 Male 086328 07554 ChiSquare 328106 The computed test statistic is 3 28 The critica1 value ofthe chisquare with one degree offreedom 1 21 x 21 at a5 signi cance level is 3 84 The computed value is less than the critica1 va1ue so we do not reject the null hypothesis of independence at unnciu for Bush and more females for Kerry But the di erence is not large enough to reject the null 27 Comments I The null hypothesis is that planned presidential Vote is independent of gender 7 We do not reject that hypothesis at 5 signi cance As noted earlier the test for a difference between two proportions is a special case ofthe contingency table test 7 at case is speci cally the 2 x 2 table casejust considered 7 Our test is equivalent to atest ofthe null that the proportion ofmales voting for Bush or Kerry equals the proportion offemales voting for Bush or Kerry 7 We do not reject equality ofproportions at 5 Comments Cont 0 Specifically the test statistic for the difference in proportions test is the square root of our computed chisquare which is a standard normal under the null 7 The square root ofour 328 is 1 81 7 This is less than the critical Value ofz for a two tzil test at the 5 leVel of196 28 a at 1 390 1 4 Research Methods in Economics ECO 4451 Reviewing the Literature Review I Previously we discussed developing a research plan from a Veneral area or uestion of interest I Now we consider reviewing the literature related to your research question Outline I What s the literature I Why review the literature I How to review the literature I Finding sources I Critical reading IWriting the review The Lwterature reseamh an they say Why Revwew the Lwterature Educmeynmse IAngbs mum pmb ews imdmgs Educmeynuvvezdw I Whats Mum delknnwn Wnke he 2mm 11 ways when needed bmzdevscheme Mknnw edge Fmdmg Sources snumes I Da a numencauv pnces Wage mcnmes I Prmrh eramre Revrewmgthe Lrterature Three Sources Genera snurczs 4m me mrmmea pubhc Ipnru arprss mwwapas mam ma mums Secundarysnurczs rsummzrws m svmheszs nr pumzry research Ireuew undes 2mm bums amannqs mmmnrs anzrysnurces 4m um accnums nizcmz research 9062er mmueueamms murmurs mummy mm mm Genera Sources van neralure rwreu mrgm mung sum Iypes nr sums Genera snurczs are he reasmuame but he p manduce vuu m mprcamp he annv e Me may IWaNmmmurm mmmum ec Secondary Sources Semndarvsnurms are enremery useM m hrmgmg vnu up to speed a m rdenmvmg pnmarv snurces Jnmr niEcnnnmc waecnm mm nr Econer WWW Jmlmi nIEc mm mm mm Mm omm rummaan researcmng 2 g Prwmary Sources anaw sources are the mns1 mpnnam new zmdzs heme 1252mm vepmtsr hm ma accnums e1 1252mm mm mm m mums and mher IGenm jnumzb mes emeaa W212an m MM m n A x p mem numb zmdes m a pavmwum Wmmmn Prwmary Sources How Schma y Journa s Work Avaseavchevsubmns menea zmde m we ednmr nnejnumz The ednnv erjnumz ummzmv deems whelhmn when an VEJEEW E zmde mnverdmexrexpens mm VEVW me pubhcznnn The Peer Revwew Process Theveievns avepeevvwwers VEJEEHHE papev andthe ednnv cnmphes he mmnmm wwwme new a e memeuem Semeumes me Memes mu vecnmmend mane papa be vewewea asecnndume aneune em n me amnm mnemme vault am 0 pmeeee been Egan The Peer Revrew Process Cont Evenmrnme rererees and ednnr mayzgree thz he papersnnum be pubrenea AhhnugMaHrnm perie the peer rem prneess nerpe m mzemm usesemwnmrsreeemna nmwersrea mn paperstn mmds mum Peer Revrew a Research Ouahty One reason why peer revrew rs ennsrdered ex ensvesueemnr rnvnrvedrn eer revrew Anmner reasnrr rs that you gel the rst hmrtalrnrrsmal are snmelrmes rgrrnred rn Peer Revrew a Ouahty Cont Peerrevrw arnne enmguaranreenrquarw IEadreswmsmmsqaswhlsmdmpeer revewai humus enm reseerensnnennesmemwm peer revewai humus Bmpeerrwrw e nnerrnpnnarnsranaamnr Wsnrsspeua werwrm reer wrean mm M m quotum emmenmmgnq resemh n a penrmr sea Fmdmg the Re evant Lwterature Heammc Search engmes are a quad 51m espeuaHv EcnnLn Search nrkewmds m We of e semere readmgmher dawned the MHex From One Source to Many Useme Meme 5 m m thE zmdesynu Useme Snmz Sums mmmn namnim zmvve zmdwseavch E 5 UELcndesanwm snmce I ndnnemmmekwveseavch 5mm use ssEL m M mwnnampwsnwebsm Usmg rawew amdes Anmher mtegwsm m a quad mm W n en he mmprehenswe u h amme meanl wnmm Perspectives Other approaches Ema me tab es nimmems m mm m the area anan ISnmE mmmem e d numzb 212 mm m nm We web page Lwterature Search nformatwon The ducumem Searrmng me Lneramres research hss nhdem wed amdzs 1222quot nd m onus a pnnvsmdy Lwterature and Data Search nformatwon The ducumem Searrmng for Research andDa as pmmesmnreminmlmn ahnmdalabasesmal nu ma m useM Lscs nmemnums I WE IDmsnmces Lwterature Search Examp e A auaem wanted to exp am mm factors Underaandahw he mmauvmnugmmat been dune nn msmpm mmmes rm 6 van nbpmvzs meme Undemand exsungknm edge rn new Wuvnwn mnmhmm n m emw wsnmn 5125 Unders zndmelhndsused sueng sX masses r e anH AdamvnmsMe nheadmg nthe mm M We zmde ve evanceinvynm mm Vnu may um um um m need my 2va ham m gmm understandmg mm 122d n m Emmy Types of Research Amdes h COHOWCS Survey amdzs 41w hmzmve vevws Themenm mmmmmsammwmemmumm mm mum Wu Empmcz mam unmWhgarmdd mmmm mews uwnmvmpwm The Usua Setup for Empwrwca Research O ehinmmzmpzns mvndunmymmenz Pym emmm mmmeNmemmuew Amuse mpmmem Empmcz mst na nvqmmmm mums unncmmng dscussmh The Argument The gmnm zmde s m avgumerW evmence Read me zmde m magnum wammeme avgumem Understandmg the Argument WM 5 the match quesuw 5mgmpnseaanswmemessw WM mm mm ngtczt nvthenveltczt mm mm pvnpmed m 27 E s mmmme m A mwe mm mm mmxznpms Evatuatmg the Argument Wm lnvawnvnnrs are my uquot wanna th A m mm mm mm mamas mewam nu mndusnrs MW mm quotmm smammnwmm amt m a m nun arean weamsses m nmm awn th cutttes m Understandmg Techmczt m my man ww em a smm Mathemmtczt veasnnmg m mmmztnumemmmem nnvmepmn m m snumn m Ecnnnmelnc mm 5an mmmdmmmmmmwes wumn Mm mm mamas mans r mm munmg new mum sums detztb W tmg the Ltterature Revtew Takethe perspearve that vnu are wrmng eenrrnrrrrs who rs net 3 spectah mthe area vnu are rrrveargatrrrg Exprammme reader mm W tmg the Ltterature Revtew Coht Exprarmnme reader Whatareme mbtehs weaknesszs hmnzlmr niezvhemnm Hw wruwmmew Hw wm wu meme W tmg the Ltterature Revtew Coht vame Mr breastthMeyth cnrhpzvsnr in man number m pzmcutzw rmpnnamnr vethhtstumes I Evezdmrbne descrrmrnre m memmrs nHess rmpnnamnr vethhtstumesrtnshw w havethe Wrmng the Lwterature Revwew Cont Dn more men summanze Emma compare drawmerences Make me rewewreacn some km a mndusnn IAn mm 4112p n upwnhwhzmnwe m vs mam nnwmnw Make me rewewpmm m yuurresearm as me ng ca nex s1ep Wrmng the Lwterature Revwew Cont Nmemal amdes m peerrrewewed rewm Wrmng more a amwswpe mm mm deta ed News p nr researm Cmng Lwterature Tn emphaswze a referenue mdudeme as1 earn uhhmunnm aremnesesmme m1 nias h v emence ISmnh 1776 manduced he MES 11 mm Man Crtrng Lrterature Cont pmqu are mzkrng pm zmhnrampd2m rrr Mr rrm byzumnr IAsrmhrcmcusnn was warm me rm er Jam Use rrm zu hnrs name e1 er rmhe Indy erme paper whemhere zr23 nr mere zmhnrs ueerrmeerrrereerreee The Reference Lrst rrrrrrrematery Mm ennerusrnrr cried m arpna nrder by rst more rag name The generar rdea rstrral a reader needs m were based on your rererenee Common Errors rrr Lrt Revrews r esezrch Nn ennugh ammrnn deem m prrnr research pubrerrea rrr peer reereeeareurrrae Nn ennugh Wnrmznnn prnvrded abnunhe prrnr research Merereugnerenaeememerrmrrgperurrem pun r Wrmng asmemem er mm rzmer man a revrew er Merzmre Common Errors m Lrt Revrews Cont Wrmng a mm mm N Hwy mau mmnm memwngtnexp zmwwme studresreachedsrmr zrnrdr eremcnnc usmrs Nntrezcmng any ouncrusmn m Werence m evzmzlrnn Mnmngmg Mn a pawl m hzvmgsnm ype m summannnwwwrenncrw Surveys ECO 4451 Research Methods Research Design Approach wwwmmmmm wwmemm smmmmwhmmhm Mmmm Field Methods wMWWW sMMmWWm OMWWWW hmmmmmmmmh smwmmmhm smmmwm ehhhmmOhehhn M mmm Surveys onwnnomnnoonm mwmm wwwwmwnMMe Consumer preferences attitudes behaviors Business practices 0theriniormation Strengths oi Suneys noon eGathe a kinds of information r NWWMWmnmm woo emn mnnMwmnwmn meow Weaknesses oiSuneys emmnmmmmnd WWMMMM nononomnnnon mMM nnmmmommnmn w ms tnnmnnnnnan Mm mm rPersonal Interviewing rielephone Interviewing rSellAdministered rComputer Administered rMail ulntemet mmmmmmmm ummummmommmm mmh Crea ion and Selection of measurement ques rons Sampling issues and contactcallback procedures lnstrument Design Data Collection processes t t MmmMMMm oMWMMMWWMM mWWMMMWm tmmmmmMmummm mmmmmmmmm meMMWMM Personal Interview vinonap conversation initiated by interviewer to obtain information from the respondent lntenievervs Respondent vStrangers vlntervieuer controls topics and patterns ofdiscussions vRespondent lnsignificant consequences Asllted to provide information Does not receive benefits Personal Inteniev Advantages ineld a high percentage of returns vCan be ade to vielda representative sample of the general population vDeptb accuracy and detail of tnformationgreatersince interviewer has more options power and control vlntervtev length does not affect refusal time Personal Interview Disadvantages oHtgherrosts slower response rates reported in some areas oMoretlmeronsuming oHuman equation may distort results olnterviewerblas oGeneral discomfort talking to strangers Personal Intenien Requirements lorSucoess HMMMWMmme immmmnmMMe mWWMWW nmmmmnno wwmwmm mmWMMMMMg qwm Personal Intenien Requirements lorSucoess anomnmmmm lmmmnmmMn P e mmmmmmmm HWWMWWWWWP nonnwnmnnmnm WMHMWWMampad rmmemmmn wmm Poachingranbeabiasinglaclor Personal Interview Requirements torchcess 3 Participantmvst receive adequate motivation to participate v Motivation is the responsibility of the interviewer Respondents can be motivated to participate and even enjoy the experience TFMmMMmWWMn w ti 0 to u c Q how of PM Competing thiwnf WW hvtivhiw hhwviww WWW t thwtvvthvph t t hhmviwavph RespondentMotivotion Prohipvof thpw vavrwhhpwww Whitman B m whpvmv hmvvivwvr g t t c waiwd WWW hitImogen PM ihnhtvviChjmw oftthvpn Dixcction of In ucnc an ntemiewiechniqwe wmmmwmmm mmmm Respondent must believe that the experience will be somewhat pleasant and satisfying Respondent must believe that answering the survey is an important use othis or her time Respondentmwstdismiss any mentat resewations that he or shemight have about participation Intenieuieohnique ointroduction and first impression oBusy or unavaiiabie participants oEstablish good reationship oGathering the data by probing oRecording the interview oSeiection and training of interviewers intenieuieohnique Probing oProbing is the technique of stimuating complete and reevant answers Phrase questions so responses are not biased Avoid questions that make assumptionsaboutyoursubjects Avoid using words that mrqht have different meanings to you and your subjects intenieuieohnique Probing oProbing Styles Briei assertion of understanding and 39 erest Expectant pause Rep t Repeatinq the respondent s reply Neutra question or comment Question clarification 3 ea rnq the question lnlenieu Problems 1 Biased Results Sampling Error Nonresponse Error Response Error Nonresponse Error oCannot locate person whom you are supposed to study oUnsuccessiul in motivating person to participate oSolutions Establish and implement callbadr procedures Create nonresponse sample and weight results Substitute anotherindividual for the missing participant Response Error oData reported differs from actual data oParticlpant initiated error olnterviewer Error Wmmd eReported data might differ from the actua data under the following circumstances mmmmmMmmmmm hhhmhahhhdawmd immwmmmmhmmw mmem WMWW edhmmmmmmmmmm f immeMMWmmmw WWW f f edMMMWWWWMMt iMMMdmmwmww WWW ommwmmmmmth emmmmwmwwhm mm h memmma Imamm 2 Costs dampWmdmampmwmm Nahwmhmdme MMNMMWMWWMW mm mdMMWMMMMWm MWMMMW nmmmp anmMMs Publiel nnneDireelnrier CndeenehnngeNnnberrnnanndanigil Dialing Privately aeennnlaiedlneal direelnrier mmwwMMe mm eModeraiecosi eUse of fewer skilled interviewers than personal interview eCan usea much larger sample onera larger geographical area eCompuier aid lpmpuierassisied telephone interviewing CAiil lpmpuieradminisieredtelephone interviewing Disadvantages otTelephone Surveys nmmmmmm omwwmmmmmm MWMMMWMW tMMmmmmmmmw WNW mmmmeMm omwmmmmm WWWWWMWMM SeltAdministeredSurveys Types of SelfAdministered Surveys MN tMMWmm stWWmem em tWWewmem ie restaurants hotels car dealerships etc Advantages ofSelr Administered Surveys NMWW MMM oWMmmmmmm OMMMWWWWMW WWMWWM wmmwm oWw oMMmmmmmm onmnmmmmmmHmMmpmmmmB WebSurveys GOODSM MMMMW onmmwamphmmwm mmms Mmmm Wamp MMWMMM MWMMMMWM Disadvantages of Self Administered Surveys mmmm 9WMMmmmmm MW omwwmhmmmmmemm mmmmmme mmmmmmmmwm omdwmm Suggestions iorlmprovement vMaice surveys easy to read v0iiercearresponsedirections vinclude personalized communication vinclude iniormationai cover letter vinclude researcher contacts to encourage response Total Design Method vResembiance to advertising is Avoided vCover page is eyecatching no s vtast page used to invite comments v s from most interesting to more iiiicultdissue central p d verst it is easv amp interesting vFormat is consistent easy to ioliov iDiioont vCoveretterampConiidentiaitv vRespected sponsor vPostcard reminderthank pout week ater vNew questionnaire in 3 week vFinaiquestionnairein iiiweeic Selecting an OptimalSurvey Format of Survey Ideas MMMMMWmmMX WmMMww 3 MMWMMWMW wwmmwwmmmnms mmmmmmmmmm wmm wmmmmmww mmwmwmm Nature of Measurement Wmmmmm mmmmmwmmmm w mm mmmmwad M Nature ol Measurement Cont v Measurement isathreepart process 1 Selecting observable empirical events 2 Developing a set of mapping rules u Mapping mlesareascheme for assigning numbers orsymbols to represent aspects of the event being measured 3 App ing the mapping rule 5 to each observation of that event Nature ol Measurement Cont that is Measured 0bjects olhlngs of ordinary experience tables people books and cars vlniangibles attitudes and peer group pressures Data Types vFourCharacterlstics Classification Numbers are used to group or sort responses No order exists Order Numbers are ordered Distance the dillerences betvveen numbers are dered 0rrgm number series has a undue origin indicated by be number zero 9 Data Types Cont oNominal Data mmmmmMmmammmm mm mwmmmmmmm mmmeMMmhhmm Data Types Cont omma mwmmmmmmmmmme owamw ommmmmmmmmmwme mEWMWWW mmmmmm Data Types Cont omwm mMmmmmmmmmmmt mmmme ommmmmmmmmm dmmm HmM mwmmmw Dmemm oRatio mmmmmmmm mmm mwmmmmwwwdmm mmmmmm Wmmmmmm DMmm oError Sources kmm mmmm Mmm MWM Wmmmmmm DMmmmm oRespondent mmmemma mmmmmmmMy MMMMMMMWM ammmmmmwm mmmmwmms Wmmmmmm DMmmmn oSituational Factors WWWMWNMWW mmMmmmmm mMmMWmm mmmmmmm mmmmmmmm mommmmmmn nmmmmmmMn Wmmmmmm DMmmmn oMeasurer MWWmmmmw meMmmwmm mm MWMWMMWMM mwmmmmmmn mmemmmmn Wmmmmmm DMmmmn olnstrument oMMMWMMm mmmmmmn oToo confusing andambiguous oPoor selection of content Characteristics ol Sound Measurement yThree characteristics Reliahility Validity sensitivity Reliability yRetiahte measurements are tree from error Repeatability stability of measures overtime eTestretest method lnternal consistency eSpllthall method test hall the measurement scale items against the other hall Validity yVald measurement instruments measure what they are intended to measure Face content validity eSuhjectlue agreementamong experts that scale appears to measure what it purports to measure Criterion convergent validity eAhility of one measure to correlate with other measures of same item Validity tWMmemm MMMMmmmmm mmmmmama mmmmmmmmm mmm mmmmmm mmmmmwmmma mwmmwmwme tmmmmmmmwm mmm Sensitivity MWMMWHWWMW WWWMMS WMMMMMMWMM mmmmma tmwmmmw mWWawammn mmmmeMm Measurement ENLWMMmm Measurement eMwwmmmHMamw ememm zMwwmwMMemme emwwmwwmm 4wmmmmmwMy MwwmmmHMeMm meme memmmmmm mmmmw WmmwmmmMWMS mwmmmmmedee mmeemmemme wwampmmMeWMKMWs mmemwewme emmmm ummmmumm eHourly wage eQuarterly GDP eSimple but many measurement decisions such as mmmmm meMWm ummm mMmmmmm mMWWMWm WWMmMMMWMwm MWMMWM ommmmummmmum mmMMmmmmmmW mmemmmmm WMMMMWMMm WmmmmmmMMMr HWWWMMWM Mm mum sConvert nomina to real using a price index such as CPi sindex number a composite measure of separate items ePrice index a weighted average of some set of prices mememmm mmmm Converting Nominal lo Real ommm Real year 5 Nominal year ill CPI year illCPllyear 5 o0mnuenommmaMa oUSngmmmn weZWL onnnnonnn l Mn wmmwal nwanWMMis MWHWHWMWn N rl 3 Convening Nominal lo Real cool oonnoonwnomn no omnonnnnnn W 9 83 O Q m D II lt3 lt3 3 O Q m D lt D Q Percapila measures omonnonnnnn omnoomnmnnn nonnnmnnnn WWW Ww EBWmm ow WMmmn mommm nmmmmo oDivide by relevant population oKey is choosing reevant popuation fordenominator oEg unemployment rate UE rate numberUE numberin 2F mmnMMn o2002 US GDP 104462 b oChange New Old 364 biion oAnnual percentage change or growth rate Yeart24eart1 1100 104462 10082 1100 361 Annonnoo oConvertguarterlygronth to equivalent annual rate 42444244124441100 For monthly exponent is 12 Rooms oExampes ConsGDP SavingGDP oEmployment to Population Ratio o2 2 1 E mommMCmmmms oMonetaryvaluesofdifferent countries oConvert to common value eg US doars oSimiarto inflation adjustment but using mmmmo E mmmmmm MwmeWomomMoo ommmmmmmm mmeW omeMWWMw WWW oWwMMmmm WWW Mwwbpmpbdbdnd onmnmnmmmm mmwmmmmm bmm osmrdMMnmm Mmmmdmp mmmbmmdc thmd bmbMHW e t r h MbmmmmMWW wmwmmmm Mm M6 Mwwbpmpbdbdnd anmmmm ammWM an mMWb Mmm szMmmde mononwmnp WWWMWWWME mmmmmm Uses indicabrs oi dimensions oi concept Mwwbpmpbdbdnd d Dinension isan aspect of concept eg taking bribes diverting public property to private gain would be examplesoidimensionsoicorruption o Manpconceptsbavennneronsdinensions d indicator shows presenceabsence of dimensionordegreebnbich dimension is presentabsent d iombine indicators into 9 into nnneigbted ep son 9 Scaie measureninerebnoreirnponantquot indicators receive greater neigbt Levels of Measurement oFrom owest to highest eve Classification Numbers are used to group or sort responses No order extst s Order Numbers are ordered Distance the dtherehces between numbers are ordere WWW buthe dumb mher sertes hasauh due ortghhdcated er zero Data Types Cont ohominal or Categorical Data WMhmmtmwdmdhmh MMWMWM duumouumdumeu mm Mmmmmwhom hmMmwmmhmm Emmhmmmmuhdhth mmmummmm mumo Data Types Cont rth euhuuummmuoume owMumw hmMMMdmmmmmWMe thmeHMMsmmmsdmd mhwmoWMWme mmmdeMuMwmms h mhhwWMW Data Types Cont umm mMmmmmmmmmmmt mmmme mwmmmmmmwmmmmm dmmm HmM mMMMMm w Data Types Cont mm mmmMampmmmmd ummmm mwmmmmwwwdmm mmwmwwwMMa wmnmwwmmmmmK Data Types Cont memmmwmmma Mamd mmwmmmm mmmMMMmemd mmmwmm mm mmM m Measurement Quality HMmmmmmm Precision Reiahiity Vaidity Precision mmmwMMMMS Annua income 4044237 vs annual income between 40000 and 49999 omemmmonmomnmo mmw whwhmmy memhmMmmmmn Reliability mmmmmmmmm mmMmmmMMmmmwmmd oTestretest method oSplithalf method test half the measurement scale items against the other half meWMmmm mom Validity yVald measurement instruments measure what they are intended to measure aeuww oSuhjectiwe agreementamong experts that scale appears to measure what 39t purports to measure WMWMWWW eAhility of one measure to correlate with external criterion such as behavioral outcome Validity yCriterion validity Cont mMWMWWMMmm uwmmmmmu mmmmmmmmm MM rDoes SAT predict college success ydonstructvalidrty mmmwmmmm mmmmmmmmmm ummmmmn Research Methods in Economics ECO 4451 Experiments Experiment I A research method that controls conditions so that one or more Variables can be manipulated to test a hypothesis 7 Manipulated variables are Independent vanables 7 Effects of independent variables on one or more dependent vanables are r Randumiza un and Cunb39ul are used to remove effects ofextraneuus vanables on the dependent variables easured 7 Thus elfects ofthe independent variables are isolated r Allows evaluation ofcause and effect Experimental Economics I Experimental economics is a eld that uses controlled experiments to test economic hypotheses 7 Market experiments 7 Game strategic interaction experiments 7 Individual decisionmaking experiments Experimental Economics cont I Experimental economics emphasizes maintaining control over incentives that in uence behavior 7 Often construct an incentive scheme that rewards subjects differentially based on their performance in the experiment Experimental Economics Cont I Three key features of incentives in economic experiments a Monotonicity 7 subjects must prefer more to less ofthe item used for remrd r Salience r rewards must depend on actions and subjects must understand how their rewards will depend on their actions a Dominance rthe rnain incentive operating in the experiment must be the reward offered other incentives must be negligible Experimental Settings I Laboratory experimenw conducted in a lab or other arti cial setting to achieve maximum possible control over the experiment I Field experimenw 7 conducted in natural setting Where behavior of interest normally occurs Laboratory Experiment Field Experiment Arti cialrtow Realism Naturaerlgb Realism Few Extraneous Many Extraneous Variables Variables ngh con rol owcon rol II II ow Cost ngh Cost ShortDurahon Long Duration Subjects Aware of Partlclallon Subjects may be unaware of artwclaho Experimental Design Select independent variables and decide how to manipulate Select dependent variable and decide how to measure Select and assign test units Remove effects of extraneous variables Experimental Design 7 Select amp Manipulate Independent Variables Independent variables are those factors Whose 7 ncmg You must be able to set the independent variable at different levels or in different categories I An experimental treatment is a particular setting 1 of the independent vanab es Experimental Treatments Exam le 7 you want to test the elfect on sales dependent variable ofmedia advertising and price independent variables Suppose you plan to manipulate each ofyour independent variables at two levels r Pnce7hlghvs low 7 Advatlslng 7Radlo VS Newspapa Then there are four possible experimental treatments 7 ngh priceradio low prlcdradlo high pricenewspaper low r These four treatments might be administered in four separate test markets Experimental Design 7 Select Dependent Variables amp Decide how to Measure 0 Dependent variables are the effects or outcomes you wish to measure 7 Productivity 7 Turnover 7 Sales 0 Choose measurement 7 instrument frequency etc Experimental Design 7 Select amp Assign Test Units I Test units are subjecw or entities Whose response to experimental treatments is to be measure 7 People consumers workers traders 7 Firms 7 Departments or other groups ofpeople I Test units must be selected into the experiment and assigned to experimental treatments Select amp Assign Test Units I Ideally selection into the experiment and assignment to treatments Would be randomized I Typically selection into experiments is not random While assignment to treatments is I Randomization 7 assignment of test units to experimental treatments by c ance I Matching 7 an alternative method of assignment that tries to match the subjects assigned different treatments according to pertinent characteristics 7 Classic example is atwins study 7 Other examples include matching by age or experience Experimental Design 7 Remove Effects of Extraneous Variables I Control and Randomization I Control 7 the ability to hold certain conditions constant while manipulating the independent variable 7 De nes basic difference between experimental and other research Control Effects of Extraneous Variables I Sometimes can literally remove extraneous in uences 7 Packaging labels other indicators of brand in a taste test I Probably most common control is through constancy of conditions 7 Test units exposed to identical situations except for variation in experimental treatment Control Effects of Extraneous Variables Cont I Other controls used for particular types of extmneous factors include blinding and counterbalancing I Blinding 7 Prevent subjects from knowing Whether they have been assigned a particular treatment 7 Double blind e prevent subjects and experimenter from nowmg e Placebocontrolled drug trial is classic example Control Effects of Extraneous Variables Cont Counterbalancing 7 Eliminate order of presentation bias by varying order of treatments 7 Used when same subjects administered more than one experimental treatment Experimental Design 7 Remove Effects of Extraneous Variables 0 Control and Randomization Randomization 7 random assignment of subjects to experimental treatments 7 Different levels of extraneous variables are equally likely to occur for all treatments 7 With large number of subjects extraneous 39ables are unrelated to treatments because of t van random asslgnm en Experimental Validity I Two broad Ways of judging experiments I Internal Validity 7 Is the experimental treatment the true cause of the change in a dependent Va1iable 7 or is the treatment confounded with some extraneous 39able I External Validity 7 can the resulw of the experiment be generalized to a broader population or institutional setting I Good experiments o en are strongest on internal Validity but may be lacking in external Validity Threats to Internal Validity I History 7 uncontrolled events in environment that in uence dependent variable 7 Key problem if occurs between measurements or if is correlated with assignmmt ofexpenmentzl treatments 7 Special case is when effect where different orperimental groups have exp erienced different histories pnor to exp errment Maturation 7 changes in test units over time cause affect dependent variable 7 Learning or erperience during experiment fatigue boredom 7 Key problem if correlated wlth assignment oftreatments e e g would affect Lreatmmts asslgaed later in experiment Threats to Internal Validity I Testing 7 When posttest responses are affected by learning or perception c anges caused by pretest 7 Key problem is if subjects act differently in posttest because ofpretes Instrumentation 7 an effect on resulw caused by changing measurement methods 7 Measurement instrument may be changed to avoid testing effect above but then difference in results may be due to di erent measurement method rather than to atment Threats to Internal Validity I Selection 7 when experimental results are in uenced by assignment of test unis to experimental treatments 7 Key problem is when assignment creates correlation between treatments and extraneous factors 7 Selfselection into groups is classic example Threats to Internal Validity I Mortality or attrition 7 when some subjects withdraw from experiment before completion 7 Key problem if extraneous Variables cause differential attrition from different experimental treatments I Statistical regression 7 regression to mean 7 Key problem if unusually high or loW scores used to assign treatments Increasing Internal Validity I Control group I Random assignment I Pretesting and posttesting I Posttest only Threats to External Validity I Use of test units who are not representative of population 7 Poor sampling design 7 Excessive reliance on college students I Limited set of experimental treatments may not represent interactions of untested influences that occur in real life Increasing External Validity I Draw representative sample of test units I Conduct eld experiments a er initial Work in lab experiments I Test parallelism between lab and eld resulw Threats to Internal and External Validity I Demand characteristics 7 Experimental design procedures that unintentiona11y hint to subjects about experimenters hypothesis or about desired outcomes 7 Experimenter bias 7 when experimenter in uences actions ofsubjects 7 Guinea pig effect a when subjects change their behavior to in uence outcome of experiment Hawthorne effect a an eirect caused by subjects just knowing that they are participants in an experiment Experimental Design tPreExperimenfal Designs tTrue Experimental Designs QuasiExperimenfs PreExperimenial Designs 1 OneShot Case Study Periorm Treatment Observe Resuiis 2 OneGroup PreiesiPos1Tesi Design Periorm Preiesi MontpuioieTreOiiniervene Posir es 3 Static Group Comparison Seieci experimentat group and eantrat group Aamtntster Treatment to One Post Test aatn OneShot Case Stud X 0 Treoimeni Posiriesi Does not measure etteet at treatment no Comparison Nate R Random Assignment ofsubjects into groups 0 Measurement observation or test x Treatment or manipulation OneGroug Pretest PostTest Or X 02 Prelesl Treatment Poseresl Measures etteot or treotrnent 0201 thhtnrsubtects only 7 no oontrot group thhoul treotrn ent rnus etteot oouto be due to any number or extroneous tootors rnotuorng oernono enorooterrstros Static Group Compon sron Meosures etteot 0201 between subreots only no pretest e ogorn oouto be due to extroneous ortterenoes Three True Experimental Designs 1 PretestPostTest Control Group Design 7 Random Assr nrnent Pretest botn groups 7 TreatMantpulate expertment group 7 Postrrest botn grou s 2 PostTestOnly Control Group Design 7 Random Assrgnrn ent r TreatMantpulate tne expertment group 7 Postrrest botn groups 3 Solomon FourGroup De ign R S andorn Assrgnrn ent to tour groups 72 expenm entot and 2 n ro r Prertest expenmental group r and controt group r 7 Do not pretest expertmental group 2 and controt group 2 r TreatMantpulate bo n experr39rnentot not control groups 7 Postrrest ott groups True Expe mental Des gns Pretest PostTest PostTest Only Control Group Control Grou R Or x 02 R x Or R or 04 R 02 Tu nd X39s Effect To lint X39s eet 02701 0200 oAol OK rt groups known to be equot rn rrnportont extraneous rectors Dt erenceerrydr eyences Sometrmes pretest not possrbte or not cdvtscble due to testrng or rnstrurnentotron ettects Solomon Four Group Design Experimental Group 1 R 01 X 02 Control Group 1 R 03 04 Experimental Group 2 R X 05 Control Group 2 R 06 soates testing or instrumentation effect from treatmen e ec Can isolate other extraneous factors from treatment Advanced Experimental Designs are More Complex 0 Completely randomized Randomized block design 0 Latin square 0 Factorial Exumgles of True Exgerimenlal De gn Exlens ns Complelely Randomized Design 750le rorm orlrue Ex erlmenl rleferlng levels or one lndependenl Vorloble randomly asslgned lo mulllple groups Randomized Block Design Add onlo Completely Rondomlzed Deslgn rConlrol Used lnsleod or randomlzollon lo rsolole errecls or a slngle extraneous Vorloble lolin Square Desi n eAdds onlo Randomlzed Block Deslgn rWJHlDle exlraneous VCJrlCJbles can be measured Focloriol Design Measures mm and lnleracllve ellecls of lwo or more laclors slmulloneously Completely Randomized Designs Example Generic brand cannew wants to compare their sales to Name Brand Cannew and the factor that price plays in consumer purchases Generic tests the sales of their product when it is 007 012 and 017 lei than the Brand Name price They randomly assign stores to sell at their set prices and monitor for two months Independent Variable Price Gap Level 1 Level 2 Level 3 Group A Group B Group C Completely Randomized Design With a posttest only Group A R X1 01 Group B R X2 02 Group C R X3 03 How to measure treatment effects Completely Randomized Design With a pretest posttest Group A R 01 X1 02 Group B R 03 X2 04 Group C R 05 X3 06 How to measure treatment effects Randomized Block Design Used when i not confident that randomization will remove effects of extraneous factor eg due to small sample sizes 2 want to measure whether treatment effects differ according to extraneous factor Randomized Block DeSIgn Exampie ceheric Cohhery ihihks ihcorhe hos oh errecioh the response ofsaies to price They o o ihcorhe39 o biocking factor in their exoerirheh Note Each blockgroup iihcorhe grou or must receive each treatment Note you coh measure erreci of price goo separateiy by ihcorhe grou eihos effects of ihcorhe ore biocked You coh measure average price erreci over groups m oih erreci oho oirrerehce ih orice erreci oeiweeh groups ihierociioh erreci La In Sguare Design May be used when heed to control effects of two extraneous factor Uselhelwo exrraheous blockrhg Hector 0 as the rows and c lu fa square table losel up aesrgh Drvrae each l rhg fact each eh square to or lnlo ah equal number of levels so the le represents a uhraue r eh randomly assrgh treatments to cells h the table so that each grveh treatment appears only once h each row and each column Number uflevels ufeach bluckmg facturmust equal numba39 uftxeatment evels ufthe active ram t combrhahoh of Nobles La 39n Sguare Des39gn Example Gehenc Wants to add Variable to measure store an Gddmonal extraneous srze Factorial Design 0 Allows testing main effects and interactive effects of multiple independent variables 0 Main effect in uence of one independent variable on dependent varia l 0 Interaction effect in uence of two 0 mor 39 r e 1n ependent variables together on dependent variable Factorial Design Example The Generic company also wants to test effect of putting a price per unit a bel above the cans Example Dependent Variable Measurements in 2x2 Factori effect Note dep var largerfor ad A for Women larger for ad B for n Interaction Between Gender and Advertising Copy Be evabm Indep endent Level 1 Variable 2 Leve2 Independent Variable 1 Level 1 Level 2 Group A Group B Group C Group D 2 X 2 Factorial Design with a Group A Group B Group C Group D Posttest Only R XM 01 R X21 02 R X12 OB R X22 04 2 X 2 Factorial with a Pretest Group A Group B Group C Group D Posttest R 01 R 03 R 05 R 07 X11 02 X21 04 X12 06 X22 OE QuasiExperimenls i Nonequivglenl Control Group Design 2 Separate ngple Prelesl Posl Tesl Design 3 Time Series Design Noneguivolenl Conlrol Group Design Prelesl poslleslconlrol group wilhoul random ssignme main forms lnlacl Equivalent Des39gn 7 cl nolurolly diwded inlo groups SelfSelected Experimental Group Design l eers gre recruiled ror experimenlol group quot01quot x 03 05 Comparison or prelesl resulls is on indicolor or me degree or equivalence belween lesl and conlrol groups Separale Sample Prelesl Posi Tesl Design 0 Random gssignmenl 1 group prelesled 0 Treatment lhen 2nd group posl lesled R 01 R X 02 gtlt does nol hove lo be performed on line iirsl group Time Series Design Repealed obsenGlions before during and a erlrealmem Advamages e lrack changes over longer lime periods dislinguish perm anem from lransilory changes Student Example of Field Experiment I Dawn Russell The effect of mimicry on tip percentage Do servers get a larger percentage tip when they mimic the order by repeating back the full order to the customers Russell Mimicry Experiment I Five servers at one restaurant on one night I Each server reports data for 10 tables 7 5 tables mimic the order 7 5 tablesmot 7 Alternate mimic not mimic I Dependent variable tip as of bill I Independent variable Whether mimic I Control restaurant time day and server 7 Blocking by sewer permits within subjects analysis I Randomization Whether mimic or not to given table Russell Experiment Data Vanable Mean Std Dev Minimum Maximum Cases SERVER 3 00000000 1 42857143 1 00000000 5 00000 50 MIMI 00 00 505076272 000000000 1 00000 50 GENDER 440000000 501426536 000000000 1 00000 50 o o RAC 0000000 494871659 000000000 1 00000 50 AGE 2 50000000 1 34392055 000000000 5 0000 50 TIP 136400000 56234ZOE701 100000E701 25000 50 Servers Coded 175 m1m1c1 ifrnirnic 0 ifnot gendePl lffemale ays 0 ifrnale rac lfnonrwhlte 0 ifwlute age 0 lfpaymg cuswmer s age in years appears to be lt 201 2130 etc to 5 1fgt60 Russell Results 7No sewerspeci c e ects OLS without Group Dummy Variables 1 1 Ordinary least squares regression 1 1Dep var Tlp Mean 1364000000 SD 5623420E011 1Model size observations 50 Parameters 2Deg Fr 481 1Residuals Sum ofsquares 1394640000 Sthev 053901 1Fit Rsquared 0 9954AdjustedRsquared 081201 1ModeltestF 1 48 533 Probvalue 025301 11 anelDataAnal lsofTIP ON39E a 1 neonditional ANOVA No regressors 1 1Sou an ion Deg ree Mean qu 1 1Between 19372 E01 4843001502 1 1 Residual 580 4 3 E02 1 otal 154952 49 316229E02 Russell Results 7No sewerspeci c e ects 1Variable1 Coemeient 1 Standard Error 1tratio MIMIC 35200E01 15245983E01 2 309 Constant 11880 10780538E01 11 020 20 Russell Results 7 With serverspeci c effects Least Squares with Group Dummy Variables i Ordinary 1east squares regression i Dep var TIP Mean 136 400 S D 5623420413E01 iModel size observations 50 Parameters 6 Deg Fr 4 i iResiduais Sum ofsquares 1200920000 Sthev 05224i iFit squared 224973AdjustedRsquare 13690i iModeltestF 5 44 255 Probvaiue 04097 Russell Results 7 With serverspeci c effects lVariable i Coef cient i Standard Error itratio i MIMIC 35200E01 14776E01 2 382 Interpretations Why is coefficient ofmimic variable identical to the case where serverspecific effects are not contro11e 7 F num denom Prob value H2 vs 1 1607 i3vs 1 5 331 i 4 vs 1 2 554 i 4 vs 2 5 675 i 4 vs 3 1 774 Jat Uit Js as as o as O V 21
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'