This 5 page Class Notes was uploaded by Dr. Janiya Bernier on Thursday October 22, 2015. The Class Notes belongs to ECON 190 at University of California - Berkeley taught by Staff in Fall.




Date Created: 10/22/15
Department of Economics Spring 2006 University of California Economics 190 Berkeley Professor Martha Olney Stata Lesson Thursday February 15 2006 1 Where to nd the data sets httpsocsberkeleyeduNolneyspring06econl54 There are four versions of the dataset there tXt excel stata and sas transport le Download all four to your desktop 11 To transfer an excel data set to stata format start translate program choose input le type format here excel or SAS transport browse to nd the data set choose output le type format stata it will automatically suggest a le name choose all variables and all cases select transfer 12 To open the data in txt ascii format start stata know where your data set is type insheet using pathi001txt 13 To read in data that are in fixed format Read the manual Use in X Create a dictionary le that speci es the locations of the variables and their new names Department of Economics Spring 2006 University of California Berkeley Economics 190 Stata Lesson Page 2 of 5 2 To save your work Create a log le File Log Begin or click on the icon that looks like a scroll Choose either formatted log le sch only readable within stata Or text log le log readable within any word processor 3 To look at the data describe list Careful this will go on and on use red X to stop Tabulate tells you how many different occurrences there are of each observed value of a variable useful for variables with just a few possible values not useful for continuous variables Including two variables gives you a crosstab table tab variable name gives a list of all the values and their frequencies eg tab occup How many teachers principals superintendents tab occup sex By gender how many teachers principals superintendents Summarize tells you a variety of summary statistics summarize variable name 0 sum variable name gives brief summary statistics sum variable name detail gives longer list of summary statistics 31 To look at the dataset sorted by some variable two steps Step one sort variable name eg sort sex Step two by variable name sum variable name e g by sex sum totear Department of Economics Spring 2006 University of California Berkeley Economics 190 Stata Lesson Page 3 of 5 4 To start naming the variables label variable variable name variable label Label variable pob Place of Birth 5 To name the values of variables two steps Step one label de ne labelname value name value name Label de ne sexlabel 1 male 2 female Step two label values variablename labelname e g Label values sex sexlabel 6 To create new variables generate newvariable some function of existing variables gen newvariable some function of existing variables Gen numimonthSJaid totearn earnmo 7 To create dummy variables generate dummyvar existing variable value for alummy to equal I generate male sex l 75 To replace values of existing variables replace varname newvalue if varname olalvalue gen female sex replace female 0 if female replace female 1 if female Department of Economics Spring 2006 University of California Berkeley Economics 190 Stata Lesson Page 4 of 5 76 To replace missing values of existing variables 8 replace varname if varname missingvalue Replace age if age 9 T0 graph the data Click graphics along the top menu Choose the type of graphs start with easy graphs Choose a particular graph try scatter plo to start Specify X horizontal axis variable and y vertical axis variables Specify any if restrictions Click submit and the graph will pop up in a couple of seconds Edit your speci cations as you wish click submit again when you have it as you like it save your graph by rightclicking anywhere on graph To run a linear regression regress depvariable independentvariables Re gress totear male To run a probit Probit depvariable independentvariables Generate ownhome ownhm 2 Probit ownhome male totear age To get probit results that are directly interpretable eg Dprobit depvariable independentvariables Dprobit ownhome male totear age Department of Economics Spring 2006 University of California Berkeley Economics 190 Stata Lesson Page 5 of 5 11 eg 12 13 To include only part of the data set add if statements restrict probit to include only teachers Dprobit ownhome male totear age if occ Creating interaction variables Suppose you think that gender matters but not as a constant shift factor You think instead that the effect of gender is to alter the return to college education In that case you need to create an interaction term Gen male grad male grad Now include this as a variable in your regression along with just grad The coef cient on maleigrad tells you if men have a different return to being a college grad than women do NOW let s play With the data set see also data exercise page on class webpage Estimate a relationship between total earnings and its determining variables Think rst about 14 X a how will you take age into account b does gender matter Does it shift the earnings function or change some of the returns or both c do you want separate equations for teachers superintendents principals or do you just want a shift factor for different occupations do files When you have a lot of commands write a do file which contains all of the commands and then have stata run the do le Save the do le in ascii format with the suf x do For instance mondaydo Then type do monday nostop Nostop is an option that tells stata to keep going even if it encounters an error A good place to 100k for help data sets to play with do les etc www aw comstockiwatson


