### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Scientific Study Politic POL 051

UCD

GPA 3.56

### View Full Document

## 19

## 0

## Popular in Course

## Popular in Political Science

This 99 page Class Notes was uploaded by Pierre Huel on Tuesday September 8, 2015. The Class Notes belongs to POL 051 at University of California - Davis taught by Bradford Jones in Fall. Since its upload, it has received 19 views. For similar materials see /class/187571/pol-051-university-of-california-davis in Political Science at University of California - Davis.

## Reviews for Scientific Study Politic

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/08/15

nti c Study of ics POL 51 Professor 3 Jam Uniwrsity of Califamia Davis Fun With Numbers 0 Some Univariate Statistics 0 Learning to Describe Data 0 Research is empirically basedtherefore we must work with data 0 You just did with your plots 0 No statistics in the plots but it does summarize information usefully coverage unhe Immxgmuon issue Hunsw n cnmmcle mesmm Visualizing Data o Often the first place to start is with visualization 0 Works best with continuous data but we39ll learn tricks for understanding data measured at other levelsof measurement Start with an example Useful t o Visualize Data Histogram 20 gt u z a 3 u 2 u Main Features 0 Exhibits Right Skewquot I Contrast this with Left Skewquot 0 Some Outlying Data Points 0 Question Are the outlying data points also influential data points on measures of central tendency 0 Let39s check The Mean 0 Formally the mean is given by 0 Or more compactly Our Data Mean on is 26067 Mechanically O 263 73 886726067 Problems with the mean No indication of dispersion or variability That is it is a single indicator ofcentral tendencybut is it a good indicator What about variability around the mean variance targundth 0 Why N51 Our Data o Variance 2024318 o Mechanically 0 263260672 73260672 Iquot BB260672 66 0 Interpretation 0 The average squared deviation around Y is 202431 0 Who thinks in terms of squared deviations 0 Answer no one 0 That39s why we have a standard deviation Standard Deviation 0 Take the square root of the variance and you get the standard deviation 0 Why we like this 0 Metric is now in original units of Y 0 Interpretation 0 SD gives average deviationquot around the mean 0 It39s a measure of dispersion that is in a metric that makes sense to us Our Data 0 The standard deviation is 11992 o Mechanically 263260671 7326057 m 88 26067166quot 0 Interpretation The average deviation around the mean of 26067 is 44992 0 Now suppose YVotes o The average number of votes is about 261 and the average deviation around thIs number Is about 50 votesquot 0 The dispersion is very large 0 ma ine the opposite case mean test score is 85 percent average eVI Ion IS 5 pe cent Revisiting our Data Histogram 20 gt u z a 3 u 2 u Skewness and The Mean 0 Data often exhibit skew o This is often true with political variables 0 We have a measure of central tendency and deviation about this measure Mean sd 0 However are there other indicators of central tendency o How about the median Median 0 50m Percentile Location at which 50 percent of the cases lie above 50 percent lie below 0 Since it39s a locational measure you need to locate itquot 0 Example Data 32 5 23 99 51 0 As is not informative Median Rank it 5 23 32 51 99 Median LocationN12 when n is odd 623 Location of the median is data point 3 This is 32 Hence M32 not 3 hntfr zrelgation 59 percent of the data He above 32 50 percent of the a a Ie e ow 32 What would the mean be 426data are skewed Median 0 When n is even 67 5 23 32 54 99 o M is usually taken to be the average of the two middle scores 0 N127235 0 The median location is 35which is between 23 and 32 39 M23322275 o Aquot pretty straightforward stuff 7 gt 7 I Dispersion ammd the Median o 0 mmmmmz O Na mum 1 Ward amam porn around tin mm 0 hawk ulm O IWr lnge O mundhnit n wartown O swucmmma39lmdm whlwm inklde o mmmilwmwhndm mmwmaxwmm mum Mu dmwh 108 and the 5 Number Summary 0 Dawn 5 as 32 5 quot 99 o 25 Pumntilms I 75 Willa 0 gags dif cult batman 75 and 15quot pareun t 51 0 Hana Mz75 IOR49 I Summary Max Min 35 so 7539 19 MM at as Finding Percentiles I Gtmchrmuh O pide pixantih o nkumpk u O InkMm Q mamas Mi munmummm Mm Mannan06 Muchs I ILHMOMW Q W Lntmwudmn m anme Example 0 67 5 23 32 54 99 0 25th Percentile L2561oo15 0 Round to 2The 25th Percentile is 5 0 75th Percentile L7561oo45 0 Round to 5The 75th Percentile is 51 0 50th Percentile L5o61oo3 0 Take average of locations 3 and 1 o This is 23322275 Our Data Median120 Votes ie 50571I100 25th Pe rcentile46 Votes 75 h Pe rcentile289 Votes IQR 213 Votes 5 number summary 0 Mines 2539quot Pike Median10 5 Png Mung massive dispersion Mean was 26067 Median120 The Mean is much closer to the 75 percentile That39s SKEW in action Revisiting our Data Odd Ball Cases Histogram 20 gt u z a 3 u 2 u Influential Observationsquot 0 Two data points 0 Y1 1334 7 0 Suppose we omit them not recommended in applied research 0 Mean plummets to 20069 drop of 60 votes 0 sd is cut by more than half 20392 0 Med114 note it hardly changed 0 Let39s look at a scatterplot Useful to Visualize Data Scatterpl 0t Main Features 0 Y and X are positively related 0 There are clearly visible outliers 0 With respect to Y which outlier worries you most 0 Influence E iimm 9k FQE MWFJ mun mm m runningu uuun mm min Immumu m uum m1 munumquot mum mum m Pram 2 L 3 S Q j 3 W m 2le um um M GUM mum nu mmquot munv mlnmull quotInnquot mmumr quot9va y m vim 1 ukunu Surnames o PalmBauthunty o quWmmnhy 0 WW I mummhdn ng mmbcrofauchumnm I The i ot raced massive confusion Q Maginan mry in Harm 537 was C mwmnmm 9353 3000 c m c m g 38 mo x N 9 E o gt Buchanan by Bush Vote in Florida 0 PALM BEACH P NELLAS MMWOUGH dDUVAL 552m 100000 200000 Vote for Bush chanan by Gore Vote 3000 m E E 0 0 0 N E o gt 100000 200000 300000 Vote for Gare Univariate Statistics 0 We can clearly learn a lot from very simple statistics 0 We will use a data le I ve put on SmartSite O PewHispanicCenterdata O Latinoa respondents 0 Survey Data are on attitudes of Latinos on a variety of issues 0 Contains both citizens and noncitizens 0 It39s OBSERVATIONAL DATA 0 Why Univariate Statistics Mean vs Median R code posted on SmartSite summarytimeinc0untry narmTRUE Min1stQu Median Mean3rdQu Max NA39s 000 700 1500 1828 2600 8400 619900 Interpretation 0 0quot denotes lt 1 year 0 The data are skewed ll in the blank 0 Why 0 The ve number number summary is ll in the blank Univariate Statistics o Graphical displays of data are useful 0 Histogram and Boxplot parmfrow22histtimeincountry xlabquotYears in the USquot yabquotFrequencyquot mainquotYearsinCountry for Latinoa NonCitizensquot colquotbluequot box YearsinCounlry ior Launpla anfCIllzens Frequegb anzliiu lhsys Histogram Anytime you see a figure here is the question you must ask What is the main feature of the plotquot lfyou can39t answer that it39s probably not a good plot Sowhat is the main feature ofTHIS plot Boxplot boxplottimeincountry yabquotYears in the USquot mainquotYearsinCountry for Latinoa NonCitizensquot coquotredquot box Years 4n the US YuninCoumry for Ll nolu NonCllilens Box Plot 0 Main Features I Graphical display ofthe 5number summary 0 Illustrate dispersion ofthe data as well as skewness Analysis Between Groups 0 May be of interest to compare groups I DemocraciesNonDemocracies O IMF intenlentions vs nonInterventions 0 Republicans vs Democrats 0 Citizens vs Noncitizens 0 Some R code to do this hint this code will help you for HW2 Univariate Statistics 0 The variable a scale measuring beliefs in how much ofa problem discrimination is in generalquot in the workplacequot and in schoolsquot 0 Scale of the variable 01 where 0 denotes maximal belief that discrimination is NOT a problem and 1 denotes maximal belief that discrimination is VERY MUCH a problem 0 R code meandiscrimscale narmTRUE 1 0647533 sddiscrimscale narmTRUE 1 03538025 mediandiscrimscale narmTRUE 1 06666667 fivenumdiscrimscae na rmTRU E 1 00000000 05000000 06666667 10000000 10000000 Controlling for citizenship 0 Let us account for whether or not the respondent is a citizen 0 We might want to look at the means and sd for those who are and who are not citizens o R code gttappydiscrimscale citizen mean narmT E 07343246 06264739 gttappydiscrimscale citizen sd narmT E 0 1 03358822 03426383 gttappydiscrimscale citizen median narmT E 0 1 10000000 06666667 Interpretation 0 What do these numbers tell us 0 Interpretation is the name of the game 0 Some other examples wIR code 0 Data are from the 2008 election on votes on Prop 8 Univariate Quantities in R 0 Our Data 0 Yes on Proposition 8 by County Graphical Displays of data Histogram Dot Chart Box Plots Stem and Leaf Strip Plot First the basic statistics in R 0 Man by county 0 pmeancpmportionfnrpmpa O m 567w 0 Standard deviation O wsdtpmpm nnfntpmpa I It 13355qu 01 Mamba summaty Q ai mummropw onprropa 0 11 235078 Sam 5915364 68m 7581076 F39emuencw Histogram Hlstogrnm OH on B by County Perwusge V PSUHF39mu a Dot Chart Box Plot Stem and Leaf The decimal point is 1 digits to the right ofthe 2 459 3 4888 4 024445578 5 0113344566779 6 0000023344457889 7 0011123334455 Strip Plot R Code for Previms ri 7775 r WWMMM U gmummiut Mm Wm maltI10 1mm minm mm 1 g quotWhamquot mama39swin nmgm amnm m Combined r equanw Percemags as an F v p a WWW a mxmmmw w W map 1va THE SCIENTIFIC STUDY OF 0 POLITICS POL 51 Professor B Jones University of California Davis TODAY 0 Introduction 0 Preliminary Concepts POLITICAL SCIENCE 0 Political science is the application of empirical principles to the study of phenomena that are political in nature 0 Two reasons to understand how to conduct empirical research 0 Citizens are confronted with empirical research daily through political news and debate 0 You can use empirical research techniques to improve your own work EMPIRICAL RESEARCH 0 Empirical research on political phenomena can be used to 0 Improve understanding of and find solutions to difficult problems 0 Applied research 0 Satisfy your intellectual curiosity about the nature of political phenomena 0 Theoretical research 0 What do I do EMPIRICAL RESEARCH o The empirical research process of deciding 0 Which information will be used in an analysis 0 Which method will be used to conduct the analysis 0 Which statistic will be used to demonstrate the findings EXAMPLES OF EMPIRICAL RESEARCH 0 Political scientists study a variety of questions 0 Winners and losers in politics 0 Who votes and who does not Repression of human rights 0 Public support for US foreign involvement 0 What questions are you interested in studying 0 Find a problem IS POLITICAL SCIENCE A SCIENCE 0 There are two general objections to classifying political science as a science 0 Practical objections Philosophical objections IS POLITICAL SCIENCE A SCIENCE 0 Practical objections 0 Political behavior is extremely complex 0 People can intentionally mislead researchers 0 Measurement is often subjective 0 Data can be difficult or impossible to attain 0 Data can be ugly or misleading POLITICAL SCIENCE DISCIPLINE o The discipline has changed over time 0 Traditional approach 0 Period between 1930 and 1960 primarily described the practice of government 0 Empirical approach 0 Followed early survey work in the 1950s led to the widespread application of statistical methods explanatory research POLITICAL SCIENCE DISCIPLINE o The discipline has changed over time o Normative pushback o In response to empiricism focused on questions of morality and policy issues that are relevant to real world political discussions 0 Debate between empirical and normative research has cooled since the 1980s 0 To engage in modern political science requires you to understand scientific method 0 and therefore a basic understanding of statistical reasoning 0 So let s start CAUSALITY CC 33 CC 33 0 Does the existence of X cause y 0 Why do we care 0 Policy choices research questions etc 0 Problems 0 We often ascribe causality under conditions when we cannot do so 0 HUH Causal claims gone wrong can lead to very misleading conclusions 0 Any sports fans here Don t worry if you are not f MERE g run 31 STORKS AND BABIES 0 Do Storks Cause Babies 0 Some data taken from Robert Matthews 2000 Storks Deliver Babies p0008 Teaching Statistics 223638 Posted on SmartSite required reading 0 R Application also posted on SmartSite STORKS AND BABIES storks birth 1 100 83 2 300 87 3 1 118 4 5000 117 5 9 59 6 140 774 7 3300 901 8 2500 106 9 4 188 10 5000 124 1 1 5 551 12 30000 610 13 1500 120 14 5000 367 15 8000 439 16 150 82 17 25000 1577 STORKS BABIES AND R R Script Babies and Storks Data are from quotStorks Deliver Babies p20008quot Robert Matthews 2000 Teaching Statistics 22 3638 Vectors of Data storksltc100 300 15000 9 140 3300 2500 4 5000 5 30000 1500 5000 8000 150 25000 birthltc83 87 118 117 59 774 901 106 188 124 551610120 367 439 82 1577 landarealtc28750 83860 30520 111000 43100 544000 357000 132000 41900 93000 301280 312680 92390 237500 504750 41290 779450 dataltdataframecbindst0rks birth landarea Correlation corcoeflt corbirth storks Regression Model 1inearm0dellt 1mbirth storks p10tbirth storks XlabquotSt0rk Breeding Pairsquot ylabquotBirth Rates mainquotBir and Storksquot 1ineslinearmode1 tted storks typequot0quot ltyquots01idquot colquotredquot Storks Babies and R Correlation 62 Regression Model Birth22503 029Stork Slope Intercept Formula ymx b bconstant22503 mslope029 ydependent variablebirth rate xindependent variablestork p airs Correlation gt corcoeflt corbirth storks gt gt Regression Model gt linearmodellt lmbirth storks gt gt corcoef 1 06203315 gt linearmodel Call lmf0rmula birth storks gt summarydinearmodel Call lmf0rmula birth storks Residuals Min 1Q Median 3Q aX 479295 166266 144888 2055 631753 Coefficients Estimate Std Error t value Prgt I t I Intercept 2250e02 9360e01 2404 002959 storks 2881e02 9405e03 3063 000789 Signif codes 0 39 0001 39 001 39 005 39 01 1 Residual standard error 3323 on 15 degrees of free Multiple Rsquared 03848 Adjusted Rsquared 03438 Fstatistic 9383 on 1 and 15 DF pValue 000789 STORKS AND BABIES 0 Correlation or corelation between birth rate and breeding stork pairs is 62 o The standard error of the correlation is 008 0 That is the pValue is 008 0 Assuming the null hypothesis were 0 what does a 0 correlation imply the probability of observing a correlation this high is 1 in 125 o ie 1125008 o It is unlikely we could have gotten this result by random chance alone Birth Rates 1500 1000 500 Birth Rates and Storks O 5000 10000 15000 20000 25000 30000 Stork Breeding Pairs STORKS AND BABIES o This just in StorksCauseBabies 0 Any objections Stork Breeding Pairs 5000 10000 15000 20000 25000 30000 0 Storks and Land Area 4e05 Land Area kmquot2 6e 05 8e05 STORKS BABIES AND LAND AREA 0 Stork breeding pairs and land area has a correlation of 58 0 Birth rates and land area has a correlation of 92 0 Therefore birth rates and stork breeding pairs will have to be correlated o The problem X stork pairs appears causally related to y birth rates but only because of the presence of a third variable 2 land area A PICTURE o The Nature of the Problem Z IS A CONFOUNDER o It is a hidden factor that causes two measured variables to appear as being causally related 0 z is known as a confounding variable Sometimes also called a lurker variable 0 The relationship is spurious 0 Sports Illustrated jinx 0 Madden jinx REGRESSION TO THE MEAN o TestRetest Situations 0 Imagine two tests where students randomly guess 0 Each test has 100 questions 0 On average we would eXpect to observe what score on Test 1 o In realityquot what do you imagine the data would look like 0 What about Test 2 PROBLEMS WITH INFERENCE o What is inference 0 Regression to the Mean is but one of many issues 0 The presence of confounders is another issuea big and important issue The Baseball Manager s Dilemma 0 Bottom of the ninth down by 1 run 0 Two Outs o Runners on second and third 0 and the pitcher is up 0 You have only two players left 0 and this is the National League 0 What will you do THE CHOICES 0 Player 1 280 hits from 1200 at bats 0 Player 2 110 hits from 500 at bats 0 Their batting average 0 Player 1 110500220 Player 2 2801200233 0 Who would you choose 0 On batting average Player 2 gt Player 1 BUT WAIT 0 Both players are switchhitters they can bat from the left or right side of the plate 0 We 11 go money ball and play the best match up 0 The data Side From Left From Left At Bats 100 800 Hits 26 200 0260 0250 HUH o What happened 0 Not accounting for switch hitting Player 2 is preferred to Player 1 0 When accounting for switch hitting Player 1 is preferred to Player 2 o Worse From either side of the plate we would conclude Player 1 is better than Player 2 even though Player 2 3 overall batting average is higher COLLEGE ADMISSIONS 0 University Admission Statistics 0 1000 women apply 1000 men apply Admission Rate 0 Women 510100051 percent 0 Men 800100080 percent 0 Conclusion 0 Evidence of gender bias 0 This was basis of UC Berkeley gender bias case in the 1970s Source t mlhr eer quot herkele edu1 7Fin39mn rm htm BUT WAIT 0 Two colleges students apply to College A and College B o The Admissions Data Female Male College Applied Accepted Rate Applied Accepted Rate A 980 490 50 200 80 40 B 20 20 100 800 720 90 Total 1000 510 51 1000 800 80 o b indings39 Admission Rate for each college is higher for women than men 0 Overall admission rate is higher for men Simpson s Paradox 0 Two preceding examples illustrate Simpson 3 Paradox o Named for EH Simpson based on 1951 paper 0 Phenomenon has been known since at least 1899 and Yule 1903 published a paper on it 0 Why a paradox o The result is counterintuitive Simpson s Paradox o The Paradox o A reversal resultquot 0 The relationship between two variables found Within subgroups differ in direction when the subgroups are combined 0 Batting Averages on LeftRight Side vs Overall 0 Gender admissions by college vs Overall Gender Admission Rate 0 Consider admissions data again ADMISSIONS DATA AND 1973 BERKELEY CASE 0 Our example 0 The model Admission RatefGender Gender Bias Hypothesis Admission rates of women will be lower than men 0 YAdmission Rate XGender 0 Data seem consistent With the hypothesis 0 The Problem 0 There is a third variable what is it 0 College to Which students applied A vs B o ZCollege PARADOX REVEALED o The Problem is Simple O A There is a strong association between Y and Z 0 One college B is easier to 39get into than the other college A o B There is a strong association between X and Z 0 Women tend to apply to the harder college A at higher rates men tend to apply to the easier college B at higher rates 0 Therefore because of A and B there is a strong connection between Y and X o This connection however is spurious A PICTURE o The Nature of the Problem PARADOX RESOLVED 0 Beware the URKIR MIDIill 0 Similar problem to the StorkBaby example 0 The Problem Z is a confounder If we had accounted for Z we would have arrived at different conclusions IMPLICATIONS 0 Berkeley 1973 Gender bias not found when accounting for departmental admission rates 0 Interestingly it was found that women tended to apply to more difficult graduate programs than men Across departments graduate admission rates were higher for women 0 Not accounting for departmental differences gender bias appeared SO WHAT CONSTITUTES DATA 0 The Experimental Ideal o Observational Data 0 The importance of control 0 The often absence of control POL 51 SCIENTIFIC STUDY OF C POLITICS Professor B Jones University of California Davis GET OUT THE VOTE 0 Hypothetical What is the effectiveness of a directmail campaign on encouraging people to vote 0 Assume it is nonpartisan and the goal of the campaign is to increase participation 0 The problem directmailing is very eXpensive and there are finite resources RESEARCH DESIGN 0 We have a rudimentary research question here Does the directmailing campaign have a positive effect on increasing voter turnout o The hypothetical is now a hypothesis The researcher has some belief that the campaign will produce the eXpected result increased turnout 0 Think of other examples 0 The PROBLEM is how to assess whether or not the EFFECT eXists RESEARCH DESIGN 0 Classical Hypothesis Testing o If the researcher expects an effect due to the campaign and the effect is not observed the conclusion would be o No effect impliesthedirect mailing would have a negligible or NULL effect on turnout 0 So the hypothesis for a null effect or equivalently the case when the effect on turnout is ZERO might be stated as 0 H0 Mail campaign has no effect on increasing turnout H stands for hypothesis and 0 denotes a null effect

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.