by: Lauren Jones

Research Methods Week 10? Pols 201

Lauren Jones
UT
GPA 3.88

This only has Thursday, as there was no class Tuesday. Last week's reading is also attached!
This 5 page Class Notes was uploaded by Lauren Jones on Monday April 4, 2016. The Class Notes belongs to Pols 201 at University of Tennessee - Knoxville taught by Adam Eckerd in Winter 2016.

Date Created: 04/04/16
31 March 2016  Descriptive Statistics  Regression o All based off same premise  Explain why variables vary  Test theories o Mostly we have been encouraging, variable x and variable y  Prediction  Measuring effect sizes  Perform Valid comparisons o Get probability look at the nature of relationships, see what predicted medical expenses would be, if they vary, why?  Y=mx+b  X is the intendent, b is slope, y is the dependent o B is the regression coefficient  Represents the amount of change expected  E o Reason for y hat  If the relationship is completely deterministic, then every point lies exactly on the line  But this is never case o Points always fall above or below, even though strong correlations  Y hat is the predictive value of  Y hat does not equal Y o Y hat would be that value for every given x  Ei o Half the time it will be positive, and half will be negative  Minimize the distance, w  Minimize the Ordinary Least Squares o Constant  Sometimes meaningless, determine the meaningful need of things  Regression o Estimates what happens in the population  Each point is a data point from the sample  Regression and Correlation o Coefficient of determination  Means that are squared that is the proportion of variance in Y determined or explained by x  Tells us about variation, we’ve got lots of variation o The squared tells us the proportion, the closest to one, the more of the variation we have explained  You can interpret using the x line. o X as b  Hypothesis test o A slope of 0 is a horizontal line  As x increases, y does not change  You can measure relationships better by using regression o Type one and Type two errors  Occur when choosing a level of significance  Specify the hypothesis  Determine alpha value o Usually .05%  Calculate sample statistic  Compare statistic to hypothesized parameter  Calculate the t value  Find a p value  Make a decision of H0 o Then state!  The unlikeliness  We can infer that more education leads to an increase in income  Confidence Interval Method o 95% confident that a sample was derived from a population where there is positive significant relationship  Specify hypothesis  Calculate b  Calculate confidence interval for b  Determine if 0 is within the interval  You get the result as before o Don’t care about specific nature if we take any two communities  We expect higher per up  Practical versus statistical significance o Ten extra dollars  835  Actually look at the value of the slope  Only meaningful when it is. o Modifiable areal unit problem  Depending of area, could change if you change the area o Simpson’s Paradox  You can miss relationships vs certain groups when aggregating o Regression Fallacy  Making decisions based on extreme observations  Results will tend to regress to the mean  Example of one used for everything o Ecological Fallacy  Based on an aggregated trend  At the individual level, does not hold o At state level, it looks like one Creating good charts  General Principle of Graphic Display o Efficient display of meaningful and unambigious data  Define what the numbers represent o Deceptive choices in design can distort numbers and relationships that are trying to be represented  Life factor statistic  Numerical measure of the data distortion  A chart is to simplify numerical comparisions  Common errors o Include elements in the display that have nothing to do with the comparisons o Components of a chart  Chart titles  May state the conclusion o Subtitles contain what the x and y mean  Axis Titles and Labels  Repeat what is clear from the main title or labels  Axis Scale and Data Labels  Te value or magnitude is defined by these labels  Legends  Used when chart has more than one data series  Gridlines  Little ink as possible not ot overwhelm  Sources  Specifying source of data is important for citation  Other Chart Elements  There can be unconventional things  When Graphic Design Goes Badly o The most general standards of charting data are thus the following  Present meaningful data  Define the data unambigoulsy  Do not distort the data  Present data efficiently  Types of Charts o Pie Charts  Represent distribution of categorical components of a single variable  Poor representation of data  Don’t use them, especially three d ones o Bar Charts  Typically display relationship between one or more categorical variables  Minimize ink to data ratio o Do not use three d o Legends inside or below plot area o Time Series Line Charts  Most efficient means of displaying large data amounts o Boxplots  Plots the median and four quartiles o data for an interval level variable  Best used for comparing the distribution of the same variable for two or more groups or points at a time  Displays how a single case compares to a large number of other cases o Sparklines  Eliminate all nondate graphical elements of the chart, providing a simply display of variation in trend and numerical information enough to make sense of the information presented  Voting o Voter turnout is an indicator of either the strength of a osciety’s culture or the quality ofinstitutions  Low turn out in America is linked to individualism and distrust of the government o Political gerrymandering  Process by which elected officials choose voters rather than the other way around o Dsicrepency between votes counted and people saying if they voted  Sample mortality effect  People may not want to fully participate  People lie o Barriers  You have to register a month previously 

