# Final Study Guide 1350

This 1 page Study Guide was uploaded by Rachael Kroeger on Monday May 18, 2015. The Study Guide belongs to 1350 at Ohio State University taught by Strait in Spring 2015.

Date Created: 05/18/15

POPULATION entire group of individuals about which we want information SAMPLE part of the population that we actually examine to gather information for purpose of drawing conclusions about whole population PARAMETER a number that describes the population ex avg weight of population of dogs in cbus STATISTIC number that describes a sample ex avg weight of a sample of 500 dogs in cbus estimate parameter by a sample statistic Population Inference Sample parameter Sampling statistic RANDOM PHENOMENON choosing SRS from pop and getting stat several possible outcomes unknown sample SIMPLE RANDOM SAMPLE SRSZ each individual has an equal chance of being chosen STRATIFIED RANDOM SAMPLE individuals in the same group have an equal chance of being chosen divide individuals of pop into groups based on some characteristic SRS win each group combine all into big group COMPLETELY RANDOM DESIGN all subjects are assigned randomly to different treatment groups without accounting for any other variable before hand RANDOM BLOCK DESIGN group of experimental subjects that are known before the experiment to be similar in some way that is expected to affect the response to treatments 0 Blocks can be larger groups of many subjects and can be much smaller units Random assignment of subjects to treatments is carried out separately within each block u Males drug gro co re g Placebo grou ma compare Femaly drug grou compaie Placebo grou MATCHED PAIRS DESIGN design to compare two different treatments 1 Pair up subjects ideally the two subjects are very similar to each other each subject in a pairing randomly receives one of the treatments blocks pairs 2 Each subject gets both treatments but in different orders ideally blocks each subject individually HISTOGRAMS quantitative variable distribution bars have equal width but height of each bar displays how many individuals fall within a speci c range of values UNIMODAL 1 peak BIMODAL 2 peaks MULTIMODAL more than 2 peaks BOXPLOTS GOOD skewedness vs symmetry shape center median and spread IQR comparisons between groups BOXPLOTS BAD number of peaks in a distribution size of data set frequency of values within different intervals OUT LIERS data points that deviate unusually far from the overall pattern circles separately from the lined part of box plot HOW TO FIND 1 Find Q1 and Q3 2 Calculate IQR Q3 Q1 middle 50 of data points 3 High outliers are gt Q3 15 x IQR 4 Low outliers are lt Q1 15 x IQR If distribution is skewed andor has outliers use measure of center measure of center median spread IQR lf distribution is symmetric with no outliers use measure of center mean spread standard deviation RESISTANCE how strong of an impact outlier have on a particular measure of center or spread Mean amp standard deviation are NOT resistant to outliers strongly impacted by outlier bc use mean amp sd to calculate values for data Median amp IQR are resistant to outliers no impacted by outliers because median is the middle value and same reason for the IQR Z observationmeanstd dev zxuo 6595997 RULE how many std devs the data falls into Cth PERCENTILE value such that c percent of the observations lie below it and rest lie above Q1 25th percentile Q375tquot Median 50th SCATTERPLOTS FORM linear nonlinear no pattern correlationinear DIRECTION positive negative no association STRENGTH strong moderate weak pattern CORRELATION r 1ltrlt1 sign on r matches scatterplot data association positivenegative Outliers have an impact on r R doesn t change value if you change the units of the x or y variabes R value has no units Absolute value of correlation tells the strength of association 0002 very weak to negligible correlation 0204 weak low 0407 moderate 0709 strong high 0910 very strong LEASTSQUARES REGRESSION quotline of best tquot 1 create scatterplot and describe and outliers 2 compute correlation coef cient r 3 obtain equation yabx A intercept value of y when x variable is 0 where the line crosses the y axis B slope amount that the y variable changes when x variable increases by 1 unit larger b steeper line Slope and correlation ALWAYS have same sign R2 percent of variation in y variable is explained by the regression line To get 1 nd correlation r 2 square 3 multiply value by 100 R2 value is always between 0 100 Closer to 100 the stronger the linear relationship between x and y good EXTRAPOLATION using x values well outside the range of the original x values to predict y BADII OUTLIERs big effect on correlation and regression line think about how removing the outlier will impact the correlation CAUSTAITON x causes y Can show by strong and consistent association higher doses are associated with stronger responses alleged cause precedes effect alleged cause is plausible Watch for confounding variables PROBABILITY random phenomenon a number between 01 describing the proportion of times that the outcome would occur in a long series of repetition Probability outcome 0 NEVER occurs Probability outcome 1 always occurs AVERAGESPROPORTIONS tends to be closer to the truth SUMSCOUNTS differ from them the truth by more and more RULES FOR PROBABILITY any probability is a number between 0 and 1 all possible outcomes together must have probability 1 must be true for a model to be a probability model probability that an event does not occur is 1 the probability that event does occur Psunny 1Pnotsunny if two events have no outcomes in common the probability that one or the other occurs is the sum of their individual probabilities event1 landing on odd event 2 landing on even P landing on odd or even Podd Peven VALID PROB MODEL all of the probs between 0 and 1 all probs add up to 1 SAMPLE DISTRIBUTION choosing random sample form pop amp calculating stat if repeated process amp look at distribution probability model Tells what a sample stat takes amp how often it takes on those values when we look at repeated samples from the pop described by shape center and spread or variability SAMPLING DISTRIBUTION what values the stat takes along whow often it takes on that value if many samples bell curve CONFIDENCE INTER VAL estimating pop proportion or pop mean random variability sample stat MOE sample statcenter moewidth always implies pop not sample T conf level T moe Tsample size imoe pop T MOEno change 1523 adults 501 teens moe of 95 conf state about teen would be greater for adults teen sample is smaller HYPOTHESES Ho sign Hagtlt signs 1 sided gtltsigns in alt hypoth 2 sided sign in alt hypoth P VALUE prob Ho is true not same as p ipvalue Tevidence against Ho by data 1 sided onetailed lt or gt 2 sided compute z score compute prob in closer tail double answer to get p value SIGNIFICANCE LEVEL 1 usually 005 unless stated P value 2 x don t reject Ho not enough evidence to say statement P value lt x reject Ho

