### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# SPEC TOP RES MANGMT QERM 598

UW

GPA 3.86

### View Full Document

## 10

## 0

## Popular in Course

## Popular in Quantitative Ecology And Resource Management

This 135 page Class Notes was uploaded by Ms. Bart Lind on Wednesday September 9, 2015. The Class Notes belongs to QERM 598 at University of Washington taught by Staff in Fall. Since its upload, it has received 10 views. For similar materials see /class/192196/qerm-598-university-of-washington in Quantitative Ecology And Resource Management at University of Washington.

## Similar to QERM 598 at UW

## Popular in Quantitative Ecology And Resource Management

## Reviews for SPEC TOP RES MANGMT

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/09/15

Introduction to Analysis of Variance Eli Gurarie QERM 598 Lecture 3 University of Washlngton 7 Seattle January 22 2008 Historical roots of ANOVA sh Renata Ay mer Ftsher 189071952 was one of the greatest stattsttmhs 2nd popu zuo geneucs he was mterested m reatmg dtfferences m phenotype to dtffetences m notype The presence ofdtfferent 2eex r 250 2 htgzdvumtem human equmtsznd w many mums 2 mm Dem mm wtth t guess a mkesz kmd The Great Pie Zone out Experiment One Wednesday after the weekly QERM soup an experiment was performed to test the effects of different desserts on student concentration The twelve students were divided into three groups of four each of what was to consume in its entirety an Apple pie a Blueberry pie and a Cherry pie Later from 130300 all twelve students attended a statistics department seminar All but one of the students zoned out at least once during the seminar and the total zoneouts duration ZOD in minutes was carefully recorded by the experimenter The results in minutes are tabulated below Treatment ZOD min totals means Apple Pie 0 2 05 15 4 10 Blueberry Pie 1 2 3 2 8 20 Cherry Pie 7 55 65 5 24 60 totals 36 30 Visualization Pie experiment results Pie experiment Boxplot 7 7 o 7 7 o m 7 X m 0 E g m 7 o g m 7 7 c c o o g ltr 7 ltr 7 5 5 u u 5 m 7 o 5 m 7 77 cf cf 2 N 7D 0 X D g N 7 f o o N a N 7 7X a 7 7 Ag 0 c 7o c 7 77 i i i i i i A B c A B c Pie type Pie type Formulating a hypothesis 0 Research question Do different kinds of dessert have different effects on the concentration of QERM students Formulating a hypothesis 0 Research question Do different kinds of dessert have different effects on the concentration of QERM students 0 Null hypothesis in words Different pie treatments result in essentially similar ZOD39s Formulating a hypothesis 0 Research question Do different kinds of dessert have different effects on the concentration of QERM students 0 Null hypothesis in words Different pie treatments result in essentially similar ZOD39s 0 Alternative hypothesis in words Different pie treatments result in different ZOD39s Formulating a hypothesis 0 Research question Do different kinds of dessert have different effects on the concentration of QERM students Null hypothesis in words Different pie treatments result in essentially similar ZOD39s 0 Alternative hypothesis in words Different pie treatments result in different ZOD39s Null hypothesis in math MA MB MC Formulating a hypothesis Research question Do different kinds of dessert have different effects on the concentration of QERM students Null hypothesis in words Different pie treatments result in essentially similar ZOD39s Alternative hypothesis in words Different pie treatments result in different ZOD39s Null hypothesis in math MA MB MC Alternative hypothesis in math Mills ORMAfMC DR13ch Comments on models and hypotheses 0 So far we ve formulated scientific questions in terms of hypotheses and hypothesis tests consistent with ttests randomization MannWhitney etc to compare samples Comments on models and hypotheses 0 So far we ve formulated scientific questions in terms of hypotheses and hypothesis tests consistent with ttests randomization MannWhitney etc to compare samples 0 When confronted with more complicated systems or datasets hypothesistesting is a little narrow It is more enlightening to think in terms of model assessment We typically propose several possible statistical models and assess which has greater explanatory power given the quality of the data In practice you will be looking at the results of large ANOVA tables which can contain within them many implicit hypotheses and are all essentially assessed simultaneously as we select the best or most parsimonious model The hypothesis test is best thought of as a tool in the model selection process Comments on models and hypotheses So far we ve formulated scientific questions in terms of hypotheses and hypothesis tests consistent with ttests randomization MannWhitney etc to compare samples When confronted with more complicated systems or datasets hypothesistesting is a little narrow It is more enlightening to think in terms of model assessment We typically propose several possible statistical models and assess which has greater explanatory power given the quality of the data In practice you will be looking at the results of large ANOVA tables which can contain within them many implicit hypotheses and are all essentially assessed simultaneously as we select the best or most parsimonious model The hypothesis test is best thought of as a tool in the model selection process This distinction is reflected in the nomenclature Even a very simple design like the pie experiment where the analysis is just one step more complicated than a twosample ttest we use an ANALYSIS of variance whereas the ttest is merely a TEST Formulating a statistical model a Model 1 Single mean XU M 6039 a Model 2 Unique means X0 M 6039 Formulating a statistical model a Model 1 Single mean XU u 6039 a Model 2 Unique means X0 u 6039 Where 0 XU uniquely identifies an individual measurement 0 6 122 indexes the treatment Here a 3 representing pies A B and C a j E 12n indexes the individual measurement within each treatment group Here n 4 and the total number of samples N an 0 u is the true grand mean 0 u is a true group mean within each treatment group 9 EU is a random individual error term Formulating a statistical model a Model 1 Single mean Xi u 6039 a Model 2 Unique means X0 u 6039 Where 0 XU uniquely identifies an individual measurement 0 6 122 indexes the treatment Here a 3 representing pies A B and C a j E 12n indexes the individual measurement within each treatment group Here n 4 and the total number of samples N an 0 u is the true grand mean 0 u is a true group mean within each treatment group 9 EU is a random individual error term Very very important assumption eij39s are iid N 002 Comparing Variances The goal of ANOVA is to compare within group variances 5 and between group variances 52 If the withingroup variance is somehow smaller than the betweengroup variance then the treatment might have some explanatory power ie a certain amount of the variance is accounted for by the treatment effect Consider our data Tr ZOD min means variances A 0 2 05 15 10 08333 B 1 2 3 2 2 0 06667 C 7 55 65 5 60 08333 7 30 5 57272 The values for 52 certainly look smaller than 52 But how do we perform rigorous inference Several Sums of Squares The sum ofsquares is a total measure of variability Consider the Total sum of squares 5 ZZQVXJZ 1 i1 11 7 note that SSN 7 1 is an unbiased estimate of the total sample variance 7 of the data Several Sums of Squares The sum ofsquares is a total measure of variability Consider the Total sum of squares a n 55 ZZQVXJZ 1 1 11 7 note that SSN 7 1 is an unbiased estimate of the total sample variance 7 of the data This sum can be decomposed into 55 HZOQ 7 22 2209 7 gt 2 2 i1 i1 j1 sstreatment SSerror 3 SSenmis the Error sum of squares ie the sum of all the little deviations from each of the local little means while SSneatmemis the Treatment sum of squares ie the sum of the differences of the little means from the grand mean weighted according to the number of measurements within each group Expectations of Sums of Squares 1 Under hypothesis that variances are iid E 55 7 E 22 7 mg 7 0 71w 2n 71w 1 1 1 Similarly E sstreatment 2 i 1 72 4 Expectations of Sums of Squares 1 Under hypothesis that variances are iid E 55 7 E 22 7 mg 7 0 71w 2n 71w I71 1 1 Similarly E sstreatment a 7 1a2 4 Thus 0 559oN 7 2 called Mean Squared Error MSE is unbiased estimate of a O SSWSEWMa 71 called Mean Squared Treatment MST is ALSO an unbiased estimate of a Getting Close Anyways we kn ow ESS E 22mm n71a 5 Thus applying Fact 7 SS 07 N ChlsquaredN 71 6 Under the null assumption NO treatment effect a 02 Thus since 7 7 55 SSerror sstreatment 7 7 l T p then by Cochran39s theorem SSemna2 and SSneamema2 must be independent Chisquared random variables with N 7 a and a 71 degrees of freedom respectively Meet William Gemmell Cochran 1909 1980 Cochran was a good statistician and later a great statistics department administrator I was hoping to find something outof theordinary in his biography but absolutely nothing came up Really I put his picture into these notes to break up the stream of equations Finally we obtain the test statistic 55 763 men a 7 1 M5 763 men F sfseimillh 2 l wisely I 8 If the null hypothesis is true then both the and F0 N Fa 717 N 7 a Under H0 0 MSemr and the MSneatmen are unbiased estimators of 02 0 FUN Fa71N7a Fact 4 Under H1 MSneatmen will be greater than MSemm and we can reject the null hypothesis based on comparing the F0 test statistic to the F a 717 N 7 a distribution AN OVA table Typically we construct a table to summarize our analysis Source of Sum of Degrees of Mean F0 Variation Squares freedom Square Treatment 55 a 7 1 MStreat Residual 559m N 7 a MSerror Total 55 N 7 1 Treatment refers to variation explained by differences between group means Residuals refers to differences within group means Pie analysis Pie experiment ANOVA table Source 55 df MS Fo p value Pie 56 2 28 36 5081e05 Residuals 7 9 0778 Total 63 11 F25plot 7 m 9 m Zik Pie analysis Pie experiment ANOVA table Source 55 df MS Fo p value Pie 56 2 28 36 5081e05 Residuals 7 9 0778 Total 63 11 F25plot 7 m 9 m a 7 u 1 2L al A Under the null hypothesis the F0 statistic will be not significantly different from 1 Ours clearly looks extreme Pie analysis Pie experiment ANOVA table Source 55 df MS Fo p value Pie 56 2 28 36 5081e05 Residuals 7 9 0778 Total 63 11 F25plot 7 m 9 m a 7 u 1 2L al A Under the null hypothesis the F0 statistic will be not significantly different from 1 Ours clearly looks extreme What is the probability of an even more extreme F0 value emerging from a true null hypothesis PrF0 gt Fm 5081gtlt 10 05 9 This is the p value or probability of Type I error It is clearly very very small so we reject the nullhypothesis with great confidence Model Specification Recall our 2 models 0 Model 1 Single mean XU M 6039 0 Model 2 Unique means X0 M 6039 The ANOVA helped us pick out the most informative model Model 2 It basically told us that accounting for the groups made our estimate for a significantly smaller than not accounting for the groups would have Now that we have selected a model we can speci it In the model we ve selected there are 4 parameters in m 3 and 72 The estimates for these parameters are parameter estimate value in Apple pie Sq 1 My Blueberry pie 22 2 33 6 3 Cherry Pie 0 MSW 0778 Hypotheses vs Models 0 Strictly speaking the hypothesis test lets us say that There is at least one pair ofmeans in our experiment that is not equal This is a relatively crude result but it can be stated with great certainty Hypotheses vs Models 0 Strictly speaking the hypothesis test lets us say that There is at least one pair ofmeans in our experiment that is not equal This is a relatively crude result but it can be stated with great certainty o In contrast the model we have selected lets us say that Given the data collected we can predict that the mean effects of Apple Blueberry and Cherry pie dosage on QERM students are about 1 2 and 6 minutes of zoning out with some roughly normally distributed variability with variance around 08 Hypotheses vs Models 0 Strictly speaking the hypothesis test lets us say that There is at least one pair ofmeans in our experiment that is not equal This is a relatively crude result but it can be stated with great certainty o In contrast the model we have selected lets us say that Given the data collected we can predict that the mean effects of Apple Blueberry and Cherry pie dosage on QERM students are about 1 2 and 6 minutes of zoning out with some roughly normally distributed variability with variance around 08 o This second statement is not strictly speaking true Like all models it is a reduction and simplification of reality However given the information that we have it is probably the best description of reality The hypothesis test was an aid in selecting this model Hypotheses vs Models Final Comment It is often said that all models are wrong but some can be useful Perhaps a corrolary might be that hypothesis tests are always accurate when performed correctly and always useful but only for the construction of models which are all wrong but occasionally useful Comparing two samples Eli Gurarie QERM 598 Lecture 2 UanerSlty of Washlngton 7 Seatt e January 17 2008 Ants Seed ant Pogonomyrmex sainus 9114 55x Thatch ant Formica panipiis The Question WHICH IS Seed ant Pogonomyrmex sainus E T Step 1 Collect Data Seed Ant Thatch Ant Welgvt m Hezdwldth Welgvt m Hezdwldth 1 o 1 2 5 1 1 3 a 1 1 A 1 5 1 5 1 1 7 1 a 1 1 9 1 1o 1 1 1 11 1 12 1 13 1 14 1 15 1 15 1 1 17 1 1 5 1a 1 5 19 1 1 a 20 1 2 21 1 1 5 22 1 23 1 24 1 25 1 1 25 1 27 1 2a 1 1 29 1 1 30 1 note data gratefully stolen from http stat ucla edudatasets Step 2 Visualize Data Boxplots Weight mg Head widths mm m o 1 f 1 7 1 1 f 1 00 17 1 o 1 v 1 1 e 1 1 1 N 1 8 7 D 6 gt1 7 m 1 f 1 f 17 71 1 8 r m 1 1 lt1 7 1 lt1 1 G e A o 1 1 1 1 Seed Thatch Seed Thatch Step 2 Visualize Data Histograms Weight Seed Am Head Wldm Seed Ant E E I a a m an an mu m Mu an N 1 5 17 a 19 2D We gM ME HEB WWW NW Weight Thatch Am Head Wldth Thatch Ant E N E G V V V V V V t G V V V V V V t m an an mu m Mu an N 1 5 17 a 19 2D We gM ME HEB WWW NW Step 3 Summary statistics Seed Ant X Thatch Ant X S Head width mm Weight mg 30 14 195 162 0096 928 26 167 0147 A tough question A tough question o It certainly seems like Thatch ants might to be bigger than Seed ants A tough question o It certainly seems like Thatch ants might to be bigger than Seed ants 0 But there are obviously some Seed ants that are bigger than some Thatch ants A tough question o It certainly seems like Thatch ants might to be bigger than Seed ants 0 But there are obviously some Seed ants that are bigger than some Thatch ants o What does the question Which is Bigger actuay mean A tough question o It certainly seems like Thatch ants might to be bigger than Seed ants 0 But there are obviously some Seed ants that are bigger than some Thatch ants o What does the question Which is Bigger actuay mean The short answer a It doesn39t mean anything We must refine the question Eg what is the probability that any given Thatch Ant is bigger than any given Seed Ant Weight comparisons Head size comparisons Ordered Thatch Ants 5 Ordered Thatch Ants 5 ordered Seed Ants ordered Seed Ants We must refine the question N N5 CW 223 211 Wtf gt WSJ 0 92 N N Ch NENS Z Elil IHti gt Hsj 062 Weight comparisons Head size comparisons Ordered Thatch Ants 5 Ordered Thatch Ants 5 ordered Seed Ants ordered Seed Ants Or we can just go way too fancy Weight differences Head width size differences Ordered Thatch Ants Ordered Thatch Ants 5 ordered Seed Ants ordered Seed Ants Getting there but more questions Getting there but more questions a The statement that A is bigger than B about X of the time is an improvement Getting there but more questions a The statement that A is bigger than B about X of the time is an improvement 0 But how do we know that this comparison isn39t an artifact of random sampling Getting there but more questions a The statement that A is bigger than B about X of the time is an improvement 0 But how do we know that this comparison isn39t an artifact of random sampling The short answer Getting there but more questions a The statement that A is bigger than B about X of the time is an improvement 0 But how do we know that this comparison isn39t an artifact of random sampling The short answer c There is no short answer It takes lots of really confusing statsy jargon to say anything about anything Start getting used to it Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 9 We choose some summary of the data called the test statistic To N ft Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 9 We choose some summary of the data called the test statistic To N ft 0 We create a null distibution of the test statistic i e the distribution we would expect of the test statistic if the null hypothesis were true Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 9 We choose some summary of the data called the test statistic To N ft 0 We create a null distibution of the test statistic i e the distribution we would expect of the est statistic if the null hypothesis were true a We calculate the experimental value of the test statistic to and compare it to our distribution Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 9 We choose some summary of the data called the test statistic To N ft 0 We create a null distibution of the test statistic i e the distribution we would expect of the test statistic if the null hypothesis were true a We calculate the experimental value of the test statistic to and compare it to our distribution 0 We set some criterion often called the critical region within which we would fail to reject not quite the same as accept the null hypothesis Here two things can happen Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 9 We choose some summary of the data called the test statistic To N ft 0 We create a null distibution of the test statistic i e the distribution we would expect of the test statistic if the null hypothesis were true a We calculate the experimental value of the test statistic to and compare it to our distribution 0 We set some criterion often called the critical region within which we would fail to reject not quite the same as accept the null hypothesis Here two things can happen 0 If rois extreme lies outside our critical region we reject the null hypothesis accept the alternative hypothesis humbly acknowledging that we might be wrong and call the probability that we might be wrong the Type I error Hypothesis testing in 7 or 6 or 8 easy ha ha steps 0 Since it can be tricky to even define what it is we want to know we define it39s opposite which is often simpler This is called the null hypothesisHo 9 What the null hypothesis isn t we call the alternative hypothesis H1 or HA 0 We choose some summary of the data called the test statistic To N ft 0 We create a null distibution of the test statistic i e the distribution we would expect of the test statistic if the null hypothesis were true a We calculate the experimental value of the test statistic to and compare it to our distribution 0 We set some criterion often called the critical region within which we would fail to reject not quite the same as accept the null hypothesis Here two things can happen 0 If tois extreme lies outside our critical region we reject the null hypothesis accept the alternative hypothesis humbly acknowledging that we might be wrong and call the probability that we might be wrong the Type I error e If tois not extreme we fail to reject the null hypothesis calling the probability that we might be wrong the Type II error Example Step 1 2 Null and Alternative Hypotheses H0 Seed and Thatch ants can be considered to come from the same population QQG Example Step 3 Choose test statistic We could do something crazy like the count statistic CW 212 1IWzigt W51 Ch NENs 21 2151 Hri gt Hsj Example Step 3 Choose test statistic We could do something crazy like the count statistic CW 212 1IWzigt W51 Ch NENS Eli EE1KHII39 gt Hsj But that39s kind of crazy How about something relatively straightforward like the difference between the means tW V W5 tH H 7 H5 Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W 9 split up into two new vectors W5 5 and W7 5 Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W 0 split up into two new vectors W5 5 and W7 5 9 obtain and store the statistic TW 5 W7 5 7 W7 5 Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W 0 split up into two new vectors W5 5 and W7 5 9 obtain and store the statistic TW 5 W7 5 7 W7 5 0 Repeat steps 13 a bunch of times Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W 0 split up into two new vectors W5 5 and W7 5 9 obtain and store the statistic TW 5 W7 5 7 W7 5 0 Repeat steps 13 a bunch of times 0 Repeat steps 14 for head sizes H Example Step 4 Obtain null distribution One approach is using Monte Carlo simulation to obtain a simulated nulldistribution of the test statistic o If the null hypothesis is true then there is no difference between the two groups means we can resample them in any which way 0 So 0 shufer all weights W 0 split up into two new vectors W5 5 and W7 5 9 obtain and store the statistic TW 5 W7 5 7 W7 5 0 Repeat steps 13 a bunch of times 0 Repeat steps 14 for head sizes H Example Step 4 Obtain null distribution Frequency 3 0 100 WrWs Ht Hs c c c N gt c 2 0 2 U 9 c u 8 Q o 30 20 10 0 10 20 30 02 01 00 01 02 TWSW Th 5er Example Step 5 Assess observed statistic Frequency 3 0 100 WrWs Ht Hs c c c N gt c 2 0 2 U 9 c u 8 Q o 30 20 10 0 10 20 30 02 01 00 01 TWSW Thswm W 3013333 17 00504667 Example Step 6a is this extreme enough WrWs rH c c 30 20 10 0 10 20 30 02 01 00 01 02 TWSW Thswm Pr Tw539m gt Tw O Pr TH5m gt TH 00598 Example Step 6b is this extreme enough The measure of egtlttremeness shoug reflect the fact that H1 is twosided WrWs HrHs 12000 Frequency 0 100 300 Frequency 4000 8000 0 730 720 710 0 10 20 30 702 701 00 01 02 TW Sim Th Sim Pr Twem gt Tw 0 Pr THmi gt TH 001188 Example Monte Carlo simulation Step 7 Decide is this extreme enough aka the hoodoo voodoo step 0 After 10000 simulations of random samplings of Weight under the null hypothesis there were exactly 0 whose mean difference was more extreme than our measured difference of 381 mg Thus we can reject the null hypothesis with high confidence Example Monte Carlo simulation Step 7 Decide is this extreme enough aka the hoodoo voodoo step 0 After 10000 simulations of random samplings of Weight under the null hypothesis there were exactly 0 whose mean difference was more extreme than our measured difference of 381 mg Thus we can reject the null hypothesis with high confidence After 10000 simulations of random samplings of Head size under the null hypothesis about 11 had values that more extreme than the measured difference in means of 0051 mm We could still reject the null hypothesis but not with very high confidence since there s a 1 in 10 chance that a sampling from the null hypothesis will yield a more extreme result than our data A typical significance level is 005 but this is partially a historical artifact from the days when everyone relied on tables If we finagled our hypotheses to be onesided Ho H H5 H gt H5 the p value drops to 0059 Is that good enough It s not strictly below 005 It s the sort of result that might be classified as marginally significant Example Monte Carlo simulation Step 7 Decide is this extreme enough aka the hoodoo voodoo step 0 After 10000 simulations of random samplings of Weight under the null hypothesis there were exactly 0 whose mean difference was more extreme than our measured difference of 381 mg Thus we can reject the null hypothesis with high confidence After 10000 simulations of random samplings of Head size under the null hypothesis about 11 had values that more extreme than the measured difference in means of 0051 mm We could still reject the null hypothesis but not with very high confidence since there s a 1 in 10 chance that a sampling from the null hypothesis will yield a more extreme result than our data A typical significance level is 005 but this is partially a historical artifact from the days when everyone relied on tables If we finagled our hypotheses to be onesided Ho H H5 H gt H5 the p value drops to 0059 Is that good enough It s not strictly below 005 It s the sort of result that might be classified as marginally significant 0 Let s say we really feel that it s not extreme enough Does that mean the null hypothesis is true NO Itjust means we lacked to power to reject it We really really wanted to but we failed to reject Ho Example Monte Carlo simulation Step 7 Decide is this extreme enough aka the hoodoo voodoo step 0 After 10000 simulations of random samplings of Weight under the null hypothesis there were exactly 0 whose mean difference was more extreme than our measured difference of 381 mg Thus we can reject the null hypothesis with high confidence After 10000 simulations of random samplings of Head size under the null hypothesis about 11 had values that more extreme than the measured difference in means of 0051 mm We could still reject the null hypothesis but not with very high confidence since there s a 1 in 10 chance that a sampling from the null hypothesis will yield a more extreme result than our data A typical significance level is 005 but this is partially a historical artifact from the days when everyone relied on tables If we finagled our hypotheses to be onesided Ho H H5 H gt H5 the p value drops to 0059 Is that good enough It s not strictly below 005 It s the sort of result that might be classified as marginally significant 0 Let s say we really feel that it s not extreme enough Does that mean the null hypothesis is true NO Itjust means we lacked to power to reject it We really really wanted to but we failed to reject Ho w See why we call this the l39lOOClOOVOOClOO step T Tests In statistics lots and lots of magical things happen when you make a few assumptions The biggies are 0 Independence 1 also necessary for Monte Carle Randomization etc etc o Constant variance between groups that are being compared 0 Normality T tests Assess normality S E Samme memes m m an an an All welghts Thanh Antwelghts Seed Am welghts a a a E Dw w w a 7 w w w All headslzes Thanh Ant headslzes Seed Am headslzes E m E a a g Memeth euamues Yhearetxca Ouanmes Memeth euamues T tests some basic math facts 0 if X1XgXn are iid rv s with distribution N M02 then 71 n A 72 XXMNN7 7 2 M N Chisquaredn 039 l 1 0 if Y N N 0 1 and Z N Chisquaredn then where Tn is Student sT distribution with n degrees of freedom Y 1Zn N TM 3 0 Exercise Combine all these facts to show that under the assumption that 0 1nn71 N Tn 71 ELM ix T tests a little more math 0 Consider n1 measurements of X1 N N uha and n2 measurements osz N N uzag 0 Assume 0 a 02 0 Under H0 1 M 0 With these conditions we can derive the following result to M N tn1n272 5 1 1 5P F a where Sp is called the pooled variance and is a weighted estimate f 02 S 3971 1215J1r2j223152 6 T tests long story short This beast to N tn1nr2 is a test statistic with a known 5W E null distribution To which looks a lot like the standard normal distribution T tests Check our data 0 For weight tW 7 0841 assess against T58 Pr T58 gt tW1O 9 so REJECT H0 0 For headsize th 1157 assess against 1 58 Pr T58 gt 17 1217 so FAIL TO REJECT H0 T tests Compare methods rHs TD 0 c c N c c c c a c a m c 8 c a a 2 2 U U 9 o e 0 LL 3 LL g c c 7010 000 005 010 4s 72 0 2 4 Thswm Thswmtt PrTW5m gt THA1188 PrT58 gt 17 0 1217 Meet Mr William Sealy Gosset sum in Hm William Gosset June 13 1876 October 16 1937 the inventor of the T test was a bright mathematician who worked for the Guiness brewing company Some time earlier a scientist had inadvertently revealed important brewing secrets in a science journal so the company decided to clamp down on publishing careers Gosset did his statistics late at night and published pseudonymously as quotStudentquot hence Student39s T He went through much work hand checking estimates working on the small sample problem Apparently the company was too stingy with it s wares for him to perform experiments on large samples Or perhaps he felt sorry for his liver Introduction to stochastic processes ed Eli Gurarie QERM 598 Lecture 6 February 14 2008 A Stochastic Processes Is 0 Any process in which outcomes in some variable usually time sometimes space sometimes something else are uncertain and best modelled probabilistically 0 Stochastic is to deterministic as random variable is to number 9 Biggest difference from what we39ve done so far Dependent Data A Stochastic Processes Is 0 Any process in which outcomes in some variable usually time sometimes space sometimes something else are uncertain and best modelled probabilistically 0 Stochastic is to deterministic as random variable is to number 9 Biggest difference from what we39ve done so far Dependent Data Examples include Just about everything In ecologybiology just about everything includes In ecologybiology just about everything includes o WeatherClimate In ecologybiology just about everything includes o WeatherClimate a Population biology a Birthdeathreproductionmortality a Migrations and movements ust about everything includes 0 WeatherClimate a Population biology a Birthdeathreproductionmortality a Migrations and movements 0 Evolution a Population genetics MutationSelectionDrift a Gene sequences ust about everything includes 0 WeatherClimate a Population biology a Birthdeathreproductionmortality a Migrations and movements 0 Evolution a Population genetics MutationSelectionDrift a Gene sequences a Epidemiology a Disease spread within a population SIR models a Disease spread within an organism a Development of resistance ust about everything includes 0 WeatherClimate a Population biology a Birthdeathreproductionmortality a Migrations and movements 0 Evolution a Population genetics MutationSelectionDrift a Gene sequences a Epidemiology a Disease spread within a population SIR models a Disease spread within an organism a Development of resistance 0 Tools for assessing models and estimating parameters a MCMC Markov Chain Monte Carlo a Simulated annealing ust about everything includes 0 WeatherClimate a Population biology a Birthdeathreproductionmortality a Migrations and movements 0 Evolution a Population genetics MutationSelectionDrift a Gene sequences a Epidemiology a Disease spread within a population SIR models a Disease spread within an organism a Development of resistance 0 Tools for assessing models and estimating parameters a MCMC Markov Chain Monte Carlo a Simulated annealing o and much much more Just about anything ALSO includes Just about anything ALSO includes Your life Just about anything ALSO includes Your life 0 You drop out of school to make lots of money in the stock market Just about anything ALSO includes Your life 0 You drop out of school to make lots of money in the stock market a You lose all your money gambling Bernoulli and Bernoulli 1713 Just about anything ALSO includes Your life 0 You drop out of school to make lots of money in the stock market a You lose all your money gambling Bernoulli and Bernoulli 1713 a You eventually do or don39t find your way home from some unknown establishment you 39ve chosed to drink your sorrows away in Pearson 1905 Just about anything ALSO includes Your life 0 You drop out of school to make lots of money in the stock market a You lose all your money gambling Bernoulli and Bernoulli 1713 a You eventually do or don39t find your way home from some unknown establishment you 39ve chosed to drink your sorrows away in Pearson 1905 These are all Classic problems in stochastic processes Historical aside on stochastic processes Andrei Andreevich Markov 1856 1922 was a brilliant Russian mathematician who refused to believe that the Central Limit Theorem only applies to independent data and consequently came up with the most widely used formalism and much of the theory for stochastic processes Passionate about math pedagogy he was a strong proponent of problem solving over seminar style lectures A political activist he refused tsarist honors requested that he be excommunicated from the Russian Orthodox Church out of solidarity with the recently excommunicated Leo Tolstoy publicly renounced his membership in the electorate when Parliament was dissolved and eventually left his teaching post when the government demanded that teachers spy on students with socialist sentiments He said this of his most famous English colleague quotI can judge all work only from a strictly mathematical point of View and from this viewpoint it is clear to me that Pearson has done nothing of any note quot1 1from Basharin et al 2005 The Life and Work of AA Markov Discrete state transitions Consider X X1X2X37 X is in some discrete state space 8 here A7 B7 C with fixed probabilities of transitioning from one state to another PAS PAA PBS i PAC PCB PCA PBC U PCC Sample sequence X CCCBBCACCBABCBAM This object is called a Markov chain Some definitions Xn has the Markov Property if PrXn anX1 X1 7Xn71 Xnil PrXn anXna Xnil for all n in X1 Xn In other words any system whose future depends only on the present and not on the past has the Markov Property and any Markovian Xn is Markovian is called a Markov Chain The pgt39s of a Markov chain are transition probabilities If pgt39s are time invariant p00 pg the chain is called time homogeneous or is said to have stationary transition probabilities Discrete state transitions We express this process in terms of a Probability Transition matrix PAS PAA PBS F e i PAC PCB PCA PBC U PCC Such that My PFX1 J39iX i Pa 1 Note that Zpg1BUTZpij 1 2 11 i1 Discrete state transitions We express this process in terms of a Probability Transition matrix PAS PAA PBS G i PAC PCB U Such that N Prx1 jZMij PrX i 3 391 Which can be conveniently rewritten in matrix notation as 7r1 M X 7r 4 Where 7r is the distribution of the system over all states at time t Example 1 German children play catch2 Let39s give the ball to Antje and see what happens JO 7539 7J0 rAhJo VSL 7J0 VSL 7J0 St A St 7J0 VAHVA VJO VNVSL 7J0 VNVSL 7J0 rAnrMVSL 7J0 VNVSL 7 JO 7 etc ThlS IS called a realizatian of a stochastic proc s 2mm www edaytutoma orgenunof cxa PcturesMarkovcham png Consider the process probabalistically Give the ball to Antje again we 1707070 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 1000 7n 00 75oo 25 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 17 O7 07 O m 00 7500 25 7r2 000 4380 562 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 17 O7 07 0 7n 00 7500 25 7r2 000 4380 562 3 0 1460 1460 562 0 146 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 17 O7 07 0 7n 00 7500 25 7r2 000 4380 562 3 0 1460 1460 562 0146 ms 0 1880 2970 1820 333 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 17 O7 07 0 7n 00 7500 25 7r2 000 4380 562 3 0 1460 1460 562 0146 ms 0 1880 2970 1820 333 5 0 0610 2010 4080 33 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7m 17 O7 07 0 7n 00 7500 25 7r2 000 4380 562 3 0 1460 1460 562 0146 ms 0 1880 2970 1820 333 5 0 0610 2010 4080 33 6 0 1360 1810 3810 302 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7r0 17 O7 07 0 7n 0075 0 025 7r2 0004380562 3 0140014005620146 7m 00188002970018200333 5 006102010408 0 33 0130018103810302 0121002200347 0297 g 7r7 00 02 04 I36 0810 Ame A Exandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7r0 100 0 7n 00175 0 025 7r2 000143801562 3 014601146015620146 7m 018801297011820333 5 006101201014080133 013601181013810302 0121022001347 0297 O116021103540319 7139s 7r7 00 02 04 I36 0810 Ame Ahaxandra Juacmm Stefan Consider the process probabalistically Give the ball to Antje again 7r0 17 O7 07 0 7n 00750 025 7r2 0004380562 3 0140014005620146 7m 0180029101820333 5 006102010408033 6 0130018103810302 m 0012102200347 0297 8 0110021103540319 9 g 7 0118 0205 0372 0305 00 02 04 I36 0810 Ame A Exandra Juacmm Stefan Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 W9 W9 7r10 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 018802910118101333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0116 02110354 0319 0118 02050372 0305 0124 02120356 0307 Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 018802910118101333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0116 02110354 0319 0118 02050372 0305 0124 02120356 0307 0119 0212 036 0309 Consider the process probabalistically Give the ball to Antje again 7m 1 00 0 7n 00750025 7r2 0 0 0438 0562 3 0146 01460562 0146 ms 0188 0297 0182 0333 5 006101201O4080133 6 0136018103810302 m 0127 0229 0347 0297 8 0116021103540319 2 g 0118020501372O305 m 0124021203560307 7m O11902120360309 m 7 012 0209 0362 0309 an 02 IA 06 as m Ame A Exandra Juacmm Stefan Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 0188 0297 0182 0333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0118 02050372 0305 0124 02120356 0307 0119 0212 036 0309 012 0209 0362 0309 0121 02110361 0308 0116 02110354 0319 Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 0188 0297 0182 0333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0118 02050372 0305 0124 02120356 0307 0119 0212 036 0309 012 0209 0362 0309 0121 02110361 0308 0116 02110354 0319 Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 0188 0297 0182 0333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0118 02050372 0305 0124 02120356 0307 0119 0212 036 0309 012 0209 0362 0309 0121 02110361 0308 012 0211 036 0309 0116 02110354 0319 Consider the process probabalistically an 02 04 I36 0810 Ame A Exandra Juacmm Stefan Give the ball to Antje again we 7r1 7r2 M 7m 7r5 71396 7quot7 1 0 0 0 0 0750 025 0 0 0438 0562 0146 01460562 0146 0188 0297 0182 0333 0061 0201 0408 033 0136 0181 0381 0302 0127 02290347 0297 0116 02110354 0319 0118 02050372 0305 0124 02120356 0307 0119 0212 036 0309 012 0209 0362 0309 0121 02110361 0308 0110211036 0309 012 021 0361 0308 The stationary state The state 7rquot 0 120i210i3610i308 is referred to as stationary Note that 0 The name is a little bit misleading the ball is not stationary it is always moving d aroun 9 The state can be solved for mathematically 7r M X 7r 5 Zn This is a straighforward linear algebra problem and is usually easy to obtain for Mathematica 0 All states have a value between 0 and 1 and have finite probablity of being revisited forever and ever until the Amie Alexandm mm 52m children s arms fall off Such states are termed recurrent persistent or ergodic nn n2 m as as 1EI Example 2 Genetic Drift Consider a population of size N that features 2 alleles A and B at a given locus Example 2 Genetic Drift Consider a population of size N that features 2 alleles A and B at a given locus P20 Example 2 Genetic Drift Consider a population of size N that features 2 alleles A and B at a given locus Example 2 Genetic Drift Consider a population of size N that features 2 alleles A and B at a given locus lt 9 5 5 H H H a H a Genetic Drift Fisher Wright Matrix If the State X is defined as number ofA alleles in the population then 0 1 2 3 0 13 0 2 0 2 03 M1 lt93 swag swag as 2 a we we a 3 0 0 0 1 0 1 2 3 1000 0000 0000 0000 0296 0444 0222 0037 0444 0296 0000 0000 0000 1000 WMI O 0 o w x o m m m 10 00 02 04 06 08 click on image to move forward Frequency 020 000 lllllllllllllllllllllllllllllllllllllllll 024681114 2023262932 Number of Allele A in population click on image to startpause animation Fixation and transience General FisherWright matrix Some properties of genetic drift 0 Always eventually fixates at O or N 9 Proportion of fixation depends on initial proportion of a given allele 9 Rate of fixation depends inversely on N o Other states are called transient contrasted with recurrent because the process does not necessarily return to them The final moral a Genetic drift is a stochastic fluctuation in allele frequencies that leads inexorably to fixation for small populations but is counteracted by mutation and migration for large populations Example 3 Blackjack ml 09 K g Np 9 mltgtl VV XDDU010 V No0 l 1 4010 7 x7 A 39 J a a 7 39 Figure 2 Graphical depiction offull state space 2 Each state represents an ordered triple LJl39LH denoting the number of low medium and high cards that have been played from the shoe Blackjack state space analysis from httpcmcriceedudocsdocsWak2004Ju1AMarkovChapdf

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.