×

Let's log you in.

or

Don't have a StudySoup account? Create one here!

×

or

Statistical Methods for Bioscience II

by: Imelda Casper

10

0

6

Statistical Methods for Bioscience II HORT 572

Imelda Casper
UW
GPA 3.74

Staff

These notes were just uploaded, and will be ready to view shortly.

Either way, we'll remind you when they're ready :)

Get a free preview of these Notes, just enter your email below.

×
Unlock Preview

COURSE
PROF.
Staff
TYPE
Class Notes
PAGES
6
WORDS
KARMA
25 ?

Popular in Agricultural & Resource Econ

This 6 page Class Notes was uploaded by Imelda Casper on Thursday September 17, 2015. The Class Notes belongs to HORT 572 at University of Wisconsin - Madison taught by Staff in Fall. Since its upload, it has received 10 views. For similar materials see /class/205253/hort-572-university-of-wisconsin-madison in Agricultural & Resource Econ at University of Wisconsin - Madison.

×

Reviews for Statistical Methods for Bioscience II

×

×

What is Karma?

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/17/15
StatForHort 572 Larget March 11 2008 Review 2 During this semester we have studied several types of models 1 simple and multiple regression with quantitative and catergorical exaplanatory variables 2 logistic regression 3 Poisson regression You should know when each type of model is appropriate In particular7 standard regression assumes that the outcome variable is normally distributed with a mean that is a linear combination of the predictors sometimes transformation of the outcome variable or inputs is needed for the model to t logistic regression assumes that each outcome is a zero or one the probability of success is modeled as the inverse logit function7 11 exp77 of a linear combination of the predictors Poisson regression assumes that the outcome variable has a Poisson distribution which models counts without a xed maximum the mean of the distribution is the exponential of a linear combination of predictors Both logistic regression and Poisson regression are examples of generalized linear models In both cases7 they can be extended with family quasibinomial or quasipoisson to account for overdispersion You should be prepared for the following 1 from a description of data and a biological scenario7 be able to identify which type of model is appropriate 2 given R output for each type of model7 be able to make a numerical prediction of an outcome based on new input variable values 3 be able to interpret the meaning of estimated coef cients There are also a number of concepts that we have discussed that you should be clear about a Simple linear regression model assumptions 1 linear relationship between response and explanatory variables 2 errors are independent of explanatory variables and each other 3 constant variance and 4 errors are normally distributed Know how to identify possible violations7 for example from residual plots Alternative Viewpoint of simple linear regression Understand that ifthe input z is 2 standard deviations above its mean7 then the predicted y is r2 standard deviations above its mean where r is the correlation coef cient between z and y Correlation The correlation coef cient r measures the strength of the linear relationship between z and y on a scale between 71 and 1 Strong nonlinear relationships can have any correlation coef cient strictly in the range The sign of the correlation coef cient is the same as the direction of the association For a perfectly linear relationship between z and y r is either 71 or 1 exactly StatForHort 572 Larget March 11 2008 o Transformations 7 ln regression problems7 transformations of the response or explanatory variables can make the linear model t the data better 7 Linear transformations of the emplanatory variables do not change the goodness of t of the model7 but do change the interpretation of regression eoe eients Common linear transforma tions include centering by substracting a mean or some other central value and standardizing which substracts the mean and divides by the standard deviation or some other statistic related to variability 7 Log transformations do change the model and are often useful for addressing violations of the linear model assumptions 0 Model matrix representation for multiple regression You are not responsible for the matrix algebra7 but you should understand that a multiple regression model depends on a matrix where 1 There is a column of ones for the intercept 2 Each continuous variable is represented by a single column 3 Each factor with h levels is represented by h 7 1 columns There is no unique way to do this7 but the typical parameterization is for each column to be lled with Us and 1s where one level is treated as a reference and each other level is associated with a single column that indicates if the observation is in that level 4 Interactions between quantitative variables add a single column that contains the product of the two variables 5 Interactions between a quantitative variable and a factor with h levels add h 7 1 columns to the matrix7 each of which is the product of the column for the quantitative variable and one of the h 7 1 columns for the factor 6 ln general7 interactions between any number of variables add columns for each product of columns associated with the main variables Know that there is a single coef cient 6 for each column in the model matrix 0 Least squares versus maximum likelihood for model criteria Simple and multiple regression models nd the parameters that minimize the sum of squared residuals This least squares criteria is equivalent to the maximum likelihood criteria which estimates parameters to make the likelihood7 or probability of the observed data7 as large as possible This equivalence follows from the normal dis tribution Generalized linear models based on other distributions like binomial for logistic regression or the Poisson estimate parameters by maximum likelihood which is different from least squares o Overdispersion 7 Both generalized linear models we have seen can be generalized with ouerdispersion 7 These models allow for more variability than the binomial or Poisson model would predict 7 These models can be understood as a two stage hierarchical model where the parameter p for binomial7 u for Poisson is drawn rst from some distribution and the response y depends on this parameter StatForHort 572 Larget March 11 2008 Exam format H F 9 7 The exam will be worth 100 points There will be eight TrueFalse and explain questions worth 4 points each 32 points total 7 These questions will each be designed to test your understanding of a single concept discussed above i If the statement is true there is no need to add explanation but if the correct answer is false explanation can be worth partial credit if it shows understanding If the answer is false you should brie y explain why There will be three short answer questions worth 8 points each 18 points total 7 The answer to each of these questions should be brief consisting of no more than a few sentences 7 These questions will test conceptual understanding of major topics seen in the course There will be two data analysis questions worth 25 point each 50 points total Each of these data analysis questions will have several parts 7 Each problem will have a biological description of data similar to those in your homeworks Each problem will include summaries of a data analysis including possibly R output of numerical coef cient estimates as well as graphs of data 7 Many parts will involve doing calculations associated with model such as making predictions Some parts will ask you to interpret the coef cients or your prediction calculations in the context of the biological problem Sample Problems Circle TRUE or FALSE lf FALSE brie y explain why After tting a logistic regression model a plot of residuals versus tted values is useful for seeing if model assumptions are violated Circle TRUE or FALSE lf FALSE brie y explain why In a multiple regression problem an quantitative input variable x is replaced by z 7 meanx The R2 statistic for the tted model will be the same Circle TRUE or FALSE lf FALSE brie y explain why In a multiple regression problem an quantitative input variable x is replaced by z 7 meanx The coef cient 6 associated with x will have the same numerical value after the transformation that it had before SHORT ANSWER ln multiple regression model that predicts the weight of a dairy cow one year after birth with inputs 1 the weight of its mother in kg and 2 a factor with k diets state what the intercept term measures in the model brie y explain why the model should include the intercept term even though it has no useful biological interpretation StatForHort 572 Larget March 11 2008 5 PREDICTION A regression model predicts the urine urea nitrogen UUN concentration rngdL of Holstein dairy cattle on the basis of the the weight of the cow and diet7 which is one of four treatrnents7 A7 B7 C7 and D gt displayfit1 digits 3 lmformula UUN weight diet coefest coefse Intercept 2 652 226073 weight 0192 0150 dietB 178087 40553 dietC 426424 42661 dietD 627502 42401 n 20 k 5 residual sd 63937 R Squared 095 a Predict the UUN concentration for a 1600 pound cow in each of the four treatment groups b A second model includes an interaction between weight and diet gt displayfit2 digits 3 lmformula UUN weight diet coefest coefse Intercept 212943 388186 weight 0048 0259 dietB 434496 607750 dietC 278767 754457 dietD 742005 583112 weightzdietB 0406 0402 weightzdietC 0453 0483 0402 weight dietD 0090 n20k8 residual sd 65746 R Squared 096 Predict the UUN concentration for a 1600 pound cow in each of the four treatment groups A O V For each rnodel7 you could graph four lines showing the predicted UUN versus weight with separate lines for each treatment group Brie y describe how to distinguish between the graphs for each model A CL For each rnodel7 what change in UUN concentration is predicted from an increase of 100 pounds in a cow in treatment group B StatForHort 572 Larget March 11 2008 H F 9 7 9 Sample Problem Solutions FALSE A plot of residuals versus tted values in logistic regression will be show two curves7 one for the 0 outcomes and one for the 1 outcomes This plot will not be helpful in assessing the quality of t of the model TRUE Linear transformations do not affect goodness of t of linear models FALSE Linear transformations do affect the values of coef cients The intercept is the predicted weight of a dairy cow whose mother weighed 0 kg when given the rst diet There is no biological relevance to this interpretation as the mother will not have weighed 0 kg However7 the model should include an intercept so that the tted line is not constrained to pass through the origin We want to t the best possible line within the range of the observed data a For the rst model7 the predicted UUN values are 305 for A7 483 for 137 731 for C7 and 932 for D These are found from 726521L 1600 0192 47 where z 07 17842647 6275 for the four treatment groups A 2129430048 1600 290 13 212943 7 4345 0048 1600 0406 1600 505 C 212943 7 2788 0048 1600 0453 1600 736 D 212943 742 0048 1600 7 0090 1600 888 c The model in a will show four parallel lines The model in part b will show four lines7 but they will not be parallel b The predictions are d In model a7 the slope is 0192 mgdL per pound7 so the predicted increase in concentration is 192 mgdL ln model b7 the slope is 0048 0406 0454 so the predicted increase in UUN concentration would be 454 mgdL

×

25 Karma

×

×

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Jennifer McGill UCSF Med School

"Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over \$500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!
×

Refund Policy

STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com