Popular in Course
Popular in Environmental Science
This 5 page Class Notes was uploaded by Jose Parisian on Thursday October 29, 2015. The Class Notes belongs to ES714 at Wright State University taught by Staff in Fall. Since its upload, it has received 8 views. For similar materials see /class/231099/es714-wright-state-university in Environmental Science at Wright State University.
Reviews for EnvironmentalStatistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/29/15
Introduction 1 March 28 2005 Chapter 1 Introduction ES714 is an Environmental statistics course for students who have had at least one previous course in statistics that covers the basics of statistical principals such as probability distributions hypothesis testing con dence intervals t tests analysis of variance and regression analysis From this foundation we shall cover more advanced statistical topics that are com monly encountered in the environmental sciences The eld of statistics is vast and the statistical applications pertinent to environmental statistics is also vast It will be impossible to cover in great detail all the topics that we shall encounter in this course In fact entire courses can be devoted to each of the topics we will cover this quarter These topics include 0 Sampling Designs 0 Multiple Regression topics 0 Logistic Regression and Generalized Linear Models 0 Time Series Analysis 0 Spatial Statistics Sampling Designs Environmental studies require data Often the data is obtained in eld studies where the investigator must obtain samples In order to make valid statistical inferences from the data the data must be representative of the population from which it is obtained This typically involves randomly sampling the units in the study in some fashion There are a wide variety of methods available If one does not obtain a sample in a statistically valid fashion then the resulting data may fail to be of much use due to inadvertent selection biases Additionally a prudent choice of a sampling design may lead to more ef cient estimation of population parameters and save time and money in collecting data Multiple Regression Topics Regression is one of the most used statistical tools The reason for its high use is that regression allows us to model complicated relations between variables We shall quickly review the fundamentals of multiple regression and then follow that up with topics involving regression with indicator variables polynomial regression and nonlinear regression models Generalized Linear Models Regression models can be generalized to handle response variables that are not normally distributed The most commonly used type of a generalized linear model is logistic regression ln logistic regression the response Introduction 2 is a Bernoulli 0 or 1 variable indicated success or failure The estimated regression function gives the probability of success based on a covariate Time Series Time series are very important in environmental studies Any time data is collected over time such as temperatures7 the resulting data is a time series Time series analysis allows us to study seasonal trends and overall trends in the data as well as correlations between successive measurements Spatial Statistics Environmental studies often involve data collected spatially such as soil samples obtained in a large eld It is of interest in such studies to investigate if clusters exist or if the data is spatially correlated Some Basics of Experimental and Sampling Design Experimental design is extremely important when embarking on any environmental study Experimental design deals with designing experiments Scienti c results come from the analysis of data from such experiments Here we introduce some basic terminology and ideas An experiment is when researchers control the allocation of treatments to the ex perimental units Thus7 in an experiment7 the researcher has direct control over treatments received The emperz39merztal unit is the smallest unit to which treatment combinations are applied The observational unit is the unit upon which measure ments are taken Experimental units can be people7 animals7 beakers of liquids7 plots of land etc Sometimes the experimental and observational units are one in the same7 and other times they differ For example7 a beaker of water may contain several or ganisms and the implementation of the experiment is to expose the organisms to a particular chemical If the chemical is added to the water7 then the experimental unit is the beaker lf responses are recorded for individual organisms within the beaker7 then the organisms are the observational units In observational designs7 the researchers do not have control on the allocation of treat ments7 but instead observe a population of interest The population is the collection of all possible sampling units When identifying the population7 the investigator must determine to which group will the conclusions of the statistical inference be applied What constitutes a sampling unit may not always be well de ned For instance7 if one is performing an observational study of the emerald ash borer in Ohio7 the population may be de ned as all ash trees in Ohio with trees being the sampling units Alternatively7 sampling units could be de ned as plots of land in this example Sometimes the population will be hypothetical For instance7 in the emerald ash borer example7 the population may be all trees infested with the borer in Ohio If the borer has not entered Ohio yet7 then the population is hypothetical Nonetheless7 experiments could be conducted by infesting trees with the borer in order to study different treatments for eradicating the borer If the goal of an experiment is to study the effect of some treatment7 say an environ mental toxin7 on a population of interest7 then good experimental design requires the Introduction 3 elimination of all other variables that may effect the outcome to the greatest extent possible For example if we want to study the effect of a toxin on a particular organ ism then one would want to control to the greatest extent possible the effect of any other variables on the organism If an effect is discovered when analyzing the data the experimenter would like to be able to attribute that effect to the toxin However this will not be possible ifthere are other variables that were not controlled that could have caused the effect This is sometimes referred to as local control Experimental units should be as similar to each other as possible in order to obtain local control If there is a high degree of variability among the experimental units then it may be dif cult to detect effects due to the increased variability On the other hand if high levels of variability exist among experimental units then it may not be possible to differentiate any observed effects based on experimental conditions or differences in experimental units If we want to observe the effect of a toxin on an organism then we want to factor out confounding factors due to differences in temperature differences in lab conditions and lab technicians solution preparations etc Blinding Another aspect of local control is the notion of blinding There are many examples of studies whose results came out wrong because the experimenter had reason to believe the experiment would come out in a particular way even before the experiment was even conducted This phenomenon occurs everyday if someone believes that something is a particular way they nd evidence that supports their view while discarding or ignoring evidence that does not support their view Often this happens on a subconscious level Unfortunately scientists are not exempt from this sort of bias For this reason it is important for investigators to protect their research by blinding or masking For instance if one wants to study the effect of an environmental toxin on an organism in an experiment using different levels of toxin exposure then the investigator should not know which experimental units received which treatment A single blind study is one where the subjects do not know which treatment they are receiving This may not be very relevant in a study on sh say but it can have a big impact in clinical trials involving humans A double blind study is one where neither the subjects nor the experimenter know which subjects are getting which treatments Controls Another fundamental aspect of experimental design is that of a control group If a study of the effect of a toxin on an organism shows an effect how can the experimenter know the effect is more than what could occur by chance alone A control group in this situation may be a group of organisms that are as similar as possible to the organisms that are exposed to the toxin except the control group is not subject to exposure Therefore if a difference is observed between the treatment group and control group that is too big to have occurred by chance alone then it may be reasonable to attribute the difference to the toxin In studies of anti depressants the control group is often a group of subjects who take a placebo It is important that these studies be double blind It is well known that there is a high placebo response rate in studies of depression In other words many depressed subjects will report some improvement by simply knowing that they are taking something that is supposed to make them feel better even if that something does not actually work If a group of subjects receiving the actual anti depressant treatment do improve on average the experimenters need to determine if this improvement is greater than Introduction 4 what would be observed from the placebo effect alone Another way of incorporating controls into an experiment besides a placebo control7 is to have the control group receive a well established standard treatment which will then be compared to the treatment of interest When implementing an experiment there will be a variable of interest or perhaps several variables We may record something as simple as whether or not an organism survives7 or we may record how long an organism survives The variable of interest may be a quantitative measurement such as the size of a plant or animal length7 width7 etc The experiment will then control factors that are thought to effect the variable For instance7 in the toxicity study on sh7 the dose ofthe toxin would be the factor and we may perform the experiment at several different levels of the factor7 ie different doses Often times an experiment will incorporate other factors as well7 for instance water temperature7 pH level of the water7 etc A blocking factor is one whose effect on the variable is not of interest but is known to effect the outcome and therefore it must be controlled for A blocking factor then is used to factor out extraneous sources of variability in order that a more focused study of the effect of the treatment can be made For instance7 suppose interest lies in accessing the effect of pesticide runoff on the thickness of turtle eggshells Suppose also that a turtle7s diet has a very big effect on eggshell thickness If a study is undertaken to study the effect of the pesticide on the eggshell thickness but there is no control over the diet of the turtles7 then it could be the case that most of the variability in the observed eggshell thicknesses is due to the difference in diets If this variability is too great7 then it may not be possible to detect a difference in mean eggshell thicknesses due to the pesticide However7 if diet is used as a blocking factor7 then this source of variability can be controlled leading to a more powerful statistical analysis7 ie an analysis that is more likely to nd an effect due to pesticide if such an effect actually exists Randomization We have discussed issues related to factoring out variability due to factors that are not of concern in an experiment It must be noted however that there will always be some intrinsic variability between sampling units When studying animals7 no two animals are exactly the same7 even after considering factors such as age7 size7 etc In order to balance out uncontrolled systematic effects in an experiment7 units are assigned to treatment combinations using randomization Randomization is the process of assigning treatments to experimental units at random If there exist systematic effects that we have not or could not control for7 then the use of randomization will hopefully wash out differences between treatments that are not due to the treatments themselves Suppose in the sh toxicity example that some sh have a genetic predisposition to be immune to the harmful effects of the toxin and the experimenters are unaware ofthis genetic effect lfthe sh are randomized to different treatments based on varying doses of the toxin7 then randomization will help make it possible with a high degree of probability that the sh with and without this genetic predisposition will balance out somewhat between the treatments Random number generators can be used to implement randomization in practice Most statistical software packages have random number generators Non random allocation of treatment to experimental units once again puts the ex Introduction 5 periment at risk of bias7 even when it is unintentional Finally7 remember the most sophisticated statistical analysis available cannot salvage a poorly designed or implemented experiment
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'