Sampling and Analysis of Environmental Contaminants
Sampling and Analysis of Environmental Contaminants ENVS 541
Popular in Course
Popular in Environmental Science
This 64 page Class Notes was uploaded by Gerry Spinka on Friday October 23, 2015. The Class Notes belongs to ENVS 541 at University of Idaho taught by Staff in Fall. Since its upload, it has received 8 views. For similar materials see /class/227753/envs-541-university-of-idaho in Environmental Science at University of Idaho.
Reviews for Sampling and Analysis of Environmental Contaminants
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/23/15
l l Module a Con clusio l 47 y 4 3 Pseu Multiple Testing alysis and Bayesian ethods quotl Pseudoreplication o Pseudoreplication is an issue when data ti points are correlated in space or time or r l when the data collected are not Ln representative of the entire population l o In these cases you may have less Q 9 information than you think o You should adjust your degrees of Q freedom downward t 9 4122002 Module 2 9quot Pseudorepicaz ion Q o Example I O o In a study I was involved with a graduate student was to collect data on lead contamination in Q homes in the Bunker Hill Superfund area i y o The plan was to drive to a neighborhood and go doortodoor asking ifthey would participate If Q yes then information and samples were collected and they went next door and continued quot At the end ofthe week the data collection would I 9 4122002 Module 3 end amp r y PseudorepIcaz lon o Example But homes in a neighborhood tend to be alike in age value condition People who live there also tend to be alike Also some neighborhoods would be expected to be more contaminated than others 80 data points collected in this way are correlated Also some neighborhoods would be well covered and others may be skipped 9 4122002 Module 4 5 4959 vol 595quot l A h 39 Pseudorepicaz ion 9 4122002 0 Example The solution was to change the design Randomly pick a neighborhood Go doortodoor until someone agrees to participate Collect that data Then randomly pick another neighbomood The difference is that each data point is chosen at random from the entire population and should be uncorrelated with the others Module val lvm t 9 39 Multiple Testing 0 This problem occurs when many tests of significance are carried out on a data set Some are likely to be signi cant by chance In fact oc ofthe null hypotheses will be incorrectly rejected This is after all the de nition of or SO some results will appear to be signi cant when they aren t O O 0 4122002 Module 9 19 9 a 7 5 Multiple Testing o The solution is to take the number oftests into account and adjust the procedure for rejecting the null hypothesis when multiple tests are performed x o Beware of blindly running many manytests on a data set searching for the one that is signi cant This could lead to a declaration of COLD FUSION 4122002 Module 7 9 A 4152 a MetaAnalysis o MetaAnalysis involves combining the information from a number of studies together to see if the studies as a group support or reject a hypothesis o There are many methods for doing this o We won t explore the details of these methods 4122002 Module 8 amp Bayesian Methods o There are two types of statisticians in the world Frequentists and Bayesians Frequentists view probability as completely objective They look at all statistical methods from a standpoint of what would happen in the long run if a sample were taken over and over Statistics is often taught by them beginning with coin flips pulling balls from an urn or using a deck of cards 9 4122002 Module 9 19 3 9 quot O O a T Bayesian Methods 0 o Bayesians on the other hand view 39 probability as subjective Q j o Probability can be expressed as a Q degree of belief that an event will occur Q o That belief could be based purely on the 7 data in hand or could involve past experience other data expert Q judgement theory etc l 4122002 Module 10 Q i by Bayesian Methods Q o Bayesians express their degree of belief 7 about a parameter as a prior probability Q distribution o Then they incorporate new data into the 39 analysis using a likelihood function the Q likelihood that those data occurred Q given a particular parameter value o The result is another probability I r distribution called a posterior distribution 4122002 Module 11 Bayesian Methods o The Reverend Thomas Bayes created this approach called Bayes Theorem 6 the parameter being estimated For simplicity assume that G can take on a small number of values 916 2 G n You as the investigator may have some guess as to the probabilities ofthese being the best estimate of G o If you don t then they are equally likely 0 31 49592 a 0 4122002 Module 12 Q a 939 Bayesian Methods Q o These probabilities are your priors 0 Plt 1gtPlt 2gt Pe n Q Note these probabilities must sum to 1 l y o Then you collect some new data o You can determine the probability that those data occurred given that 9 8i o That s called a likelihood and denoted by Pdata 9i 4122002 Module 13 l x 39 5x Bayesian Methods o Then your prior beliefs and the data are combined to give a new posterior set of probabilities using Bayes Theorem Pdata8i 9 ZPdata kP k Q k1 9 4122002 Module 14 Q ff PGildarak n Q i by Bayesian Methods Q Po If instead of a small number of i possibilities for the parameter it could Q fall anywhere in a range then the summation is replaced by an integral n These methods are being used more and more in environmental science since events are rarely reproducible data sets are limited and expert judgement often must be used 9 4122002 Module 15 26 Data Qu y Objectives lt Data Quality Objectives o Data quality objectives are a planning x 7 tool to help ensure that data collected for a study are the right amount the right kind the right quality 39 I 4122002 Module 27 Q T r on Data Quality Objectives what s been collected in the past what s easy to collect l gt what s affordable what s familiar Q o rather than focusing on what needs to be done Q r o Too often data are collected based on 0 X 9 4122002 Module 27 3 amp r 5 Data Quality Objectives amp o The US Environmental Protection Agency developed the data quality objectives process to help organizations plan data collection activities to effectively and efficiently address environmental contamination issues 3 cvs r r 9 4122002 Module 27 4 h of Data Quality Objectives 66 o State the problem Understand exactly what is being studied and why i Often different stakeholders have different l views ofthe problem The time to come to a common understanding is prior to the data collection not later Y39 a 9 4122002 Module 27 l l we 6 Data Quality Objectives o ldentify the decision t Determine what decisions will be made based on the data quotto p Life 7 t 39 9 4122002 Module 27 h 6 a Data Quality Objectives 6 o Identify inputs to the decision Decide what data is needed to make the decisions that need to be made 39 This involves thinking about X what variables need to be measured 7 X 3 Avg 7 9 4122002 Module 27 6 Data Quality Objectives 6 or o Define the study boundaries l What is the timeframe for the study What are the spatial boundaries ofthe L study area Three dimensions length width depth 6 51 V t39 9 4122002 Module 27 h 6 9 Data Quality Objectives 6 o Develop a decision rule Decide on the action limit by deciding how the data will be analyzed and what result 39 will result in which management actions X l X Y39 a 9 4122002 Module 27 9 we 6 Data Quality Objectives 6 or o Specify limits on decision errors k Two types of decision errors can exist Do nothing when a problem exists Do something when no problem exists Decide what probability of each type of error is acceptable p 3 9 7 r 9 4122002 Module 27 10 Data Quality Objectives 9 o Optimize the design 11 3 4 4 4122002 Module 27 11 2 m A Stratified Random Sampling o Stratified random sampling involves splitting the population into sections or strata and choosing a random sample from each stratum o It is appropriate when population units are more similar within each strata than they are across strata 4122002 Module 23 va vf 939 amp Stratified Random Sampling o Populations of people are often stratified by age sex geographic location political party or other important variables o Environmental samples are often stratified by land type terrain geography geology land use zones of contamination and so forth 4122002 Module 23 Q Q Q V V Stratified Random Sampling o Advantages of stratification You can calculate separate estimates of the parameters for each stratum Ifthe strata are different from one another on the characteristic under study contamination for example you may make different management decisions for different strata 4122002 Module 23 amp Stratified Random Sampling I o Advantages of stratification O Different strata can be sampled more or less intensively depending on study goals and population characteristics For example areas expected to be more variable should be sampled more intensively The standard error ofthe mean will be smallerthan for SRS particularly ifthe strata are quite different from one another 6 v v v 4122002 Module 23 amp 0x Stratified Random Sampling 39 o Disadvantages of stratification Usually make decisions on stratification before the study is carried out and these choices may turn out to be incorrect 7 Stratification can complicate later data use Data analysis is more complicated 1917 9 5 4122002 Module 23 A r 5 Notation o K Number of strata o Ni size of the ith strata population X o N size of total population o ni size of the ith strata sample K 39 l o n 2m size of the total sample Qv amp i1 9 4122002 Module 23 7 I Sample Statistics from 0 Stratified Random Sampling I o pi population mean of the ith strata Q 3 o sample mean of the i hstrata Q o si sample standard deviation of the ith strata quot o The sample strata mean and standard deviation are calculated in the normal 9 way 4122002 Module 23 8 A 5 Sample Statistics from Q Stratified Random Sampling 4122002 Module 23 a Sample Statistics from a Stratified Random Sampling 7 The overall mean is l KM K aquot yigTegwyi where wi is the proportion of the population in the it strata 6 9 49 4122002 Module 23 5 Sample Statistics from a a Stratified Random Sampling i has sample standard error l 1 N 2 Q A K i 52M quot SE 1 Z OX ys N mi N 4 r r 9 4122002 Module 23 I Sample Statistics from 39 a Stratified Random Sampling o An approximate 10010c Confidence Interval for LL IS Q Q L10 A ySiZaZSEyS 4122002 Module 23 la 39 Systematic Sampling g o Systematic sampling involves selecting sample units according to a specified pattern in time or space c o This ensures that the entire population is evenly covered g o Forthis reason it is intuitively appealing and often used in environmental sampling Ai i A3 962005 Module 24 OneDimensional Procedure o Onedimensional systematic sampling is useful for sampling in time or along a line such as a river or stream Calculate kNn as the spacing intenal Round kto nearest integer Pick a starting position at random and sample it o Select the second sample point k distance from it the third k distance from the second and so on until n samples have been defined 962005 Module 24 3 O O V 9c vc vo cv3 5 6 r OneDimensional Procedure quot o Notes Often the last distance does not exactly equal k View the linear population as circular so that when the end is reached you Q circle back to the beginning and con nue i 9 962005 Module 24 4 6quot it 48 r 9 A yci vc jvt r 962005 quot OneDimensional Procedure a o Example Population the days in July N31 Sample 6 days in the month k316 517 5 Randomly pick a number between 1 and 31 17 Sample days 17 22 27 1 6 11 Note day 1 was calculated by 27532 311 Sort into order for reporting purposes 16 11 1722 27 Module 24 5 O is TwoDimensional Procedure o Twodimensional systematic sampling is useful for sampling the land surface o Split the population into n equal areas o The areas could be squares rectangles triangles or any other shape 6 quot W o Pick a point in each area to sample Points can be 0 l r 79 962005 middle of each area Randomly chosen once and then sample that same point in each area Randomly choose a different point in each area Module 24 A e 5 Data AnaySIs quot39 The simplest way to analyze the data is to treat it as though it was collected using simple random sampling This probably overestimates the standard error but that is better than underestimating it Another way to analyze the data is to arbitrarily stratify into groupings of equal numbers of points and analyze using methods for strati ed samples The best way is to analyze using methods for spatial data Chapter 9 9 9 9 595969 I 962005 Module 24 7 7 quot Systematic Sampling o lfthe population has natural cycles ortrends and the Q periodicity of the sampling coincides with the cycle I then the sample will be biased 39 o For example temperature cycles based on 24 hour periods as well as 365 day periods o If k12 hours then samples will always be taken 39 b at the same times each day Q lftemperature affectsthe measurement then the r full range of temperature will not be experienced and the samples will be biased 9 962005 Module 24 8 A to Systematic Sampling o Advantages Easy to do Intuitiver appealing a o Disadvantages Simple analysis overestimates the standard error Cycles or patterns in the population may create problems v f l L 962005 Module 24 Performing xperiments Observing Data vs w Performing Experiments Li 397 7 o Analyzing environmental data is often gt complicated by nonadherence to standard assumptions o Some issues that arise Rarely can experiments be carried out on 77 39 the environment It is difficult to control some variables a y while studying others Q 39 4122002 Module 41 2 a 9 9 1 9f QE f 9 Observing Data vs Performing Experiments o Some issues that arise There are many types oftrends and patterns in nature at many different scales all happening simultaneously and these are not well understood by us Laboratory scale experiments sometimes give different results than what is observed at a larger scale in nature Standard distributions many not apply 4122002 Module 41 3 wx 9 y 9 e i Q Q T l Observing Data vs Performing Experiments o Some issues that arise Data can be missing or not measurable censored Data points close together in time or space are more alike than those further apart autocorrelation o A few of these issues will be discussed 9 in this module 4122002 Module 41 v v 9 v v r Observing Data vs Performing Experiments o Observation and experimentation are the basis of science v o Observation is passive while experimentation is active o Experimentation involves purposefully changing some conditions in a controlled way and observing the result 4122002 Module 41 Q Q V v 9 s Observing Data vs Performing Experiments 39 o In order to be certain that changing variable A will affect variable B in a certain way it is necessary to change A and observe B while holding everything else more or less constant o Other important considerations in experimentation are randomization replication and controls 4122002 Module 41 va v vgv r Observing Data vs Performing Experiments o Randomization involves making sure that sample units are randomly chosen to be measured or are randomly assigned to treatment groups o This ensures that differences between sample units on any number of unmeasured characteristics will not interfere with observing the effect of the variable under study 4122002 Module 41 7 Aampvv6v amp QVO 392 Observing Data vs Performing Experiments 39 o If careful randomization is not used unintentional bias can result A researcher unconsciously picks the unhealthiest specimens to measure for effects of environmental contamination because that agrees with his beliefs A scientist with a new idea for treatment unintentionally selects the healthiest lab mice to test it on 4122002 Module 41 Q Ti v 4i eff a 9 Observing Data vs Performing Experiments o Replication involves doing the experiment multiple times or collecting multiple data points Q o Without replication we don t know how much variability is normal and so we can t tell if what we saw was natural variability or a real effect 4122002 Module 41 9 s i s Q q T l Q m Observing Data vs Performing Experiments o Controls are sample units to which the treatment being studied is not given o Controls allow us to see what would have happened to the population if we had left it alone o Without controls we must make a judgement about this issue and will have little assurance of its validity 4122002 Module 41 10 9 9 vf v r Observing Data vs v Performing Experiments o The problem is that proper experimentation is rarely possible in the field of environmental science o So we often must rely on observational data 1 o If we observe A and B and other things and A and B change together then A may cause B or vice versa or they may both be caused by C 4122002 Module 41 11 9 V Q v v 392 Observing Data vs Performing Experiments quot o We can do laboratory experiments to check on relationships o We must also use our knowledge experience scientific theory etc to make judgements about the environment 4122002 Module 41 Q j Observing Data vs Performing Experiments 7 o Be careful to Q Be a skeptic double check Make sure observed relationships make 1 I sense theoretically Q m Use experimentation where possible to validate observed relationships Q Don t assume that if a relationship holds in Q the laboratory it will definitely hold in 7 nature you must observe it on that scale 4122002 Module 41 13 2 Observrng Data vs 0 Performing Experiments o Be careful to I 7 Make as few assumptions as possible Q regarding the data Be as open minded as possible when analyzing the data look at what the data is Q telling you r 0 o In other words being a good 1 environmental statistician is being a I good scientist 4122002 Module 41 14 l Populations and Samples m A I Introduction o According to American Heritage Dictionary of the English Language statistics is The mathematics of the collection organization and interpretation of numerical data especially the analysis of population characteristics by inference from sampling 4122002 Module 21 v v vfigv Introduction o Almost everything we understand about the way the world works is based on some sort of data And that data by its nature is incomplete and imperfect Incompleteness comes from our inability to study every single element of a situation under study lmperfection includes things such as errors in measurement Therefore we must make inferences about the whole by only seeing a fuzzy picture of a part ofthe whole 9 All data must be analyzed and interpreted Statistics is the science ofdoing this well 4122002 Module 21 3 v3 59 V 9 Introduction o Quote He uses statistics as a drunken man uses lampposts for support rather than illumination Andrew Lang 18441912 Scottish author Quoted in Alan L Mackay The Harvestofa Quiet Eye 1977 o Fact All statistical methods rely on some set of assumptions some more dif cult to meet than others Often those who try to prove a preconceived point using statistics don t understand the foundation of the techniques when they apply and how to appropriately interpret the results Or worse they knowingly misapply techniques to achieve a particular result 4122002 Module 21 4 Q t r Introduction 7 Quote You can prove anything with 39 statistics Anonymous 5 o Fact Data don t lie However it isn t always easy to X determine how best to analyze and interpret data o Well intentioned and careful data analysts working with the same data but with different sets of assumptions may obtain different results 9 4122002 Module 21 5 6 Introduction 6 7 Key questions to ask before analyzing a data set are Q 7 What is the research question that are you 7 twing to answer What data is needed to answer that question What assumptions are reasonable to make concerning that data by V 39 9 4122002 Module 21 l 95 h A r Introduction 0 A o Quote There are three kinds of lies lies damned lies and statistics Benjamin Disraeli A gtX o Most ofthe time if you can answer the above questions and apply appropriate techniques based on the answers then you will get a correct result and the use of other also appropriate statistical techniques will give the same result 9 4122002 Module 21 6 Introduction 6 o Analyzing data and drawing defensible inferences is more than mathematics It is both an art and a science and requires more than merely learning a set oftools it requires careful thought and analysis a y L a V l 39 9 4122002 Module 21 amp Populations and Samples 6 o The population is the entity that you want to understand It could be a group of people a group of animals living in a contaminated area of land a body of water the top 6 inches of soil in a particular area or a body of air 3 1 Q o The population is a collection of N items of interest where N could in infinite 9 4122002 Module 21 9 Populations and Samples 0 I o The sample units are all those items in the population that might be sampled gt o Sample units are the individual items in l x the population Q o lfthe population is land air or water i Ogt the sample units must be defined by size and by characteristics of space and time i I amp 4122002 Module 21 10 A Q i 9 Populations and Samples Q r o The sample are the n items that you actually collect and measure 0 o If nN then a complete census of the Q I population was done This is almost Q neverthe case in practice 9 4122002 Module 21 11 amp r Populations and Samples o We use the sample to draw conclusions about the population Since nltN and our measurement techniques are imperfect we have imperfect information about the population 5 v v O Howthe sample is drawn from the population is crucial to how well we can make inferences about the population This is the basis for studying sampling design y r 9 4122002 Module 21 12 101 Methods for Censored Data ll Censored Data Analysis A o Censored data are the result of an analytical chemistry practice of deciding whether or not a measured value can be considered to be different from zero o If not the data is reported as nondetect or as below a calculated limit ofdetection LOD o Censored data values create statistical difficulties since there are no numbers to use in statistical calculations 8152603 Module 101 2 w O o Some common censored data practices Ignore them biases mean high Replace them with some constant zero biases mean low limit of detection biases mean high half ofthe limit of detection I 8152603 Module 101 3 Censored Data Analysis Quot wv A t Censored Data Analysis Replace them with Randomly generated data from a uniform0 LOD V Assume a distributional type and use 0 theoretical results to give maximum likelihood estimates ofthe distribution s parameters Works very well if your assumption is right and poorly if it s wrong Requires specialized software l O 8152CO3 Module 101 4 91 w Censored Data Analysis Regression on order statistics Fit a regression line on a normal probability plot ofthe uncensored data using censored values as place holders and estimate the model parameters from the line coefficients Fillin methods Estimate the parameters ofthe distribution and then set the censored data equal to their expected values Usually iterative I 8152603 Module 101 5 v 9 9 A t Censored Data Analysis Robust parametric method Construct a normal probability plot Transform to normality if necessary Fit a line by linear regression Use the line to calculate expected values for the censored points Replace the censored data with their expected values Use all ofthe data including both uncensored and the new values to calculate the statistics l O 8152CO3 Module 101 6 t v v y A 9 Censored Data Analysis o Example 117 lt5 1 439 96 9 1 r lt5 120 Q 7 3 lt5 14 5 158 2 1 0 9 lt5 7 O 8152003 Module 101 n Original Replace Replace Replace Random Data Ignore wl zero wl LOD wILODl2 Uniform 7 117 117 117 117 117 117 1 3 93 93 93 93 93 93 104 104 104 104 104 104 5 96 96 96 96 96 96 0 lt5 00 50 25 19100 73 73 73 73 73 73 145 145 145 145 145 145 r O 109 109 109 109 109 109 134 134 134 134 134 134 80 80 80 80 80 80 lt5 00 50 25 0 5034 139 164 164 164 164 164 164 119 119 119 119 119 119 39 91 91 91 91 91 91 1 120 120 120 120 120 120 Q lt5 00 50 25 2 9824 158 158 158 158 158 158 lt5 00 50 25 44955 83 83 83 83 83 83 K 237 237 237 237 237 237 True Mean 10 1201 961 1061 1011 1010 True SD 394 413 614 466 536 541 LOD 5 8152003 Module 101 8152003 Calculations for Regression on and Robust Parametric Technic i Zi Xi ooxlmoubwm x o 39u w o 1 4 1 w A s Censored Data Analysis v 9f 9 8 Normal Probability Plot 250 200 150 0 100 50 x 00 300 100 100 300 Normal Score Z A f 39 8152003 Module 101 V V 9 Regression on Order Statistics SUMMARY OUTPUT Regression Statistics Multiple R 09589039 R Square 09194967 Estimate of Mean 1024 Adjusted F 09137465 Estimate of Stan Dev 533 Standard E 12123278 Observatic 16 ANOVA df SS MS F F Regression 1 23502 23502 15991 4761E09 Residual 14 2057634 1469739 Total 15 2555963 Coef cients andard Err t Stat Pvaue Intercept 1024 0333851 3067893 3E14 X Variable 533 0421111 0 LI 1264539 5E09 e v v i A Regression on Order Statistics O Normal Probability Plot 00 100 Normal Score Z 100 8152003 Module 101 Calculations for Regression on Order Statistics 1 and Robust Parametric Techniques 1 I Pi zi xi Fitted A 1 003 187 lt5 029 r 2 008 140 lt5 277 39 3 013 113 lt5 4 23 4 018 092 lt5 5 35 5 023 074 7 3 6 28 1 6 028 059 8 0 7 10 Q 7 033 045 8 3 7 86 1 8 038 031 91 8 57 7 7 O 9 043 019 9 3 9 25 397 7 10 048 006 9 6 9 91 11 052 006 104 1057 12 057 019 109 1124 I 13 062 031 117 1192 V 14 067 045 119 1263 15 072 059 120 1338 16 077 074 134 1420 17 082 092 145 1514 I 18 087 113 158 1625 i 19 092 140 164 1772 8 152003 20 097 187 237 2019 13 39 Robust Parametric r 11 7 L 93 93 1 104 104 96 96 lt5 02936 73 73 r gt 145 145 39 109 109 391 134 134 Q 80 80 I lt5 42347 164 164 14 119 119 91 91 120 120 lt5 27688 15 8 158 7 lt5 53477 39 83 83 3 237 237 1 Tme Mean 10 1024 True SD 4 523 f LOD5 8152003 Module 101 14 Q l is 9 Other Solutions 0 o Use the median and other percentiles instead of the mean and standard deviation 039 o The median can be calculated if less than 50 of the data values are censored l o Other useful statistics are the minimum 0 a maximum and other percentiles 0 8152003 Module 101 15 Q l of Other Solutions 0 I o Five number summary l Minimum Q 25th percentile O Median 50th percentile Q 2 75th percentile Maximum 0 l l O 8152CO3 Module 101 16 Q l e Other Solutions 4 s o Five number summary example Minimum lt5 Q quot 25th percentile 784 Median 50th percentile 1000 0 75th percentile 1238 Maximum 237 l o Linear interpolation was used to Q 4 estimate the percentiles l O 8152C03 Module 101 17 0 f l 0 Conclusions 0 o Censored data pose some special 0 statistical problems v o If a very simple method must be used 1 replacement with half of the detection 6 39 limit is probably the best of a bad lot l O o A better method is the robust parametric a method l O 8152CO3 Module 101 18 Q 9 Conclusions Q a Robust parametric method Create a normal probability plot Q t Fit a regression line 9 Use the regression coefficients to Q quot estimate values for the censored data i Use all of the data including the Q estimated values to calculate the r statistics 01 Conclusions A l Oquot o Other solutions include t Don t censor the values in the 039 first place a Use the median and percentiles instead of mean and standard a deviation l O 8152CO3 Module 101 V Detection of Changes 1 A D Once the monitoring network is in place the data can be analyzed for long term means long term variances short term means over set periods of time 8 hour or 24 hour for example distribution shape quot1 l 1072005 Module 52 Q i i 9 Detection of Changes Q n Once the data has been collected for a l 7 reasonable period oftime the data can also i be analyzed for trends and for abrupt Q changes in the mean u b There are a number ofdifferent techniques to 39 do this x Expected trends suggest regression analysis Q Expected Changes in mean suggest paired t tests twosample t tests or ANOVA r 1072005 Module 52 A 6 Detection of Changes amp A need to detect a change or a trend quickly suggest Control andor CUSUM chads A need to detect a change in distribution shape or location suggests ChiSquared V Tests or more sophisticated tests for 9 distribution type x i 9 1072005 Module 52 39 i 7 Detection of Changes 0 p ANOVA r at u 4 9 Q O at L x X Use when you are looking for a difference in means Advantages Well known technique Disadvantages You must know or estimate at what time the change occurred That works if some known event happened but usually that s not the case Module 52 l A O Detection of Changes J D Regression Analysis v u l l Of so a Q 0 4 r r I 1072005 Use when you are looking fortrends in time Advantages Well known technique Disadvantages You have to have quite a bit of data to detect a trend and often we have a need to detect these trends very quickly as they begin Module 52 39 Q i i Q Detection of Changes Q I D Control Charts y l7 r Use when you have data in time and a need to detect a trend or change quickly in Q real time 0 Advantages quot Simple technique Disadvantages K You have to have a baseline of data to start X Q With Not widely used in environmental applications lquot 1072005 Module 52 7 A 4 Detection of Changes Control Charts i Changes in the mean are detected in the sample Q mean chart Vi Changes in variability are detected using a range 5 7 variance or standard deviation chart The idea is that you use the baseline data to lay out the charts and then you plot new sample statistics on the charts and look for excursions beyond the bounds Excursions indicate possible shifts or trends a Q 1 Q iquot 39 9 1072005 Module 52 8 39 r Control Chart Example Q 5 Procedure laid out in Manly there are I other ways to construct these charts but for K simplicity we ll look at this way 9 K A baseline set of data of M samples each i of size n is taken over a year l For each sample calculate the sample mean and sample range The overall mean is the mean of the Q sample means if they are all of size n or if 7 just the mean of all of the data I 1072005 Module 52 9 A O 399 Control Chart Example Q A Notation L E the sample mean of the ith sample R the sam le ran e ofthe i39h sam 1e Q l l0 g P the overall mean ofthe means St 7g V E the overall mean ofthe ranges Q 6 the overall estimate of the standard deviation 1 1072005 Module 52 10 9591 9939 quot 5quot I Control Chart Example p Mean control chart warning limits are set at A 639 i196 p Mean control chart action limits are set at A 639 i309 1072005 Module 52 11 Control Chart Example quot Range control chart warning and action limits are set by multiplying E times the multipliers from Table 55 in Manly page 144 1072005 Module 52 12 Q i 5 Control Chart Example Q b When the warning limits are violated it s an R V indication of a possible shift or trend Since this will occur 5 ofthe time when a real Q change has not occurred the only action U generally taken is increased attention Sometimes sampling frequency is increased When the action limits are violated search for l a cause or take other appropriate action This will occur only 02 of the time if a real change has not happened r 10172005 Module 52 13 Control Chart Example 5 a If 1 1085 1036 1112 10777 76 5 v 2 1164 1160 118739 11703 27 Q X 3 991 1088 1155 10780 164 V 4 1046 1065 1015 10420 50 r 5 1008 1051 1061 39 10400 53 quotl 6 994 1072 108039 1048739 86 Q I 7 1107 1082 108439 10910 25 8 1081 1167 1096 11147 86 39 9 1091 1074 1199 11213 125 Q 10 1143 1219 106739 11430 152 I 9 10172005 Module 52 14 Q g y Control Chart Example Q y a Overall Mean 10927 Mean Range 844 Stan Dev 0591844 499 Lower Warning Limit 10927196499sqrt3 10362 Upper Warning Limit 10927196499sqrt3 11491 Lower Action Limit 10927309499sqrt3 10037 Upper Action Limit 10927309499sqrt3 11817 0 399 Control Chart Example a m Sam pie Data I Mean 39 Range 11 1107 1082 1221 11367 139 I O 12 1059 1158 1100 11057 99 13 1100 1038 1080 10727 62 14 1086 1118 1059 10877 59 t 15 1101 1073 107339 1082339 28 16 1088 1121 119239 1133739 104 1157 1187 100339 1115739 184 18 1045 1097 104239 1061339 55 19 1112 1043 112539 1093339 82 L v v i 9 10172005 Module 52 16 39 t a Control Chart Example 12500 120007 115007 110007 105007 n 9 100007 9500 0 2 4 6 810121416182022 Time Op 3 tvcp jjvgquot Contaminant Concentration 39 Or 0 I f 1072005 Module 52 17 A O In 7 Detection of Changes 4 p CUSUM Charts L r 39 Use when it is very important to detect 39 small changes in the mean quickly Advantages f 39 Very sensitive to trends or changes Q Disadvantages Not a particularly simple technique to construct or for managers to understand Q Not widely known or used in environmental 1 applications iquot 9 1072005 Module 52 18 39 i Detection of Changes Q MD ChiSquared Tests r Use when you have a need for a screening a K test for changes in the mean or in distribution shape Advantages Easy to use 0 Doesn t require assumptions on data 7 distribution type normal for example Q Disadvantages 7 Not widely used in environmental applications 1072005 Module 52 19 n X 51 Designs 0 Monitoring Designs for Monitoring o Monitoring networks are used in a variety of environmental applications Air quality monitors constantly track the condition of the ambient air and determine compliance with the Clean Air Act All public drinking water wells are routinely monitored for contaminants to ensure a safe drinking water supply The INEEL has a network of monitoring wells both onsite and offsite monitored by multiple entities testing for aquifer contamination Al ilfil i l 4122002 Module 2 amp Designs for Monitoring o The design of monitoring networks is complex and involves balancing various goals o Some issues involved in designing monitoring networks Should the monitors be placed in random 9 locations throughout the area or should be be located where contamination is expected to be worst 9 4122002 Module 3 agapvl yi amp amp Q DeSIgns for Monitoring o Some issues involved in designing monitoring networks How best can we achieve a good spatial coverage ofthe area and at the same time minimize the number of stations since each station is expensive How best can we balance the need to do the best science possible with the desire for simplicity I amp 4122002 Module 4 5949 h r Designs for Monitoring Q o Some issues involved in designing monitoring r 0 networks Should the monitors be placed permanently in Q N one location should they move around or be a combination If they move around with what frequency should they move 7 O39 Once a year Once a month Once a week Every day 4122002 Module 6 Designs for Monitoring 6 o Some issues involved in designing monitoring networks How do you gain the public s trust in the results of monitoring especially ifthey are inherently distrustful ofthe government a it s 7 r39 9 4122002 Module a at 9 95quot l h A 39 Designs for Monitoring da X 9 4122002 o Some issues involved in analyzing the ta from monitoring networks What is the best metric to use a time average 8 hour average for example or a maximum value achieved during a time interval Monitoring technology is continually being improved How best can we analyze a data set that has changing characteristics Module 7 amp 39 Designs for Monitoring amp 080 v of b L 39 y E 9 4122002 me issues involved in analyzing the data from monitoring networks How best to handle missing data values or those that are below our ability to measure censored Module Q 9 Q r l Q 9 Q f iQ a Designs for Monitoring o Designs that use optimization There are many ways to use statistical methodsto create a design that optimizes some characteristic of the design or data Unfortunately these create their own set of tradeoffs error of the mean may place all ofthe stations at For example a design that minimizes the standard the border ofthe area which violates our desire for good spatial coverage 4122002 Module amp 39 9i 6i amp AWVXA r 9 6 Designs for Monitoring o Designs that use optimization It isn t necessarily clear what to optimize Optimizing one thing may lower your ability to estimate something else Typically these sort of calculations require a lot of information about the situation Usually we don t have this information until afterthe network is designed and functioning 4122002 Module h 39 Designs for Monitoring a X X is Q 0 Q Q a 9 4122002 o So in a given situation judgements must be made about the various desirable characteristics ofthe network spatial coverage simplicity capture maximums Module randomization minimize cost estimate both long and short term trends public acceptability that their risk is monitored straightforward data analysis
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'