STAT METH CLIMATE RESRCH
STAT METH CLIMATE RESRCH ATMO 632
Popular in Course
Popular in Atmospheric Sciences (ATM S)
This 8 page Class Notes was uploaded by Demarcus Schaden V on Wednesday October 21, 2015. The Class Notes belongs to ATMO 632 at Texas A&M University taught by Staff in Fall. Since its upload, it has received 23 views. For similar materials see /class/225955/atmo-632-texas-a-m-university in Atmospheric Sciences (ATM S) at Texas A&M University.
Reviews for STAT METH CLIMATE RESRCH
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/21/15
Notes on Climate Signal Detection Gerald R North December 67 2004 1 Introduction One often wishes to see ifthere is a deterministic signal buried in the natural variability of a climate data stream How can we process the data stream in such way as to estimate the strength of the signal embedded in the data and assess the con dence interval in our estimate We start with a simple example in which there is only one signal in the stream that we wish to detect We accomplish our goal by building a statistical model of the process 2 Detection of a Single Signal Our statistical model can be expressed as follows Td ma 045x Nz 1 The variable x refers to a point in space time say x y t the term Td mz refers to the data referred evaluated at the point x The function Sx is the signal we are trying to detect In our case it is given and not random The function Nz is the natural variability in the background eld Nz is a random eld assumed to have ensemble average zero It can be thought of as the noise7 in the problem7 but note that this noise might be quite complicated7 since it represents a realization of the climatic eld in the absence of any signal We have one realization of the data eld Td m and we want to assess the strength of the signal embedded in the eld a If the strength is statistically different from zero we say we have detected the signal The rst step in our procedure is to develop the natural variability into its EOFs ltNltzgtNltz gtgtwnltzgt dz WM lt2 where the angular brackets denote ensemble average The psz39nx are the EOFs and they form a complete set such that any of the elds can be expanded into them Nltz Z max lt3 n1 WW 2mm lt4 n1 so anwz lt5 We can now expand each side in the EOFs and project out the components T as N 6 In this expression the Nn is a random variable along with T5 while Sn is deterministic Our task is to nd the value of 04 which best makes the model t the data We also know that the Nn satisfy ltNnNmgt Anal 7 where An is the variance associated with EOF 71 Notice that 6 is almost in the form of a standard least squares problem except that the error term7 is not white noise We can correct this by multiplying through by the prewhz39tenmg matrix 6mm Wmm 8 m We then have T 048 N 9 where 00 00 00 T 2 WWW Sn Z HIMSn N Z Wman 10 n 1 n 1 n 1 Now 9 is in the form of a standard least squares problem since the error term Nn is white ie7 its variance is independent of n We proceed through the least squares regressionstep by forming the total mean squared error7 E2 21ng 7 04502 and nding its minimum with respect to a dEZ A E iQZSATfiaSn 0 11 n1 2 where we have put a hat on 04 to indicate that it is the value it attains when the derivative on the lhs is set to zero Solving for 04 yields A 221 5 an a 12gt n1 n i anmanSman Sm 00 SngL 24 A 14 5 2201 Note that 04 is itself a random variable from one realization of the data stream to another This can be seen from the last equation since T5 is a random variable We can determine whether the esti mate of 04 is biased by inserting 6 into the last equation and taking the expectation value 0 a 15 which tells us that this estimator of 04 is unbiased This is not unexpected since we know that the standard estimators in regression analysis are unbiased What is somewhat surprising is that if we take a single term and use it to estimate 04 we would use the estimator 7 16 am Sm By the same procedure as before we nd that this is an unbiased estimator After a bit of work we can establish A m var o 7 17 lt gt Sign lt gt This last is reasonable7 since if the component of S along wmx is small the error variance in 0 will be large Likewise if Am is large for that component the error variance will be large 21 Estimators with a Finite Number of Terms We now have an unbiased estimator for 04 in 16 We could improve our estimate by a single EOF component by adding up a nite num ber of them in say a set M consisting of M terms and dividing by the number in the sum M A 1 Tm ambom 7 Z 7 18 M meM Sm But an optimal estimator based upon the individual unbiased esti mators 16 consists of weighting the individual terms by the inverse of their variances M T a 2 615 2 6m 1 19 n6M m neM The 6m turn out to be just 5 m m 20 when this is substituted into 19 we nd that we are back to the original regression estmator 14 in the case where M a 00 We see that this optimal estimator works for a nite sum7 and it is unbiased and has the least error vari2ance for the set M The sum IM EngM if has an interesting interpretation Each term is an expression of the signal squared to the noise variance for an individual mode This sum is the error variance for the optimal estimator of the coef cient a With each term the error variance grows but the signal to noise ratio diminishes Consider the product 2 a where inequality is Schwartz7s inequality This means the sum M is an upper bound on the signal to noise ratio squared 22 Statistically Independent Estimators A simpler example might aid in understanding the results obtained above Consider rst a pair of thermometers that are to be used in estimating the temperature of a bath whose actual temperature is To Each of the estimators is unbiased T1 T2 0 22 Our model of the temperature estimators is T1 T0 1 23 T2 T0 2 24 with the gaussian random error terms 6139 N N0U26j This last says the error terms are uncorrelated The estimators T1 and T2 are statistically independent estimators but the variances of the errors might be quite different from one another The question we want to answer is how can we employ some linear combination of the two instruments in such way as to minimize the error in the estimate of the temperature of the bath If we have two measurements even though one might have a large error there should be some means of making use of even this poor datum We set up our problem as T 0le 17 am 25 We wish to know the value of 94 such that the overall error is minimal Note that the coef cient of T2 is such as to preserve the zero bias thus leaving only one unknown Our task is to express the mean squared error in terms of Oz and then minimize it E2 ltT0 7 T2gt 26 a l 1 a622gt 27 0420 1 7 0020 28 Now we can set the derivative of E2 to zero and nd 1 1 1 71 29 040 7 7 7 m e d e The optimal estimator is then A T1 T2 1 1 1 T 7 7 7 7 30 W lt egtbfad The second factor is a normalization merely to maintain the zero bias Note that the individual estimators are weighted the inverse of their error variances Estimators with large errors will be weighted less than those with smaller errors The problem is easily generalized to the case of M thermometers with the 2th instrument having error variance 012 In this case we write M T lt31 i1 and the constraint to insure no bias is M 2 al 1 32 We proceed as before with the use of a Lagrange multiplier7 etc The nal result is M A A 211 Tom 7M f 33 271 We have solved the problem of how to linearly combine M inde pendent measurements in such way as to minimize the error If the instruments had some covariance between them7 it would be possi ble to nd linear combinations of them that would be independent the PCs This should shed some light on why it is important to go to independent coordinates in the estimation problem 3 Several Signals The problem of more than one simultaneous signal comes up fre quently This model can be formulated as Td ma ZS OLSSS Nz 34 In this case we have 715 different signals with strengths as As before we expand into EOFs of the noise eld7 T Zeus N 35 91 Again7 T and Nn are random7 while Sm is a given deterministic shape in space time We need to prewhiten the terms in the equation by multiplying through by the matrix WW 31 The sum of squares of errors is given by 2 E2 Z T 7 Eggs 36 where the variables with tilde are de ned as in the previous sec tion We seek to minimize E2 with respect to the 0455 7 1 nS simultaneously 5 72 E 35 T i 0 37 Rearranging 2 lt2 as 25m 5 1n 38 9 n n These last are a set of linear equations for the 045 5 1 nS There are several important issues we have to contend with A major one involves the properties of the symmetric matrix Q g E Z ringsn 39 The rank of g is the number of its nonzero eigenvalues The rank will be T9 3 M where M is the number of EOF modes in the sum over n It follows that to determine all 715 of the as we need to have M 2 715 So if there are four signals whose strengths are to be estimated we need at least four EOF modes in the estimation process Besides the rank of 9 we have to worry about collinearity Here the issue is whether two or more columns or rows are nearly lin early independent of one another Physically this means that two of the signals have nearly the same shapes in z Collinearity makes discrimination between these two signals very dif cult and this will show up in the next issue to be discussed The estimators 645 are random variables since the T5 are random variables from one realization of the data stream to another This problem is simple in the single signal problem since 0 is normally distributed about its true value But in the case 715 gt 1 there will be correlations between the different estimators 025 and this leads to a more involved analysis as to whether some of the 645 are signi cantly different from zero
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'