DATA COLCTN & ANLYS
DATA COLCTN & ANLYS ESS 522
Popular in Course
Popular in Earth And Space Sciences
This 12 page Class Notes was uploaded by Miss Jeanette Keebler on Wednesday September 9, 2015. The Class Notes belongs to ESS 522 at University of Washington taught by Staff in Fall. Since its upload, it has received 18 views. For similar materials see /class/192667/ess-522-university-of-washington in Earth And Space Sciences at University of Washington.
Reviews for DATA COLCTN & ANLYS
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/09/15
E88 522 Spring 2007 6 Properties and Application of the Discrete Fourier Transform In this lecture we are going brie y review the different kinds of Fourier transforms and then discuss some practical aspects of using the discrete Fourier transform DFT Different Flavors of Transforms We have come across four types of transform which we will reView here 1 Continuous time cgt Continuous frequency Nonperiodic continuous functions in time and frequency are related by the integral transform Gf Igtexpi27r dt W 61 g0 IGfexpi2n df We can write the Fourier transform in terms of its amplitude and phase Gm lel expi9f 62 where Gf is the amplitude spectrum 60 is the phase spectrum and lGfl2 is the energy or power spectrum For many applications we are more interested in the amplitudepower spectrum than the phase 2 Continuous time cgt Discrete frequency Periodic continuous functions in time transform to form Fourier Series that are discrete but not periodic in frequency 1 T 13927 391 G Jgtexp T dt 0 gng 63 3 Discrete time cgt Continuous frequency From the symmetry of the Fourier transform pair we can infer functions that are periodic and continuous in frequency yield discrete but not periodic functions in time Gf GkAtexp i2nfkm gk A j Gfexpi27tkatdf 64 note that the integral is taken over one period in the periodic frequency function E88 522 Spring 2007 4 Discrete time cgt Discrete frequency When both the time and frequency functions are periodic then they are both discrete This yields the discrete Fourier transform DFT N 1 2m39kj Gk 2g exp j0 N 1 AH 27t39k39 65 1 gt Gex g N k pl N j The Fast Fourier Transform FFT From the expression for the discrete Fourier transform shown in equation 65 it is clear that calculating each term for a real time series requires N multiplications of a real number and a complex exponential or 2N multiplications Now all N terms require 2N2 operations which is reduced to N2 when we remember the symmetry properties of the Fourier transform However it turns that the Fourier transform can be computed in only N logN operations when the number of samples is a power of 2 ie N 2 where M is an integer We will not cover the details in class but we can explain it brie y as follows The forward transform of equation 65 can be written in as two series over the odd and even terms to yield M271 2717iij M271 Z ik 2jl Gk Z g2jexp l39 2 g2jlexp FD N FD N NH 27mg 2m39k NH 2mquotij ex ex ex 66 gag p NM p Njgagm p NZ 2m39k Pk exp N Qk We have expressed Gk in terms of a sum of two Fourier transforms Pk and Qk each for a time series of N2 samples The reason why this reduces the number of calculations is that because the periodicity of the Fourier transform we can write k 114 6397 so the calculation of all the terms inP and Q requires only half the number of calculations required for all the terms in G Now if N is a power of 2 we can keep on subdividing the Fourier series until we are le with series of twopoint sequences The book keeping becomes fairly complex to express but the net result is that the number of operations is reduced to N logN an enormous computational saving for long time series Without the FFT Fourier transforms would be much less prevalent in computational data analysis Even and Odd Number of Samples An N sample time series with a sample interval At will have a length or period of NAt The spacing of frequency samples is l T lNAt and the Nyquist frequency or folding frequency is l2At E88 522 Spring 2007 If the number of samples is even then the folding frequency will coincide with a frequency sample For a real time series the frequency sample that coincides with the Nyquist frequency will be real To understand this consider sine and cosine waves that are sampled at exactly twice their frequency starting with a sample at time 0 the sine wave the imaginary component is always sampled where it is zero If the number of samples in the time series is odd the Nyquist frequency falls between two frequency samples Even and Odd Symmetry Any function can be written as the sum of even symmetric about zero and odd antisymmetric parts 1 1 m 5mm ft 5W ft f r f x 68 We know that a real time series transforms to a frequency function whose real parts are even and imaginary parts are odd A real even time series transforms to a real even frequency function ie a sum of cosines The symmetry relationships of a Fourier transform can be summarized m Frequency Real cgt Real Even and Imaginary Odd Imaginary cgt Real Odd and Imaginary Even Real Even cgt Real Even Real Odd cgt Imaginary Odd Imaginary Even cgt Imaginary Even Imaginary Odd cgt Real Odd One implication of these symmetries is that it is possible to obtain the FFT of two real time series by combining them as the real and imaginary parts of a complex time series Spectral Leakage and Windowing Note this section will be covered in the exercise on the Discrete Fourier Transform anal Spectral Leakage that follows the lecture It may be hard to follow without doing the exercise Imagine that we obtain a discrete time series with a finite number of samples from a process that is continuous in time eg a year of temperature data or 10 seconds of data from a permanent seismic installation If we take a discrete time series with a finite number of samples and obtain its DFT we are making the tacit assumption that the time series is periodic even though it is not Any discontinuity between the lSt and last time sample ie a jump in the value or a change in the gradient will contribute to the frequency spectrum in a manner that is generally undesirable This effect is known as spectral leakage One can understand this effect by envisioning the discrete series as the result of truncating a longer waveform by multiplying with a boxcar This is equivalent to convolving the spectrum of 63 E88 522 Spring 2007 the untruncated time series with spectrum of a boxcar ie a sync function The side lobes in the sync function leak frequencies from one frequency to another during the convolution in the frequency domain see Question 1 in the 3rd homework on convolution The solution to this problem is to multiply the finite sampled time series by a windowing function or a taper that decreases smoothly to zero or near zero at its ends There are literally dozens of different choices of tapers see the Matlab window function many of which look very similar and some sophisticated tapering techniques see multitaper spectral analysis method in Chapter 10 of Time Series Analysis and Inverse Theory by David Gubbins Cambridge University Press 2004 In the exercise following this lecture we will consider one taper the Hanning window or raised cosine window the word window is used synonymously with taper which is defined for N points by 1 cos 27tAll12j wi i1 2N 69 Note that the indexing excludes the zero points at either end The effects of applying a taper are 1 To decrease the time series to zero at its start and end so that there is no discontinuity in the periodic time series 2 To change the weighting of samples in the time series so that those near the middle contribute more to the Fourier transform This is potentially a problem if the characteristics of signal are not stationary ie they do not change with time Much of the theory of spectral analysis was developed by engineers to look at stationary signals such as noise on an electrical circuit and in geosciences we often apply them to signals like seismic waveforms which clearly do vary with time There is nothing wrong with doing this but you need to be cognizant of the potential problems 3 To reduce the resolution of the spectrum by averaging adjacent samples We can understand this effect by looking at the amplitude spectrum of the windowing function and remembering that multiplication in the time domain is convolution in the frequency domain Ideally our windowing function would have a very narrow central lobe minimal averaging and loss or resolution but one of the tradeoff s in window design is that the narrower the central lobe the less well the window suppresses the effects of truncating the waveform This is manifested by sidelobes in the frequency domain that leak signal from other frequencies into each spectral sample spectral leakage E88 522 Spring 2007 Padding with time series with zeros Because of the computational savings of the Fast Fourier Transform it is most efficient to compute the Fourier transform of a time series that has 2M samples To accomplish this one can simply pad the end of the time series with zeros a er applying ataper to the original time series Sometimes it is desirable for display purposes to add more zeros than is necessary to increase the number of samples to the nearest power of 2 For example one might pad a 100 point time series with 412 zeros rather than 28 Adding zeros can also introduce a leakage effect even for a tapered time series function since the gradient changes abruptly to zero in the padded region Padding with zeros decreases the spacing of frequency samples but it does NOT increase the number of independent spectral values Instead the values at additional frequencies are interpolated and the resulting spectrum looks smoother Spectra of Long Time Series Although the FFT and modern computers allow Fourier transforms to be computed for long time series one will eventually reach a computational threshold Equally importantly there is often no need for the very fine frequency resolution that would accompany Fourier transform of a long series For instance if one was looking at the spectral amplitude characteristics of a year s worth of seismic noise sampled at 100 Hz something that I have done there would be nearly 232 points and even if an FFT was computationally feasible for such a long series it is hard to imagine why one would want frequency samples spaced at 3 X 10398 Hz One option would be to smooth the output spectrum with a running mean or better still a Gaussian Another practical approach is to divide long time series into shorter windows and take the mean spectral amplitudes or energies amplitudes squared of all the windows A taper will generally be applied to each window to minimize spectral leakage Since the tapering functions tend down weight samples near the end of each window it is common practice to allow an overlap of 50 between adjacent windows The method of using overlapping windows to calculate power spectra is commonly known as Welch s and is applied by the Matlab function pwelch psd in earlier versions E88 522 Spring 2008 10 Filtering Filters are commonly applied to data to eliminate unwanted frequencies and emphasize those of interest A good example is our application of the boxcar and Gaussian smoothing lters to the smooth temperature data in an earlier exercise An ideal lter would have an amplitude spectrum that is zero outside the pass band and unity within it The problem with sharp lters is that their implementation requires long time sequences remember that the uncertainty principle tells us that a sharp feature in one domain is broad in the other and truncating these sequences leads to a Gibb s phenomenon in the lter amplitude response Sharp causal lters also apply large strongly frequencydependent phase shifts to the data All lters involve a tradeoff between the sharpness of the lter and the length of the sequences required to implement them Butterworth Filter The Butterworth lter is a popular lter with a simple amplitude response it the lter that gives the attest response in the pass band for a given lter order The amplitude spectrum of the analog Butterworth lowpass filter is de ned by 1WJ I2 4 10 1 Zn 1 by l f where fC is the cutoff frequency and n is the order of the filter At the cutoff frequency the response is 12 the value at 0 frequency As n increases the sharpness of the cutolT increases Figure l The amplitude spectrum of a high pass filter is given by IGgfl1 IGWf 102 A pass band filter is created by simply shifting the filter along the low pass lter along the frequency axis 2 1 lemxband f 103 PHI 13 where is the center of the pass band and fC is the pass band half width The notch or band stop lter is simply leandsmp 1 leandpm In the exercise you are going to look at digital versions of the Butterworth 104 Filter Speci cations There are a variety of other lters available in Matlab eg Chebyshev type I and II Elliptical Bessel and a myriad of tools to help you design lters We will not consider these other than to describe the terminology used to describe filter characteristics which is shown in Figure 2 for a low pass lter Filter amplitude responses are frequently displayed on a plot with logarithmic scales lOl E88 522 Spring 2008 Amplitude differences are expressed in decibels For the amplitude response G the difference in decibels between two amplitude values is given by G 013 2010g10 105 lel or if one prefers to think in terms of the power spectrum 2 G dB1010g10 1392 106 lel The lter rolloff can be speci ed in decibels per octave where one octave corresponds to a doubling in frequency Zero Phase Filters A zerophase filter is one in which the phase are unchanged by ltering that is the filter phase lag is zero Zerophase lters are acausal that is they lead to arrivals before the lter onset To understand this consider a 5 function at time zero It is composed of equally weighted cosine functions at all frequencies If we remove some of these we will no longer have a delta function but the cosine functions that remain will be symmetric about time zero and the signal before time zero will be acausal In the frequency domain we can implement a zerophase filter simply by performing an FFT multiplying by the amplitude spectrum of the lter and performing an inverse Fourier transform In the time domain we can implement a zero phase filter by ltering twice First we convolve the filter with the data then we reverse the lter and filter again process Phase shifts applied by the first filter operation are the opposite of those from the second lter operation and the net phase shift is zero Mathematically we can write ytg rgrxr 107 Reversing a time series is equivalent to taking the complex conjugate in the frequency domain Thus in the frequency domain we get YfGfGfXfGf2Xf 108 Causal Filters Causal lters are generally applied in the time domain either as FIR or IIR lters N M y 21799 2 10 9 0 1 Remember for an FIR lter the coef cients in the second sum are zero IIR lters can generally achieve the same ltering characteristics with less terms than an FIR filter Since a minimum phase lter has its energy concentrated upfront it can typically be implemented with shorter sequences for a given accuracy and so minimum phase filters are generally preferred even if we are not interested in their phase characteristic However sometimes the maximum phase filters are preferred because they have a more linear phase response and thus while they apply a larger 102 E88 522 Spring 2008 time shift the time shift can be calculated from the slope of the phase response and the resulting waveform is looks more similar to the input waveform Applying Filters in Matlab In the exercise that follows this lecture you will learn how to apply an IIR approximation to the analog Butterworth filter in Matlab Butterworth low pass lters fc01 with orders of 1 2 4 and 8 39 Filter Amplitude 1 I I I v I I I I I 09 08 07 06 05 04 39quot n 03 a2 quot2 01 e z V f x n H o I I 39 I I I 0 005 01 015 02 025 03 035 04 045 7 05 r Frequency Hz Fuss buwwk ripplg 4 D 39 W 95 2 39 0 dz roll 0 f I J E I 0 06M 9 lt5 w 39 quot 5609b 1 I 5 30 I I 39 M Interpolation and Kriging A very common problem is geosciences involves interpolating data In this lecture we will first discuss some simple interpolation techniques available in Matlab before discussing more specialized techniques for interpolating sparse and irregular spatial data I have used Geostatistics Modeling Spatial Uncertainty by JeanPaul Chiles and Pierre Delfinger call QE332 882 C45 1999 as my primary source for this lecture One Dimensional Interpolation in Matlab l Nearestneighbor yi interplxyxi nearest 2 Linearyi interplxyxi linear 3 Cubic splineyi interplxyxi spline oryi splinexyxi Fit a cubic polynomial of the form yabxcx2dx3 through pairs of successive points Match y and dydx for pairs of splines at each point allyabc2 0 at end points 4 Spectral interpolation of evenly sampled data see equation 14 on page 52 of the Dirac Comb lecture function resample will do this Two Dimensional Interpolation in Matlab l Nearest neighbor zi griddatax y z xi yi nearest 2 Triangular based linear zi griddatax y z xi yi linear 3 Triangular based cubic zi griddatax y z xi yi cubic 4 Spectral interpolation of evenly sampled data There is no resampleZ in Matlab but it is possible to apply resample separately in each direction Note that if the data is on a rectangular but not necessarily uniformly spaced grid then interp2 can be used instead of griddata for 13 Weighted Linear Interpolation In the geosciences we often have to work with spatial data that is very unevenly sampled Although this can be interpolated using nearest point linear or cubic methods these are often unsatisfactory splines tend to overshoot with unevenly spaced data A simple linear interpolation scheme can be developed as follows 151 Suppose we wish to interpolate the quantity Z We have measured values Z at n points 139 l 2 n with coordinates x yi and we wanted to estimate the value Z0 at point x0 yo A general linear estimate can be written as Zx0ZL1Zx1 1 11 where the asterisk in Z0 is used to differentiate the estimate from the true value of Z0 and 1 are the weights One strategy for determining the 1 values is to used an inverse distance scheme 1 15 1 n 1 11 C My where h is the distance between x0y0 and xy that is Z Z aJ x a c is a small constant to stop the scheme blowing up when the interpolation point coincides one of the data points and w is an exponent which is typically between 1 and 3 Inverse distance schemes can work well for points that are surrounded by known values in that they produce a reasonable looking interpolation However they are rather ad hoc for example how does one chose the optimal value of w Kriging is a method that provides an objective way of determining the weights 1 based on an understanding of the statistical properties of the data Before describing Kriging we must first discuss the spatial characterization of data Characterizing Spatial Variations We are familiar with making scatter plots of two variables In the last lecture we used one to visualize the covariance and correlation when considering a straightline fit We can also make a scatter plot showing all possible pairs of data values whose location is separated by a certain distance in a particular direction That is we can plot of Zxh against Zx where h is a constant vector Clearly if h is small andor if Z varies only slowly with position the points will fall near a straight line with a slope of unity Figure 1 left If h is large and or Z varies quickly the plotted data will be more scattered Figure 1 right In thinking about methods to characterize the spatial variations in data it is useful to think in terms of a residual value that we can write as YxlZxl mxl 4 Here mx is a model of the large scale spatial trends in the data which might in the simplest case be just the population mean u ie no large scale spatial trends but might also be a more compleX model such as a loworder polynomial The spatial covariance can be defined as Chi Yxl hYxl 5 1 152
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'