COMPUTER VISION CAP 5415
University of Central Florida
Popular in Course
Popular in System Engineering
This 134 page Class Notes was uploaded by Khalil Conroy on Thursday October 22, 2015. The Class Notes belongs to CAP 5415 at University of Central Florida taught by Marshall Tappen in Fall. Since its upload, it has received 45 views. For similar materials see /class/227209/cap-5415-university-of-central-florida in System Engineering at University of Central Florida.
Reviews for COMPUTER VISION
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/22/15
CAP 5415 Computer Vision Lecture 3 Periodicity Sampling Practicalities Fall 2009 Announcements PS 1 has been released Foreword We will be doing some derivations The goal is to use the derivations to help you build intuition and understanding about why things behave the way they do If you aren39t understanding something please ask questions Don39t get too stressed out if something is still a bit fuzzy you are welcome to come to office hours What are we doing Essentially we are guring out how to build the image out of these sinusoids of different frequency Each pixel in the transformed image corresponds to the amount of a sinusoid of a particular sinusoid that is needed High Frequencies Low Frequencies What are High Frequencies What if we remove the high frequencies Old Spectrum New Spectrum How will the new image look What are High Frequencies Removing the high frequencies makes the image look blurry Old Spectrum New Spectrum Try building a sharp edge out of lowfrequency sinusoids What are Low Frequencies What if we remove the low frequencies Old Spectrum New Spectrum How will the new image look What are Lcw Frequencies hat if we remove the low frequencies Old Spectrum New Spectrum How will the new image look Working with the DFT Discrete Fourier Transform Is the complex part bothering you yet Let39s look at a different representation Every complex number can also be represented as z X jy ree r magnitude real number 9 Phase Phase and Magpfgcude Fourier transform of a all natural images real function is complex have about the difficult to plot Sarne magnitude visualize transform instead we can think hence Phase seems of the phase and to matter but magnitude of the magn39tUde largely transform doesn t Phase is the phase of the DemonstratIOn complex transform Take two pictures Magnitude is the SW39 the phase magnitude of the tranSfOrmS complex transform Compute the Inverse what does the result look like This is the This is the This is the magnitude transform of the zebra pic This is the phase transform ofthe zebra pic Reconstruction with zebra phase cheetah magnitude 395 xvquot 1 Reconstruction with cheetah phase zebra magnitude The Fourier Transform Helps Us Analyze Convolutions Notation fxy is the signal Fuv is the DFT If h f g lt Convolution Then Huv FuvGuv Convolution in the spatial domain is multiplication in the Fourier domain You39ll derive this in the problem set Back to averaging Remember 19 19 19 19 19 19 Back to averaging Filtering We pixelwise multiply the DFT of the input image by the DFT of the filter Frequencies where the magnitude of the response of the filter are near zero black in the images will be eliminated Take the log to rescale brightness Log Spectrum after 3X3 averaging Unfiltered Spectrum LogSpectrum after 7X7 averaging Back to this example With oset to 1 With Uset to 3 First the filter Magnitude of the DFT After Filtering Vocabulary Low Pass Filter 19 19 19 19 19 19 19 19 19 Vocabulary Bandpass Filter Vocabulary Highpass Filter Now let39s compute some DFT39s Problem 3 Calculate the DFT of Step 1 Write out the DFT Fu 27er 6 27TjN 6 27F N Notice that we can rewrite the last term Gave 13 gt 6 2m 2m What can we say about this term 6 2wj 2wj Remember periodicity So now we have 1 6 27TjTu 6 27Tj Now you can write this in terms of sin and cos Quick Exercise Show that 27Tj 6 27Tj 2 COS 27Tgt DFT of this signal Periodicity and the DFT Assume you39ve computed the DFT and you want to compute a value in the original signal Inverse DFT or Synthesis Equation N 1 k Z X Mg Wm k0 Periodicity and the DFT Inverse DFT or Synthesis Equation N 1 39kn m E X 62779 N 130 What if we try to compute xN1 Using a DFT representation assumes that the signal is periodic Sampling What39s the wrong way to do this What39s the right way Why Let39s talk about the wrong way first You want half as many pixels Take every other pixel Really fast aws Right Way Wrong Way Analyzing what39s going on Fortunately we have almost all of the pieces that we need to analyze this problem First worksheet time Convolution Solution We need to compute the DFT of him f wig in First Step write DFT Equation HM finiginie 2 j Write gn in terms of the Inverse DFT N l 1 N l Hiui Z finie 2WJTN Z GimieZW 710 m0 Do some arithmetic N l N l nu m Hiui Z Gimi Z fink 2 quot m0 n20 Rewrite as a convolution 1 N l N GmFu m m0 Sampling What is sampling Same thing as multiplying the signal image by a periodic impulse train M What does happens in the Fourier Domain The spectrum of the signal is convolved with an impulse train HM LAX What does happens in the Fourier Domain If the signal has components with too high a frequency then the copies will overlap Effectively you are introducing highfrequency content that doesn39t exist 1quotuuricr 39139run3l39urm L39Iugniludc quot Spectrum 39I I 53 quotmlquot quotmay and Sh in 521 m plczd linurim39 39 Signal quotI39m nsl39urm 351 le Emulrum quotLIL I1Lll by mLIIlipliualiun with hm ller Inacc mule1y Inverse Ruccunslruclcd Iquotuun39 CT 5 gnal 39i39e msfm m 39 Magnitude Span um IL Iuricr Ilm5mnl Mugni lLlL lC l Spectrum h l quot Sunlplc Upy and Shift Sampled It39uun39cr I Signal 39I ransl39uzn39m ul IILll by nmltipliuulinn with hm filler J LCClelILly IHVCT SC Remnsulruulcd 39LLrir blgmll 3913913931151 01 111 Magni luer Spccu um 1 In Aliasing This is called aliasing Bad high frequencies accidentally introduced by sampling Two ways to solve Take more samples Spreads out the impulse train in the Fourier Domain Make sure spectra don39t overlap Eliminate highfrequencies by blurring Practical Example of Aliasing httpwwwmichaelbachdeotmotstrobindexht ml In Images Aliasing is often called Jaggies Right Way Wrong Way Practical consequences from my own work Goal Zoom up lowresolution images Getting the blurry image To verify the method is working we generate the lowresolution image from the full resolution image Makes a big difference in the result Result from properlyantialiased image Result from improperlyantialiased image What39s going on The system is enhancing the unwanted aliasing Result from properlyantialiased image Result from improperlyantialiased image Practicalities of working with the DFT get Wait That39s doesn39t look like what I39ve been showing you A Remember periodicity What if you like lowfrequencies in the center Both MATLAB and Numerical Python provide functions called fftshift and ifftshift These shift the image so that the low frequencies are at the center Back to the image sharpening example Task Design a filter to make things look sharper How Remember highfrequencies correspond to the sharp stuff in images If we boost the high frequencies the image should look sharper As a reminder Operations 1 convolution 1 subtraction over the whole image As an equation Zgtkf2Z Zgtkf In terms of DFT This filter boosts the highfrequencies in the image Original Filtered Lecture 11 Meanshift and Normalized Cuts Segmentation CAP 5415 Fall 2009 MeanShift Like EM this algorithm is built on probabilistic intuitions To understand EM we had to understand mixture models To understand meanshift we need to understand kernel density estimation Take Pattern Recognition Basics of Kernel Density Estimation Let s say you have a bunch of points drawn from some distribution What s the distribution that generated these points Using a Parametric Model Could fit a parametric model like a Gaussian Why Can express distribution with a few number of parameters like mean and variance Why not Limited in flexibility NonParametric Methods We ll focus on kerneldensity estimates Basic Idea Use the data to define the distribution Intuition If I were to draw more samples from the same probability distribution then those points would probably be close to the points that l have already drawn Build distribution by putting a little mass of probability around each datapoint Example a 2000 Samplm b 20000 Samples Figure 2 2 Kernel density estimates of the density function shown in Figure Lilla Figure 5 shows the estimate amid with a relatively small number of samples It is uneven and does not approximate the true density well b Nith more samples the estimate of the density improves signi cantly From Tappen Thesis Formally A I i39j l P I fix a E39er Xi Kernel Most Common Kernel Gaussian or Normal Kernel Another way to think about it Make an image put 1or more wherever you have a sample Convolve with a Gaussian 11L 5525 What is MeanShift The density will have peaks also called modes lfwe started at point and did gradientascent we would end up at one of the modes Cluster based on which mode each point belongs to Gradient Ascent Actually no A set of iterative steps can be taken that will monotonically converge to a mode No worries about step sizes This is an adaptive gradient ascent Xixl 2 7i Zlilxisl l ZiilgOthX39H gt Yj1 y J Results 0 Oogo Results a Normalized Cuts Clustering approach based on graphs First some background Graphs A graph GVE is a triple consisting of a vertex set VG an edge set EG and a relation that associates with each edge two vertices called its end points From Slides by Khurram Shafique Connected and Disconnected Graphs A graph G is connected if there is a path from every vertex to every other vertex in G A graph G that is not connected is called disconnected graph 0 From Slides by Khurram Shafique Can represent a graph with a matrix a 0 1 0 0 1 b 1 0 0 0 0 c 0 0 0 0 1 e 0 0 0 0 1 d 1 0 1 1 0 BQSGROW Per Adjacency Matrix W Based on Slides by Khurram Shafique Can add weights to edges 0 1 3 00 oo 1 0 4 00 2 3 4 0 6 7 00 00 6 0 1 oo 2 7 1 0 Weight Matrix W Based on Slides by Khurram Shafique Minimum Cut A cut of a graph G is the set of edges 8 such that removal of S from G disconnects G 1 6 Minimum cut is the cut of minimum K weight where weight of cut ltABgt is given as wlt AB gt 2 WW wEAyEB Based on Slides by Khurram Shafique Minimum Cut There can be more than one minimum cut in a given graph All minimum cuts of a graph can be found in polynomial time1 1H Nagamochi K Nishimura and T Ibaraki Computing all small cuts in an undirected network SIAM J Discrete Math 10 1997 469481 Based on Slides by Khurram Shafique How does this relate to image segmentation When we compute the out we39ve divided the graph into two clusters To get a good segmentation the weight on the edges should represent pixels affinity for being in the same group 01 Z A e Images from Khurram Sha que Affinities for Image Segmentation HFmrFUJHE 4 Bn39ghtness Features 21in i if ilXU e XltJgtil2 lt 0 otherwise Interpretation High weight edges for pixels that Have similar intensity Are close to each other MinCut won39t work though The minimumcut will often choose a cut with one small cluster I 112 Q 0 Min cut 2 Q 0 I 7 u I 0 0 I I i O m Mincut I better cut I Fig l A case where minimum cut gives a bad parlilieh Image From Shi and Malik We need a better criterion Instead of mincut we can use the normalized cut rIit1B F1111B N 39 f ALB 7 u L stmixl V tissm39tB if assocA V riieA ieV wu t Basic Idea Big clusters will increase assocA thus decreasing NcutAB What is AssocAV rut B 217 11 B assorlzl V 035001 a V A teltl B assocAV is defined as assocA V cutA B assocA A assocAA is the sum of all weights in A assocAV is the sum of all weights associated with Aincluding the cut Why this is a better criterion 39039 C O C Consider this graph where all edges have weight 1 The Minimum Cut The minimum cut would have value 2 Only wants to delete two edges The Normalized Cut If we used the normalized cut criterion the cost of this cut becomes 22 2712857 The Normalized Cut Consider this cut instead The weight of this cut will be 353512 This is the better cut under this criterion Better balanced and better behaved Finding the Normalize Cut Can rewrite as T NcutA B with y e 1 b yTDl 0 y y o This is NPComplete Finding the Normalized Cut NPHard Problem in the binary case Can nd approximate continuous solution by nding the eigenvector with the secondsmallest eigenvalue of this generalized eigenvalue problem 1 1 1 D 2D WD 2zlez WI IGFGZZDZy That splits the data into two clusters Can recursively partition data to nd more clusters Code available on Jianbo Shi39s webpage Results Figure from Normalized cuts and image segmentation Shi and Malik 2000 So what if I want to segment my image Ncuts is a very common solution Meanshift is also very popular Results Figure from Normalized cuts and image segmentation Shi and Malik 2000 So what if I want to segment my image Ncuts is a very common solution Meanshift is also very popular Using Image Segmentation Much work is centered on producing good big segmentations From Ren and Malik ICCV 2003 Big is not always better Sometimes we want to oversegment the images From Ren and Malik ICCV 2003 Big is not always better These are called superpixels Nice way to group pixels into larger more coherent blocks From Ren and Malik ICCV 2003 Using SuperPixels Can build up image segments by merging pixels together Oulusion C ucs From Hoiem et al ICCV 2007 SuperPixels for Denoising Imagine breaking the image up into chunks then replacing each chunk with a smooth color From Liu et al PAMI 2007 Removing Noise 0 This provides an effective way for averaging out noise F7 c 10 AWGN d Denoised by the Odiorder model PSNRZ705 From Liu et al PAMI 2007 Comparison 10 AWGN PDE Wavelet GSM Ours lstiorder model Original From Liu et al PAMI 2007 Comparison 39 39 i x 10 AWGN PDE Wavelet GSM Ours lstiorder model Original From Liu et al PAMI 2007 Lecture 7 More on Classification Fall 2009 First Some Review Looking at the confidence in classification gt We can translate this response into a probability gt We will use the logistic function to transform the response into a probability of the correct classification gt We will assume that each point will be labeled with a label l l can take values 1 or 1 1 Pll 1le 1 eXpax by 0 Finding the line parameters gt Now for any set of line constants we can find out the probability assigned to the correct label of each item in the training set N N 1 11 Pllilxil H 1 l exp7liami l by 0 i1 gt We39ve inserted li into the exponent because if li 71 1 Pl 71x l l l 1expambyc gt We multiply the probabilities because we believe the points are drawn independently gt Note that for each item this number tells us how much the model defined by that particular line believes that the ground truth right answer is the actual right answer Finding the line N N 1 gPllilXil i1 1 expkmmi byi 6 gt This is also called the likelihood of the data gt Ideally this likelihood should be as close to 1 as possible gt We can find the parameters of the line by looking for the line parameters that maximize this likelihood gt In other words find the set of line parameters that make the right answers have as high a probability as possible Finding the line N N 1 P l X g l ll ll 1 expiliam by 0 gt Before going on we are going to do a quick math trick gt Because the log natural logarithm or In function is monotonically increasing ie it is always increasing the parameters that maximize the above equation will also maximize N N 10g Pllilxilgt Z 710g1 expiliami by l i1 i1 Finding the line N N 10g Emilm Z 71091 exp7liami byi c i1 i1 V Similarly we can multiply this equation by 71 to change from a maximization problem to a minimization problem V Leading to a function that we will call a loss function or cost function and can be denoted as L N L 2 log 1 expiliami l byi l i1 gt Our goal is to find the line parameters that minimize L Minimizing L gt We can minimize L by using its gradient VL 8L QLJ gt The gradient can be viewed as a vector that points in the direction of steepest ascent gt So the negative gradient points in the direction of steepest descent An algorithm for minimizing L gt This leads to a simple algorithm for optimizing the line parameters gt We39ll create a vector a p b C gt The steps are 1 Initialize p to some value po 2 Repeat these steps 21 Calculate VL using the current value of p The value of the gradient depends on p 22 P H P nVL gt We go in the direction of the negative gadient because that this the direction of steepest descent More on Loss Functions gt Let39s look at the loss function that we are minimizing Lm log1 e7 lacuna gt If the label is 1 then we are encouraging our linear classifier to return a positive value gt Loss grows approximately linearly as it gets more and more negative The Log Loss Lm log1 i e gt This can be thought of as a modification of the zero one oss gt The 0 1 loss says I only care ifl make a mistake The Exponential Loss gt This is an upper bound on the 0 1 loss Boosting Approach to Finding Classification Parameters gt In the logistic regression approach discussed in the previous lecture we gathered all of our features then optimized the weights gt What if we had millions of features We couldn39t load those into memory and optimize using gradient descent gt What if we added features greedily We could then consider tons of features gt This often called boosting Describing This Mathematically gt To begin with we will minimize the exponential loss gt We will also define a new term Nf W503 Za jlid j1 gt is the classifier The predicted label of any sample is signFf gt Each is a feature Each aj is a weight gt In the boosting algorithm we will build up one feature at a time Adding Features gt Let39s assume that we have defined forj features and have chosen gtj1 Nf N562 Z ai jlt gt j1 gt Now we need to choose 1H1 gt We will minimize the exponential loss for N training examples with respect to Lg1 N Laj1 Zexp JARED a1l jllt gtigt i1 Finding the parameter N LlaHl 2 exp lilFlfi aj1 jllfilll il gt We need to find Lg1 gt We could use gradient descent or we could do a Newton step gt In a Newton step we will use a Taylor series to approximate Laj1 with a quadratic function then solve that quadratic function to find 1H1 M x N mm a b V5quot z e b gt This is similar to Newton39s algorithm for minimizing functions Finding the parameter N Mam 2 exp liiF i ajl j1fii i1 N La l Z ili j1i i exp lii i i aj1 j1iii 1 N 1741 2 1 93139 exp JARED aj1 j1fii i1 gt Ifwe do the Taylor expansion around Lg1 0 then N Ham 3 Zexp HAMm aimlt0 37 i1 gt Differentiating this we can solve for Lg1 a 7 MO 31 1mm epozFWD 7 MO 2 gt 15i exp 4139 FWz i A Regression View w N Llaj1l 2 exp lilFlfill aj1L0 a1 2 i1 gt This can also be looked at from a regression point of view N LlaHl 2 exp lilFlfill j15iaj1 102 i1 gt We minimize this with respect to Lg1 gt This term exp can be thought of as a weight for each training example that will be updated at each round Boosting gt This is called the GentleBoost algorithm gt We have to choose gtj1 Usually choose that greedin by searching over a bunch of different features that have been thresholded Lecture 15 Basic MultiView Geometry Stereo lfl needed to find out how far point is away from me I could use triangulation and two views scene point 4 plane optical center Graphic from Khurram Shaffique To see this I have three unknowns xyz I have 4 equa ons P projects to u and v P projects to u and v Today For the rest of the lecture we will talk about the geometry of multiple views To begin we will talk about epipolar geometry Image from Forsyth and Ponce Epipoles The projection of the optical centers of each camera 390 epipole ep39pOIe Image from Forsyth and Ponce Epipoles and epipolar lines For a point P we may not know where it projects to p39 We do know that it lies on an epipolarline Image from Forsyth and Ponce When the cameras are calibrated The vectors a quot005 and O p are coplanar Can be expressed as 019 00 X O p 0 Image from Forsyth and Ponce How you can see this The vector returned by the cross product is perpendicular to the two vectors Can be thought of as a normal to a plane If lines in the plane it should also be perpendicular to that normal I quot quotr 5 0 Illlage TI 0 39 Ul sylll anu VUHCE Relating the two views First remember that we have have points in two coordinate systems Can express a rigid transformation translation and rotation between the two system R is a Rotation Matrix Op 8 73p i t T iS a Translation vector P Image from Forsyth and Ponce Now rewrite constraint in terms of the coordinates of the left camera gt a 079 00 X 019 0 p t X 3731 For simplicity 39 i p thp The essential matrix Starting with p gtlt A cross product can be rewritten as a matrix multiplication leading to the constraint pT5p g is called the essential matrix Properties of the essential matrix pT p 0 3x3 matrix Rank 2 Only relates the extrinsic parameters ofthe two views
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'