New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Adaline Pollich


Adaline Pollich
GPA 3.81

Ching Cheung

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Ching Cheung
Class Notes
25 ?




Popular in Course

Popular in Electrical Engineering

This 142 page Class Notes was uploaded by Adaline Pollich on Friday October 23, 2015. The Class Notes belongs to EE 639 at University of Kentucky taught by Ching Cheung in Fall. Since its upload, it has received 20 views. For similar materials see /class/228317/ee-639-university-of-kentucky in Electrical Engineering at University of Kentucky.




Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/23/15
Multimedia Information Systems Samson Cheung EE 639 Fall 2004 Lecture 1 Applications amp Trends Defl n Itlon I A computer hardwaresoftware system used for I Acquiring and Storing I Indexing and Searching I Manipulating editing and quality enhancement I Distributing I Protecting large amount of visual information I Images video animations and associated metadata I Examples I Web Media Search Engines I Mobile Multimedia Portals I Digital Libraries for Large Organizations W x it s important I New Content Creation I Multimedia authoring software I Diagnostic multimedia in scientific applications I HighSpeed Access Networks I New Services and Devices I Multimedia Messaging I Multimedia Enabled Intelligent Archive I Media portals I Standards become available I Digital TV DVD MPEG4 MPEG7 and IPMP Multimedia Web Search I Google Image Search Ditto Altavista 3Dup Singingfish AIITheWeb Kazaa Media Desktop Lycos Pictures and Sounds MIDI Explorer Picsearch I Issues I Search keywords display interfaces more advanced search tools Video Cluster Search I Cluster near duplicate web video chs seam i quot VIP um i m Berkelev Video Cluster Search Results Search on tzlzphonz yizlds I F S u m m a O n Number ofvideoswith keywords 8 39 I 7 of VIdeo usmg low Average cluster size 69 mm 48 level features I Issues 40 videos sing e video sing e video Single Video 3 I Search keywords 7quot no semantics sing e video sing e video 1g i minim xi Hermitage Museum I Mixed media search tools keyword and visual I Virtual tour using panoramic and 3D view I Zoom view gallery multiresolution I Watermark protection invisible I Balanced use of aesthetic informational and technical components I Issues I Acquisition representation user interfaces search tools copyright protection I Link Blob World I Apply image segmentation to partition images into blobs I Weigh importance based on color texture location shapesize I Issue image segmentation is hard for general picture I Link Specific Domains I Geological Images I Human settlement forest M I Oilexploration Medical lmageries Compare simulation vs experiments Porn filtering M Issues I Need domain experts difficult to get data G3 Potentially very fruitful no I Put human in the loop relevance feedback 6 AnxTime TV TiVo I Timeshift local storage 60 hours I Skipping mode I lnstant record live pause simultaneous recordplayback I Highlighttrail viewing I Searchretrieval bookmarking multisource com parisonsummarization I Personal profile multiuser profile I Target services ads consumer usage data I Pay per choice eCommerce AnyPlace TV Universal Multimedia Access rquot smart phones seltup f boxes J Universal figmgobih a Content Access quotquot onthe Internet V itquotzn in ux home 4 T i ii 39 g39 Ian 1 Broadcast Content I Technical issues I Scalable coding skimming bandwidthrestricted adaptive streaming Broadcast mon tor39ng nmm mm W amqu w m m nmmemmx tnmau mamme n mwwa mm w my Dvnzvnm lnmm m MmNWM nmmm mm Wm A Dnmnsxnnuudwllhwrndnmnrem 5 mm m unm mm J 1 Mahmum madman mmmmmwsmm mm mm m mm a mm mm wwwm umm mum mm me 1 qu Mummu s Mama mmgmg mm 1m Issue Watermarking is expensive and the watermark may not survive the production process Legacy content Virage News Video Indexing co Km co Fam 1 0 Mana Go sums Go 5 e a g 1 Hum a ABCNEWS EB Key frames Congress Considers Endorsing E Signatures Y VIILACE Wm NEW m Dam manna Cgeswndemqi Sumquot saw WNW mquot Keyword um quot m wdca fur quotmmm manned search gt N i Smyh Ll nkI ng PW Wm News mam Tip m We H 2001 52mm m a transcnpts Cullgtuuyxxilelil Mmquot Bruwu m vruiar mm 7 m m am e m to Vldeo banmmng Tr nltnrl1r quotmm mm TH VNTFRNFFWA U FDTD v Ed mm 5mck Fraud F ounshes on Mama Vlrage amp Vi39 zfmilsz ng39 n Curmgmdm mm mm petarknmngs Momma m ABC evvs a w w m a Columbia s Sports Video Navigator and Streaming C DE 93 9 DM2dlamau2amp ntIL 191E519 Can 2n 5 Gama Game 1 39 4E1 uusen tzmmn nm avauam 7Lang1h 272 to Medln lmngeuenmsl 7 3D Search 1 I Hot topic I Availability of 3D models I Highdimensional unstructured features I Issues representation efficiency imperfection in modeling IBR I Link HP SpeechBot I Get sound clips from radio website I Create timealigned transcript based on speech recognition I Index audio segments based on words I Over 14K hours I Issue High speech quality with no music or noise Poor recognition results I Link Multimedia Information Systems Samson Cheung EE 639 Fall 2004 Lecture 3920 Dimension Reduction Feature Selection Based on Dr GutierrezOsun39a s lecture notes Feature Selection I Why Feature Selection I Search strategy and objectives I Objective functions I Filters vs Wrappers I Sequential search I Exponential search I Randomized search Why Feature Selection Definition 0 Given a feature set Xgtlt i1N find a subset YM h MltN that axrtxrzvuvxrnwit 1 optimizes an objective function Y Ideally the probability of correct classrfication Xl X2 Xlt x l w amp 1x x2xM ar 1axJxI1N XN Why Feature Subset Selection 0 Why not use the more general feature extraction methods and simply project a high dlmensional feature vector onto a lowdimensional space Feature Subset Selection is necessary in a number of situations 0 Features may be expensive to obtain I You evaluate a large number of features sensors in the test bed and select only a few for the nal implemen a ion 0 You may want to extract meaningful rules from your classifier I When you transform or project the measurement units of your features length weight etc are lost 0 Features may not be numeric I A typical situation in the machine learning domain I In addition fewer features means fewer parameters for pattern recognition 0 Improved the generalization capabilities 0 Reduced complexity and runtime Search strategy and objective function Feature Subset Selection requires 0 A search strategy to select candidate subsets An objective function to evaluate these candidates Search Strategy Nt Exhaustive evaluation of feature subsets involves rMJ combinations for a fixed value of M and 2N combinations if M must be optimized as well I This number of combinations is unfeasible even for moderate values of M and N so a search procedure must be used in Search ractice Complete feature set Feature Subset Selection For example exhaustive evaluation of 10 out of 20 features lnmrmaum involves 184 756 feature subsets exhaustive evaluation oi10 subs ten et con out of 100 invoives more than to13 feature subsets Devijver and Kittier 1982 Oh ctive function I A search strategy is therefore needed to direct the FSS process as it explores the space of all possible combination of features Final feature subset Objective Function The objective function evaluates candidate subsets and PR returns a measure of their goodness a feedback signal BIQO M m used by the search strategy to select new candidates Objective function Objective functions are divided in two groups ilters The obiective function evaluates feature subsets by their information content typically I F interclass distance statistical dependence or informationtheoretic m ures Wrappers e objective function is a pattern classifier which evaluates feature subsets by their predictive accuracy recognition rate on test data by statistical resampling or cross validation Training data Cornpleteleature set Filter F55 Search Feature information subset content Objective tu ri ction Final feature subset ML algorithm Training data Campiete feature set Wrapper Fss l Search Feature redictive subset accuracy V P R algoritri m Final ieature subset PR algorithm Filter types I Distance or separability measures 0 These methods use distance metrics to measure class separability such as Distance between classes Euclidean Mahalanobis etc I Determinant of SWJSE LDA eigenvalues I Correlation and informationtheoretic measures T e e methods are based on the rationale that good feature subsets contain features highly correlated with predictive of the class yet uncorrelated with not predictive of each other 0 Linear relation measures I Linear relationship between variables can be measured using the correlation coef cient JYM Where pm between features i and i o NonLinear relation measures I Correlation is only capable of measuring linear dependence A more powerful measure is the mutual information llYk C Plymwcl PlYM PM The mutual information between the feature vector and the class label lYMC measures the amount by which the uncertainty in the class HC is decreased by knowledge of the feature vector HClYM where H is the entropy function JimiiiYMci HicieHic i m i iPmin lg dx WM and Pimmc which 5 an iiiposed problem lot highrdiniensional spaces ln practKe Baum i994 mutual information Is replaced by a heuristic like M M M JiYMZleHCJeiBZ Zilxwxri mi minzmel Filters 0 Advantages I Fast execution Filters generally involve a noniterative computation on the dataset which can execute much faster than a classi er training session Generality Since filters evaluate the intrinsic properties of the data rather than their interactions with a particular classi er their results exhibit more generality the solution will be goodquot for a larger 39 5 family of classifier o Disadvantages endency to select large subsets Since the filter obiective functions are generally monotonic the filter tends to select the full feature set as the optimal solution This forces the user to select an arbitrary cutoff on the number of features to be selected Wrappers u Advantages Accuracy wrappers generally achieve better recognition rates than filters since they are tuned to the specific interactions between the classifier and the dataset I Ability to generalize wrappers have a mechanism to avoid overfitting since they typically use crossvalidation measures of predictive accuracy 0 Disadvantages Slow execution since the wrapper must train a classifier for each feature subset or several classifiers if crossvalidation is used the method can become unfeasible for computationally intensive methods k of generality the solution lacks generality since it is tied to the bias of the classifier used in the evaluation function The optimal feature subset will be specific to the classifier under consideration rch strate ies a Exponential algorithms i Slow search space grows exponentially i But accurate 1 Sequential algorithms a Fast add and remove features sequentially a May be trapped in39 local minima a Randomized algorithms i Sampling and Genetic Branch and Bound 1 The Branch and Bound algorithm Narendra and Fukunaga 1977 is guaranteed to find the optimal feature subset under the monotonicity assumption 0 The monotonicity assumption states that the addition of features can only increase the value of the objective function this is Empty feature set Jlx lt Jx x 1ltJgtlth x z X 3lt Wm XI 0 Branch and Bound starts from the full set and removes features using a depthfirst strategy I Nodes whose objective function are lower than the current best are not explored since the monotonicity assumption Hm feame set ensures that their children will not contain a better solution Branch and Bound 2 Algorithm Fukunaga 1990 o The algorithm is better explained by considering the subsets of M39NM features already discarded where N is the dimensionality of the state space and M is the desired number of features 0 Since the order of the features is irrelevant we will only consider an increasing ordering i1lti2ltin of the feature indices this will avoid exploring states that differ only in the ordering of their features 0 The Branch and Bound tree for N6 and M2 is shown below numbers indicate features that are being removed I Notice that at the level directly below the root we only consider removing features 1 2 or 3 since a higher number would not allow sequences l1lt i2lt i3lt i4 with four indices Branch and Bound 3 Initialize CLDC k1 Generate successors of the current node and store them in LlSTk 1k argmalelemy till lELlSTkl delete lk from LlSTk V Check bound lf x gtlt 1vxlltd 0t 5 0 else lf kM we have the desired number of features go to 6 go to 2 Backtrack to lower level set kk1 terminate algorithm else go o Last level Set dJXK X X Hj and Y xwxwux l go to 5 A Nai39ve sequential feature selection One may be tempted to evaluate each individual feature separately and select those M features with the highest scores 0 Unfortunately this strategy will VERY RARELY work since it does not account for feature dependence X2 An example will help illustrate the poor performance that can be expected from this na39r39ve approach 0 The figures showa 4dimensional pattern recognition problem with 5 classes Features are shown in pairs of 2D scatter plots 0 The objective is to select the best subset of 2 features using the naive sequential feature selection procedure X1 Any reasonable objective function will rank features according to this sequence JXgtJx2JX3gtJx l x is without a doubt the best feature It clearly separates 0 012 L03 X2 and x3 have similar performance separating classes in three groups x4 is the worst feature since it can only separate 04 from 05 the rest P of the classes having a heavy overla The optimal feature subset turns out to be x x4 because x4 provides the only information that X needs discrimination between classes m4 and 05 However if we were to choose features according to the individual scores Jxk we would certainly pick X1 and either x2 or x3 leaving classes 04 and m5 non se arable I This naive strategy fails because it cannot consider features With complementary information Sequential Forward Search Sequential Forward Selection is the simplest greedy search algorithm 0 Starting from the empty set sequentially add the feature X that results in the highest 0 function JYkX when combined with the features Yk that have already been selecte Algorithm biective Empty feature set 1 Start with the empty set YU 2 Select the next best feature x r1rgmagtltJYk X 3 Update YkHYkx kk1 X5 4 Go to 2 Notes 0 SF8 performs best when the optimal subset has a small number of features When the search is near the empty set a large number of states can be potentially evaluated Towards the full set the region examined by SFS is narrower since most of the features have already been seie d o The search space is drawn like an ellipse to emphasize the fact that there Furrrearme set are fewer states towards the full or empty sets I As an example the state space for 4 features W is shown Notice that the number of states is larger in the middle of the search tree I The main disadvantage of SFS is that it is unable m m mm mm mm mm to remove features that become obsolete after the mu mm mm mm addition of other features man man mun mum 1111 Sequential Backward Search Sequential Backward Selection works in the opposite direction of SFS Starting from the full set sequentially remove the feature X that results in the smallest decrease in the value of the objective function JYx I Notice that removal of a feature may actually lead to an Increase in the objective function JYkx gt m Such functions are Said to be nonmonotonic more on this when we cover Branch and Bound Algorithm Start with thefull set Y0X Empima ume Remove the worst feature x39 arg maxMYx e x Update Yk1Ykx kki XE Go to 2 Notes SBS works best when the optimal feature subset has a lar e number of features since SBS spends most of its time visiting large subsets The main limitation of SBS is its inability to reevaluate the usefulness of a feature after it has been discarded Full feature Set Bidirectional Search Bidirectional Search is a parallel implementation ofSFS and 38 SFS is performed from the empty set 888 is performed from the full set a To guarantee that SFS and 88 converge to the same solution we must ensure that I Features already selected by SFS are not removed by SB Features already removed by 333 are not selected by SFS I For example before SFS attempts to ad and a new feature it checks if it has been removed by SBS if it has attempts to add the second best feature and so on SBS operates in a similar fashion Algorithm Ematy feature set 1 Start SFS with the empty set YF 2 Start 88 with the full set YB 3 Select the best feature x39 argmaxMY XH xiv YFm YFx x39 3 Remove the worst feature x39 argmax JYax ex Yaw Yax ex39 kk1 4 Go to 2 FLU feature set Sequential Floating Search Sequential Floating Selection methods are an extension to the LRS algorithms with flexible backtracking capabilities 0 Rather than fixing the values of L and R39 these oating methods allow those values to be determined from the data I The dimensionality of the subset during the search can be though to be floating up and down There are two floating methods Sequential Floating Forward Selection SFFS starts from the empty set A er eac forward step SFFS performs backward steps as long as the obiective function increases Sequential Floating Backward Selection SFBS starts from the full set I te eac backward step SF performs forward steps as long as the objective function increases SFFS Algorithm SFBS is analogous 1 Startwith the empty set YZ 2 Select the best feature Empty reauire set x39 argmaxMYk Xi xsvi Yk Yk ex39 kk71 3 Select the worst feature x39 arg rriagtltJYK 7 x m 4 lfJYkx39gtJYk then Yki k k 9 t 8 3 Notice that you ll need us 9 59 a same no keepmg to go to Step 2 avoid in nite loops Full feature set Simulated Annealing 1 Simulated Annealing is a stochastic optimization method that derives its name from the annealing process used to recrystallize meta s During the annealing process in metals the alloy is cooled down slowly to allow its atoms to reach a configuration of minimum energy a perfectly regular crystal I If the alloy is annealed too fast such an organization cannot propagate throughout the material The result will be a material With regions ofregular structure separated by boundaries These oundaries are potential faultlines where fractures are most likely to occurwhen the material is S resse The laws of thermodynamics state that at temperature T the probability of an increase in energy AE in the system is given by the expression AE P AE e it I where k is known as Boltzmann s constant The SA algorithm is a straightforward implementation of these ideas 1 Determine an annealing schedule Tii 2 Create an initial solution YO While TigtTM N 3aGenerate a new solution Yi1which is a neighbor oni 3bCompute AE Jmm JYit 3b ll AEltU Empty feature set hen always accept the move from Yi to Yi1 else accept the move with probability PexpAETi PU lemme 59 Simulated Annealing 2 Simulated annealing is summarized with the following idea 0 When optimizing a very large and complex system Le a system with many degrees of freedom instead of always going downhill try to go downhill most of the times Haykin 1999 The previous formulation of the Simulated Annealing algorithm can be used for any type of minimization problem and it only requires specification of o A transform to generate a local neighborfrom the current solution ie add a random vector I For Feature Subset Selection the transform will consist of adding or removing features typically implemented as a random mutation with low probability 0 An annealing schedule typically TitrgtltTi with 005610 0 An initial temperature TO Selection of the annealing schedule is critical 0 If r is too large the temperature decreases very slowly allowing moves to higher energy states to occur more frequently This results in slow convergence 0 If r is too small the temperature decreases very fast and the algorithm is likely to converge to a local minima Simulated Annealing 3 A unique feature of simulated annealing is its adaptive nature a At high temperature the algorithm is only looking at the gross features of the optimization surface while at low temperatures the finer details of the surface start to appear A MY High T Genetic Algorithm Genetic algorithms are optimization techniques that mimic the evolutionary process of survival of the fittest 0 Starting with an initial random population of solutions evolve new populations by mating crossover pairs of solutions and mutating solutions according to theirfitness objective function The better solutions are more likely to be selected for the mating and mutation operations and therefore carry their genetic codequot from generation to generat39on o For the problem of Feature Subset Selection indiVidual solutions are simply represented with a binary number 1 if the given feature is selected 0 otherwise which is the original representation proposed by Holland in 1974 Algorithm Empty feature set 1 Create an initial random population 0 2 Evaluate initial popu ation 2 Repeat untll conver ence or a number of generations 2aSelect the fittest individuals in the population 2bPerform crossover on the selected individuals to create offspring 2c Perform mutation on the selected individu 2e Evaluate the new population a s 2dCreate the new population from the old population and the offspring O Full feature Set Genetic operators Singlepoint crossover Select two individuals parents according to their fitness Select a crossover point With probability Pc 095 l5 reasonable create two offspring by combining the parents Crossover point selected randomly Parent 010 010110 1101 010110 orfspring Parent 110 0110000 01105 110000 Ulfspring Binary mutation Select an individual according to its fitness With probability PM 001 is reasonable mutate each one of its bits Mutated nus Individual 11010110000 11001010111 oifspring Selection methods The selection of individuals is based on their fitness the value of the objective function We will describe a selection method called Geometric selection 0 Several other methods are available Roulette Wheel Tournament Selection etc Geometric selection The probability of selecting the rm best indiVidual is given by the geometric probability mass function PM qti q I q is the probability of selecting the best individual 005 is a reasonable value Therefore the geometric distribution aSSigns higher probability to individuals ranked better but also allows unfit indiViduals to be selected In addition it is typical to carry the best individual of each population to the next one o This is called the Elitist Model m H Selection probability 3 8 2 a Au 5n so Individual rank GA pa ameter choice The choice of crossover rate Pc is important You will want a value close to 10 to have a large number of offspring o The optimal value of PC is inversely proportional to population size Doak 1992 The choice of mutation rate PVI is very critical 0 An optimal choice of PM will allow the GA to explore the more promising regions while avoiding getting trapped in local minima l A large value Le PMgt025 will not allow the search to focus on the better regions and the GA will perform like random search I A small value ie close to 00 will not allow the search to escape local minima The choice of q the probability of selecting the best individual is also critical 0 An optimal value of q will allow the GA to explore the most promising solution and at the same time provide sufficient diversity to avoid early convergence of the algorithm In general poorly selected control parameters will result in sub optimal solutions due to early convergence Search Strategies summarized Accuracy Complexity Advantages D39sadvantages A s finds the EXhaUStlve optlmal sell Ion Dad I 0 Cannot Sequential acktrackmg Quadratrc ONEx Simple and fast backtrack needed P P Randomized con cl Generally ow choose good parameters parameters Multimedia Information Systems Samson Cheung EE 639 Fall 2004 Lecture 11 Video boundary detection racteristics a Video is a rich information source a links between frames cuts fades dissolves i changes in color shapes motion of both camera and objects video acquisition data shot angles camera movements each type of video has its own characteristics commercials news movies sport Video Structure Frame typically 125 or 130 seconds 39 Shot sequence of similar frames An unbroken sequence of frames from one camera Scene A collection of conseCutive shots focusing on One or more objects of interest One location One thematic concept or event EpisodeStory consecutive scene s intro news reporter weather Temporal segmentatien Iden Wthebounda es 139quot VideoSEgmemm between shots and between quot scene s Why shotscene segmentation I Video Browsing and visualization I Archival and indexing I Nonlinearediting I Event detection I Film Understanding Production Models n Abrupt shot change that typically occurs in one frame Fade ln or M between brightdark and new shot a Dissolve Gradual transition between two shots Mg and others Motion El Camera moti0n pan and zoom il Object motion not shot change Flashlight and aperture change Program types a News lighting 3 Sports many cut changes a Movies motion similar locations 3 Television programs eg music docUmentary sitcom etc 1 Commercials short fast changing a Consumer videos pan zoom noise flash 1 Meetings and presentations Basic Detection Methods 1116 scale color edge motion compressiOn data audio histogram DCT motion vectors audio features audio types 3366 ms 100ms few sec o difference th reisholding statistical model based Factors I ACCuracy detection and timing I Speed I Simplicity I G enerality I Typical problems I Camera movement I Flash Pairwise pixel comparison Counts number of pixels changed from a DP 1 if Pik1Pi1k1 gtr 0 otherwise frame ito the next A shot boundary is found if more than Tb pixels has changed Problem sensitivity to cameraobject motion ppm 1 and noise ASH 31 gt T many pixels change MN b v Trivial extension to blockbased comparison ALN Histogram comparison v Grey level histograms for thregersuccessive frames Frames 1 and 2 alm Ost identical Camera break between frames 2 and 3 7 Many techniques for computing differences between histograms gt More robust against camera movement Pixrl Bull An example Cuts Dissolve 5 5 I I 5 E Twn comparison approach Two thresholds u Tb for camera break 39detection a Tslt Tb for special effects like dissolves motion Com pare consecutive frames e39g histograms if difference exceeds Tb cut if difference exceeds Ts potential graduate transition start accumulating differences from that frame until the transition becomes lower than Ts Boundary if accumulated difference becomes higher than Tb Using twin comparison camera special break effect QI39ETJLI39HQIHTIHlegrT 1quot F T quot MIIHIHW Z E I Y b E g m mmm u LEW 1717 WEEK 8 M Definitions a pin number of entering edge piXels a pout number of exiting edge pixels Observations fadein pin much larger than pout fadeout pout much larger than pin Dissolve a predominance of pin followed by a predominance of pom assuming the new image is faded in before the old one is fadedout 51 Statistics Edgechange Ratio Lienhart paper comment not worth it Compresseddomain processing 1 DCT coefficients Fastest but coarse DC term frOm lframes a Fast but better lframes DCT coefficient correlation in Fast but best DC images need MV See Song amp Yiu 99 DC P 339 hiWiDC P quot r refZ 4 it i0 See Koprinszka C arrato 2001 39 Macroblock Coding Mode many intrablocks indiCates a scene change Motion Vectors camera movement Bit usage scene change uses a lot of bits election of threshold mllllostmethio ds use a fixed global threshold 3 Adaptive threshold Slidingwindow based method Yeo Li u 95 Look at a sliding window of histogram differences Hardcut if the maximum absolute difference is significantly larger than the second max absolute difference Shotduration prior Va3conCelos Lippman 00 Likelihood testing Threshold based on shotduration modeling a Modelbased no threshold 4 1 i z A A mg 1 393 on 7 j e 4 17 e n Lieo ill Helm m l l to J llE i 7 Tlll 393 a Seoeoe olmeloge M e meeeotremeo IHl etootfem A e olotret oo emcee Meet D e eoeoe et ienoe 1 I 1 39SC M no SC for A gt Pno SC M no SC for A i PM SC no SC for A gt Pno SC no SC for A PM no SC no SC for A PSC no SC for A PM SC no SC for A gt Pno SC no SC for A PM no SC no SC for A PSC no SC for A PSCD gt A 5 HA lt SCD lt A 5 i I V EDD 15quot c0412 Fig 1 Sim duration histogram and 11133111111111 likelihood fit obtained with th Wai oull disrriburzon gt Build statistical model of SCD based on genre Video S i W iW to t Veeeoooeloe Liloomem eooo Stet e z oel Model I a ooteot Aoelye e emo loet eoteozeooo 3 Genera approach for model edit effect etection a model effect i design a function Et which describes the effect i transition can be modeled as a function Et which can be written as a combination of the images X and 39Y 3 design detector consider for example what remains constant in the effect which then forms the basis for detecting that a frame is part of a special effect General made fenquot fadeniaut and dssave T1 start of fadeout of X T2 start of fadein of Y L1 length of the fadeout of X L2 length of the fadein on S 39 L1 t T 39 t T L2 teT2rT2L2 r L1 zeTLTLL1 d Y 5E H 3 t 2 tET2T2IQ 1 tET1T1L1 etector 3 Differentiation take the pixelwise difference between the derivatives in subsequent frames 3 Observation for the duration of the fadein this should be constant 1 Detection try different values for L2 and see whether the formula below holds for the frame Y at tL2 j Drawback an too many types of transition to model No t robust against motion Modelbiased technique Features ll Histogram differences 1 Audio cepstral difference a Motion vectors Train HMM using label data detect using viterbi algorithm No threshold Two cut state as a sharp cut may last for two frames in interlacervideo gee Borec zky Wilcox 93 A Hidden Markov Model ramewo e e rk orvidedSeg39memmion39ueing Audio and F atures Scene and Story Segmentation I Much harder problems I Domain information can help a great deal I Graphbased technique Scene Transition Graph I More next lecture Video Seene Layout Let the use of capitals with a number indicate that the shots are equivalent Eg below B C could denote two views of people in an interview where D shows both of them in one view Scene 2 quotquotquotquotquotquotquotquotquotquot quot SC ne 3 Scene 391 gt Transition Graph 3 Indicate shot changes by arrows 3 Loosely defined scenes correspond to groups of shots that are highly connected Camputation 0f the in Detect shots using any algorithm 2 Timeconstrained clustering group shots together if they are visually similar ex Using Hausdorff distance time difference less than T Create the graph m connect two clusters by an arrow if they contain consecutive shots Partition the graph based on cutedges edges whose removal lead to disconnected components Scene transition graph See Yeung Yeo Liu 96 TREC Video Retrieval Tasks 3 Problem Rapidly growing quantities of digital video a Increasing research in contentbased retrieval from digital video I1 ut no common basis for evaluationcomparison of approaches Approach Find as much video data as possible and make it available to the Community of researchers Use the data to build an open metricsbased evaluation 1 History I TREC Text REtrievaI Conference sponsored by NIST and DoD started in 1992 to study text IR and speech recognition TREC Videjo Retrieval is azse parate activity started in 20037 a Data 133 hours 1998 ABCCNN news C SPAN a Four tasks Shotboundary determination Highlevel feature extraction 17 Story segmentation and classification Search manual and interactive 3 Participating groups 24 c 9 2 u w L A P 9 9 5 cocoa RecaH Story segmentation classfication 31 Segmentation task Identify sotry boundaries in CNN and ABC new shows a Evaluation based on precision amp recall boundaries have to be within l 539 seconds interval around ground truth bounda es News classificatiOn task i Annotate stories as either news or nonneWS 5 Evaluation based on percentage of correctly identified news story footage Three categories AudioVisual features provided by CMU AV Automatic Speech Recognizer output LIMI I ASR only Feature Extraction I Identify the following highlevel features in the video Indoors Aircraft News subject face not anchor person News subject monologue People at least three humans 39 39 Nonstudio setting Building walled structure with roof Sporting event Road Weather neWs Vegetation in its natural environment Zoom in Animal 39 39 Physical Violence Female speaking Visible audible V Madeleine Albright Visible Cartruckbus exterior of N n m a m o IXILZZcZEQS Z AQLQLDELUZQ 09099590209090 Search Task I TRECVID systems including a human in the loop are presented with a tOpic and are to return up to 1000 shots which meet the need I Note the unit of retrieval is the shot not the news story I Two search modes manual and interactive l Require a user interface from participant Search Types A HHUMAW wank w V Human funnmaum mm Bk query as mm Eggz i gtgg ggmm and nroduces resmt wwthum on knowledge of wheatan my ma mm or Search result A A HHUMANR Human m urmulmes query based on c qu mp I system takes query as mam ery ardnr results and produces resuuwuhouz funherluman mieNentlm on this mvccatlon 25 TQpCS total relevant found 10039 Find shots with aerial views containing both one or more buildings and one or more roads 87 101 Find shots of a basket being made the basketball passes down through the hoop and net 104 10239 Find shots from behind the pitcher in a baseball game as he throwsa ball that the batter swings at 183 103 Find shots of Yasser Arafat 33 104 Find shots of an airplane taking off 44 105 Find shots of a helicopter in flight or on the ground 52 106 Find shots of the Tomb of the Unknown Soldier at Arlington National Cemetery 31 107 Find shots of a rocket or missile taking off Simulations are acceptable 62 108 Find shots of the Mercedes logo star 34 1 1 Find quotho an urban ex in 339 J o 5 a 339 a g 6 5 an 939 5 2 5 6 188 quotattic andfor Multimedia Information Systems Samson Cheung EE 639 Fall 2004 Lecture 3 amp 4 Color Video and Fundamentals of Data Compression Color Science I Light is an electromagnetic wave Its color is characterized by the wavelength content of the light I Visible 400nm 700 nm Spectral power distribution E of a signal daylight Swirl l 1 mumtn distribution 403 470 500 EISU L IG 65 700 Wavelength 11111 Human Vision I I Sensors I Rods night vision I Cones red green blue VQRQGQqu RH II a If quota f 391 Jr 5 J I 1 a 1quot g k I 2 4M 450 53930 55G I538 6513 709 Warslag l juuz Image Formation Gamma Correction CRT I The light emitted from CRT is roughly proportional to the voltage raised to a power I This power is called gamma y22 I Gamma correction Not Gamma Corrected Corrected Different Color Coordinates I Different numerical representations to facilitate the applications I Coordinate systems I RGB monitor I CMYK printer I CIE Chromaticity XYZ I Lab and Luv I HSV HSV HLS I YIQ YUV YCbCr CIE 1931 Chromaticitz Diagram I Visible color I Normalized intensity Luv and Lab L a b color space I Nonlinear transformations of the XYZ tristimulus space I Designed to have a more uniform correspondence between geometric distances and perceptual distances between colors that are seen under the same reference illuminant I Luv for additive color I Lab for subtractive 113M HSI HLS HSV HSV spam I Intuitive Color Hue I HLS Hue Lightness Saturation I HSI Hue Saturation Intensity I HSV Hue Saturation Value I Nonlinear transform Salurahon Saiuratinn Lth s quotv 39 YUV YCbCr YIQ I Linear transform of RGB to achieve energy compression I YUV analog color space for television I YIQ analog color space for NTSC composite rotate YUV by 33 to further bandlimit Q I YCbCr discrete and scaled YUV Analog Video Signals I Type of analog signals I 3 wires component RGB I 2 wires Svideo 1 for luminance 1 for chrom I 1 wire composite frequency multiplexed all together W NTSC frequency quot39II I I l quot39 I I liIEI 39 m sprectrum 0 115 y mm 483 57560 ca name Analog television I All interlace signals P Frame of Total Bandwidth TV System Rate Scan Channel Allocation MHZ fps Lines Width MHZ Y I or U Q or V NTSC 2997 525 60 42 1 6 0 6 PAL 25 625 80 5 5 18 18 SECAM 25 625 80 6 0 2 0 2 0 Digital Video Signals 0 Pler with only Y Talus 0 0 0 3 O 9 O 9 0 0 0 0 O O O O O O imumycfandvaam D O 0 0 O O O O O 0 0 0 0 3 O 5 O 5 O wnthzaudChvaii 422 420 CCIR 601 CCIR 601 CIF QCIF 52560 62550 NTSC PALSECAM Luminance resolution 720 x 480 720 x 576 352 y 288 176 x l44 Chromlnance resolutlon 360 x 480 360 x 576 176 gt 144 88 gtlt 72 Color Subsampllng 422 422 4120 4120 Aspect Ratlo 43 43 43 43 Fieldssec 60 50 30 3O Interlaced Yes Yes No No HDTV iabie 54 Adranted Digiiai TV tormats suppoted by ATSC 1 0i Active tot Attire Aspect Ratio Picture Rate Pixeis per iine ii 1920 1060 1619 60130P MP 1280 720 1619 GOP 3UP NP 4 161981413 601 60F SUP 2iP 640 43 413 60160P 3UP24P I Larger frame size I Noninterlaced or progressive P format I 169 aspect ratio I Standard by 2006 mandated byFCC Fundamental of compression I Two types I Lossless 0 file highquality media I Lossy o tradeoff quality with bitrate I Information content average entropy I Shannon s Source Coding Theorem N iid random variables each with entropy HX can be compressed into NHX bits with negligible loss as N gt Lossless compression I Memoryless sources approximate entropy using variable length code I Huffman codes I Arithmetic codes I Memory sources exploit redundancy I Runlength codes I Library based codes I Contextbased methods I Transform coding 0 Linear prediction 0 Others Huffman Code III1 g I Binary Huffman tree I leaf nodes represent symbol and each branch represents 0 or 1 I Traversal from root to leaf represents the codeword Sym Incl CDunt Huffman code cont I Advantages I Prefix property no decoding ambiguity I Easy to decode traverse down the tree I Disadvantages I Each symbol must use at least one bit 0 Blocking I Changing statistics 0 Update Huffman tree to reflect statistics Arithmetic Coding I Represent the entire sequence of symbols as a real number I Each symbol is mapped to an interval whose length is equal to its probability occurrence I Higher probability gt longer interval gt lower precision needed gt fewer bits I Highlevel steps Select the interval that corresponds to the symbol Send leading bits if known Renormalize the selected interval Partition again based on probability Go back to 1 WPPN Example of Arithmetic coding 10 05 o 34 0334 03322 03322 s 3 TV I F le Symbol E E A l B 01 1 C 02 x D 005 D 39 D E 03 x F 005 C k C 01 E B k E g A I A o 03 03 03156 Arithmetic Coding cont I Advantages I Asymptotically optimal I Use less than one bit to code a symbol I Easier than Huffman to update statistics does not require tree rebuilt I Disadvantages I Higher implementation cost need multiplication for normalization except binary I Very sensitive to transmission error 21 Runlen th code I Commonly used in image and video coding I Run Level OOOO1OOOOO5004000O80090 run level 51 65 34 58 39 10 22 Library based codes I Uses fixedlength codewords to represent variable length strings of symbolscharacters that commonly occur together I The encoder and decoder build up the same dictionary dynamically while receiving the data I The codec places longer and longer repeated entries into a dictionary and then emits the code for an element rather than the string itself if the element has already been placed in the dictionary I LZ77 gzip LZW UNIX compress GIF V42 bis 23 Contextbased prediction 0000 0001 0010 GD 0 Pb O Pb O Pb l 0 04 0 09 0 01 000 D O 1 06 1 01 1 09 1 1 O D gt O D gt O I One PDF for each context I Even simple context requires LOTS of tables 81 I Two solutions I Model ex ContextWeighting Trees Willems Shtarkov Tjalkens 95 I Handpicked like in H264 24 Transform Coding I Need to code I Goal Devise a transform 1 Burrows Wheeler Transform 2 Decorrelated Whitening Transform Transform X s into independent Y s o Equality hold if they are all independent 25 Burrows Wheeler Transform BURROWS URROWSB RROWSBU ROWSBUR OWSBURR WSBURRO SBURROW BURROM OWSBUE ROWSBL RROWSE sort SBURRC URROWE WSBURE lowzdflmml ABCm SABm RSAm RSAm URSm WURm BWUm l8 l8 20 21 18 Runlength And Huffman I Used in bzip bzip2 best text compression to date I Burrows and Wheeler 1994 I Readable account by Mark Nelson Dr Dobb s Journal 1996 26 Linear Prediction O O O l O O I 4 1 1 1 and all its shifted versions form the transform I Many other transforms DCT Wavelet I Typically considered as lossless because of the floating representation 27 Multimedia Information Systems Samson Cheung EE 639 Fall 2004 Lecture 15 Statistical Pattern Recognition Retrieve all news stories that show Kerry and Bush in the same shot Identify in the surveillance video all groups with more than three individuals Separate all the instruments in a symphony recording 3 Find all the goal shots in a football match video Block out a certain individual in a surveillance tape for privacy reason What is Pattern Recognition I Two main tasks I Supervised classification the input pattern is classified into a number of predefined classes ex Speech recognition I Unsupervised classification the pattern is assigned to a hitherto unknown class ex Image segmentation Model for PR Feat e L Preprocessing ur quotl1 Irimnnn gt pattern A Classr cauon Trai nin g t 39 39 Feature rammg r gt Preprocessmg i p Extlaction gt Learmng pattern l Selection l iii I Not every PR problem has the training part I Feature ExtractionSelection refers to the process of identifying the most important features for the PR tasks later I Focus on classification and learning this week Classification Classification Given a measurement x find corresponding class label y Modeling Measurements are RANDOM VARIABLES whOSe DISTRIBUTION depends on the class Class Observation 0i 39 Class Conditional distributions gtO gt FranzIv u Reflect variably in feature hidden PCTlmi discrete continuous 21 Reflect nelse Random Variables review a Joint PDF 3 Marginal px j pxydy rs Covariance T a 2 Ex wow u a Conditional Probability 5 Knowing one value in ajoint distribution constrains the remainder o Bayes rule 14x 1106i y My 3 my x 14x i y y 1106 E nn reverse 39 39 39 39 ai 39 39 I Either term can be discreteor continuous Priors and Posterior a Bayesian inferenCe can be interpreted as updating prior beliefs with new measurements x Bayes Rule 7 7 If p39c0 Prc07 39 Prior 2jphlmj P OJ739 Posterior probability b blt Evidence px pm a I I y m Posterior is prior scaled by likelihood amp normalized by evidence so posteriors 1 m Objection priors are often Unknown a but omitting them amounts to assuming they are all equal Prmix Classification criterion 1 Maximum Likelihood ML y arg man PXx Yo 2 Maximum a Posterior MAP y arg maxw P39Yoa Xx arg maxw PXX Yo PYo 3 Bayesian decision rule y arg minw ELo Xx where La measures the loss when misclassifying the class as a Sources of error min 1Prco 1 1701033 39Pru7 Prco 2L 1 Prerr39LX 1 a Suboptimal threshold regions bias error a use a Baers classifier a Incorrect distributions model error I better distribution modelsmore training data a Misleading features Bayes err0r a irreducible fora given feature set regardless of classification scheme Training Given labeled training data X1 y1 a XZWZ xN YN a derive the classification rule Density based approaCh Boundarybased approach I Estmate PXx I Yw dlrectly Identify the class decision 39 Parametric f rm boundary of region directly assume a specnflc PDF ML estimate find parameters to maximize PXY I Nonparametric Histogram how many bins I rlrorl lematic when dimension is I 9 Need lots of data I Does not overfit easily I Overfitting a histogram with too I Difficult to interpret many bins is a poor estimate of the underlying pdf Ontology for PR Class Conditional Densities Unknown Supervised Learning r Parametric k Nonparametric Parametric Plug in Density Decision l Min Rules Estimation I Boundary Resolving Construction I lr l x 7 V 7 Density Based Approaches Geometric Approach Unsupervised Learning Nonparametric Clusler Analysis Our focus I Parametric Models I Gaussian Mixture Models I Hidden Markov Models I Nonparametric Models I Neural Networks I Support Vector Machine I Boosting I Dimension Reduction I Feature selectiOn I PCA and Randomized projection I Applications to fast similarity search Gaussian istributimn 3 Easiest way to model distributions is via parametric model tasSume known form estimate a few parameters 3 Gaussian model is simple amp useful 1 39e q xau i 2750i 0 normalization to make it sum to 1 in 1D p ltni Highdimensiena Gaussian m Described by d dimensional mean vector p and d x dcovariance matrix 2 1 1 T 71 pxic0 7X7ui 2 mg md 2il2 Gaussan Mixture Model GMM 3 Single Gaussians cannot model a distributions with multiple modes I distributions with nonlinear correlation r r 7 a What about a weighted sum Ie px a chpxlmk C where ck is a set of weights and pxmk is a set of Gaussian CiOmponents a can fitranything given enoughcomponents 2 Interpretation Each observation is generated by one of the Gaussians chosen at random with priOrs ck Prmk Examplle resulting gtsjurfa39c39e 06 39 39 39 r 4 Gaussian 3 39 components original data Problems need to find the parameters for each component and the prior Two cases I Supervised classification I Unsupervised clustering I See notes Multimedia Information Systems Samson Ch eung EE 639 Fall 2004 Lecture 19 Data structure for Similarity Search Finding the records in a large dataset closest to a given query Applications II Contentbased retrieval II Datamining and Pattern recognition I Geographic information system GIS where is the closest post office in Our focus is on efficient data structure indexing and the associated signal processing algorithms II Easy insertion and deletion El Fast query time uery Types c the distance tolerance is specified used at the earlier stage and can be very loose s The Specifies the number of close matches to the given query point 1 An interval isgiven for each dimension of the39 feature space and all the records which fall inside this hypercube are retrieved Binary search for 1D feature 123 45 3 5333 6 100 94 10 34 7256 1 Sort 94 10 123 3 7256100 Oog N J Search Much better than sequential 234 3 34 5333 6 94 tha b f 2E Big RD 9 3 quot KLVAVWEquot L9 7 0quot 15 gt o ggwgfglj I1 cigusg lfI4awLm Q g 0 W V 391 7 a WHW UJJWG B 3 3i mi g 33 SEE MES WEN ma 1 rm GemEa gamma ma glbm 613 mam jFIE m If mU1 E W 1 me wmmears p Mutidmeneie e index Structuree 1 Partitioning in the higherdimensional space 1 Some common structures u kd Trees 3 Point quadtree u MX quadtree m R Trees R TV SS u and many others kud Trees 1 kd tree is a multidimensional binary search tree 1 Each node consists of a record and two pointers The pointers are either null or point to another node a Nodes have levels and each level of the tree discriminates for one attribute 1 The partitioning of the spacewith respect to various attributes alternates between the various attributes of the ndimensional search space Kd tree example Input Sequence A65 50 X 3 A 65 50 60 70 C7060 D 75 25 3 60 70 E 50 90 F 90 65 G 10 30 H 80 85 1 95 75 39 r H8 085 19575 N Y ksd Tree Search Algorthm miNotations KAL M LowL N HighL DiscL The discriminator at L s level KAL The A attribute value of L LowL The left child of L HighL The right child of L m Algorithm Search for P39 K1 Kn Q Root I Q will be used to navigate the tree 1 While NOT DONE DC the following if KiP KiQ fori 391 n then we have located the node and we are DONE Otherwise if A DiscQ and KAP lt KAQ then Q Low Q else Q HighQ PointQuad Trees I Why limit to binary partition I Each node of a kd imens39ional quad tree partitions the object space into k quadrants I The partitioning is performed along all search dimensions and is data dependent like kd trees PointQuad Tree Example Example Partitioning of the space D35 85 C The glia d tree 80 I 09055 A5050 E0525 0 To findinsert P55 75 Sin ceXAlt XP and YA ltYP gt go to NE ie B 39 Since XB gt XP and YBxgt YP gt go to SW which in this case is null MXquadtree 5 Both kd tree and pointquad tree may divide the space unevenly due to the uneven distribution of the data points i MXquadtree are similar to point quadtree tree except that they divide the embedding space iii Each srpli t evenly divides a region MXQuad tree Example Example Construction of a MXquad tree Insert A65 50 Partltlonmg of the space XltEIzi50 1 3 A6539 50 4 C70 60 B6070 2 A6550 6 D7525 A6550 10 2039 30 40 50 60 70 80 90 InseIt C170 601 Xlt50 A6550 xlt75 Ylt75 Xlt625 39 39 gt625 D7525 A6550 I Xlt625 g gt625 mm 70 C70 610 B6070 00060 Rtree a All previous structures are not suitable for disk access because of the random access of each data object 1 Disk typically organize blocks of data called pages much more efficient to load one page at a time 3 Rtree similar to Btree in 1D E Main memory store summary of data m Data are stored at leaf nodes m Each leaf node contain multiple data objects Rtree continued Root and intermediate nodes corresponds to the smallest rectangle that encloses its Child quot n nOde sr I 39 Each node must beat least half maria Innl If full minimize the height of the tree Leaf nodes contain pointers to the actual objects A rectangle may be spatially Contained in several nodes egg J 3 yet it can be associated with only one node I Need tors39earch multiple branchesll Lowdimensional case I Rtree NN Search I Keep Number of Bounding Boxes small Storage I Each BB should have roughly the same number of points Hihdimensianai casea Assume 011 1 N uniform dist data pts MBB s a For fixed N Enndist 0xd I Length of main diagonal of 01d xd a If BB 39is a d cub es V squot a d 100 S 05 V 8631 gt M E N a TO keep Msmall lower individual BB to dim d Olog2NM gt Vd cube05 MN lmaxmax distance from BB 05 Iog 2N M a d gtlogN lmaXlt Enndist 239 Sequential Search


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Janice Dongeun University of Washington

"I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.