BIG (BusinessIndustryGovernment) Statistics
BIG (BusinessIndustryGovernment) Statistics STAT 597C
Popular in Course
Popular in Statistics
This 0 page Class Notes was uploaded by Hilbert Denesik on Sunday November 1, 2015. The Class Notes belongs to STAT 597C at Pennsylvania State University taught by Staff in Fall. Since its upload, it has received 14 views. For similar materials see /class/233137/stat-597c-pennsylvania-state-university in Statistics at Pennsylvania State University.
Reviews for BIG (BusinessIndustryGovernment) Statistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 11/01/15
Adding a cluster neighboring structure g topology to K means Selforganizing Maps 1S0 m A SOMtype clustering alorithm Choose a distance function for points say dx1x2 Choose a K Now we want to specify a topology neighboring structure for the clusters We do this choosing a KxK matrix with nonnegative entries that expresses distances among K latent learning units one for each cluster u1uK 5uiuj ij 1K The topology does not necessarily have to correspond to a physical similaritydissimilarity or proximity What it ends up mapping is clusters relatedness in the fitting process The next thing we need to specify a learning propagation mechanism We do this through a kernel function that will take as argument distances between learning units M5 Initialize K Tdimensional weight vectors one for each learning unit these are points in the space of the data and will function as centroids eg K among the N data points selected at random w10 wK 0 For each x 1 Compute the distances from the current centroids dxwk k 1K Update the current cluster membership identifying the closes winning centroid J mx arg mink1K 6 06 wk De Update all centroids based on the proximity of the corresponding units to the one of the winner and the learning propagation mechanism wkewkh6u ukx wk klK mx 3 Repeat until cluster centroids and thus memberships have stabilized possibly cycling through the observations several times Loosely speaking this algorithm pursues a minimum of the objective function AKstart Z i h5umxukd2xwk where the centroids are again the optimization variables and the minimum we reach depends on the starting w10wK0 In real SOM s the kernel further depends on a time parameter that represents the number of iterations h 51 As I increases the kernel becomes smaller allover and more concentrated about 50 see later In the objective function we have replaced an indicator with a function that comes from composing a postulated learning propagation mechanism kernel with a postulated neighboring structure among clusters distance in the leaming units space Whatever the learning topology may be SOM reduces to an algorithm similar to Kmeans if there is no learning propagation which corresponds to hgtQ 60 h8t Q 5 0 Similarly whatever the learning propagation mechanism may be SOM reduces to an algorithm similar to Kmeans if we postulate no neighboring structure among clusters which corresponds to 0139239 50h hlm iii Real SOM algorithms Slightly different because they are devised to train a net for future recognition purposes not explicitly for clustering The data points are a collection of training data input patterns that are fed to the net so it may or may not make sense to present them more than once The number of learning units K does not necessarily correspond in any sense to the number of groups one expects in the training data Sometimes it is selected to be as large as the number available input patterns or even larger The neighboring structure is automatically defined by the fact that learning units are located on a l or 2dimensional grid square distances on the grid In order to have the net stabilize as input patterns are presented to it the leaming mechanism kernel has a further parameter expressing time as time goes by h6t s shape decreases in hight and width reducing both the learning effect and the leaming propagation note with this change one cannot rigorously write an objective function corresponding to the training of the net anymore Imagine a situation in which input patterns only once and one does not necessarily have a large enough number of them as to guarantee convergence of the algorithm tapering learning has a stabilizing effect The order in which the input pattems are presented may matter a great deal need to randomize Geometrically speaking the weight vectors of the adapted units are moved a bit towards the input pattern The amount of weight vector movement and the number of units that are affected by adaptation are determined by the kernel function and decrease in time In time each weight vectors comes to be similar to a type group of input patterns Each unit is thus more likely to win at future presentations of a pattern type Adapting not only the winner unit but also a number of units in the neighborhood of the winner leads to the following Close by input pattern types in data space with its distance 69 Close by weight vectors in data space with its distance 69 Close by units in units space with its distance39 on the lattice The training process of the selforganizing map describes a topology preserving mapping from a highdimensional data space onto a low dimensional unit space Patterns that are similar in terms of the data space are mapped to geographically close locations in the units lattice SOM s for Visualization start with a number of learning unit larger or equal to the number of data points input patterns Each unit will come to represent a data point a proximitypreserving nonlinear projection from the data space Tdimensional high to a 2D grid Example from httpwwwifstuwienacaUandisomlibsomhtml SOM consisting of a square arrangement of 7 by 7 learning units The black circle indicates the unit that was selected as the winner for the presentation of input pattem xt The weight vector of the winner wt is moved towards the input pattem and thus wt1 is nearer to xt than was wt Similar yet less strong adaptation is performed with a number of units in the Vicinity of the winner These units are marked as shaded circles The degree of shading corresponds to the strength of adaptation the weight vectors of units shown with a darker shading are moved closer to xt than units shown with a lighter shading As t increases the shading fades Architecture of a selforganizing map m s are w s in our symbols 39Irpu qua As a result of the training process similar input data are mapped onto neighboring regions of the map Fuzzy K means If the data are not lumpy ie if the variation in the data space is gradual instead of abrupt disjoint clusters may provide a poor description An approach with fuzzy clusters seems more appropriate Objective function one version AKstart Zimxkfd2xck mxk in 01 are membership functions fuzzy memberships Sum to 1 over k and a strictly positive number over x f is the ccfuzziness exponent 3 1 Degree of fuzziness of the final solution39 degree of overlap between groups With f 1 the solution is a hard partition As f approaches infinity the solution approaches its highest degree of fuzziness Corresponding algorithm one version Choose a distance function for points say dx1x2 Choose a K Choose a fuzziness exponent f 3 l Initialize K Tdimensional centroids e g K among the N data points selected at random 010 cK0 altematiVely initialize the memberships o lterate the following dxck Vx k 1K dZf71x Ck K ELF 71me9 j1 Zmxkfx e X Ck Zmbgkf mxke VxklK 0 Repeat until cluster centroids and thus memberships have stabilized possibly cycling through the observations several times Data points located between clusters will have fairly spread membership functions How about outliers ie points outside the main body of data Fuzzy k means with extragrades adds an extra class for outliers De Gruijter and McBratney 1988 References Kohonen T 2quotd edition 1997 Self Organizing Maps SpringerVerlag EM algorithms for selforganizing maps 1999 THeskes J Spanjers W Wiegerinck Abstract Selforganizing maps are popular algorithms for unsupervised learning and data visualization Exploiting the link between vector quantization and mixture modeling we derive EM algorithms for selforganizing maps with and without missing values We compare selforganizing maps with the elasticnet approach and explain why the former is better suited for the visualization of highdimensional data Several extensions and improvements are discussed 1 Introduction Selforganizing maps are popular tools for clustering and visualization of high dimensional data 8 13 To derive an error function for the selforganizing map we will follow the vector quantization interpretation given in among httpciteseernjneccom280386html On the use of selforganizing maps for clustering and visualization 1999 A Flexer Principles of Data Mining and Knowledge Discovery Abstract We will show that the number of output units used in a selforganizing map SOM in uences its applicability for either clustering or visualization By reviewing the appropriate literature and theory as well as our own empirical results we demonstrate that SOMs can be used for clustering or visualization separately for simultaneous clustering and visualization and even for clustering via visualization For all these different kinds of application SOM is compared to other statistical approaches This will show SOM to be a very exible tool which can be used for various forms of explorative data analysis but it will also be made obvious that this exibility comes with a price in terms httpciteseernjneccom105424html Bezdek J C 1981 Pattern Recognition with Fuzzy Objective Function Algorithms Plenum Press New York DeGruijter J J McBratney AB 1988 A modi ed fuzzy k means for predictive classification In BockHHed Classification and Related Methods of Data Analysis pp 97104 Elsevier Science Amsterdam Roubens M 1982 Fuzzy clustering algorithms and their cluster validity European Journal of Operational Research 10 294301 XieXL BeniG 1991 A validity measure for fuzzy clustering IEEE Transactions of Pattern Analysis and Machine Intelligence 13 841847
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'