# Statistical Models Theory and Application STAT 215A

Marketplace > University of California - Berkeley > Statistics > STAT 215A > Statistical Models Theory and Application
This 9 page Class Notes was uploaded by Floy Kub on Thursday October 22, 2015. The Class Notes belongs to STAT 215A at University of California - Berkeley taught by Staff in Fall.

Date Created: 10/22/15
Statistics 215a 10203 D R Brillinger Residual analysis rijzyij m ai bj Yij yi j Plot vs fitted values row values ai column values bj the diagnostic plot comparison value a m Look for patterns outliers surprises 450 line suggests log transform Example wildfire data L1 approximation 1 Location summary statistic mine Xi lyi 9 minimized by yWHD if n 2ml any ymyml I1 2m Proof Perturb G away median 2 Linear function min 21 IYi XiTBI linear programming llfit 3 Twoway array i least absolute residuals minwu 2m IYij 39 H 39 0 3 llfit diagnostic plot residual rU vs a Hm ii median polish operate iteratively removing and column medians until each rowcolumn has median O sweeping additive approximation yijmaibjrij resistant can carry out by hand missing values OK answer depends on whether begin with rows or columns approximates Ll solution if remove means get aov in one pass mf 8 H lt lt E H lt lt h 5 lt lt lt twoway diagnostic plot residual rU vs a Hm add resistant line can suggest transformation to additivity Yu 2 mlaimlbjm rU Example acres of wildfires Other criteria minHIOCIB Zij PYij Ll 0L 3 biweight trimmed mean twoway medpolisheda cp OLS vs resistant line Statistics 215a 10603 D R Brillinger J W Tukey and M B Wilk 1966 Data analysis and statistics an expository overview AFIPS Conference Proceedings Vol 29 Also in The Collected Works of John W Tukey Vol 3 and the Statistics 215a Reader Introduction The basic general intent of data analysis is simply stated to seek through a body of data for interesting relationships and information and to exhibit the results in such a way as to make them recognizable to the data analyzer and recordable for posterity Four major influences act on data analysis today 1966 l The formal theories of statistics 2 Accelerating developments in computers and display devices 3 The challenge in many fields of more and ever larger bodies of data 4 The emphasis on quantification in an ever wider variety of disciplines Exposure the effective laying open of the data to display the unanticipated is to us a major portion of data analysis Formal statistics has given almost no guidance to exposure Data analysis is like doing experiments Far too many people m have persisted in regarding statistics even data analysis as a branch of probability theory m within modern mathematics Statistical data analysis is much more appropriately associated with the sciences and with the experimental process in general The general purposes of conducting experiments and analyzing data match point by point For experimentation these purposes include 1 more adequate description of experience and quantification of some areas of knowledge 2 discovery or invention of new phenomena and relations 3 confirmation or labeling for change of previous assumptions expectations and hypotheses 4 generation of ideas for further useful experiments 5 keeping the experimenter relatively occupied while he thinks Comparable objectives in data analysis are l to achieve more specific description of what is loosely known or suspected to find unanticipated aspects in the data and to suggest unthought of models for data s summarization and exposure to employ the data to assess the always incomplete adequacy of a contemplated model to provide bith incentives and guidance for further analysis of the data to keep the investigator usefully stimulated while he absorbs the feeling of his data and what to do next

