# INTRODUCTORY PROBABILITY MA 320

UK

GPA 3.54

This 5 page Class Notes was uploaded by Kennith Herman on Friday October 23, 2015. The Class Notes belongs to MA 320 at University of Kentucky taught by Robert Molzon in Fall.

Date Created: 10/23/15

RRpad Reference Card by Tom Short EPRI Solutions Inc tshorteprisolutionscom 20050712 Granted to the public domain See wwwRpador for the source and latest version Includes material from R for Beginnerr by Emmanuel Paradis with permission Help and basics Most R functions have online documentation help topic documentation on topic topic id help search quot topic quot search the help system apropos quot topicquot the names of all objects in the search list matching the regular expression topic help s tart start the HTML version of help str a display the internal structure of an R ob39ect summary a gives a summary ofa usually a statistical summary but it is eneric meaning it has different operations for different classes of a ls show objects in the search path specify patquotpatquot to search on a atte ls str strO for each variable in the search path dir show les in the current directory methods a shows S3 methods ofa methods classclass a lists all the methods to handle objects of lass a options set or examine manyglobal options common ones Width digits error library x load addon packages 11braryhe1px lists datasets and nctions in package x attach x database x to the R search path x can be a list data frame or R data le created with save Use search to show the search path detach x x from the R search path x can be a name or character strin of an object previously attached or a package Input and output load load the datasets written with save data x loads speci ed data se s read table file reads a le in table format and creates a data frame from it the default separator sepquotquot is any whitespace use headerTRUE to read the rst line as a header of column names use as tors use commentcharquot quot to prevent quotitquot from being interpreted as a comment use skipn to skip n lines before reading data see the help for options on row naming NA treatment and others readcsv quot filenamequot headerTRU39E id but with defaults set for reading commadelimited les readdelimquotfilenamequotheaderTRU39E id but with defaults set for reading tabdelimited les read fwf filewidths headerFALSE sep r as 39 FALSE read a table offixed widthformatted data into a dataframe Widths is an integer vector giving the widths of the xedwidth elds save file saves the speci ed objects in the XDR platform independent binary format save image file saves all objects mm m cat filequot quot sepquot quot prints the arguments a er coercing to character sep is the character separator between arguments print a rints its arguments generic meaning it can have differ ent methods for different objects format x format an R object for pretty printing write table x file quot quot rownamesTRU39E col namesTRU39E sep quot quot prints x a er converting to a data frame ifquote is TRUE character or factor columns are surrounded by quotes quot sep is the eld separator e01 is the endofline separator ha is the string for missing values use c01namesNA to add a blank column header to get the column headers aligned correctly for spreadsheet input sinkfile output to file until s1nk Most ofthe IO inctions have a file argument This can often be a charac ter string naming a le or a connection f11equot quot means the stande input or output Connections can include les pipes zipped les and R variables On 39ndows the le connection can also be used with description quotclipboardquot To read a table copied from Excel use x lt readde11m quotclipboardquot To write a table to the clipboard for Excel use writetab1e x quotclipboardquot sepquottquotc01namesNA For database interaction see packages RODBC DB1 RMySQL RPgSQL and ROrac1e See pac ages XML hde netCDF for reading other le formats Data creation c generic inction to combine arguments with the default forming a vector with recursiveTRUE descends through lists combining all elements into one vector from to generates a sequence has operator priority 14 1 is 2345 seqfrom to enerates a sequence by speci es increment length speci es desired length seq alongx generates 1 2 length x useful for for loops rep x times replicate x times use each to re eat each el ement of x each times rep c 12 3 2 is 1 2 3 1 2 3 repc 123 each2 isl l 2 23 3 data frame create a data frame of the named or unnamed arguments dataframe v14 chc quotaquot quotBquot quotcquot quotdquot n10 shorter vectors are recycled to the length of the longest list c a list of the named or unnamed arguments 11st ac 12 bquoth1quotc31 array x dim array with data x specify dimensions dimc 342 elements ofx recycle ifx is not long enoug matrix x nrowncol matrix elements ofx recycle factor x levels encodes a vector x as afactor gl 1 k els1m generate levels factors by spec ifying the pattern of their levels k is the number of levels and n is the number of replications expandgrid a data frame from all combinations of the supplied vec tors or factors rhind combine arguments by rows for matrices data frames and like 0 ers chind id by columns Slicing and extracting data Indexing lists list with elements n n nth element of the list x quotnamequot element ofthe list named quotnamequot 1 Indexing vectors nth element x n x in all but the nth element x 1 n rst n elements x71n elements from n1 to the end x c 1 4 2 speci c elements x quotnamequot element named quotname quot x x gt 3 all elements greater than 3 x x gt 3 amp all elements between 3 and 5 x x 1 cquotaquot quotandquot quotthequot elements in the given set Indexing matrices element at row 1 column j row 1 X xc 13 columns1 and3 x quotnamequot row named quotnamequot Indexing data frames matrix indexing plus the following x quotnamequot column named quotnamequot n me id Variable conversion as array x as data frame x as numeric x aslogicalx I ascharacterx I convert type for a complete list use methods as Variable information is na x is null x is array x is data frame x is numeric x is complex x is character x test for type for a complete list use methods 1 s length x number of elements in x dim x Retrieve or set the dimension ofan object dlm x lt7 c 3 2 dimnames x Retrieve or set the dimension names ofan ob39ect nrow x number ofrows NROW x is the same but treats avector as a one row ma ix ncol x and NCOL x id for columns class x get or set the class of x c1ass x lt7 quotmycIassquot unclass x remove the class attribute ofx attr xwhich get or set the attribute Whlch ofx attributes obj get or set the list of attributes of ob Data selection and manipulation which max x returns the index of the greatest element ofx which min x returns the index of the smallest element ofx rev x reverses the elements of x sort x sorts the elements of x in increasing order to sort in decreasing order revsort x cutxhreaks divides x into intervals factors breaks is the number of cut intervals or a vector of cut points ascomplexx match x y returns a vector ofthe same length than x with the elements 0 x which are in y NA otherwise which x a returns a vector ofthe indices ofx ifthe comparison op eration is true TRUE in this example the values of 1 for which x 1 a e argument ofthis function must be a variable ofmode logi cal choose 11 k r nln 7kkl na omi t x suppresses the observations with missing data NA sup presses the corresponding line if x is a matrix or a data e na fail x returns an error message ifx contains at least one NA unique x if x is a vector or a data frame returns a similar object but with the duplicate elements suppressed table x returns a table with the numbers of the differents values of x typically for integers or factors subset x re ums a selection of x with respect to criteria typically comparisons xSVl lt 10 if x is a data frame the option select gives the variables to be kept or dropped using a minus sign sample x size resample randomly and without replacement s1ze ele ments in the vector x the option replace TRUE allows to resarnple with replacement prop table xmargin table entries as fraction ofmarginal table Math sin cos tan asin acos atan atan2 log loglO exp max x maximum ofthe elements ofx min x minimum ofthe elements ofx range x id then c mm x max x sum x sum of the elements ofx di ff x lagged and iterated differences of vector x prod x product ofthe elements 0 x mean x mean ofthe elements ofx median x median ofthe elements ofx quanti le x probs sarn le quantiles corresponding to the given prob abilities defaults to 025575l weightedmean x w mean ofx with weights w rank x ranks ofthe elements ofx var x or cov x variance of the elements ofx calculated on n 71 ifx is a matrix or a data frame the variancecovariance matrix is calculated sd x stande deviation ofx cor x correlation matrix of x if it is a matrix or a data frame 1 if x is a vec or var x y or cov x y covariance between x and y or between the columns of x and those of y if they are matrices or data frames cor x y linear correlation between x and y or correlation matrix ifthey are matrices or data frames round x 11 rounds the elements ofx to n decimals log x base computes the logarithm ofx with base base scale x if x is a matrix centers and scales the data to center only use the option scaleEAL SE to scale only centerEAL SE by default centerTRUE scaleTRUE pmin x y a vector which ith element is the minimum of x1 y 1 pmaxxy id for the maximum cumsum x avector which ith element is the sum from x l to x1 cumprod x id for the product LIquot1 cummin x id for the minimum cummax x id for the maximum union xy intersect xy setdiff xy setequal x y is element el set set inctions Re x real part of a complex number Im x imaginaryth Mod x modulus abs x is the same Arg x angle in radians ofthe complex number Conj x complex conjugate convolve xy compute the several kinds of convolutions of two se f ft x Fast Fourier Transform of an array mvfft x FFT of each column ofamatrix filter x filter applies linear ltering to a univariate time series or ch series separately of a multivariate time series Many math inctions have a logical parameter na rmEALSE to specify miss ing data NA removal Matrices t x transpose diag x dia onal 95 matrix multiplication solve ab so ves a 99 x b for x solve a matrix inverse ofa rowsum x sum ofrows for a matrixlike object rowsums x is a faster version colsum x colSums x id for columns rowMeans x fast version of row means colMeans x id for columns Advanced data processing apply X INDEX FU39N a vector or array or list of values obtained by applying a function EUN to margins INDEX of X lapply X FUN apply EUN to each element of the list X tapply X INDEX FUN apply EUN to each cell of a ragged array given y Xwith indexes INDEX by da ta INDEX FUN apply EUN to data frame data subsetted by INDEX ave x FUNmean subsets of x are averaged or other inction speci ed by EUN where each subset consist of those observations with the same factor levels merge a b merge two data frames by common columns or row narnes xtabs a b da tax a contingency table from crossclassifying factors aggregate xbyFUN splits the data frarne x into subsets computes summary statistics for each and returns the result in a convenient form by is a list of grouping elements each as long as the variables In x stack x transform data available as separate columns in a data frame or list into a single column unstack x 39nverse of stack reshape x reshapes a data frarne between wide format with repeated measurements in separate columns of the same record and lon format with the re eated measurements in separate records use directionquotwide or direction long Strin s paste concatenate vectors a er converting to character sep is the string to separate terms a single space is the default collapse is an optional string to separate collapsed results substrx start stop substrings 39 character vector can also as sign as substr x start stop lt7 value strsplit x split split x according to the substring spl1t grep pattern x searches for matches to pattern within x see regex gsub pattern r ement x replacement of matches determined by regular expression matching sub is the same but only replaces st occurrence tolower x convert to lowercase toupper x convert to uppercase match x table avector ofthe positions of rst matches for the elements ofx arnong table x in table id but returns a logical vector pmatch x table partial matches for the elements ofx arnong table nchar x number of characters Dates and times The class Date has dates without times POSlXct has dates and times includ ing time zones Comparisons eg gt seq and d1fft1me are use il Date also allows an 7 DateT1meClasses gives more information See also package chron as Date s d as POSIXct s convert to the respective class format dt converts to a string representation The default string format is 20010221 39Ihese accept a second argument to specify a format for conversion Some common formats are a A Abbreviated and full weekday name b B Abbreviated and full month name d Day ofthe month 01731 II Hours 00723 I Hours 01712 j Day of year 0017366 m Month 01712 M Minute 0075 9 p AMPM indicator S Second as decimal number 00761 U Week 00753 the rst Sunday as day l ofweek l w Weekday 076 Sunday is 0 W Week 00753 the rst Monday as day l ofweek l y Year without century 00799 Don t use Y Year with centu 2 output only Offset from Greenwich 70800 is 8 hours west of Z output only Time zone as a character string empty if not available Where leading zeros are shown they will be used on output but are optional on input See strft1me Graphics devices x11 windows open agraphics window postscriptfile starts the graphics device driver for produc ing PostScript graphics use horlzontal SE onef1le FALSE paper quotSpeclalquot for EPS les famlly speci es the font AvantGarde Bookrnan Courier Helvetica HelveticaNarrow NewCenturySchoolbook Palatino Times or ComputerModem w1dth7 an helght speci es the size of the region in inches for aperquotspec1alquot these specify the paper size ps options set and view if called without arguments default values 0 the arguments to postscrlpt pdf png jpeg bitmap xfig pictex see Dev1ces dev of f shuts down the speci ed default is the current graphics device see also devcur dev set Plotting plot x plot ofthe values of x on the yaxis ordered on the xaxis plot x y bivariate plot ofx on the xaxis and y on theyaxis hist x histogram ofthe frequencies ofx barplo t x histogram of the values ofx use horlzFAL SE for horizontal bars do tchart x if x is a data frame plots a Cleveland dot plot stacked plots linebyline and columnbycolumn pie x circular piechart boxplo t x boxandwhiskers plot sunflowerplot x y id than plot butthepointswith similar coor dinates are drawn as owers which petal number represents the num f oints stripplot x plot of the values of x on a line an alternative to boxplot for small sample sizes coplot x y l 2 bivariate plot of x and y for each value or interval of values of z interactionplot f1 f2 y if f1 and f2 are factors plots the m ans of y on the yaxis with respect to the values of f1 on the xaxis and of f2 different curves the option fun allows to choose e summary statistic of y by default f unmean matplot xy bivariate plot ofthe rst column ofx vs the rst one of y the nd one ofx vr the second one of fourfoldplot x visualizes with quarters of circles the association be tween two dichotomous variables for different populations x must be an arraywith dlmc 2 2 k or amatrix with dlmc 2 2 if assocplot x CohenFriendly graph showing the deviations from inde pendence of rows and columns in a two dimensional contingency ta ble mosaicplot x mosaic graph ofthe residuals from a loglinear regres sion of a contingency table pairs x if x is a matrix or a data frame draws all possible bivariate plots between the columns of x plot ts x if x is an object of class quotts quot plot ofx with respect to time x may be multivariate but the series must have the same frequency and dates ts plot x id but ifx is multivariate the series may have different dates and must have the same frequency qqnorm x quantiles of x with respect to the values expected under a nor 1 law ma qqplot x y quantiles of y with respect to the quantiles of x contour x y z contour plot data are interpolated to draw the curves x and y must be vectors an 2 must be a matrix so that m zc length x length y x and y may be omitted filledcontour x y 2 id but the areas between the contours are coloured and a legend of the colours is drawn as wel image x y 2 id but with colours actual data are plotted persp x y 2 id but in perspective actual data are plotted s tars x ifx is a matrix or a data frame draws a graph with segments or a star where each row of x is represented by a star and the columns are the lengths ofthe segments symbols x aws at the coordinates given by x and y sym bols circles squares rectangles stars thermometres or boxplots which sizes colours are speci ed by supplementary arguments termplotmodobj plot of the partial effects of a regression model mod obj The following parameters are common to many plotting functions addFALsE if TRUE superposes the plot on the previous one if it exists axesTRU39E if FALSE does not draw the axes and the box type IIpII speci es the type of plot quotpquot points quotlquot lines quotbquot points connected by lines quot0quot id but the lines are over the points quothquot vertical lines quotsquot steps the data are represented by the top of the vertical lines quotSquot id but the data are represented by the bottom of the vertical lines xlim ylim speci es the lower and upper limits ofthe axes for exam ple with xllmc 1 10 or xllmrange x ab ylab annotates the axes must be variables of mode character main title must be a variable of mode character sub subtitle written in a smaller font Low level plotting commands points x y adds points the option type can be used lines x y id but with lines main textx y dds text given by labels at coordi nates xy atypical use is plot x y typequotnquot text x y names mtext text side3 line0 adds text given by text in the margin speci ed by Slde see ax1s below l1ne speci es the line from the plotting area segments x0 y0 x1 yl draws lines from points x0y0 to points arrows x0 yo x1 yl angle 30 code2 id with arrows at points x0y0 if code2 at points x1y1 if code1 or both if code3 angle controls the angle from the shaft ofthe arrow to the edge of the arrow head abline a b draws a line of slope b and intercept a abline hy draws ahorizontal line at ordinate y abline vx draws avertical line at abcissa x abline lm obj draws the regression line given by 1m0bj rect x1 yl x2 y2 draws arectanglewhich left right bottom and top limits are x1 x2 yl and y2 respectively polygon x y 39 linkiu p 39 quot quot Y legend x y legend adds the legend at the point xy with the sym ols given by legen title adds a title and optionally a subtitle axis side adds an axis at the bottom s1de1 on the left 2 at the top 3 or on the right 4 atvect optional gives the abcissa or ordinates where tickmarks are drawn box draw abox around the current plot rug x draws the data x on the xaxis as small vertical lines locator 11 type llnquot returns the coordinates xy a er the user has clicked n times on the plot with the mouse also draws bols typequotp quot or lines typequot l quot with respect to optional graphic parameters by default nothing is drawn type quotnquot Graphical parameters These can be set globallywith par many can be passed as parameters to plotting commands adj controls textjusti cation O leftjusti ed O 5 centred 1 rightjusti ed bg speci es the colour of the background ex bgquotredquot bgquotbluequot the list ofthe 657 available colours is displayed with colors b ty controls the type ofbox drawn around the plot allowed values are quot0quot quotlquot quot7quot quotcquot quotuquot ou quot quot the box looks like the corresponding char acter ifbty quotnquot the box is not drawn cex a value controlling the size oftexts and symbols with respect to the de fault the following parameters have the same control for numbers on e axes cexax1s the axis labels cexlab the title cexma1n the subtitle cexsub col controls the color of symbols and lines39 use color names quot redquot quotblue quot see colors see rgb hsv gray and ralnbow as for cex there are colax1s col lab col ma1n font an integer which controls the style of text 1 normal 2 italics 3 bold 4 bold italics as for cex there are fontax1s font lab 1n fontsu las an integerwhich controls the orientation ofthe axis labels 0 parallel to e axes 1 horizontal 2 perpendicular to the axes 3 vertic lty controls the type of lines can be an integer or string 1 quotSolldquot 2 quotdashedquot 3 quotdottedquot 4 quotdotdashquot 5 quotlon dashquot 6 quottwodashquot or a string of up to eight characters between quot0quot and quot9quot which speci es alternatively the length in points or pixels of the drawn elements and the blanks for example ltyquot44quot will have the same effectthan lty2 lwd a numeric which controls the width of lines default 1 mar a vect tcl a value which speci es the length oftickmarks on the axes as a fraction height of a line of text by default tcl0 5 xaxs yaxs style of axis interval calculation default quotrquot for an extra sp e quotiquot forno extra space xaxt if xaxtquotn quot the xaxis is set but not drawn useful in conjunction with axis si yaxt if yaxtquotn quot the yaxis is set but not drawn useful in conjonction with xis side2 Lattice Trellis graphics xyplot y x bivariate plots with many inctionalities barchart y x histogram ofthe values of y with respect to those of x do tplo t y x Cleveland dot plot stacked plots linebyline and column bycolumn densityplot x density functions plot uantiles of x with respect to the values expected under a the oretical distribution stripplot y x single dimension plot x must be numeric y may be a fac or qq y x quantiles to compare two distributions x must be numeric y may be numeric character or factor but must have two levels sp lom x matrix of bivariate plots parallel x parallel coordinates plot levelplot z xyl g1g2 coloured plot ofthe values of 2 at the coor dinates given by x and y x y and z are all ofthe same leng wireframez xylg1g2 3d surface plot cloud zxy l g1g2 3d scatter plot In the normal Lattice formula y x l glg2 has combinations of optional con ditioning variables g1 and g2 plotted on separate panels Lattice functions take many of the same arguments as base graphics plus also data the data frame for the formula variables and subset for subsetting Use panel to de ne a custom panel function see apropos quotpanelquot and llines Lattice inctions return an object of class trellis and have to be printed to produce the graph Use print xyplot inside functions where auto matic printing doesn t work Use lattice theme and lset to change Lattice defaults Optimization and model tting op tim Par fn method c quotNelderMeadquot quotBFGSquot 39 Gquot 39 LBFGsBquot quotSANNquot generalpurpose optimization par is in a values fn is function to optimize normally minimize nlm f p minimize function f using a Newtontype algorithm with starting Va ues p 1m formula t linear models formula is typically ofthe form response termA termB use I xy l x 2 for terms made of nonlinear components glm fo rmula fami ly t generalized linear models speci ed by giv ing a symbolic description of the linear predictor and a description of the error distribution f ami ly is a description of the error distribution and link function to be used in the model see fami l nls formula nonlinear leastsquares estimates of the nonlinear model parameters approx x 37 linearly interpolate given data points x can be an xy plot ting structur spline x 37 CUbic spline interpolation loess formula t apolynomial surface using local tting Many of the formulabased modeling functions have several common argu ments data the data frame for the formula variables subset a subset of variables used in the t naaction action for missing values quotna failquot quotna omit quot or a inction The following generics often apply to model tting functions predict fit predictions from fit based on input data df residual fit returns the number of residual degrees offreedom coef fit returns the estimated coef cients sometimes with their standarderrors residuals fi t returns the residuals deviance fit returns the deviance fitted fi t returns the tted values logLik fi t computes the logarithm of the likelihood and the number of parameters AIC fit computes the Akaike information criterion or AIC Statistics aov formula analysis of variance model anova fit analysis ofvariance or deviance tables for one or more tted model objects densi ty x kernel density estimates of x binomtest pairwisettest owerttest prop test 0 t test use help search quottestquot Distributions mean0 sd1 Gaussian normal rate 1 exponential a n shape scale 1 gamma lambda Poisson scale1 Weibull tion0 scale1 Cauchy shape2 beta rcauchyn rbeta n df Student t dfl df2 FisherrSnedecor F x2 rchisqn df Pearson rbinomn size prob binomial rgeom n prob geometric rhyper nn m n k hypergeometric rlogis n location0 scale1 logistic rlnormn meanlog0 sdlog1 lognormal rnbinomn size prob negative binomial runifn min0 max1 uni orm rwilcoxnn m n rsignranknn n Wilcoxon s statistics All these functions can be used by replacing the letter r with d p or q to get respectively the probability density dfuncx the cumulative probability density pfuncx and the value of quantile qfunc p with Programming function arglist expr inction de nition return value if cond expr ifcond consexpr else altexpr or var in seq expr while cond r repeat expr break next Use braces around statements ifelse test yes no a value with the same shape as test lled with elements from either yes or no docallfunname args executes a function call from the name of the inction and a list of arguments to be passed to it Rpad utilities RpadU39RL filename returns the URL for the given lenarne RpadBaseU39RL filename returns the base URL for the given lenarne RpadBaseFile filename returns the le name relative to the base R irec ory RpadIsLocal returns TRUE ifrun locally rather than the clientserver version Rpad HTML utilities HTML x outputs an HTML representation of an object uses package RZIITML HTMLon turn on HTML mode the default is text HTMLoff tum offH39IML mode HTML tag tagName HTMLe tag tagName make starting and ending tags for tagName with elements as tag parameters HTMth text HTMLhZ text headings H1 H2 HTMLargs a string with the arguments as quota argl39 b39 argZ39 quot an umm 139 L1 quotLad l text quot II a radio input with an R variable name variableName commonName links it with other radio elements text speci es an eckbox name tex t 39 checked FALSE a checkbox input with an R variable name name HTMLselect name text de faul t1 size1 a select box with an R variable name name with the options text value 39 rpa t pe quotRvariablequot an in variable name name and default value rpadtype can by quotRstringquot or quotRvariablequot HTMLlink url text a link wrapped around text HTMLimg filename an img HTMLembedfilenamewidth600 height600 an embed use ful for pdf or svg Most ofthese return a character string and automatic printing sends the string to the output with the effect that the HTML is displayed in Rpad Rpad plotting utilities newgraph name quot quot sets up the graphics device not needed un less you want to change parameters showgraph generates the HTML to show the graph and runs newgraph to advance to the next graphics le linkTRUE creates a link to the EPS le graphoptions changes the defaults for subsequent graphs newgraph and graphoptlons have the following options With the defaults given typequotpngalphaquot re 20 w1dth35 helght polntslze10 subllnes0 topllnes 6 rat1043 leftllnes0 6 RpadPlotName retums the name of the currently active plot H 2 2

