Popular in Course
Popular in Statistics
This 9 page Class Notes was uploaded by Dayne Reinger on Thursday October 15, 2015. The Class Notes belongs to STAT269 at Messiah College taught by SamuelWilcock in Fall. Since its upload, it has received 54 views. For similar materials see /class/223501/stat269-messiah-college in Statistics at Messiah College.
Reviews for IntroductoryStatistics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/15/15
STAT 269 Introductory Statistics Con dence Intervals 1 Parameter In each ofthe intervals that we will look at the interest is in nding a reliable estimate of some parameter The rst step in our process will be to clearly de ne what the parameter represents in the particular situation we are considering 2 Assumptions As with hypothesis testing7 each method we discuss will make certain assumptions about the way we collected our data and the nature of the population7 or populations7 from which they came These assumptions are just as important here as they are in hypothesis testing 3 Point Estimate Our intervals will each start with the most reasonable estimate we have7 which will generally be a statistic calculated from our data7 that summarizes for the data the same characteristic as the parameter measures for the population 4 Margin of Error From our assumptions7 we will use our data to determine how far out on each side of the point estimate we need to go to account for the variability7 or error7 that is a result of having only a sample of data7 and not the whole population In most cases the bound is the same value in both directions7 but this is not always the case 5 Con dence Interval The interval is found by subtracting the lower bound from the point estimate and adding the upper bound to the point estimate We then have a certain con dence that the true value of the parameter is within these values 6 Conclusion Once again7 we will be careful to state the conclusion in terms of the problem In the conclusion we will give the bounds7 with the units of measurement7 and state our level of con dence Con dence Interval Details For M large sample 1 Parameter M is the mean for all 2 Assumptions We have independent7 random observations from some population7 and the sample size is large enough that we can use the Central Limit Theorem 3 Point Estimate X 4 Margin of Error 0 Z 7 5 Con dence Interval 3 47 and 34 6 Conclusion We are con dent that the mean for all is between lower bound and upper bound Con dence Interval Details For M small sample 1 Parameter M is the mean for all 2 Assumptions We have independent7 random observations from a normally distributed population7 with unknown variance 3 Point Estimate X 4 Margin of Error 8 p75 5 Con dence Interval 3 4 and 34 6 Conclusion We are con dent that the mean for all is between lower bound and upper bound Con dence Interval Details For p 1 Parameter p is the proportion of all that 2 Assumptions We have independent7 random observations from a binomial experiment7 and there are enough trials that we can use the Central Limit Theorem 3 Point Estimate 16 4 Margin of Error 2 1M n 5 Con dence Interval 3 4 and 34 6 Conclusion We are con dent that the proportion of all that is between lower bound and upper bound NAME STAT 269 Introductory Statistics Key Concepts from Chapters 3 amp 4 0 Probability Experiment an action through which results are obtained with various possible results 0 Outcome one of the possible results 0 Sample Space all possible outcomes 0 Event a collection of one or more possible outcomes 0 Probability we will think of probabilities as closely related to the idea of relative frequency that is a probability expresses the long term proportion of the time some event would occur if the probability experiment were run over and over again even if this is not actually possible 0 Law of Large Numbers as our sample size gets larger our sample relative frequencies will become better and better approximations of the true probabilities 0 Limits all probabilities must be between 0 and 1 o Interpreting a probability of 0 means an event is impossible while 1 means it is certain o Complement the complement of an event is the set of all outcomes not in the original event thus the probability of an event and its complement must add up to 1 0 Independence two events are independent if knowledge of whether one occurs does not affect the probability that the other occurs if this is the case the probability that both events occur is found by multiplying the probabilities of each individual event 0 Mutually Exclusive two events are called mutually exclusive or disjoint if they cannot occur together 0 Random Variable a number whose value is determined by some random process whose behavior can be modelled using probabilities 0 Types of Distributions random variables are generally broken down into discrete and continuous types 0 Features of Discrete Distribution a function called a probability mass function pmf assigns probability to each of the possible values such that the assigned probabilities add up to 1 when summed over all possible values 0 Common examples common discrete distributions include the geometric the hypergeometric the Poisson the Bernoulli and the binomial each of which is actually a collection of related distributions called a family we will only comment on the binomial family here 0 Traits of a Binomial Distribution 7 There are a xed number of trials 7 The trials are independent of each other i There are M possible outcomes for each trial 7 The random variable is the count of one of these outcomes we7ll call the outcome we7re counting a success The probability of success is the same on each trial STAT 269 Introductory Statistics Graphical Displays Stemandleaf Plot 0 Choose a spot to split the data into two pieces The rst part of each number is the stem7 the remaining piece is the leaf 0 Make a column of all the possible stems from the lowest that occurs in your data to the highest Include all values7 even if some do not occur in your data 0 Draw a vertical line next to the column of stems 0 Go through the dataset and attach the leaf for each observation to the right of the vertical line on the row for its stem If the leaves are more than one digit7 leave a space between leaves on any branch 0 After all the leaves are attached7 it may be helpful to rewrite the plot with the leaves on each branch ordered from smallest to largest o If the plot looks too compact you may adjust the number of branches per stem from 1 to 27 57 or 10 o If the plot looks too spread out7 split the data one digit further to the left7 and redraw the plot with the new stems o If needed7 you may use a 0 stem for numbers that would only have leaves7 or use add a 077 to whole numbers to be used as leaves Dotplot 0 Draw a numberline that covers the range of the data 0 Divide the numberline up evenly 0 Place a dot above the numberline for each observation o If a particular value appears multiple times7 stack the dots on top of each other Keep the spacing even7 so that two stacks77 that have three dots are the same height 0 NOTE This method works best with discrete data Frequency Table 1 Choose the number of categories to use The following table is a general guide that we will use If n is between 16 and 31 we will generally use 5 categories7 and between 32 and 637 we will use 6 Larger than that we would not do by hand7 and smaller will not be interesting 2 Find the range7 that is7 the largest observation minus the smallest 3 Divide the range by the number of classes7 and truncate the answer to the same accuracy as the data Then add one to the last digit of this number 4 Start the rst category 12 unit below the rst observation 5 6 Complete the categories by adding the number found in step three to the number from step four to get the right boundary of the rst category Continue adding to complete the necessary number of categories Make a table with headings for the boundaries the frequency the relative frequency the cumulative frequency and the cumulative relative frequency 0 Frequency Simply the count of the number of observations in the category 0 Relative Frequency The frequency divided by the total number of observations that is the proportion of the data that is in the category This may also be thought of as the size of the category relative to the whole data set Cumulative Frequency The count of the amount of data in this categories and any previous category on the table This is essentially accumulating the observations as we move from smaller values to bigger ones 0 Cumulative Relative Frequency The cumulative frequency divided by the sample size that is the proportion of the data that has been seen so far This may also be calculated by adding up the values in the relative frequency column but this is more likely to be affected by rounding error Histogram o A Frequency Table with at least the columns for frequency and relative frequency must be con structed to be able to construct a histogram 0 Draw a set of axes and indicate the category boundaries along the x axis The y axis is generally used to indicate the relative frequency 0 Bars are drawn for each category from one boundary to the other with height determined by the relative frequency Boxplot r5905 9 9051 H O Find the median and both quartiles Find the intraquartile range Find L1 15 gtk iqr and L2 3 gtk iqr Find the following lnner Fences Outer Fences f1 25 L1 F1 25 L2 f3 75 L1 F3 75 L2 11 is the smallest value in the dataset greater than or equal to f1 and a3 is the largest value less than or equal to f3 Draw a numberline covering the range of the data and locate 11 25 5 75 and 13 Draw a box from 25 to 75 with a line in the center at the median Draw whiskers77 from each side to the adjacent value Points between the inner and outer fences are indicated by a closed circle 0 Points past the outer fence are indicated by an open circle 0 STAT 269 Introductory Statistics Regression and Correlation Examples A realtor in a suburban area would like to be able to estimate the price of a house based on the square feet of living area so that home buyers have a rough idea of what they may be able to afford She randomly selects eight currently listed houses and obtains the square feet of living space and the asking price The table below displays the data in hundreds of square feet and thousands of dollars LivingSpace 15 33 23 16 16 13 20 24 SellingPrice 145 223 150 130 160 114 142 265 The summary values for this data set are Sm 451375 Sm 139495 Sm 209725 0 Find the correlation and t a regression line to the data 0 Given the following using a 005 level test would you conclude that the amount of living space a house has helps to predict the selling price 3 Rejection Region Reject H0 if TS gt 59874 4 Test Statistic TS 6337 5 PValue P 0045 The Quick Sell car dealership has been using 1 minute spot ads on a local TV station The ads always occur during the evening hours and advertise the different models and price ranges of cars on the lot that week During a 10 week period the Quick Sell dealer kept a weekly record of the number of TV ads versus the number of cars sold The results are given in the following table Ads Bought 6 20 0 14 25 16 28 18 10 8 Cars Sold 15 31 10 16 28 20 40 25 12 15 The summary values for this data set are Sm 6825 5 8256 Sm 6900 0 Find the correlation and t a regression line to the data 0 Given the following using a 001 level test would you conclude that the number of ads bought helps to predict the number of cars sold Also if the manager decides that they can only afford 12 spots per week predict the number of cars they should expect to sell in an average week 3 Rejection Region Reject H0 if TS gt 112586 4 Test Statistic TS 4360 5 PValue P m 0 NAME STAT 269 Introductory Statistics Basic De nitions 0 Data pieces of information to which meaning has been attached 0 Statistics a collection of i nethods for planning experiments obtaining data and then organizing summarizing presenting analyzing interpreting and drawing conclusions based on the data 0 Basic Terms Population the complete set of all individuals or units that are of interest in the study Sample the subset of the population that is actually studied 0 Two Key Types of Numbers Parameter a numerical measurement usually unknown describing some characteristic of a population Statistic a numerical i neasurement describing some characteristic of a sample 0 Branches of statistics Experimental Design the branch of statistics that deals with planning experiments and ob taining data Descriptive Statistics the branch of statistics that deals with organizing summarizing and presenting data Inferential Statistics the branch of statistics that deals with analyzing interpreting and drawing conclusions based on the data 0 Three ways to classify data Quantitative vs Qualitative gtlt Quantitative Data data that represents counts or i neasurements answers the questions how i nuch39quot or how i nany39quot usually numerical gtlt Qualitative Data data that separates units into categories by some non numeric charac teristic Discrete vs Continuous gtlt Discrete Data any type of data where the possible values can be listed out completely gtlt Continuous Data data where the possible values fall along an interval and any list would miss many possible values Levels of Measurement gtlt Nominal Level names labels or categories that cannot be sorted gtlt Ordinal Level data values can be arranged but differences between values are meaningless if they even exist gtlt Interval Level differences make sense but there is no natural zero that is the value 0 does not correspond to nothing gtlt Ratio Level there is a natural zero so differences and ratios make sense x333 suonsanb papmm Synod sKaAms uo Sum smafqns mzzp papxmaxsgm saxdums mopmax uou 3th aaump mopmu o3 anp mu aw 3mg suogmgxm up 03 may Jomg SugdmesuoN suogmgxm mopmax o3 Siaxpua anp sg 3mg uogwmdod arm am pm mme axdums p Imammq mumng am Jong Bugdmes meg Bunaanog 1mm pQAIOAuI smug 30 sad l om o 338 03 magma aw smafqns mmqagqm asn Bugdmes asuaguemuog msnp pamams qsz ugmm smafqns am 30 up asn pmz 3xa3snp am 30 amos paws Kmmpmax twp sxa3snp mus AOURIH mug uopmndod mp apgAgpqns Bugdmes Jensth pm ugmm gag p unpum pm mans xo sdnm qns afimx pmms mug uogwmdod am apgAgpqns Bugdmes paggmns mmd Summs mopmax amos my 333mth my KJGAG 31123 Bugdmes anemaxs s 53ng KHRHbG sg u azgs 30 axdums qsz SHS aldmes mopueH aldmgs pmaaws Sugaq 30 mump 11st am Spq 333mth qsz aldums mopueH spoqxaw Bugdmes o awqm p uogwmdod am 30 aApmtmsaxdax mu aw pamqw mpp mp xo pmaaws 8mm mp ngm ug axdums p aldmes pasegg V o mpp ammua 0 mmem 319 asn uopmms am 30 apom p Sugsn uogmlmugs umzxgmdxa pugq axqtmp p pasz 8 gym mm 319 31gt mgq 30mm 319 31219 as u s mannd am dnm pgqm 30 pauurgu mu 8 mp xmaop sguagmd am sagpms Ragpmn ug dnm oqaamd xo 3Ixmxx312ax3 p ug aw Sam mqmqm moug 333mth mp Sumax 3m Bugpugg gtlt 3IXGIIX3RGJ3 pm am am 3snf x333 swaj smsm sgoox 3IXGIIQRGJ3 p 30 33ng pag tutmaxsd mp Suwpms 30 512m p 20193ch gtlt 1033123 uppom am 30 SpAGI am SSOJJR Km paumsmdm s umunmx3 am 30 max pm 31219 as now mmmdmg mums o3 Sugpmam sdnm maxqggp mug Kmmpmax smafqns Sumnd Buppolg gtlt 13190 123 may paqsgn unsw aq 30mm 81033123 om 30 33ng am 3 suaddnq 3mm 8 mp Sugpunoguog gtlt 81033123 mtpo Km 03 pm ax momgm mopmax 31gt A a3amxltgt3 dnm 3IXGIIX3RGJ3 p 03 pau gsmz sg 333mth pm QHQ uBgsaq pazgmopue Klaxaldmog gtlt ammqumd Kumgqm aq um mayo paAwsqo sg 33ng am pm pagddp sg 3IXGIIX3RGJ3 mums awqm Spms p auampadxg 01mm on sg mam pamszzmn pm paAwsqo Kidmgs aw smxsm amqm Spms p Xptus Buonemxasqo uopmndod mp ug 8mm up up 30 Spms p snsuag sagpms 30 sad l o suogsnpum mm pm mpp MSWth 39g ssatxmopmu u gsap amos o3 Sugpxmap mpp 333mg 392 pawmsmz aq o3 Honsanb aggaads p 1110 39 XJgnbuI muggqu 30 p011an mauag o 39uqumd am 30 mm H 3 331235 uogsnpuog 399 1mm Imam mm uogmqmsgp mmou p mono 3mg uogwmdod p mag Kmmpmu mm 3 sxuw up 3219 pm 033123 om mp Imammq dgqsuogww mp saqgmsap 3 5819 09 ii ppom uogssax m xmug axdmgs mm mm Sugmnsmz aw 3M suoggdumssv 392 apom Hogssax ax mm H adqs up 3 19 qu 0 9WH 0W w sasaqqod 39 unsa sgsaqmdx mp 30 5112mmth 0 gtJgt6 nonmaum aApgsod 8mm p sagdmg 6 gt l gt uopmmxm aApgsod mumptmx p sagdmg uopmmxm aAmsod gt112an p sagdmg gt 1 gt 0 0 gt J gt 939 g39 gt J gt 639 639 gt l gt uonmaxxm am Supmdwuq o uopmmxm aAgw au gt112an p sagdmg uopmmxm aAgw au mumptmx p sagdmg uopmmxm aApRSau 8mm p sagdmg fzfzaxxwx t rs sa11gt ngtA mp Imammq nonmaum mp mmmwa o o E1Q 01 rs 1Q Smmnnm mo 368 um am asaq mug o rs MS 203 a 288 AA 6AA SGHIRA Manth I 2x190g r2 apom RXGUGD o uonelamog pm uogssax aH 30 Xmmmns 32ng Kna ueas X1013np011 1quot696 IRKLS EllEVN
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'