### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Final Exam Study Guide STAT 200

Penn State

GPA 3.59

### View Full Document

## 683

## 11

1 review

## Popular in Elementary Statistics

## Popular in Statistics

This 15 page Study Guide was uploaded by Kelsey Marr on Tuesday May 5, 2015. The Study Guide belongs to STAT 200 at Pennsylvania State University taught by Andrew Wiesner in Winter2015. Since its upload, it has received 683 views. For similar materials see Elementary Statistics in Statistics at Pennsylvania State University.

## Popular in Statistics

## Reviews for Final Exam Study Guide

I love that I can count on (Kelsey for top notch notes! Especially around test time...

-*Mr. Otis Schmidt*

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 05/05/15

Stat 200 Elementary Statistics Andrew Wiesner Final Exam Study Guide Sp15 Basics of Statistics 0 Variable Types 0 Categorical if you d answer with a quotwordquot Binary two levels to the category 0 Ex yesno sexgender Ordinal some hierarchy or order to the levels 0 EX class standing Nominal no order or hierarchy Ex republicandemocratindependent race etc 0 Quantitative answer is a number Discrete something counted 0 Ex the number of females in class Continuous something measured 0 Ex heights of students age of students Graphing Variables o Categorical graph choices Pie chart Bar graph 0 Quantitative graph choices Histogram Stem plot Dot plot Box plot 0 Numerical Summaries 0 Measures of center Mode most frequent observation in a set of data Median middle point where about 50 of observed values in a set of data fall at or below Mean math average of all observations 0 for class discussions average will mean 0 Spread of Data 0 Range maximum minus minimum 0 Standard deviation amp variance variance SDquot2 or SD square root of variance 0 lnterquartile Range IQR third quartile minus rst quartile Data Shape o Symmetrical or bell shape 0 Right Skewed or Positively Skewed 0 Left Skewed or Negatively kewed quotfr gl ha Hfo I39L F r frrrf 0 mean median mode 00 Variance o verbal formula Take the difference between each observation and the mean These are called quotdeviationsquot We square the deviations sum them then divide by n1 0 Math expression quoti Variance h 0 Sigma summation X with line on top sample mean Xi all values of x we observe X the data Outliers 0 We can use quartiles to identify extreme observations that can be considered Outliers o A common method is to build a fence around your data Lower fence 01 15 x IQR Upper fence QB 15 x IQR Any points outside the fence are considered outliers Empirical Rule 0 A special application to data that is be shaped or approximately be shaped 0 Also known as 68 95997 Rule 0 What it means For data that is be shaped about 68 of observations fall within one standard deviation of mean 95 of observations fall within 2 standard deviations of the mean 997 will fall within 3 standard deviations Ex heights of US adult males are approximately bell shaped with a mean of 69 inches and standard deviation of 35 inches 0 Approximately 68 of all male adults fall within heights of 69 35 in 655725 0 Approximately 95 fall within 69 7in 6276 0 Approximately 997 fall within 69 105in 585 795 Zscore 0 Best used for bell shaped data 0 Gives the number of standard deviations an observation is form the mean 0 Found by Zscore observed value mean Standard deviation 0 Three Research Strategies 0 Sample survey 0 Observational study 0 Experiment 0 Two ways to get data 0 Nonprobability methods Convenience or haphazard sample 0 Ex use our class Volunteer Ex visit website and vote in some poll 0 Probability methods Simple random sample SRS Where each subject has equal chance of being selected Strati ed random sample 0 The population or subjects are grouped by some distinction race class standing etc then a SRS is taken from each stratagroup Cluster sample 0 Random selection process is not one at a time but a group at a time 0 Ex randomly select a ight and interview all passengers or randomly select a section of a class and interview all students in that section The purpose of using probability methods is the use of randomness the random selection Using such techniques we gather a sample that represents some population of interest all PSU students etc Doing this then allows us to infer our sample results back to population 0 Ex if we randomly selected 1000 PSU students and 62 said campus was safe at night we can infer that 62 of all PSU students think campus is safe at night 0 ln random surveys we have a margin of error 0 Ex an error range from which the true proportion falls within 0 This margin of error can be approximated by 1square root of n where n is the sample size A common industry margin of error standard is 3 which equates to a sample size of about 1100 Bias that can appear in studiessurveys 0 Response bias subjects don t respond honestly o Nonresponse bias large percentage of sample doesn t respond 0 Selection bias we select subjects that don t represent the population of interest Observational Study 0 Where we observe subjects in their quotnatural statequot 0 Experiments 0 Where we randomly assign some treatment 0 Biggest difference between these two is that for an experiment we can conclude causecasual effect but for observational study only a relationship 0 In an experiment or observational study we have an explanatory variable and a responseoutcome variable Ex smoking vs lung cancer Explanatory variable smoking 0 Response variable lung cancer Ex height and weight Explanatory height 0 Response weight Blinding of subjects is done so subjects don t know what treatment they are receiving 0 Done to avoid placebo effect 0 Blind experiment researcher does not know what treatment subjects receive This is done to avoid experimenter effect 0 If both subject and experimenter are blinded we have a double blind study 0 Probability o The keys Identify events Identify the probabilities to those events Ex pass Monday s exam with A B or C what is the probability to pass 0 Let P be the event you pass PP probability of passing Complement is the event of all outcomes not in the event ofinterest Complement to pass fail or not pass D and F 0 Rules 1 Pevent Pcomplement 1 2 O 1PM 1 Conditional Probability o Formulas PAB PA and BPB quotquot means the following event is given not divide Are 2 events independent 0 Yes if PA x PB PA and B Probability Distribution 0 Discrete something counted 0 Ex number of females in class 0 Continuous something measured 0 Ex heights of students age of students 0 Random variable a numerical characteristic of each event in a sample space or equivalently each individual in a population 0 Discrete Random variable countable set of distinct possible values 0 Ex number of heads in 4 ips of a coin Continuous Random Variable such that any value to any number of decimal places within some interval is a possible value 0 Ex heights of individuals 0 Expected value mean value in the long run meaning for many repeats or a large sample size Binomial Random Variable counts how often a particular event occurs in a xed number of tries or trials For a variable to be a binomial random variable all of the following conditions must occur 0 There are a xed number of trials a xed sample size 0 On each trial the event of interest either occurs or does not 0 The probability of occurrence or not is the same on each trial 0 The trials are independent of one another Ex number of correct guesses at 30 true false questions when you randomly guess N number of trials P probability event of interest occurs on any one trial 0 Special discrete random variable is the binomial ex discrete with 2 outcomes o If the following 4 conditions are all satis ed we have what is called a binomial experiment 0 A xed number of trials we will call this n 0 We have 2 possible outcomes for each trial success and failure 0 Probability of success and failure is the same for each trial 0 The trials are independent 0 For cumulative probability in binomial experiments we want to use software We can use software for exact or cumulative o Minitab calc l probability distribution l binomial Probability use P xx Cumulative use P xlt or equal to x lgnore inverse Events n lnputs x De ate Gate example o If we assume all those footballs are and distributed bw the 2 teams 11 de ated n11 o if randomly divided bw the 2 teams probability of getting a quotbadquot football is 05 p5 Things to Note 0 Sum of all these probabilities is one o The outcomes are mutually exclusive 0 Keep in mind that the means something Continuous Random Variables Speci cally normal random variables 0 For a normal or approximately normal set of data 0 Z score observed value or quotxquot population mean or mu 0 population SD 0 for example US adult heights are approximately normal with following mean and SD 0 males mean 69 SD 3 0 females mean 64 SD 2 o if you were a male with a height of 71 inches your 2 score would be 667 because 71693 0667 0 How do we nd probabilities for continuous distributions o Xquot2 to nd area or probability under this equation we d have to integrate it xquot33 Integrating normal curve is not fun 0 Fortunately We have software that will Someone created a table for which we convert x to z score 0 This is the standard normal table where the mean O and SD 1 0 To nd Pxltx just convert x to z score and use table or minitab To nd Pxgtx convert x to 2 use table or minitab and subtract subtract from 1 0 To nd Pxltxltx convert both x s to 2 Find probability for both using tableminitab and get difference between probabilities To nd Pxgtxltx convert both x to 2 nd probability for 1St X nd probability for 2nd subtract from 1 Then add the 2 0 Some things to watchpay attention to o Probabilities range from 01 0 The doesn t matter Pxltx Px 5 x 0 Since normal is a bell shape the mean is approximately equal to the median therefore theoretically z scores also give distance from medium Consider normal and approx normalquot as equivalent If given 2 score no need to convert to z o If told the number of SD this is the z score Ex if told in a problem you observed a score 2 SD below the mean this refers to the z score 2 o In minitab much like last week go too calc l probability distribution normal Cumulative prob x lt x lnverse Use to nd x for given probability 0 Input constant z score 00 Sample Distribution 1 Proportion 2 Mean 0 Notation o Proportions pl population proportion This is a parameter P hat sample proportion statistic 0 Means Greek letter mu population mean parameter X bar sample mean statistic Parameters are summary measures of the population 0 Statistics are summary measures from sample 0 Parameters are xed statistics vary 0 Since these statistics vary they can have distributions Conveniently under certain conditions the sample proportion and sample mean will follow a normal distribution Thus we can use zmethods to calculate probabilities Conditions for proportion o npg 10 o n1O 2 10 o if both are satis ed then p hat has an approximately normal distribution with a mean to p and a standard error to the square root of p1On 0 Conditions for means 0 Population is normal 0 OR 0 Sample size is large enough ex n 2 30 What this implies if population distribution is unknown or not normal then with large enough sample the sample mean will be approximately normal This is referred to as the central limit theorem 0 If either is satis ed then x bar is approximately normal with a mean of mu 0 quotstandard errorquot is a measure of variability for a statistic Think of it as the standard deviation for a sample statistic quotstandard deviationquot is the SD for a sample 0 If conditions we discussed are met we can nd probabilities for a sample proportion and sample mean using 2 score 0 Proportion z p hat pthe square root of p1Pn 0 Sample mean 2 x bar muSE ln minitab we follow the same process as last week Hypothesis Testing or Signi cance Testing 0 5 Steps to testing 1 Set up 2 competing hypothesis One is the null hypothesis symbol Ho Second is the research or alternative hypothesis symbol Ha 2 Set some level of signi cance called alpha a For our class we will set this alpha at 5 or 05 means that in order for us to reject null hypothesis the probability of getting evidence against null assuming null was true is less than 5 or 05 ie very unlikely 3 Gather Data and calculate a test statistic a Test for 1 proportion conditions of nple n1p210 then the test statistic is z phatposquare root of po1pon b Po is value used in hypothesis c Test for 1 mean conditions of either data comes from a normal distribution or n330 Then test statistic is T XbarMo SDsquare root of n Mo value is used in hypothesis What are possible hypotheses Proportion Hoppo i Hapgtpo ii Hapltpo iii Hap not equal to po iv Only choose one alternative based on research question g Mean Hommo i Hamgtmo ii Hamltmo iii Ham not equal to mo 4 Calculate probability value call it pvalue associated with the test statistic and using the alternative to guide us Pvalue probability our sample would produce the result assuming Ho is the calculating pvalue if Ha is onesided that is we are using gt or lt then pvalue for a Proportion is found by P z gt z stat b Mean is found by Pt gt t stat c If the alternative is two sided ie not equal then pvalue is found by using same steps above then multiplying by 2 ie doubling the result 5 Compare pvalue to alpha and make decision to either reject Ho or fail to reject Ho and write overa conclusion TthQ Decision Rule is o If pvalue is less than alpha ie alpha 05 we reject null hypothesis If pvalue is not less than alpha we fail to reject rull Statistically signi cant results are when we reject null hypothesis Ex 1 HopO6 HapgtO6 N40 28 promoted were male Phat28407 Np40624 N1p40416 After using formula we nd pvalue to be 00885 o Is this less than 05 No so we fail to reject null Ho Conclusion we do not have enough statistical evidence to refute company s fairness on promoting employees Comparing Two Groups Sampling Distribution for Difference in 2 Sample Proportions O 0 Used when 2 populations are formed by a categorical variable and a comparison of some feature of the two populations is wanted Estimate the value of the difference in 2 population proportions Test the hypothesis that the difference between 2 population proportions is O If the difference is O the 2 proportions are equal Conditions Must be 2 independent samples randomly selected from the 2 populations n1p1 n11p1 n2p2 and n21p2 must be at least 10 Con dence Intervals for difference between population proportions O 0 Used when there are 2 populations from which independent samples are available Conditions Sample proportions must be independent or randomly selected n1phat1 n11phat1 n2phat2 and n21phat2 must be at least 10 Sample distributions for sample mean of paired differences 0 O Matched pairs 2 measurements from the same individual measured under conditions or at 2 different times Dependent samples data collected as matched pairs because the 2 observations are not statistically independent of each other Conditions Population must be bell shaped and a random sample OR Population of interest has a large random sample n is greater than or equal to 30 Paired data data that have been obsereved in natural pairs Use paired data to allow you to get rid of variation from pair to pair so you can observe variation between methods Interpreting CI for the mean of paired differences If CI does include 0 it is possible that the population means for the 2 measurements could be the same If CI does not include 0 fairly certain that the population means for 2 variables are different Sampling Distribution for the Difference in 2 Sample Means 0 Independent sample individuals in one sample aren t coupled in any way with individuals in other sample CI for Difference in 2 Population Means 0 Compare means of a quantitative variable for the two populations or for the two groups within a population 0 2 Sample 2test t sample statistic null value standard error 0 Equal variance assumption 0 Pooledpooling variance If the variance of the 2 independent groups be it proportions or means we can strengthen our test by pooling the variances into one variance 0 The choice to pool or not to pool is an option in the software with default being not to pool We will pool the two proportion variances if the value in our hypothesis to test is O o For 2 indpendent means we will pool the variance if ratio of largest SD smallest SD is less than or equal to 2 Analysis of Variance ANOVA We use this method to compare more than 2 independent means 0 Primary concerns 0 Setting up correct hypothesis 0 Determining correct degrees of freedom and using this info to establish how many means are being compared and what the total sample size is 0 Checking assumptions Normality Equalvadance 0 Determining which group mean or means differ when we reject the null hypothesis Hypothesis 0 Ho all means are equal or all group means are equal 0 Ho m1m2m3 0 Ha not all group means are equal 0 Ha at least 1 mean differs The test for ANOVA uses an Fstatistic or Ftest This f statistic is a ratio of between group variance to within group variance 0 The between group variance is a measure of difference between each group mean and the overall mean Between group DF l numerator DF g1 o The within is a measure of the difference between each observation and its group mean Within group DF l denominator DF ng fstat between group variance within group variance 0 The pvalue given from the software is the probability of getting this f statistic or one more extreme Categorical Data Categorical variables are raw data made up of group or category names that don t necessarily have a logical order Contingency tables are used to display all possible combinations of 2 categorical variables Row category is typically the explanatory variable Column category is typically the response variable lnferential statistics 0 When sample evidence is used to infer something about the entire population 0 Statistically signi cant relationship It can be inferred that a relationship exists in the population 5 steps to determining statistical signi cance 0 determine null and alternative hypotheses o summarize data into the appropriate test statistic after rst verifying that necessary data conditions are met 0 Find the pvalue of the chisquared statistic 0 Using the pvalue determine whether the result is statistically signi cant 0 Come up with a conclusion based on your ndings in the previous steps Calculating the chi squared statistic o Xquot2 the summation of observed count expected countquot2 expected count Factors that affect statistical signi cance 0 As the difference in row percents increases chi square increases and pvalue decreases 0 As n increases chi square increases and pvalue decreases Risk and Relative Risk 0 Risk number in category total number in group 0 Relative risk risk in category 1 risk in category 2 0 Odds compare the chance of an event happening to the chance of an event not happening for a group Confounding variable 0 A variable that effects the response variable and is also related to the explanatory variable lurking variable 0 a term used to describe a potential confounding variable that is not measured and is not considered in the interpretation of a study Correlation and Regression Scatterplots 2D graph of the measurements for 2 numerical variables o Explanatory variable x variable 0 Dependent variable y variable 0 Regression equation Yhat bo b1x Intercept coef cient bo Slope coef cient b1 Simple linear regression analysis where you attempt to nd the line that best estimates the relationship between two variables Regression anaylsis used to examine the relationship between a quantative response variable and one or more explanatory variables Deterministic relationship a relationship where if we know the value of one variable we know exactly the value of the other variable 0 Statistical relationship there is variation from the average pattern 0 Most relationships are statistical Prediction error the difference between the observed value of y and the predicted value of yhat for an observed value of x 0 Correlation o A number used to indicate the strength and direction of a straight line relationship Strength of relationship closeness of points to a straight line Direction of relationship indicates whether one variable generally increases or decreases as the other variable increases 0000 0 Practice Questions 1 Which of the following best describes the population in a statistical study a All of the people in the United States b All of the people in the world c The group of people from which the sample should be taken d A subset of the sample i Answer C 2 A measure calculated from a population is called a An outlier b A parameter c A statistic i Answer B 3 Select the answer choice where the Central Limit Theorm for sample means would NOT apply a n20 and the population is normal b n60 and the population is normal c n15 and the population is skewed d n60 and the population is skewed i Answer C The salaries of teachers is skewed to the right Based on this select the true statement below a As you increase sample size the distribution of the sample means would approach a right skewed shape b As you increase the sample size the distribution of the sample means would approach an approximately normal shape c As you increase the sample size the distribution of the sample would approach an approximately normal shape i Answer B Which statement is true a As the level of con dence increases the interval gets narrower b As the level of con dence decreases the interval gets wider c As the level of con dence decreases the interval gets narrower i Answer C Which best describes the primary purpose of a con dence interval a Estimate the accuracy of a sample statistic b Estimate the population parameter c Make decisions on differences i Answer B Determine the standard error if you have a random sample of 64 with a mean height of 70 inches and a standard deviation of 4 inches a 033 b 05 c 25 d 16 i Answer B Which method would be appropriate for estimating the percent of students who drink more than two cups of coffee a day a Con dence interval for a mean b Con dence interval for a proportion i Answer B Which test would be appropriate for the following question Is there a difference between two auto mechanics in their quotes for repairing damaged cars a Test of one mean b Test of paired means c Test of two independent means d Test of two proportions i Answer B Which test would be appropriate for the following question Does the average student spend more than 100 a semester at McDonalds a Test of one mean b Test of paired means c Test of two independent means ol Test of two proportions i Answer A

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.