×
Log in to StudySoup
Get Full Access to UTD - STAT 2332 - Study Guide
Join StudySoup for FREE
Get Full Access to UTD - STAT 2332 - Study Guide

Already have an account? Login here
×
Reset your password

stat 2332 utd

stat 2332 utd

Description

School: University of Texas at Dallas
Department: Statistics
Course: Introductory Statistics for Life Sciences
Professor: Chen
Term: Fall 2016
Tags: Statistics, Stats, intro to statistics, binomial, geometric, Poisson, Exponential, Averages, expected value, Standard error, central, Limit, theorem, chance, errors, sample, Samples, population, confidence interval, mean, and stat2332
Cost: 50
Name: STAT 2332: Exam 3 Study Guide
Description: Covers chapters 15-half of 23 for Exam 3 I would look over probability distributions mostly!
Uploaded: 11/02/2016
17 Pages 87 Views 0 Unlocks
Reviews



(b) What is the probability that you have to purchase at least 3 boxes to get a prize?




(a) What is the probability that you have to purchase 3 boxes to get a prize?




What is the probability that of 6 randomly selected patients, 4 will recover?



Topics covered in Exam 3: CH 15: - Binomial Distribution - Geometric Distribution - Poisson Distribution - Exponential Distribution CH 16: - Law of Averages CH 17: - Expected Value and Standard Error CH 18: - Central Limit Theorem CH 19: - Sample Surveys CH 20: - Chance Errors CH 21, 23: - Confidence Interval for Population Percentage, Population MeanFinding the probability of “success” (what we’re looking for) Binomial coefficient: (see formula below; this formula is the same as the  combination formula) Binomial Assumptions/Rules: 1. Each observation falls into 1 of 2 categories 2. Fixed n 3. Observations are independent Note: (n-k)! ≠ n! – k! “success” ???? p or “failure” ???? 1-p4. Probability of “success” p stays constant for each trial Specifically, k = # of successes n = # of observations AKA total # of trials p = probability of success for each trial 1−p = probability of failure for each trial The probability of k successes out of n trials is ��(��) =��! ��! (�� − ��)!����(�� − ��)��−�� Example of Binomial Probability: Hospital records show that of patients suffering from a certain disease, 75% die of  it. What is the probability that of 6 randomly selected patients, 4 will recover? Solution:  Success????the patient recovers Failure????the patient dies k = number who recover = 4 n = 6 p = 0.25 (1-p) = 0.75 (a) Probability that 4 will recover ��(�� = ��) =��! ��! (��)!(��. ����)��(��. ����)�� = ��. �������� (b) Probability that less than 4 will recover ��(�� < ��) = ��(�� = ��) + ��(�� = ��) + ��(�� = ��) + ��(�� = ��) (c) Probability that at most 4 will recover ��(�� ≤ ��) = ��(�� = ��) + ��(�� = ��) + ⋯ + ��(�� = ��) (d) Probability that more than 4 will recover ��(�� > ��) = ��(�� = ��) + ��(�� = ��) (e) Probability that at least 4 will recover ��(�� ≥ ��) = ��(�� = ��) + ��(�� = ��) + ��(�� = ��) (f) Probability that at least 1 will recover ��(�� ≥ ��) = ��(�� = ��) + ��(�� = ��) + ⋯ + ��(�� = ��) or use the “at least one” rule???? 1-P(none) �� − ��(�� = ��)binomial variable???? X = k = # of successes geometric variable???? X = m = # of trials until 1st success (success happens at m) Ex: Flip a coin until you get Tails, keep rolling a die until you get a 3 Venn diagram comparing Binomial vs. Geometric Distributions: Binomial  Distribution ∙ ∙ Fixed n (total #  of trials) ∙ k = # of  successes (k starts at 0 k=0,1,2…) ∙ Each observation either  success or failure ∙ Observations are  independent ∙ Probability of success p  stays constant for each  trial ∙ Discrete (k = 1,2,3etc.) Geometric  Distribution∙ No limit to #  of trials ∙ m = # of trials (m starts at 1 m=1,2,3…) For Geometric Distribution: The probability that the first success occurs at mth trial is ��(�� = ��) = (�� − ��)��−�� �� Or P(m) but P(X=m) is more common∙ The probability that at least m trials are needed to get the first success is ��(�� ≥ ��) = (�� − ��)��−�� ∙ The probability that more than m trials are needed to get first success is ��(�� > ��) = (�� − ��)�� Lack of Memory Property: - Information from the past/what happened before doesn’t affect the  probability; process resets itself even after consecutive failures Example Geometric Probability: A cereal manufacturer puts a special prize in 1/20  of the boxes.  (a) What is the probability that you have to purchase 3 boxes to get a prize?  ��(�� = ��) = (�� − ��. ����)��−�� ��. ���� (b) What is the probability that you have to purchase at least 3 boxes to get a prize?  ��(�� ≥ ��) = (�� − ��. ����)��−�� (c) What is the probability that you have to purchase more than 3 boxes to get a prize? ��(�� > ��) = (�� − ��. ����)�� (d) What is the probability of getting a prize before purchasing 3 boxes? ��(�� < ��) = ��(�� = ��) + ��(�� = ��) or ��(�� ≤ ��) = ��(�� = ��) + ��(�� = ��) (e) Suppose you have already purchased 5 boxes and didn’t get a prize. What is the  probability that you have to purchase at least 3 more boxes before getting a prize? This is a lack of memory problem (previous purchases don’t matter) ��(�� ≥ ��) = (�� − ��. ����)��−�� binomial???? # of successes geometric???? # of trials Now we have Poisson distributions Poisson???? # of rare events (represented by X) in time/space/place Ex: number of typing errors per page made by a typist, number of phones  exploding in a first-world country ∙ A random variable X has a Poisson distribution if the probability that X = k events will occur is given by ��(�� = ��) =������−�� ��!, �� = ��, ��, ��, ������. Where P(X = k) = probability that X event will have exactly k instances λ = the average # of events per unit of time or area where different values of λ gives a different Poisson modelEx: The average number of lions seen on a 1-day safari is 5.  (a) What is the probability that tourists will see four lions on the next 1-day  safari? X = # of lions on the next 1-day safari k = 4 λ = 5 ��(�� = ��) =������−�� ��!= ��. ������ (b) What is the probability that tourists will see less than three lions on the next 1-day  safari? ��(�� < ��) = ��(�� = ��) + ��(�� = ��) + ��(�� = ��) (c) Find the probability that tourists will see at most three lions on the next 1-day safari? ��(�� ≤ ��) = ��(�� = ��) + ��(�� = ��) + ��(�� = ��) + ��(�� = ��) (d) Find the probability that tourists will see at least 3 lions on the next 1-day safari? ��(�� ≥ ��) = �� − ��(�� ≤ ��) or ��(�� > ��) = �� − ��(�� ≤ ��) (e) Find the probability that tourists will see more than three lions. ��(�� > ��) = �� − ��(�� ≤ ��) 0, 1, 2Relation between Poisson and Binomial ∙ When p (success) is small (≤ 0.05) and n is large enough (≥ 20), the number  of successes are rare and then Poisson (λ = np) ≈ Binomial(n, p). Poisson  distribution ends up as a close approximation of the binomial distribution. Poisson equation (using λ = np for avg number) ≈ Binomial equation Average of Geometric Distribution: �� �� Average of Poisson Dist: λ Average of Binomial Dist: np Exponential distribution: - Unlike discrete distributions like binomial, geometric, and Poisson,  exponential distributions are continuous. - Usually used when talking about time and lifetime Ex: time between marathon runners, lifetime of a carThe probability that an exponential variable X will exceed a given value "����" is ��(�� > ��0) = ��−��0/�� or ��(�� < ��0) = 1 − ��−��0/�� Where  µ = average (expected) value (µ > 0)  o Each choice of µ gives a different exponential model.  *Note that there is no equal sign because ��(�� = ��0) = 0 Ex: Suppose that X = time it takes to buy tickets at a cinema. On average, it takes a  5-minute wait until you get your tickets. Assume that X follows an exponential  distribution. - Find the probability that you have to wait for more than 6 minutes. ��(�� > 6) = ��−6/5 = 0.301 - Find the probability that you have to wait less than 7 minutes. ��(�� < 7) = 1 − ��−7/5 = 0.753 ∙ Exponential distributions also have a lack of memory or memory-free  property just like geometric distributions. Ex: Lifetime of a functional car follows an exponential model. Given that the car  lasted for 10 years, find the probability that the car’s remaining life is 4 years. ��(�� > 14|�� > 4) = ��(�� > 10)sampling/sample size: the number of observations in a sample. We take a sample  from the population. Law of Averages: Averages and proportions vary less from the “expected” as sample size  increases; the statistical tendency toward a fixed proportion in the results when an  experiment is repeated a large number of times Ex: Toss a coin 100 times????percentage of heads (not the # of heads) gets  closer to 50% # of heads = half the # of tosses + chance error ∙ chance error: likely to become larger as # of tosses increases, but likely to be small when compared to the total number of tosses. ∙ Increasing sample size reduces variability =  smaller margin of error Histogram (on the right) depicts the Law of  Average; as the sample increases, the  proportions of each observation (# on die— 1,2,3,4,5,6) become approx. equal (less varied).We use sample statistics to estimate the population parameters. Use statistics to  estimate the parameter statistics: describes the sample VS. parameters: describes the population �� = �������� ���� ������������  (sample average) Called “X-bar”  �� = �������� ���� �������������������� (population average) we’ll use this  more than the  mean of the  population  Population Average: the EXPECTED VALUE of the Sample Average. In relating sampling and population (theoretically),  ▪ Pop. Avg = Sample Avg ▪ Pop. SD = Sample SD ▪ The Expected Value of the Sample Average notation: ����(��) ▪ W/ Chance Error: (because �� will not be exactly = to �� = ����(��) + ��ℎ�������� ���������� Or “Population Average” Chance Error: measured by the Standard Error of the Sample Average ����(��) ����(��) =���� ���� �������������������� √������������ �������� ������(��) Also, as sample size ����(��) because larger samples????more precise  estimate of true pop. avg The Central Limit Theorem describes the sampling distribution. Central Limit Theory: When sample size n is large (≥25), the sampling distribution  will be approximately normal no matter what the original population distribution  looks like.  ∙ Sampling distribution is approx. normally distributed If n is large (≥25) ∙ avg of sample = avg of population ∙ SD (Standard Deviation) is given by ����(��). ∙ Approximately,  A normal bell curve68% of sample averages will be within 1 ����(��) of the pop. avg just like 1 SD 95% = 2 ����(��) just like 2 SD 99.7% (nearly all sample averages) = 3 ����(��) just like 3 SD Quick Review: Population???? entire group Sample???? subset of the population, chosen by randomly selecting from  population Parameter???? describes the population, pop. parameter is unknown value Statistic???? describes the sample, we can calculate from sample Use statistic to estimate the parameter, as long as sample represents the  population.  ∙ Planned introduction of chance: best method of choosing sample 1. Selection bias: poor sampling plan 2. Nonresponse bias: People can’t or won’t respond; low response rates Ex: people at work might miss a survey, people don’t answer the phone,  etc. 3. Interviewer/Questionnaire Bias: leading questions and surveys  commissioned by special interest groups 4. Response errors: People lie or give different answers to different  interviewers 5. Chance (Sampling Errors): Errors caused by the fact that we are taking a  sample. Control chance error by controlling sample size Ex: not looking at the entire population, undercoverage1. Convenience sample: made up of people who are easy to reach. Ex:  Facebook polls 2. Quota sample: sample “hand-picked” to resemble population Sampling Techniques for Probability Samples 1. Simple Random Sample (SRS): a subset of individuals (a sample) chosen  from a larger set (a population). Ex: Choose 25 names of employees out of a  hat of 100 names. 2. Stratified Random Sample: the researcher divides the entire population  into different subgroups or strata (e.g. males vs females) Steps to calculate Chances for Sample Averages: 1. Convert data value to standard units (z-score). �� =���������� − ����(��) ����(��)����(��) = �������������������� ������2. Draw a picture of the desired area under the normal curve. 3. Look up z-score in the Normal Prob. Table and find the area (%) SE(X) measures how close the sample AVG is likely to be to the true  population AVG  ● What’s the problem here?  –SE(X) depends on population SD –we don’t know population SD  ● How could you approximate the population SD? ̂ (called “hat”) means estimate or predicted ����̂(��) =���� (������������) √�� Confidence Interval (CI): a range of values (%) that catches the average/mean of  the population Margin of ErrorMargin of Error ∙ Interpretation of Confidence Interval: __% of all samples will give an  interval that captures the true mean concentration of the population; the  true population avg. is within the Confidence Interval. ∙ The higher the %, the wider the CI. (99% has a wider range than 90%) ���������������� ���� �������������������� %Percents/proportions are a special case of averages where all numbers in the  original pop. are either 0 or 1  ∙ For percents, when we use a box model to simulate drawing from a  population, the box must be a 0-1 box. o pop. avg (the avg of the box) = pop. proportion p o sample avg (the average of draws) = sample proportion ��̂ o pop. SD is the SD of the box, which is given by �������������������� ���� = √�� × (�� − ��) ∙ The sample % is approx. normal if n is large enough (Central Limit Theorem) Finding Chances about sample %: �� =����������(���� ������������ %) − ����(������������ %) ����(������������ %) Where ����(������������ %) = �� × 100% ����(������������ %) = √�� × (1 − ��) ��× 100%For confidence intervals, use ��̂instead of p in SE formula: -end-

Page Expired
5off
It looks like your free minutes have expired! Lucky for you we have all the content you need, just sign up here