×

### Let's log you in.

or

Don't have a StudySoup account? Create one here!

×

or

by: Monica Chang

2

0

6

# Statistical Reasoning & Practice, Week 6 Notes 36-201 Stats Reason

Marketplace > Carnegie Mellon University > Statistics > 36-201 Stats Reason > Statistical Reasoning Practice Week 6 Notes
Monica Chang
CMU

Get a free preview of these Notes, just enter your email below.

×
Unlock Preview

- probability models: discrete random variables and binomial distribution - probability models: continuous variables and normal distribution - normal density curve - standardized normal distribu...
COURSE
Stats and reason
PROF.
Weinberg
TYPE
Class Notes
PAGES
6
WORDS
KARMA
25 ?

## Popular in Statistics

This 6 page Class Notes was uploaded by Monica Chang on Tuesday October 11, 2016. The Class Notes belongs to 36-201 Stats Reason at Carnegie Mellon University taught by Weinberg in Fall 2016. Since its upload, it has received 2 views. For similar materials see Stats and reason in Statistics at Carnegie Mellon University.

×

## Reviews for Statistical Reasoning & Practice, Week 6 Notes

×

×

### What is Karma?

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/11/16
Week 6 Probability models of data: discrete random variables & the binomial counts distribution: - Using intentionally imposed randomness in studies allows us to measure variation with probability - The things that are probability-modeled are called random variables (we want to measure variation across samples) - Random variable – an outcome of a random phenomenon that takes numerical values o Discrete random variable – random variable that can only have separated and usually finite number of outcomes  Probability of discrete rv distribution is usually represented by a one-variable probability table or relative-frequency histogram  Ex. Sample count, X, is a discrete random variable (e.g. the sample count of survivors across different samples animals in population) o Continuous random variable –random variable that takes an entire interval of numerical outcomes  Probability of continuous rv distribution is usually represented by a density curve that models a relative-frequency histogram ´  Ex1. Sample mean, X is a continuous random variable (e.g. the average of salary (which comes from quantitative data) across different samples of workers in population). Note: lower-case x bar is for a single sample (called a “realization”) and the general term is capitalized  Ex2. Sample Proportion, P (which comes from categorical data), is a continuous random variable (e.g. the sample proportion of democratic voters across different samples of voters in a population). Note: p for is proportion, and hat is for “computed from data” - Example of discrete random variable: Car occupancy o Interested in the # of people per automobile on road in a city 1. Define probability experiment: randomly select car on road 2. Define random variable: X=count of # of people in car 3. Assign probabilities to X: in this case, probabilities estimated from studies o Create one-variable probability table or probability histogram to represent the probability distribution of X (# of occupants per car) for population of cars o Any value of X is unpredictable, but the probability distribution let’s us model allows us to model predictable features in the long-run (in a large group or population of cars) o The predictable feature of a discrete random variable is: shape, center, spread o These features were the same ones we used to describe for a quantitative, but now the features of the random variable don’t represent a sample variable for a sample of people, but rather the hypothetical/long-run features of the population  Shape: we can get that from the histogram  Center: Mean of discrete random variable (AKA expected value): μx=x 1p 1x ∗2 +…2x ∗p n n  In the example, the mean number of car occupants is the weighted sum of the different occupant outcomes  Note: for the mean of continuous random variable, you need to find the integral (calculus), so we will only cover special cases  like normal distributions when mean is known due to symmetry  Spread: Standard deviation of a discrete random variable: σ x √ (μ1∗p x)x −1 (p2+…+x) −μ2∗p ( n x) n  Note: finding std. dev. for continuous random variable requires integral calculus, so we won’t be calculating these, they’ll be given if necessary - Special case of discrete random variable - Binomial Distribution: o You get a special case rv by counting # of successes in binary measurement o Example question of interest: If you roll a six- sided die 18 times, and 4 only appears once, is there strong evidence that the die is loaded, or is it just random variation? o Binomial conditions: 1. Fixed sample size for every run of study 2. Two outcomes (success or failure) for each observation 3. Each observation is independent of others (random sampling and select with replacement or population size has to be 10-20 times as large as sample size) 4. Probability of success is the same for each observation of sample (random sampling) o Example: 8 lab rats are vaccinated with a vaccine that has a prevention rate of 40%. The lab rats are then exposed to disease. 1. Fixed sample size: n=8 2. For each animal in study, there is either “success” or “failure” 3. The outcome from animal to animal are independent because it’s a random sample from large population 4. Fixed probability of success for each outcome because we are given p=.40  Random variable = count of rats that live out of n=8  Because the study meets all the Binomial conditions, then the count X of successes in n trials, has a particular probability distribution  Ex. What is the change that 3 will survive (3 successes) 3 5  P(X=3)=0.2784=56 (0.4) 0.60)  In general, if we have n items of either success or failure that are selected from a population with a chance p of success, and if Binomial conditions are met, then the probability of x successes out of n is: P xsuccessesoutof n= n ∗p ∗ 1−p )−x (x - Notes on notation: o Uppercase vs. Lowercase  Uppercase denotes random variable  Lowercase denotes a fixed number (i.e. parameter, or value/”realization” of random variable) o P’s  P – denotes sample proportion (e.g. binomial proportion of success), is random variable b/c changes from sample to sample (therefore uppercase), is sample statistic that estimates p  p – denotes true population proportion, is a fixed value (therefore lowercase), is a population parameter  P( ) – means “probability of” o X vs X´  X – denotes sample count, categorical measurement (e.g. binomial)  X´ - denotes sample mean, quantitative measurement - In general: o Discrete random variable  Represented by 1-var probability table  Probabilities come from studies or sometimes special formulas (e.g. binomial) o Continuous random variable  Represented by density curves  Probabilities computed using area under density curve More probability models of data: Density curves for continuous variables, and the Normal Distribution: - We naturally sketch a smooth curve over histogram of data, called a density curve. It is an idealization of distribution for continuous variable - For all density curves: area under curve = proportion of observations = relative frequency = probability - Density curve: o Idealized curve that represents frequency distribution o Always about x-axis o Scaled to total area of 1 (i.e. 100%), so area can be used for probability - Notation: o Mean of density curve denoted by μ (lowercase mu) o Standard deviation of density curve denoted by σ (lowercase sigma) - Example of density curves: o Skewed density curves o Uniform Density curves (plateau) o *Normal Curves (bell curves) Normal density curve (in more detail): - Why are they the most important type? o For data, b/c a lot of the time, they model variables that come from mixing many independence factors (in biology, behavior, stock, etc.) o For sampling, normal curves model certain features of randomly selected samples (central limit theorem) - Features of normal distributions: o Symmetric, unimodal o Mouth-down top, gradual concave-up, symmetric tails (one standard deviation takes you to the points of inflecti2n), general shape is like the function f(x=e −x o Bell-shaped o Mean, μ, measures center o Standard deviation, σ, measures spread - 68 – 95 – 99.7 approximation (empirical rule) for normal distributions: o ~ 68% of individuals fall within 1 std. dev. of mean o ~ 95% of individuals fall within 2 std. dev. of mean o ~ 99.7% of individuals fall within 3 std. dev. of mean - Standardized Normal Distribution o Mean μ=0 o Standard deviation σ=1 o Standard Normal curve: N(0,1) o Standard normal curves serve as a reference for discussing probability of all normal distributions because all normal curves w/ the same z-scores correspond to the same area How probability distributions are used for evaluating a claim: - 1. Conduct study. - 2. Model the data (distribution), based on a claim about population - 3. Use area under model to determine likelihood of potential outcome - 4. Get more data to evaluate claim. If outcome is unlikely based on model, then there is evidence against the claim. Standardized score (Z score): observation−mean - Z= standard deviation Probability table for standard normal probabilities: - Table entries give area to the left of z (probability) - Rows give z score Columns give the hundredths decimal place of z score

×

×

### BOOM! Enjoy Your Free Notes!

×

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

Jim McGreen Ohio University

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Janice Dongeun University of Washington

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

Bentley McCaw University of Florida

Forbes

#### "Their 'Elite Notetakers' are making over \$1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!
×

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com