Statistics 201 notes week 1
Statistics 201 notes week 1 STAT-201
Popular in General Statistics
Popular in Statistics
This 5 page Class Notes was uploaded by Jessica Namesnik on Thursday September 1, 2016. The Class Notes belongs to STAT-201 at Colorado State University taught by Kirk Ketelsen in Fall 2016. Since its upload, it has received 28 views. For similar materials see General Statistics in Statistics at Colorado State University.
Reviews for Statistics 201 notes week 1
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/01/16
Statistics 201 8/25/16 Statistics how data is collected, analyzed& interpreted descriptive statistics describes a dataset, the data itself don’t generalize the facts about dataset to a larger group. inferential generalize (generalizations are called inferences), contain uncertainty. Data information, takes the form of observed measurements or descriptions Stats how we apply data to the real world Variables items of interest, can take on different values, the type of measurement being taken Population entire overall group we are interested in Population of interest/target population i.e if we want average height of US girls, the population of interest is US girls. Can be large or small. Parameter # pertaining to a population (ie height of U.S. girls Statistic – any # calculated using data to estimate parameters Sample – subset of entire population we collect data on, the variable of interest is measured on them Observation single member of a sample Census measurements obtained from every member of a sample Conerns> is the sample large enough, is the sample representative of the population of interest? Statistics 201 8/30/16 Bias if the statistic is made in a way that shows it might differ from the population parameter it was meant to estimate. Sampling bias when the sample isn’t representative of the population of interest Self selection bias when individuals select themselves. i.e: when voting for the most talented musician and the musician votes for themselves. Nonresponsive bias when certain types of respondants are more or less likely to answer a survey honestly. i.e: high school kids raising their hands for a survey on virginity. Simple random sample (SRS) – sample of a population where each unit of the population has an equal opportunity to be selected. why? because it can help overcome selfselection bias and sampling bias Observational study variable values observed & recorded from already existing data Controlled experiment researcher assigns members of study to different groups which get different experimental conditions. Treatment group undergoes the procedure Control group does not undergo the procedure Placebo effect if a person believes a treatment will be beneficial, there is a chance they might have the beneficial effect regardless of being treated or not. Correlation doesn’t imply causation Confounding variables help explain data but is not accounted for in the study Blinding – an attempt to eliminate bias by not telling the treatment and control group which is getting the treatment Doubleblinding neither the research groups or the researcher know which group is the control and which is the treatment group. 9/01/16 Location where is the data set “ located” in a # line? Where is its center? Spread how dispersed is the data 5 number summary minimum and maximum, Q Q media1, 3, Outliers any unusual values in the data set Shape what is the shape of the distribution of values in a dataset? Center mean and median mean average, sum of data divided by sample size, denoted by an x with a line above it sample size # of obsrvations in a sample “n” mean = sum of data/ sample size Median – if you put data #’s in order smallest to largest, the # in the middle is the median, separates the upper 50% from the lower 50% Compute rank (n+1)/2 tells you which ordered observation will be the median. If the rank is an integer value (3, 5, ect) go right to it in the ordered data set otherwise compute the average of the 2 surrounding observations. i.e. If the rank=5 go to the 5 th ordered observation for the median. Lower quartile (Q )1below the median, separates the lower 25% from the upper 75% of the data Upper quartile (Q )3 above the median, separates the lower 75% from the top 25% of data To calculate: put parenthesis on either side of the median to separate the lower and upper halves of the data set. i.e 1,2,3,4,5,6,7,8,9 n=9 rank= (9+1)/2 =5 so median is 5 . So 1,2,3,4) 5 (6,7,8,9 . Q =the median of the lower half of the data (1,2,3,4) 1 Q 3the median of the upper half of the data (6,7,8,9) Extremes minimum and maximum Box plot /box &whisker plot Min 1 median Q3 Whiskers go to min, max, or furthest outliers, 50% of data in box, 25% below, 25% above
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'