New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Intr Stat&Data Anlys

by: Easton Mayert

Intr Stat&Data Anlys STATS 250

Easton Mayert
GPA 3.93

Thomas Venable Jr

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

Thomas Venable Jr
Class Notes
25 ?




Popular in Course

Popular in Statistics

This 5 page Class Notes was uploaded by Easton Mayert on Thursday October 29, 2015. The Class Notes belongs to STATS 250 at University of Michigan taught by Thomas Venable Jr in Fall. Since its upload, it has received 19 views. For similar materials see /class/231658/stats-250-university-of-michigan in Statistics at University of Michigan.

Similar to STATS 250 at UM


Reviews for Intr Stat&Data Anlys


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/29/15
Stats 250 Exam 1 Study Guide Chapter 1 0 Statistics are numbers measured for some purpose 0 Statistics is a collection of procedures and principals for gathering data and analyzing information in order to help people make decisions when faced with uncertainty Chapter 2 0 simple summaries of data can tell an interesting story and are easier to digest than long lists 0 Raw data corresponds to numbers and categorylabels that have been collected or measured but not yet been processed in any wa 0 Data is always taken from a sam le Variable is a characteristic that differs from one individual to the next Sample data are collected from a subset of a larger population Population data are collected when all individuals in a population are measured Statistic is a summary measure of sample data Parameter is a summary measure of population data A categorical variable places an individual or item into one of several groups or categories 0 Ordinal variable is when the groups have an order or ranking I E 39 s all medium large extralarge A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense 0 Also known as the measurement variable or numerical variable 0 It is a discrete variable is you can only have it as a whole number I Ex number ofCD s you own 0 It is a continuous variable if you can have decimals I Ex time age etc 0 Visual summaries give us a visual representation of the data we are examining 0 Bar graph gives us a bar with height corresponding to number of items in that ou P I Used for categorical data Pie chart helps us see what part of the whole each groups form Histograms give us a graph for quantitative ata I Show us a distribution of the quantitative variable 0 To interpret histograms we examine shape location and sprea 0 Shape can be symmetric skewed bell shaped or uniform I Symmetric bell shape 00 I Le skew I Right skew I Uniform 0 Location refers to the center or the average I T e mean is usually used as the average but is pulled by outliers With a skewed graph the median will be more accurate 0 Spread variability refers to how far scores vary from the mean e s andard deviationto calculate this There are two basic measures of location or center Mean which is the numerical average value I Xbar x1xz ern n 0 Median is the middle value when data is arranged from smallest to largest This doesn t allow high or low observations to skew it 0 Note the mean is sensitive to extreme observations while the edian is resistant to extreme observations Range measures the spread over 1005 of the data 0 Ran e high value low value 7 maximum 7 minimum Percentiles the ppercentile is the value such that p of the observations fall at or below that oint 0 Common percentiles I Median 50 I First quartile Q1 25 I Third quartile Q3 75 Five number summary lists the median rst and third quartiles and the highest and lowest value to give a quick overview of the data values and information about the center an sprea Interquartile range measures the spread over the middle 50 of the data 0 IQ Q3 1 Boxplots are graphical representations of the ve number summary 0 Steps I Label an axis with values to cover the minimum and maximum ofthe data I Make a boxvvith ends at quartiles Q1 and Q3 Draw a line in the box at the median M Check for possible outliers using the 15IQR rule and if any plot them individuall Extend lines from end of box to smallest and largest observations that are not possible outliers 0 Note possible outliers are observations that are below Q115IQR or observations that are above Q3l5IQR 0 Side by side boxplots are good for comparing data 0 Watch out because points plotted individually are still part of the data I They are just outliers o Boxplots cannot confirm shape 0 Possible Reasons for outliers and possible actions 0 The outlier is a legitimate data value and represents natural variability for the group and the variables measures I Values may not be discarded for they provide important information about location and spread o A mistake was made while taking a measurement or entering it into the computer I If this can be veri ed the value should be corrected or discarded o The individual in question belongs to a different group than the bulk of individuals measured I Values may be discarded if a summary is desired and reported for the majority group only 0 Standard deviation is the measure of spread of the observations from the mean 0 roughly the average distance the observations fall from the mean 0 The squared standard deviation is variance 0 Interpretation for standard deviation is on average the X variable vary by about standard deviation from the mean X of mean I EX on average the weights of small orders of French fries vary b about 6 g from their mean weight of 73 g 0 Notes I S0 means there is no variation and all the scores are the same I Like the mean s is sensitive to outliers We used the mean and standard deviation for data that is bellshaped and we use a five number summary for skewed data 0 The empirical rule states that for bell shaped curves approximately 0 68 of the values fall within 1 standard deviation of the mean 0 95 of the values fall within 2 standard deviations of the mean 0 997 of the values fall within 3 standard deviations of the mean Chapter 5 0 Descriptive statistics describe data using numerical summaries and graphical summaries o Inferential statistics use sample information to make conclusions about a large group of items or individuals than just those in the sample 0 Population is the entire group of individuals that we want information about about which inferences are to be made 0 Sample the smaller group the part of the population we actually examine in order to gather information Variable the characteristic of the items or individuals that we want to learn about 0 Fundamental rule for using data for inference is that available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question of interest 0 Bias 0 Selection bias occurs if the method for selecting the participants produces a sample that does not represent the population of interest 0 Nonparticipation nonresponse bias occurs when a representative sample is chosen for survey but a subset cannot be contacted or does not respond 0 Biased response occurs when participants respond different from how they truly feel I The way the questions are worded the way the interviewer behaves etc can lead to individuals providing false information Margin of error refers to how close that proportion comes to the truth for the entire population 0 Conservative margin of error is lxH Most inference methods require the data to be considered a random sample Independence means that the response you will obtain from one individual doesn t in uence the response you will get from another individual Identically distributed means all of the responses come from the same distribution Chapter 6 o Observational studies 0 The researchers simply observe or measure the participants and do not assign any treatments or conditions 0 Participants are not asked to do anything differently 0 Experiments 0 The researchers manipulate something and measure the effect of the manipulations on some outcome of interest I Participants are randomly assigned to different treatments 0 Explanatory variable is the variable we are interested in learning the effect of o It has its effect on the outcome or response variable 0 Confounding variable is a variable that affects the response variable and is related to the explanatory variable 0 Effect of a confounding variable cannot be separated from the effect of the explanatory variable 0 Might be measured and accounted for in analysis or could be lurking variables 0 Randomized experiments control the in uence of confounding variables Chapter 7 0 Probability is the proportion of times an event occurs 0 It applies to the population not the sample Two events are mutually exclusive if they do not contain any of the same outcomes so their intersection is empty Two events are independent if knowing that one will occur or has occurred does not change the probability that the other occurs 0 Mutually exclusive does not indicate independent A sample is drawn without replacement if individuals are returned to the eligible pool for each election 0 If they are not eligible for subsequent selection it is a sample drawn without replacement Chapter 8 0 Random variable assigns a number to each outcome of a random circumstance or a random variable assigns a number to each unit in the population 0 Discrete random variable can take one of a countable list of distinct values I Two conditions are that the sum of all individual probabilities must equal one and the individual probabilities must be between 0 and l 0 Continuous random variable can take any value in an interval or collection of intervals 0 Expected value of a random variable is the mean value of the variable in the sample space 0 Can be interpreted as the mean value that would be obtained from an in nite number of observations on the random variable 0 Binomial random variables count the number of times a certain event occurs out of a particular number of observations or trials of a random experiment 0 The conditions are I There are n trials where n is determined in advance and is not a random value I There are two possible outcomes on each trial success and failure I The outcomes are independent from one trial to the next I The probability of a success remains the same from one trial to the next I The probability of a failure remains lpsuccess for every trial 0 A binomial random variable is de ned as xmumber of successes in the n trials of a binomial experiment 0 Probability distribution of a continuous random variable is described by a density curve 0 The curve must lie on or above the horizontal axis 0 The area under the curve is equal to 1 Chapter 9 o The distribution of all possible values for a statistic for repeated samples of the same size from a population is called the sapling distribution of the statistic Chapter 10 o The sample estimate provides our best guess as to what is the value of the population parameter but is not 100 accurate The value of the sample estimate will vary from one sample to the next 0 Values often vary around the population parameter 0 Standard deviation gives an idea about how far the sample estimates tend to be from the true population proportion on average 0 Standard error of the sample estimate provides an idea of how far away it would tend to vary from the parameter value 0 The confidence level is the probability that the procedure that is used to determine the interval will provide an interval that includes the population parameter 0 Applies to the procedure not an individual interval Chapter 12 o The null hypothesis is a statement that there is nothing happening 0 Generally the researcher hopes to reject this


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.