### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# STAT 2004, Midterm 1 Study Guide STAT 2004

Virginia Tech

GPA 3.62

### View Full Document

## About this Document

## 263

## 1

## Popular in Introductory Statistics

## Popular in Statistics

This 8 page Study Guide was uploaded by Mara DePena on Thursday February 25, 2016. The Study Guide belongs to STAT 2004 at Virginia Polytechnic Institute and State University taught by Metzger in Spring 2016. Since its upload, it has received 263 views. For similar materials see Introductory Statistics in Statistics at Virginia Polytechnic Institute and State University.

## Reviews for STAT 2004, Midterm 1 Study Guide

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/25/16

STAT 2004 STUDY GUIDE: MIDTERM ONE TABLE OF CONTENTS Course Logistics/About the Midterm ……………………………………………………………………….. 2 Populations and Samples …………………………………………………………………………………………. 2 Sampling Methods………………………………………………………………………………………… …………. 2 Experimental Design…………………………………………………………………………………………… ……. 3 Visualizing Numerical Data……………………………………………………………………………………….. 3 Distribution……………………………………………………………………………………… ………………………. 4 Boxplots………………………………………………………………………………………… …………………………. 4 Robustness……………………………………………………………………………………… ……………………….. 5 Probability……………………………………………………………………………………… ………………………… 5 Probability Distribution……………………………………………………………………………………… …..... 6 Symbols………………………………………………………………………………………… ………………………….. 7 2 COURSE LOGISTICS/ABOUT THE MIDTERM You must bring a calculator to the midterm. The midterm will consist of multiple choice and short answer questions. Statistics is the study of how best to collect, analyze, and draw conclusions from data. Data consists of observations, and these observations form the backbone of a statistical investigation. POPULATIONS AND SAMPLES Population- Represents all people or things of interest. Sample- Observed/measured subset of a population. Summary statistic- A single number that summarizes a large amount of data. Variables- Measured or observed characteristics of data. o Categorical variable- Responses themselves are categories. Nominal- Unordered levels. Ordinal- Ordered levels. o Numerical variable- Counts/measures information. Can take a wide range of numerical values. Sensible to add, subtract, or average these values. Discrete- Finite, countable scale. Can only take numerical values with jumps. (Ex: 1, 2, 3, 4…) Continuous- Continuous scale. (Height, weight, etc.) o Associated/dependent/correlated- When two variables show some connection with one another. o Independent- When two variables are not associated. o Correlation is not causation. 3 o Explanatory variable- In scientific terms, this is the independent variable. o Response variable- In scientific terms, this is the dependent variable. o Confounding variable- Variable that is correlated with both the explanatory and response variables. SAMPLING METHODS We seek to randomly select samples from a population. Bias- When a sample is skewed to a person’s interests. o Non-response bias- Can skew results when people do not respond. Sample frame- List/roster of all potential observations (numbered.) Simple random sample- Most basic random sample. Equivalent to using a raffle. All observations have an equal chance of being chosen. Stratified random sample- Divide-and-conquer sampling strategy. Population is divided into groups called strata by demographics/subgroups. Similar cases grouped together. o Random samples are drawn from each strata. Cluster sample- Population is divided into clusters, often but not always by location. o All members are measured/given treatment. EXPERIMENTAL DESIGN Observational study- A data analysis where data is collected in a way that does not directly interfere with how the data arises. Experiment- Used to investigate the possibility of causation. Has an explanatory and response variable. (Independent and dependent in scientific terms.) o Randomized- When individuals are randomly assigned to a group. o Placebo- Fake treatment. Prospective study- Identifies individuals and collects information as events unfold. Retrospective study- Collect data after events have taken place. Principles of Experiment Design: 1. Randomization- Subjects sampled randomly, treatments/control assigned randomly. 2. Replication- Large sample size based on cost/convenience. 3. Error control- Eliminate/account for any differences in the sample. Placebo Blinding- Subjects do not know if they are in the treatment or control group. 4 Double-blinding- Researcher also doesn’t know who is treatment/control. Blocking- Group subjects into blocks who share some other variable. Treatments are applied to experimental units. Response is measured on observational units. o Ex: If you modify the temperature in several fish tanks and record the heart rate of fish in different temperature tanks, the tanks are the experimental units while the fish are the observational units. VISUALIZING NUMERICAL DATA Dot plot Histogram o Sorts things into categories and provides a view of data density. o Right-skewed- Data trails off to the right. o Left-skewed- Data trails off to the left. o Unimodal- One prominent peak. o Bimodal- Two peaks. o Multimodal- Three peaks. Scatterplot o Provides a case-by-case view of data for two numerical variables. Stem-and-leaf plot o Data set: {1, 2, 4, 7, 7, 7, 12, 15, 18, 22, 24} 2 2, 4 1 2, 5, 8 0 1, 2, 4, 7, 7, 7 DISTRIBUTION Describes shape, center, and spread/variation of data. For the images below, imagine histograms that fit the depicted curves. 5 Sample standard deviation- Tells you how spread out your data is. It is the square root of the variance. BOXPLOTS A boxplot uses a five number summary consisting of the median (Q ), 2 minimum, maximum, and 25 (Q ) and 751(Q ) percenti3e. It summarizes a data set while also plotting unusual observations known as outliers. o Outliers- Observations that are extreme relative to the rest of the data. Below is an example of a boxplot depicting test scores. Boxplots are usually vertical, but this one will be depicted horizontally. o The first step is to draw the median. The second step is to draw a rectangle to represent the middle 50% of the data. 50 75 Interquartile range (IQR)- Q -3 .1It is the length of the box in the boxplot. IQR Method o One of the many methods for calculating outliers. Lower cutoff- Q -1(1.5 x IQR) Upper cutoff- Q -3(1.5 x IQR) 6 These upper and lower cutoffs make the whiskers attached to the box. Any points outside of the whisker range are considered outliers and are labeled with a dot. ROBUSTNESS Robust estimate- Strong/effective in all/most situations and conditions. Outliers do not change it very much. The median and IQR are considered robust estimates. PROBABILITY The proportion of times an outcome would occur if repeated infinitely many times. Sample space- Represents possible outcomes. Law of Large Numbers- As the number of trials increases, the estimate goes closer to the true probability. In other words, as a sample size increases a statistic gets closer to the parameter it is estimating. Example problem one: Find P(7) (probability of rolling a 7) with two fair independent dice. o How can a 7 be rolled? S (sample space)= {(1,6) or (2,5) or (3,4) or (6,1) or (5,2) or (4,3)} o Mutually exclusive/disjoint- Cannot both happen together. (Ex: Sanders and Bush cannot both be elected president.) In this example, you cannot get two of these results at the same time. Addition Rule of Disjoint Outcomes- If A1 and A2 represent two disjoint outcomes, then the probability that one of them occurs is given by P(A1 or A2) = P(A1) + P(A2). In this example we add the individual probabilities for each pair, and multiple the probabilities for each die. {(1,6) or (2,5) or (3,4) or (6,1) or (5,2) or (4,3)}= (1/6) x (1/6) + (1/6) x (1/6) +(1/6) x (1/6) +(1/6) x (1/6) = 6/36 or 1/6 A summary of the rules… o If A and B are disjoint… P(A or B)= P(A) + P(B) o If A and B are independent (knowledge of one doesn’t affect knowledge of the other)… P(A and B)= P(A) x P(B) o If A and B are not disjoint… P(A or B)= P(A) + P(B) – P(A and B) Example problem two: o We have 52 card deck. It consists of the numbers 1-10 and jacks, queens, kings, and aces. Suits are diamond and heart (red), and club and spade (black). o What is the probability of having a red King? 7 There are 52 cards, and two red Kings: the King of Diamonds and the King of Hearts. Therefore P(red King)= 2/52. o What is the probability of drawing a card that is red or a King? P(red or King)= P(R) + P(K) – P(R and K) (26/52) + (4/52) – (2/52)= 28/52 Venn diagrams can be used as a visual aid when solving probability problems. o We can make a venn diagram for the previous example. One circle represents the red cards, the other the Kings. The overlap represents Kings that are also red cards, and the rectangle represents the remainder of the deck. People who are more visual and less math-minded may prefer drawing a Venn Diagram to working out the formulas. 2 2 24 24 PROBABILITY DISTRIBUTION A table or graph showing all possible outcomes and their probabilities. Ex: Roll two dice. There are 11 possible sums. X (sum) 2 3 4 5 6 7 8 9 10 11 12 P(X) 1/3 2/3 3/3 4/3 5/3 6/3 5/3 4/3 3/3 2/3 1/3 6 6 6 6 6 6 6 6 6 6 6 8 Note that not all probability distributions will follow such a curve. Also note that all probabilities add up to one or to 100%. Possible short answer question for Pro the midterm: o Describe the distribution of two fair coin flips. S= { HH, HT, TH, TT} (1/4)(1/4)(1/4)(1/4) 1/4 SYMBOLS HH HT TH x Median x Mean Population mean ^ ❑ Estimation 2 Variance Standard deviation

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.