Stat 110 Final Exam Study Guide
Stat 110 Final Exam Study Guide STAT 110 - 002
Popular in Introduction to Statistical Reasoning
STAT 110 - 002
verified elite notetaker
Popular in Statistics
This 9 page Study Guide was uploaded by Kara Lyles on Wednesday April 27, 2016. The Study Guide belongs to STAT 110 - 002 at University of South Carolina taught by Gail Ward-Besser (P) in Spring 2016. Since its upload, it has received 58 views. For similar materials see Introduction to Statistical Reasoning in Statistics at University of South Carolina.
Reviews for Stat 110 Final Exam Study Guide
The content was detailed, clear, and very well organized. Will definitely be coming back to Kara for help in class!
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 04/27/16
Studies Sample Survey: part of population that information is collected on Observational Study: observe and measure but does no influence the response Experimental Study: A treatment is performed in order to observe responses. Can give good evidence to what causes the response Population: the true value that the parameter describes Sample: a small quantity that is use to calculate the Statistics Parameter: Number describing Population Statistics: number describing sample, and use to estimate the parameter Sampling methods Simple random sample: Basic Sampling method All individuals has an equal chance of being chosen of being picked First label each individual in population then use random # table to select labels at random Avoids BIAS Stratified Random Sample: Stratify first (put in groups) Then take random samples within stratum - Ex. Look at chp. 4 notes on pg 4 Cluster Sampling: Deals with REDUCING COST and EFFIECENY Convenience Sample: Easy and quick, researcher ask who they want to participate Can’t Trust can be Bias Voluntary Response Sample: AKA Self Selection Sample Individuals volunteer to be a part of sample Can’t trust can be bias Errors Sampling error: variation between the statistic and parameter * caused by chance Margin of error is the ONLY accounted for in the margin of error Bad sample method: VOLUNTARY and CONVENIENT Non- sampling error: Processing errors, *Response error poorly worded questions, *Non-response Variables Response variable: Measure the outcome of study (Dependent variable) Explanatory variable: Causes change in the response variable (Independent Variable) Lurking variable: Has an important impact on the relationship on the variables of the study. (Often it is what’s forgotten or is not though of) Confounding variables: When you can’t tell response variables apart. May be explanatory or lurking variables. Placebos Placebo: Dummy treatment Double Blind: The subject and the people working with them are unaware of the treatment the subject is receiving Single Blind: Either the experimenter or the subject know the treatment being received. Designs Matched pairs design: Type of block design Only compare 2 treatment Pairs of subjects are similarly matched Randomly assign members of a pair - Improvement of the SRS Completely randomized design: Randomly assign subjects to treatment Compare treatments ( usually compared to the control variable) Block design: Random assignment of subjects , that’s separated within blocks Able to draw conclusion within each block Statistically significant: Large enough observed difference that would really occurs by chance Data Ethics Institutional Review Board: Goal is to protect subjects from possible harm Informed Consent: Usually a form in writing that is obtained beforehand, letting subject know of the risk Phase trials Phase 1: small groups, evaluates safety and side effects Phase 2: larger group evaluate safety, and effectiveness Phase 3: larger group confirms effeteness, and compare commonly used treatments Phase 4: Post marking studies Variable: What you want to measure or know (Ex. Length of table) Instrument: what is used to make the measurement (Ex. Ruler) Unit of measure: Unit used to take the measurement (EX. Cm, inch, lb., feet) Measurement of the variable: Numerical value assigned to the measurement (Ex. 4ft) Types of variables Categorical: category, group Nominal: No order (Named categories) Ex. gender Ordinal: Natural order (has some type of order) Ex. Education experience Quantitative: Quantity, numerical value Discrete: Countable Integers (whole numbers) Ex. Number of friends Continuous: Measurable (finer accuracy) Ex. height Probability Rules Rule 1: probability must be a number between 0 and 1 Rule 2: all possible outcome must = 1 *if it passes both rules 1&2 it’s legitimate Rule 3 AKA compliment rule: if the question ask what is the probability that the event doesn’t occur? Use this Formula: 1 - probability the event WILL Occur Probability = numerical (has to be a number) Ex. Probability of dog losing the race is .40, what is the probability of it winning? ANSWER: . 70 Complement = Opposite (think opposite) Ex. What is the compliment of the dog losing? ANSWER: The dog winning Rule 4: Union Rule, think united so there is addition. Probabilities must be disjoint. Formula: P (A or B) = P (A) +P (B) *There may be an instance where given 2 disjoint probabilities and then a probability of both occurring. If asked the probability of both occurring use this formula: P (A) + P (B) - P (A&B) Rule 5 Multiplication rule: If events are INDPENDENT of each other multiply. Law of Large numbers The mean of observed outcomes eventually approach the expected value Central Limit Theorem The sampling distribution for the statistic is normally distributed when you take many, many samples of sufficiently large sample size Tree diagram representation of presenting complicated probabilities Different branches: represent different stages IF asked for an intersection (and) you multiply: Ex: what is the probability of surviving AND the transplant is a success. IF asked for a union (or) you add: What is the probability of living OR dying Expected gain: Formula given Confidence Intervals: formula given 68%: 1 deviation above and below the mean 95%: 2 standard deviation above and below the mean 99.7% 3 standard deviations above and below the mean Mean= p (in confidence interval) Measurements Valid:Must relate to the subject of the matter to be valid Reliable: If it gives similar results to actual measurement Bias: Consisted repeated deviation of statistics Variability: How spread out the values are To reduce variability increase sample size God sampling: Has small bias and low variability Percent change formula: Given in formula sheet Margin Of Error: estimate the amount of uncertainty within the estimate MOE= 1 / sqroot of n Based on Sample size Bar graph: Can Represents categorical variable (Shows some order) Can compare a quantitative variables Frequency bar graph uses COUNT Relative Frequency bar graph use PERCENTAGE There are SPACES between Bars Pie Chart: Represent Categorical variables ( No ordering) Must represent a whole ( 100% of data) Wedges represent parts Pictograms: DON’T USE very misleading Line chart: Measures across time Points are connected by lines Includes trends ( seasonal variation) Time is always plotted on horizontal line What’s being measured is on vertical line Overall pattern is a trend that goes upward/ downward over time Histograms: Measure quantitative variables across different groups All groups must cover the same amount of range Bars are equal widths and touch Height of bin is used to determine frequency Frequency histograms includes count, relative frequency histograms includes percentages Stem plot: Used for quantitative data Keep duplicates Increasing numerical order Steams are in a vertical column Percent change: Formula will be provided Positive result increase , Negative result percent decrease Box and whisper plot: Five number summary: Min, Q1, Q2, Q3, max Q1: median of the first half (left side) of the data Q2: Median Q3: Median on second half of data (Right Side) How to calculate quartiles look chp 12 pg4 Interpreting Scatterplots Look at overall pattern Look for outliers Described by form, direction and strength Form: clustered, curved, linear Strength: How close points follow the form Direction: Positive or negative - Positive association- if slope goes upward as you moves from left to right - Negative association- If slope moves downward as you move from left to right Correlation (r): describes the direction and strength of a straight line relationship r = 0 = no linear association r= 1 or -1= concise straight line R has no units which mean it be affected by change in units strongly affected by outliers (if outliers are taken away correlation gets stronger) Ignores difference between explanatory and response variables Measures of Central Tendency Mean: The average, which is also the balancing point Median: The midpoint of the distribution (M) Mode: Value that occurs the most often Shapes of distribution Skewed Right: right tail extends further than the left tail (Ex. Survival data) Skewed Left: Left tail extends further than the left tail (Ex. Test scores) Symmetric: Right and left sides mirror each other Interpretation: Have to be confident, must include population not sample, usually in a percent Decrease sample size variability increase, Increase sample size decrease variability Hypothesis Testing Null hypothesis use “=” signs Assume null hypothesis is true until you find evidence to reject Status quo, no change Mean of normal distribution Alternative hypothesis Tells us what probability we are calculating under the curve Uses signs <, >, = Experimental hypothesis P value Equivalent to probability, small evidence against the null If p value is low the null shall go, which mean if the p value is lower than the null you have enough evidence to reject it. -The level of significance is the value we use to determine whether to accept or reject null hypothesis -If you reject the null, DOSENT mean you can accept the Ha but can find in favor of HA Correlation Outliers are extra sensitive Never to reported by itself (use scatterplots and other statistics) Regression line Regression Line: Linear relationship between two variable (x explanatory variable, y response variable) Equation of a line: y=mx+b m: is the slope of the line b : intercept of the line (intercept is the value of y when x =0) Least- squares regression line- “the line that makes the sum of the squared vertical distances to the line as small as possible” (in class notes)
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'