Test 1 Topics
Test 1 Topics Stat 2013
Popular in Elementary Statistics
Popular in Statistics
This 11 page Study Guide was uploaded by Morgan Walker on Tuesday February 9, 2016. The Study Guide belongs to Stat 2013 at Oklahoma State University taught by Robert Adam Molnar in Winter 2016. Since its upload, it has received 105 views. For similar materials see Elementary Statistics in Statistics at Oklahoma State University.
Reviews for Test 1 Topics
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/09/16
Test 1 Topics Chapter One Population vs Sample Population-all subjects undergoing a study Sample- a group selected from the population, to make the study more concentrated Descriptive vs Inferential statistics Descriptive Statistics- the collection, organization as well as summarization of data Basically sums up what you’ve collected and makes it easier to understand Inferential Statistics- generalizes a known sample data to the unknown population data connects the sample data you’ve collected to the larger population data allowing you to see just how they correlate Qualitative (Categorical) vs Quantitative variables Qualitative- variables collected all have a distinct category Ex: rough texture vs. smooth texture etc. Quantitative- variables that are countable or measurable Countable variables are discrete a. Number of animals in a zoo Measurable variables are continuous a. Length of different ribbons in inches Nominal, Ordinal, Interval, and Ratio variables Nominal- mutually exclusive category with no ranking One color is not better than another but they are both colors Ordinal- mutually exclusive categories with rankings but no fixed meaning between the unit differences Interval- quantitative that has rankings and fixed meaningful differences with no meaning to zero or a division True interval variables are uncommon and almost always are constructed o ACT, SAT, temperature Ratio- quantitative data that has rankings, fixed differences with meaning, and is divisible by zero Ratio variables are frequently categorized into ordinal variables that have boundaries Outcome (dependent) vs Predictor (independent) variables Outcome- or dependent variable is the result of interest Predictor – or independent is the variable that is being manipulated in an experiment in order to see what happens to the dependent variable Confounding (lurking) variables Confounding variable or lurking variable- variable that isn’t the independent predictor variable but it influences the response Experiment, Quasi-experiment, Observational study Experiment- a procedure done that has more than one outcome Quasi-experiment -empirical study used to estimate the causal impact of an intervention on its target population Observational study- procedure used to know the cause and effect of something, the researcher has limited to no control of test subjects Treatment group, control group, placebo Treatment group- receives the new idea Control group- get the standard treatment Placebo- neutral treatment that seems to be the same as the actual treatment Control groups often get this since people mentally respond to the thought of treatment, or the placebo effect. Types of bias, including non-representative and non-response Non-representative bias- bias that comes from an unrepresentative bias Under-coverage- when the sample is too small and doesn’t represent the population well Non-response- bias that comes from when responders differ greatly from those who don’t respond Voluntary response bias- when the volunteer sample over presents individuals with strong opinions Blinding: double-blind, single-blind Blinding- the lack of knowledge about the treatment being given. Single blind- Analysists aren’t blinded but subjects and administrators can or can’t be Double blind, both subjects and administrators are blinded, is preferred Types of samples: random sample, systematic, stratified, cluster Random or Simple- all members of the population have an equal chance of being selected Choosing a color from a bag filled with crayons th Systematic- selecting every n member of the population from a list Every tenth person is picked Stratified- dividing the population into subgroups based on characteristic of interest then selecting subjects from each subgroup Population- Athletes, Subgroup- Soccer and Baseball, Subjects- Team Captains Cluster- dividing the population into subgroups based on convenience then selecting subjects from subgroups Population-Oklahoma State University , Subgroup- All freshman, Subjects- out of state freshman Blocking Blocking- dividing subjects into groups that are based on a certain response variable before choosing the treatment and control Hawthorne effect Hawthorne effect- subjects changing their behavior because they know that they are being watched Longitudinal studies Longitudinal studies- when participants partake in a study that requires multiple check-up dates throughout the period Replication Replication- when you assign the treatment to many different test subjects Chapter 2 Frequency Distributions: Class limits, class boundaries, class width, class midpoint Frequency distribution- organization of raw data into a table with classes and frequency counts Categorical- uses defined classes Quantitative- create classes o Grouped frequency distribution Class limits- labels Ex: 23-78 Class boundaries- actual division between groups Can be found by subtracting .5 or .05 to the lowest number and adding .5 or .05 to the highest number o Ex. 23-.5=22.5(lower boundary) 78+.5=78.5(upper boundary) Class width- from low boundary to upper boundary Can be found by subtracting the upper boundary from the lower boundary o 78.5-22.5= 56 Class midpoint- the mean of the upper and lower boundaries Can be found by adding the boundaries together and dividing by how many there are o 78.5+22.5 / 2= 50.5 Rules for choosing classes 1 observation to 1 class Exhaustive to cover all data Continuous without gaps Mutually exclusive without overlap Equal width, except the possibly open ended class on one or both ends Should have around 6-12 classes. Any more over 15 is uncommon and any less than 6 is too little. Errors in graphs: proportionality, dimensionality, lack of context, chartjunk Proportionality: effect size is neither too large nor too small Dimensionality: expanding one dimensional difference into many differences Lack of context: around the change including units Chart junk: focusing on the presentation of the chart more than the data Relative vs Absolute graphs Absolute- actual numbers Relative-values dependent on others, dependent on the absolute Distribution shapes: Bell-shaped, uniform, right-skewed, left-skewed, bimodal, U-shaped Bell- graph goes up then back down, the highest bar is in the middle http://i.stack.imgur.com/HQKeF.jpg Uniform- all bars are around the same size http://cdn.webservices.ufhealth.org/wp-content/blogs.dir/553/files/2012/07/images-mod1- histogram4.gif J-shaped- the highest bar is on the right side http://images.tutorvista.com/cms/images/67/Cumulative-Histogram.PNG Reverse J shaped- the highest bar is on the left side https://o.quizlet.com/WWoB8.7hNPj64ne6cxEOSw_m.png Right skewed- the highest bar is on the left side http://www.statcrunch.com/grabimage.php?image_id=355207 Left skewed- the highest bar is on the right side https://faculty.elgin.edu/dkernler/statistics/ch02/images/shape-left-skewed.jpg Bimodal- the highest bar is on both left and right side http://www.pqsystems.com/qualityadvisor/images/bimodal.gif U-shaped- the highest bar is on the far right and far left http://www.pindling.org/Math/Statistics/Textbook/Chapter2_descript_stat/3d4ee15f.jpg How to construct the following graphs: Pie Chart, Bar Graph, Histogram, Pareto Chart, Frequency Polygon, Ogive, Time Series, Dotplot, Stem and Leaf Plot Pie Chart-Shows single categorical variables Few categories Easy variables Different colors Bar graph- used for non-continuous categories, categories are usually ordered alphabetically, they have spaces between bars Histograms- used for continuous categories, ordered from low to high by value of variable, NO spaces between bars Pareto charts- used for non-continuous categories, categories are ordered from high to low by their frequencies, they have spaces between bars Frequency polygons- represents class counts with dots at the class midpoints and a line connects all the dots Ogive- line graph of a cumulative frequency distribution Time series- points taken in over a period of time Dot plots- one plot per observation Stem and leaf- displays data that is organized by place and value Chapter 3 Measures of central tendency: Mean, Weighted Mean, Median, Mode, Midrange Nominal mean- sum of numbers in a set divided by the amount of numbers present in the set, average of all the numbers Weighted mean- used for continuous data where each value has a weight Median- the middle number Mode- the value that shows up more than once or shows up the most Midrange- average of lower and higher number in a set Measures of variation: Range, Variance, Standard Deviation Range- highest number minus lowest number Variance- best mathematical properties https://explorable.com/images/statistical-variance-calculation-population.jpg Standard deviation- square root of variance “typical” https://statistics.laerd.com/statistical-guides/img/standard-deviation-1.png Coefficient of Variation Coefficient of Variation- used to compare deviations between groups Standard deviation divided by mean o Expressed as a percentage Empirical Rule for Bell-shaped distributions Empirical Rule- MUST BE BELL SHAPED GRAPH 68% will fall within one standard deviation 95% will fall within two standard deviations 99.7% will fall within three standard deviations https://dr282zn36sxxg.cloudfront.net/datastreams/f- d%3Aad71d96fedfd7d6f21c11e61cc61f0847303c863a13cd766ea649336%2BIMAGE_THUMB_POSTCARD%2BIMAGE_THUMB_POSTCARD.1 Measures of position: Percentile, Decile, Quartile. Know how to compute these. Percentiles will not have units o Dividing the data into 100 groups o Always round down, cannot go over Deciles- 10, 20, 30 o The third decile is the 30 Quartile- four equal size groups o Can be percentiles, but are often computed with multiple medians o There are different computation rules but this one is more common Q1 is the median of values less than median Q2 is the median of all the values Q3 is the median of values more than the median For Q1 and Q3 DO NOT include the median observation in “less than” and “greater than” Five Number Summary 1. Minimum 2. Q1 3. Median (Q2) 4. Q3 5. Maximum Interquartile Range Interquartile Range (IQR)= Q3-Q1 Outliers, including the numeric test for outliers Observation that is suspected to be outside the typical pattern of distribution Very high or very low Numeric test labels a point as the outlier if the point is More than Q3 + 1.5*(IQR) Less than Q1- 1.5*(IQR) Boxplot Boxplot Unmodified boxplot consists of a box from Q1 to Q3 with a central line as the median then extensions to the minimum and maximum http://www.mathbootcamps.com/wp-content/uploads/boxplot-no-outliers.jpg Standard Scores (also known as Z scores or standardized scores) Standard Scores- equal to 0 represents an element equal to the mean, equal to 1 represents an element that is 1 standard deviation greater than the mean, equal to 2, 2 standard deviations greater than the mean and so on http://f.tqn.com/y/statistics/1/W/9/-/-/-/zscore.GIF Chapter 4 Interpretations of probability: Theoretical, Classical, Empirical, Subjective Probability is a way to assign numerical measurement to chance, 3 ways to do this Theoretical “classical”- computed through mathematical definitions Empirical- frequency proportion of the time that events of the same type will occur in the long run Subjective- assigned estimate of chance considering data, experience and personal belief Probability Experiment Probability experiment- chance process that’ll lead to 1 out of 2 or more defined results Trial Trial- a process of observation or measurement Outcome Outcome- result of a trial Event Event-set of one or more outcomes inside a sample space, denoted with a capital letter Sample Space Sample space (s)- set of all possible outcomes Empty Set Empty set- set with no elements, denoted with a Ø Complement of an event Complement- the set of outcomes in the sample space Ex: heads is compliment of tails and vice versa Probability rules 1. The probability of any event (E) is a real number between and including 0 and 1 a. 0 ≤ P(E)≤ 1 2. The sum of the probabilities of all outcomes in a sample space is 1 a. P(S)=1 3. If an event cant occur the probability is 0 4. If an event is certain to occur its probability is 1 Tree diagrams and Venn diagrams – you will not have to draw them, but you should be able to recognize them from their pictures Tree- https://www.mathsisfun.com/data/images/probability-tree-coin2.gif Venn- http://3.bp.blogspot.com/-e4SRjbb7TBA/VZUEKiZFIkI/AAAAAAAAADY/qpYMjD3SpbA/s1600/venn-diagram.jpg Mutually Exclusive events 2 events are mutually exclusive if they have no outcomes in common Both events cannot occur in the same trial Mutually Exclusive events are labeled A and B P(A&B)=0 General addition rule P(A or B) = P(A) + P(B) – P(A and B) Conditional Probability, including P(B | A) = P(A and B) / P(A) General multiplication rule P(A and B) = P(A) * P(B | A) = P(B) * P(A | B) Independent Events, including testing to see if events are independent or dependent P(A&B)=P(A) X P(B|A) = P(B) X P(A|B) (general) P(B|A)=P(A&B)/P(A) If and only if both are independent then use this formula P(A&B)=P(A) X P(B)
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'