×
Log in to StudySoup
Get Full Access to UH - MATH 3339 - Study Guide - Midterm
Join StudySoup for FREE
Get Full Access to UH - MATH 3339 - Study Guide - Midterm

Already have an account? Login here
×
Reset your password

UH / Math / MATH 3339 / What are the types of samples?

What are the types of samples?

What are the types of samples?

Description

School: University of Houston
Department: Math
Course: Statistics for the Sciences
Professor: C poliak
Term: Summer 2016
Tags: StatsforScience and Math
Cost: 50
Name: StatsforScience Exam 1 Study Guide
Description: This study guide covers a review from videos 1-4, vocab, and R studio commands.
Uploaded: 05/06/2016
10 Pages 72 Views 3 Unlocks
Reviews


Exam 1 Study Guide  


What are the types of samples?



Types of Samples:  

- Probability sample- a sample in which each member of the population has a chance of being  selected  

- Simple Random Sample (SRS)- size n consists of n individuals, each individual of every set  has an equal chance of being selected  

- Stratified sampling- divide the population into at least two groups that share the same  characteristics then draw a SRS from each group  

- Cluster sampling- divide the population area in clusters, then randomly select clusters, then  randomly select members from those clusters  We also discuss several other topics like What is reflected in the idea of uniformity?

- Systematic sampling- selecting every k^th member of the population for the sample  - Resampling- many samples are repeatedly taken from available points from the population;  also called bootstrap  


What is quantitative variable?



Biased samples  

- Biased study- sampling method favors one over the other to get certain outcomes  - Voluntary response sample- people choose to respond in a study/survey and it is biased  towards people who want to respond, people who have strong opinions  

- Convenience sampling- chooses samples that are easiest to reach  

Concepts  

- study- an experiment in which we perform something to get a response, gives evidence for  factors causing response & control some of the outcomes  

- units=people  If you want to learn more check out If we have a chiral center, how do we know where it points?

- placebo=dummy treatment  

- factors=explanatory variables  

- placebo effects= subjects responds to placebo treatment  


Define probability.



- Random experiments- can be repeated infinite number of times, outcomes vary from  replication even though conditions are the same, each replication is independent meaning  outcomes do not affect the outcomes of others  

- Random Variable- variable whose value is a numerical outcome of a random phenomenon  - Categorical Variable- represents data which can be divided into groups  - Quantitative variables- variable on a numerical scale  

• Discrete- can count the values, usually whole numbers,  

• Continuous- value that can be obtained by measuring, data can take on any value between  two specified values, can be decimals ex:height Don't forget about the age old question of What is the meaning of microcredit?

- Sample space- a set of all possible outcomes  

• Notation for sample space: Ω= {1,2,3,4,5}  

- Parameter- a fixed number that describes the population  

- Statistics- a number that describes a sample, known when we have taken a sample  • Multiply or dividing the original data by the same value will change the SD by that factor  (mean & SD change)

• Adding or subtracting the original data by the same value will not change the SD (mean  changes but SD stays the same)  

• If all data values are the same, then SD = 0  

- Finding standard deviation- average distance each value is from the mean  1. Square each data value  

2. Add all squared data values=sum  

3. (1/n-1)[sum-(n x mean)= s^2= variance  We also discuss several other topics like What is hearing in behavioral neuroscience?

4. √s^2= SD  

- Coefficient of variation- ratio of SD to mean, used to compare variation between two groups  • cv=SD/mean  

- p^th percentile- value that p percent of the observation fall at or below it  • nP+0.5  

- Five number summary - min, Q1, median, Q3, max  

- Q1- 25% of the variable is at that value or less  If you want to learn more check out What is the difference between extrinsic and intrinsic rewards?

- Q2- 50% of the variable is at that value or less

- Q3- 75% of the variable is at that value or less

- Interquartile range- range of 50th percentile  

• IQR=Q3-Q1  

• Q1-(1.5IQR)=A  

• Q3+(1.5IQR)=B  

• [A,B] anything outside the interval are outliers  

- Range= largest value- smallest value  

- Mode- most occurring  

- Mean- arithmetic average  

• sum of all values/# of values  

- Center- mena, median, mode  

- Range- range, sd, variance, IQR  If you want to learn more check out Who said that all species of organisms show variation?

Graphs  

Categorical variable  

- Bar graphs- each bar represents a category and the height of each bar are represented by the  count or percent  

- Pie charts- helps us see what part of the whole each group forms  

Quantitative variable  

- Dot plot- dots represent each value over a number line  

- Stem plot- stems in a vertical column with the smallest at the top and leaves in the row to the  right of its stem in increasing order out from the stem  

- Histogram- like bar graphs for quantitative variable, bars touching, width of bar represents a  range of values, height of bar represents number of cases within that range of values  - Boxplot- represents the five number summary, useful for side by side comparison, asterisks or  dots represent outliers

- Cumulative frequency polygon- y axis has percentile and x- axis has the variable, graph can  show values at each percentile  

- Distributions- tells us what values it takes and how often it takes these values based on the  individuals, COSS  

• Center- median/center  

• Outliers- values outside the overall pattern  

• Shape- symmetric, skewed right, skewed left, uniform  

• Spread- largest value- smallest value  

Probability  

- Random if individual outcomes are uncertain  

- Scatterplots- show association between two quantitative variables  

• Direction- positive= increasing, negative=decreasing  

• Form- if there is a straight line relationship= linear  

• Strength- how much scatter the plot has, very strong, moderate, or weak association  - Set- collection of objects  

- Elements- individual elements in a set  

• E= {….,….}

- Venn diagrams- show relationships between two sets  

- Classical methods- probability of all outcomes are equal

• 1/n to each possible outcome  

- Relative frequency method- using data to estimate proportion of the time the outcome will  occur in the future  

• p(A)= #of times A occurs/tota #of observations  

- Subjective method- assigning probability known possible outcomes do not have equal  probability and little data is known  

- Tree diagram- graphical representation in visualizing a multiple step experiment  - 0!=1  

- Permutations- computing number of outcomes where order does matter  • nPr= (n!)/(n-r)!  

• P= (n!)/[(r!)(s!)(t!)] for repeating letters  

• Circular permutations- (n-1)!  

- Combinations- counts number of outcomes where order doesn’t matter  • nCr=(n!)/(r!(n-r)!)  

- Probability Rules  

1. P(E) has to be between 0 and 1  

2. P(Ω)= 1  

3. There are no elements in A&B= pairwise disjoint  

4. Complement Rule- P(~A)= 1-P(A)  

5. P(Ø)= O no elements  

6. Addition Rule- P(AuB)= P(A)+P(B)-P(AnB) when finding the chance of events A or B  happening  

7. Multiplication- P(AnB)= P(A) x P(B, given A) OR P(AnB)= P(B) x P(A, given B) when  finding the chance of events of A and B happening  

8. Conditional probability- P(A,given B)=(P(AnB))/(P(B))  

R studio  

- Type the following to make a list of values - dataname=c(data,data,data)  - To import data, go to tools, click on import dataset, upload by URL or local file  - If data is already in R, then type the data name and click enter  

- To find mean, median, variance, sd of the data  

• if it has one variable- mean(dataname)  

• if it has multiple variables- mean(dataname$variable)  

• if it has one variable- median(datasetname)  

• if it has multiple variables- median(dataname$variable)  

• if it has one variable- var(dataname)  

• if it has multiple variables- var(dataname$variable)  

• if it has one variable- sd(datasetname)  

• if it has multiple variables- sd(dataname$variable)  

- To find percentile  

• quantile(dataname,enter percentile in decimal) ex:(KidsFeet,0.25,0.50)  - To find five number summary  

• fivenum(dataname)

- Bar graph- plot(dataname$variable)  

- Pie chart-  

• First type- counts<- table(dataname$variable) press enter  

• Then type- pie (counts)  

- Stem and Leaf plot- stem(dataname$variable)  

- Histogram- hist(dataname$variable)  

• To plot a histogram with a title and x-axis label

hist(dataname4variable,main=“Title”,xlab=“x-axislabel”  

- Boxplot- boxplot(dataname$variable)  

• To plot boxplots side by side- use tilde as shown below  

• boxplot(quantitativevariable$categoricalvariable~quantitativevariable$categoricalvariable)  - Scatterplot- plot(explanatory, response)  

- For permutations- factorial(n)  

- For combinations- choose(n,r)

Exam 1 Study Guide  

Types of Samples:  

- Probability sample- a sample in which each member of the population has a chance of being  selected  

- Simple Random Sample (SRS)- size n consists of n individuals, each individual of every set  has an equal chance of being selected  

- Stratified sampling- divide the population into at least two groups that share the same  characteristics then draw a SRS from each group  

- Cluster sampling- divide the population area in clusters, then randomly select clusters, then  randomly select members from those clusters  

- Systematic sampling- selecting every k^th member of the population for the sample  - Resampling- many samples are repeatedly taken from available points from the population;  also called bootstrap  

Biased samples  

- Biased study- sampling method favors one over the other to get certain outcomes  - Voluntary response sample- people choose to respond in a study/survey and it is biased  towards people who want to respond, people who have strong opinions  

- Convenience sampling- chooses samples that are easiest to reach  

Concepts  

- study- an experiment in which we perform something to get a response, gives evidence for  factors causing response & control some of the outcomes  

- units=people  

- placebo=dummy treatment  

- factors=explanatory variables  

- placebo effects= subjects responds to placebo treatment  

- Random experiments- can be repeated infinite number of times, outcomes vary from  replication even though conditions are the same, each replication is independent meaning  outcomes do not affect the outcomes of others  

- Random Variable- variable whose value is a numerical outcome of a random phenomenon  - Categorical Variable- represents data which can be divided into groups  - Quantitative variables- variable on a numerical scale  

• Discrete- can count the values, usually whole numbers,  

• Continuous- value that can be obtained by measuring, data can take on any value between  two specified values, can be decimals ex:height

- Sample space- a set of all possible outcomes  

• Notation for sample space: Ω= {1,2,3,4,5}  

- Parameter- a fixed number that describes the population  

- Statistics- a number that describes a sample, known when we have taken a sample  • Multiply or dividing the original data by the same value will change the SD by that factor  (mean & SD change)

• Adding or subtracting the original data by the same value will not change the SD (mean  changes but SD stays the same)  

• If all data values are the same, then SD = 0  

- Finding standard deviation- average distance each value is from the mean  1. Square each data value  

2. Add all squared data values=sum  

3. (1/n-1)[sum-(n x mean)= s^2= variance  

4. √s^2= SD  

- Coefficient of variation- ratio of SD to mean, used to compare variation between two groups  • cv=SD/mean  

- p^th percentile- value that p percent of the observation fall at or below it  • nP+0.5  

- Five number summary - min, Q1, median, Q3, max  

- Q1- 25% of the variable is at that value or less  

- Q2- 50% of the variable is at that value or less

- Q3- 75% of the variable is at that value or less

- Interquartile range- range of 50th percentile  

• IQR=Q3-Q1  

• Q1-(1.5IQR)=A  

• Q3+(1.5IQR)=B  

• [A,B] anything outside the interval are outliers  

- Range= largest value- smallest value  

- Mode- most occurring  

- Mean- arithmetic average  

• sum of all values/# of values  

- Center- mena, median, mode  

- Range- range, sd, variance, IQR  

Graphs  

Categorical variable  

- Bar graphs- each bar represents a category and the height of each bar are represented by the  count or percent  

- Pie charts- helps us see what part of the whole each group forms  

Quantitative variable  

- Dot plot- dots represent each value over a number line  

- Stem plot- stems in a vertical column with the smallest at the top and leaves in the row to the  right of its stem in increasing order out from the stem  

- Histogram- like bar graphs for quantitative variable, bars touching, width of bar represents a  range of values, height of bar represents number of cases within that range of values  - Boxplot- represents the five number summary, useful for side by side comparison, asterisks or  dots represent outliers

- Cumulative frequency polygon- y axis has percentile and x- axis has the variable, graph can  show values at each percentile  

- Distributions- tells us what values it takes and how often it takes these values based on the  individuals, COSS  

• Center- median/center  

• Outliers- values outside the overall pattern  

• Shape- symmetric, skewed right, skewed left, uniform  

• Spread- largest value- smallest value  

Probability  

- Random if individual outcomes are uncertain  

- Scatterplots- show association between two quantitative variables  

• Direction- positive= increasing, negative=decreasing  

• Form- if there is a straight line relationship= linear  

• Strength- how much scatter the plot has, very strong, moderate, or weak association  - Set- collection of objects  

- Elements- individual elements in a set  

• E= {….,….}

- Venn diagrams- show relationships between two sets  

- Classical methods- probability of all outcomes are equal

• 1/n to each possible outcome  

- Relative frequency method- using data to estimate proportion of the time the outcome will  occur in the future  

• p(A)= #of times A occurs/tota #of observations  

- Subjective method- assigning probability known possible outcomes do not have equal  probability and little data is known  

- Tree diagram- graphical representation in visualizing a multiple step experiment  - 0!=1  

- Permutations- computing number of outcomes where order does matter  • nPr= (n!)/(n-r)!  

• P= (n!)/[(r!)(s!)(t!)] for repeating letters  

• Circular permutations- (n-1)!  

- Combinations- counts number of outcomes where order doesn’t matter  • nCr=(n!)/(r!(n-r)!)  

- Probability Rules  

1. P(E) has to be between 0 and 1  

2. P(Ω)= 1  

3. There are no elements in A&B= pairwise disjoint  

4. Complement Rule- P(~A)= 1-P(A)  

5. P(Ø)= O no elements  

6. Addition Rule- P(AuB)= P(A)+P(B)-P(AnB) when finding the chance of events A or B  happening  

7. Multiplication- P(AnB)= P(A) x P(B, given A) OR P(AnB)= P(B) x P(A, given B) when  finding the chance of events of A and B happening  

8. Conditional probability- P(A,given B)=(P(AnB))/(P(B))  

R studio  

- Type the following to make a list of values - dataname=c(data,data,data)  - To import data, go to tools, click on import dataset, upload by URL or local file  - If data is already in R, then type the data name and click enter  

- To find mean, median, variance, sd of the data  

• if it has one variable- mean(dataname)  

• if it has multiple variables- mean(dataname$variable)  

• if it has one variable- median(datasetname)  

• if it has multiple variables- median(dataname$variable)  

• if it has one variable- var(dataname)  

• if it has multiple variables- var(dataname$variable)  

• if it has one variable- sd(datasetname)  

• if it has multiple variables- sd(dataname$variable)  

- To find percentile  

• quantile(dataname,enter percentile in decimal) ex:(KidsFeet,0.25,0.50)  - To find five number summary  

• fivenum(dataname)

- Bar graph- plot(dataname$variable)  

- Pie chart-  

• First type- counts<- table(dataname$variable) press enter  

• Then type- pie (counts)  

- Stem and Leaf plot- stem(dataname$variable)  

- Histogram- hist(dataname$variable)  

• To plot a histogram with a title and x-axis label

hist(dataname4variable,main=“Title”,xlab=“x-axislabel”  

- Boxplot- boxplot(dataname$variable)  

• To plot boxplots side by side- use tilde as shown below  

• boxplot(quantitativevariable$categoricalvariable~quantitativevariable$categoricalvariable)  - Scatterplot- plot(explanatory, response)  

- For permutations- factorial(n)  

- For combinations- choose(n,r)

Page Expired
5off
It looks like your free minutes have expired! Lucky for you we have all the content you need, just sign up here