Exam 1 Review
Popular in Statistics for the Sciences
Popular in Math
This 6 page Study Guide was uploaded by Heli Patel on Monday June 20, 2016. The Study Guide belongs to 3339 at University of Houston taught by Prof. C Poliak in Summer 2016. Since its upload, it has received 31 views. For similar materials see Statistics for the Sciences in Math at University of Houston.
Reviews for Exam 1 Review
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 06/20/16
EXAM 1 review and concepts formulas Rcode ● Types of data ○ Population Data consists of all possible values pertaining to a certain set of observations or an investigation. ■ Random Experiments ● we desire each replications of the experiment to be independent,the outcomes of some replications do not affect the outcomes of others. ● sample space(Greek capital letter Ω (omega)) of a random experiment is the set of all possible outcomes. ○ Sample Data small section of the population taken for the purpose of investigation. ● Simple random sample (SRS) of size n consist of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. ● Stratified sampling subdivide the population into at least two different subgroups (strata) that share the same characteristics (as in gender or age bracket) then draw a simple random sample from each stratum. ● Cluster sampling divide the population area into sections (clusters), then randomly select some of the those clusters, and then choose all the members from those selected clusters. ● Systematic sampling selecting every kth member of the population for the sample. ● Resampling many samples are repeatedly taken from available points from the population. This technique is called the bootstrap. ■ Biased Sample systematically favors certain outcomes ● Voluntary Response Sample consists of people who choose themselves by responding to a general appeal. This type of sample is biased because people with strong opinions, especially negative opinions, are most likely to respond. ● Convenience Sampling chooses the individuals easiest to reach. ● Data ○ variable is any characteristic of an individual or object. ■ Categorical variable (factor)a case into one of several groups or categories ■ Quantitative variable numerical values ● Discrete countable set of values ● Continuous values within some interval ○ Range = largest value smallest value ○ Variance ○ Standard deviation ■ is the average distance each observation is from the mean.> or = to zero. ● Parameter and Statistics ○ parameter is a number that describes the population. ○ statistic is a number that describes a sample. Notation of Parameter and Statistics ○ Name Statistic Parameter ○ mean xˉ µ mu ○ standard deviation s σ sigma ○ correlation r ρ rho ○ regression coefficient b β beta ○ proportion pˆ p ● Population Variance ○ If N is the number of values in a population with mean mu, and xi represents each individual in the population, the the population variance is found by: ○ σ 2 = sumN i=1 (xi − µ) 2 N ○ and the population standard deviation is the square root, σ = √ σ 2. ○ Most of the time we are working with a sample instead of a population. So the sample variance is found by: s 2 = Pn i=1 (xi − x¯) 2 n − 1 and the sample standard deviation is the square root, s = √ s 2. Where n is the number of observations (samples), xi is the value for the i th observation and x¯ is the sample mean. ○ By hand find mean, square each scores, 1/(#1)*(all sum square #*mean), then square root the ans = sd ○ If we change the data set by adding/subtract then the mean changes and sd and var remains the same ○ If multiplied or divided everything changes ● X means + sd ○ y=a+bx a and b are constants ○ mean(y)= a+b(mean(x)) ○ sd(y) = b(sd(x)) ○ var(y)=b^2(var(x)) ■ X mean (x) = 3 sd (x) = 0.5 ■ y= 3+2x mean(y) = 3+2(3) = 9 ○ sd(x) = 2(0.5) = 1 ● The function for the sample standard deviation in R is sd(data name$variable name) ● ● Coefficient of Variation ○ This is to compare the variation between two groups. ○ cv = sd/mean ○ A smaller ratio will indicate less variation in the data. ● Percentiles ○ The pth percentile of data is the value such that p percent of the observations fall at or below it. ○ the 100P th percentile,the measurement with rank(or position in the list) ○ nP + 0.5 =position where n represents the number of data values in the sample. ○ Rcode >fivenum(price) ● IQR Interquartile range, ○ IQR = Q3Q1 ● outlier ○ is an observation that is "distant" from the rest of the data. ○ Q1 − 1.5(IQR) = A ○ Q3 + 1.5(IQR)= B ○ A and B is considered an outlier. ● GRAPHS ○ R code ■ For bar graph: plot(datasetname$variablename) ■ For pie chart: > counts<table(shoes$Brand) > pie(counts) ○ Dotplots ■ y putting dots above the values listed on a number line. ○ Stem and leaf plot ■ 1. Separate each observation into a stem consisting of all but the final rightmost digit and a leaf, the final digit ■ Rcode: stem(dataset name$variable name) ○ Histograms ■ Bar graph for quantitative variables. Values of the variable are grouped together.Rcode: hist(dataset name$variable name) ○ Boxplot ■ A graph of the fivenumber summary. Boxplots are most useful for sidebyside comparison of several distributions. ■ Rcode: boxplot(dataset name$variable name) ○ Distribution 1. Shape skewed to the right if the right side (higher values) skewed to the left if the left side (lower values) uniform if the graph is at the same height (frequency) 2. Center the values with roughly half the observations taking smaller values and half taking larger values. 3. Spread from the graphs we describe the spread of a distribution by giving smallest and largest values. 4. Outliers ● Relative frequency ○ method using data to estimate proportion of the time the outcome will occur in the future p(A)= #of times A occurs/tota #of observations ● Subjective method ○ assigning probability known possible outcomes do not have equal probability and little data is know ● Probability ○ The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions. ○ If, under a given assumption, the probability of a particular observed event is extremely small, we conclude that the assumption is probably not correct. ■ Classical method is use when all the experimental outcomes are equally likely. If n experimental outcomes are possible, a probability of 1/n is assigned to each experimental outcome. Example: Drawing a card from a standard deck of 52 cards. Each card has a 1/52 probability o ■ Relative frequency method is used when assigning probabilities is appropriate when data are available to estimate the proportion of the time the experimental outcome will occur if the experiment is repeated a large number of times. That is for any outcome, A, probability of A is ■ Subjective method of assigning probability is most appropriate when one cannot realistically assume that the experimental outcomes are equally likely and when little relevant data are available. ● Definition ○ A set is a collection of objects. ○ The items that are in a set called elements. ○ The sample space of a random phenomenon is the set of all possible outcomes. Ω is used to denote sample space ○ Notation Description ○ a ∈ A The object a is an element of the set A. ○ A ⊆ B Set A is a subset of set B. That is every element in A is also in B. ○ A ⊂ B Set A is a proper subset of set B. That is every element that is is in A is also in set B and there is at least one element in set B that is no in set A. ○ A ∪ B A set of all elements that are in A or B. ○ A ∩ B A set of all elements that are in A and B. ○ Ω Called the universal set, all elements we are interested in. ○ ∼A The set of all elements that are in the universal set but not in set A. ○ S i Ei E1 ∪ E2 ∪ . . ., the union of multiple sets ○ T i Ei E1 ∩ E2 ∩ . . ., the intersection of multiple sets ● Permutations Where n! = n(n − 1)(n − 2)· · ·(2)(1) Rcode for n!: factorial(n) ● Combinations ● Several Objects AT Once ○ The number of permutations, P, of n objects taken n at a time with r objects alike, s of another kind alike, and t of another kind alike is ○ ● Objects Taken of Circular ○ The number of circular permutations of n objects is (n − 1)!. ● Basic Probability Rules ○ 1. 0 ≤ P(E) ≤ 1 for each event E. ○ 2. P(Ω) = 1 ○ 3. If E1, E2, . . . is a finite or infinite sequence of events such that Ei ∩ Ej = ∅ for i 6= j, then P( T i Ei) = P i P(Ei). If Ei ∩ Ej = ∅ for all i 6= j we say that the events E1, E2, . . . are pairwise disjoint. ○ 4. Complement Rule: P(E ∩ ∼ F) = P(E) − P(E ∩ F). In particular, P( ∼E) = 1 − P(E). ○ 5. P(∅) = 0 ○ 6. Addition Rule: P(E ∪ F) = P(E) + P(F) − P(E ∩ F). ○ 7. If E1 ⊆ E2 ⊆ . . . is an infinite sequence, then P( S i Ei) = limi→∞P(Ei). ○ 8. IF E1 ⊇ E2 ⊇ . . . is an infinite sequence, then P( T i Ei) = limi→∞P(Ei).
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'