Notes for Exam 1: Descriptive Statistics, statistic: descriptive measure derived from a sample(of n items) (n=sample size) parameter: descriptive measures derived from POPULtion (N=population size) 3 characteristics of numerical data: 1. central tendency: where are the data concentrated?what value seems to be typical? 2. disWe also discuss several other topics like iup geography
We also discuss several other topics like chromosomes contain thousands of segments called:
If you want to learn more check out centralized government def
Don't forget about the age old question of What are the defining features of government?
We also discuss several other topics like How is the solution concentration determined?
Don't forget about the age old question of helping relationship stages
persion: how much variation is there in data? are there outliers? 3. shape: are the data values distribute symettrically or are they skewed? measures of central tendency: mean, median, proprotion(for binary data) if mean/median are about the same the distribution is symbiotic, and you can use either mean or median if mean and median are not close to each other, the ship is skewed lett or right, but either way use edab as a measure of central tendency Measure of dispersion standard deviation: in excel use: =stdev() range variance: in excel use: =var.s() Relative Variation: sometimes we want to compare variances of 2 data sets with difference means coefficient of variation: CV=(s/x)*100 Probabilities: three types: 1. P(x<#) use norm.dist 2. P(x>#) use 1norm.dist 3. P(#1<x>#2) use norm dist(largest)norm.dist(smaller) sample mean affected by every sample item regardless of the shape of distribution, absolute distances from he mean to the data point always sum to zero in excel use : =average() Sample Median insensative to extreme values if N is odd, the median is the middle obersavtion in a sorted data array if N is even, the median is the average of the middle 2 numbers in a sorted array of data in excel use: =median() Standardizned Variable: z= standardized variable redefines each observation in terms of how may standard deviations it is away from the mean the higher the z score, the more unusual that piece of data is Zscore =(meanstandard deviation)/standard deviation denoted z~N(0,1) Empiricle rule: with a normal distribution: within one standard deviation of the mean on either side contains 68.26% within two standard deviations of the mean on either side contains 95.44% within three standard deviations of the mean on either side contains 99.73% discrete variable: each value of “x” has its own probability P(x) continuous variable: events are intervals and probabilities are areas underneath smooth curves. a single point has no probability probability density function(PDF): for a continuous random variable, the PDF is an equation that shows the height of the curveF(x) at each possible “x” over the entire range denoted F(x) total area under curve=1 Characteristics of a Uniform Distribution: if x is a random variable that is uniformly distributed between “A” and “B” it is denoted: X~U(A,B) the PDF has a constant height of 1/ba total area= base*height s0 : ((BA)*1)/(BA) = 1 CDf increases linearly to 1 from A to b , (xA)/(BA) Uniform distribution summary: Parameters—————A=lower limit, B=upper limit PDF——————————F(x)=1/(ba) CDF————————— (xa)/(ba) domain——————————a<x<b mean————————————(a+b)/2 shape————————————symmetric with no mode Characteristics of a Normal Distribution normal/gaussian distribution(Karl Gauss defined by 2 parameters, mean and standard deviation denoted X~N(mean, standard deviation) domain is negative infinity to infinity Inverse in Excell: =norm.inv(prob, mean, standard deviation) you put in a probability and find the x value Sampling Variation: jpw ,uch a statistic varies deom one sample to the next depends on “N” sample size *larger samples have less sampling variation Sampling Error: since many statistics such as samle mean and p are continuous random variables, the liklihood there are exactly equal to the parameter call is zero there will ALWAYS be some sampling error Bias: the difference between the expected/average value o the statistic and true parameter on average, neighther the sample mean or p are biased Bias vs Sampling Error: sampling error is random, where bias is systematic an unbiased estimator avoids systematic error Standard Error for mean: the standard error for mean depends on population standard deviation and the sample size standard error for mean is: standard deviation/square root of n standard error for proportionL the standard error of a proportion depends on the population proportion, as well as the sample size sampling distribution: the probibility distrivution of all possible values of a stoats when random sample size of n is taken Central Limit Theorem: 1. if population is normal, the distribution of the sample mean is normal, regardless of sample 2. as the sample of n increases, the distribution of sample means(proportion) collapses the population mean 3. if the sample is large enough(n>30), the sample mean will have a normal distribution even if the population is not normal Confidence Interval : we construct a confidence interval by adding and subtracting a margin of error, E confidence level: how confident we are that the confidence interval contains th true population mean margin of error: depends on confidence level(zcrit or tcrit) and standard error