Introductory Statistics, Study Guide Ch. 1-3
Introductory Statistics, Study Guide Ch. 1-3 MATH 2100
Popular in Introductory Statistics
Popular in Math
This 6 page Study Guide was uploaded by Kristen Manda on Tuesday September 13, 2016. The Study Guide belongs to MATH 2100 at University of Tennessee - Chattanooga taught by Dr. Aniekan Ebiefung in Fall 2016. Since its upload, it has received 66 views. For similar materials see Introductory Statistics in Math at University of Tennessee - Chattanooga.
Reviews for Introductory Statistics, Study Guide Ch. 1-3
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/13/16
Introduction to Statistics th Test 1: September 16 , 2016 Chapters 13 Study Guide Chapter 1: Statistics: a field of study in which you collect, organize, summarize, analyze, and interpret data. Subject: the person, thing or entity that we get the information from Data: info obtained from subjects through experiments, observations, measurements, or survey Datum: one observational value Raw Data: Data recorded in the sequence in which they occur and before procedure Population: a particular group of interest in a study Sample: a subset of the population Census: data are obtained from every subject in the population 2 Major Areas of Statistics: Descriptive Statistics: collect, organize, summarize, and display data Inferential Statistics: Using sample data to make predictions, generalizations, and inferences about the population. Critical Thinking: an unbiased assessment or analysis of info in order to get a conclusion Elements of it: understanding the problem, analyze, and conclude Sampling Method to Collect Data: data must be collected in a way that is not biased Voluntary Response: biased, subjects decide to participate Analyze: every analysis must have an appropriate graph that reveals outliers, shape, missing data, and nonresponse biased. Statistical Significance: if the outcome CANNOT occur by chance Practical Significance: considers the real world, CAN occur by chance Parameter: a numerical measurement describing some characteristic of a population Statistic: a numerical measurement describing some characteristic of a sample Potential Pitfalls: misleading conclusions, small samples, loaded questions, question order, nonresponse, missing data, precise numbers, percentages over 100%. Be familiar with all of these ways way your conclusion/data could be wrong. Review Notes Variability: the key to Statistics Variable: a quantity or characteristic of interest that has different values for different subjects Qualitative Variable: consists of attributes, labels, categories, names and cannot be numerically measured Quantitative Variable: consists of numerical measurements or counts Discrete: variable that has values that are countable like 0,1,2,3 Continuous: uncountable values, measurements Levels of Measurement: Nominal: lowest level of measurement. Data at this level consist of names, labels or categories only Ordinal: Data that can be arranged in some order, but differences (by subtraction) between data values either cannot be determined or are meaningless Interval: can be ordered, and differences between data values can be found and are meaningful Ratio: highest level of measurement. Data can be ranked or ordered Representative Sample: one that is fair and not biased, one that mirrors the population Random Sample: each subject has an equal chance of being included in the sample Simple Random Sample: stricter that each subject has an equal chance, no subgrouping of population Sampling with Replacement: a subject is returned to the population after selection Systematic Sampling: select some starting point, then select every k +h (such as every 50 ) element in the population Convenience Sampling: use results that are easy to get, surveys Stratified Sampling: population itself is divided, then draw a population from each Cluster Sampling: population area divided then choose samples Types of Statistical Studies: Observational: without modifying the subjects being studied Experimental: apply some treatment and observe effect MetaAnalysis: previous studies are restudied as a group in order to obtain information not possible from individual studies Types of Observational Studies: CrossSectional: data are observed, measured, and collected at one point in time Retrospective: Data are collected from the past from records or interviews Prospective: Data are collected in the future from groups sharing common factors called cohorts DoubleBlind: the subject and worker do not know who is receiving the treatment SingleBlind: subject does not know if they are receiving the treatment Confounding: occurs when the experimenter is not able to distinguish between the effect of different factors Errors Sampling Error: the difference between a sample result and the true population result Nonsampling error: sample data incorrectly collected, recorded, or analyzed Nonrandom Sampling Error: Due to using a sampling method that is not random, such as using a convenience sample or a voluntary response sample Chapter 2: Characteristics of Data Center: a value that lies in the middle of the data set Variation: measure of the amount that the data values vary Distribution: the shape of data Outliers: a value that is very small/large from other values Time: changing characteristics over time Frequency Distribution: shows how data are partitioned among several categories by listing them with a number. Class: consists of data values that fall in a given interval such as 5059 where 50 and 59 are included Frequency of a Data Value: the number of times that the value occurs in the data set Frequency of a Class: the number of times that members of the class occur in the data set Frequency Table: consists of a table that lists data values together with the corresponding frequencies Lower Class limit: the smallest number that can belong to a class Upper Class Limit: the largest number that can belong to a class Class Boundaries: a value that separates classes so that there is no gap between any 2 consecutive classes in the frequency table. Find size of gap and then divide by 2. Class Width: difference between the lower class limits or lower class boundaries of 2 consecutive classes Class midpoint: (lower class limit + upper class limit) / 2 Relative Frequency Distribution: (Frequency of class) / (Sum of all Frequencies) Cumulative Frequency Distribution: the frequency of each class is the sum of frequencies for that class and all previous classes Histogram: a bar graph of equal width drawn adjacently. The horizontal scale represents classes of quantitative values and vertical are the frequencies Scatterplot: a dot plot of the dependent variable and independent Time Series Graph: a graph of time series data, which are quantitative data that have been collected at different points in time. Dotplot: consists of a graph in which each data value is plotted as a point (or a dot) along a scale of values. Stem plot: data portioned into 2 parts—the left portion is called the stem and the right portion is called the leaf. Pareto Chart: a bar graph for qualitative data, with the bars arranged in descending order according to frequencies Pie Chart: a graph depicting qualitative data as slices of a circle, in which the size of each slice is proportional to the frequency count. Frequency Polygon: line graph in which class frequencies are plotted against corresponding class midpoints Ogive: a line graph of cumulative frequencies against the corresponding upper class boundaries. Find values above or below a specified value. Chapter 3 Mean: sum of all values divided by the number of values. Median: is the measure of center that is the middle value when the original data values are arranged in order Mode: the value that occurs with the greatest frequency Midrange: the measure of center that is the value midway between the maximum and minimum values in the data set. (max data value + minimum data value) / 2 Mean from a Frequency Distribution: (f*x) / f Weighted Mean: w x / w Range: (max value min value) Standard Deviation: how much data values deviate from the mean. Variance: the square of the S. Deviation Empirical Rule: only for bell shaped distribution… 68% of values fall 1 standard deviation of the mean, 95% of values fall between 2 standard deviations of the mean, 99.7% of values fall within 3 standard deviations from the mean Chebyshev’s Theorem: the proportion of any set of data lying within k standard deviations of the mean is at least Coefficient of Variation: Biased Estimator: the values of the sample standard deviation s do not target the value of the population standard deviation. Unbiased Estimator: the values of sample variance tend to target the value of population variance instead of systematically tending to overestimate or underestimate it. Z score (or standardized value): is the number of standard deviations that a given x value is above or below the mean. Percentiles: are measures of location, denoted p1, p2… which divide a set of data into 100 groups with about 1 % of the values in each group. Quartiles: are measures of location, denoted Q1, Q2, Q3, which divide a set of data into four groups with about 25% of the values in each group. Review each quartile 5 number summary: minimum, first quartile, second quartile, third quartile, maximum Box plot: a graph of a data set that consist of a line extending from the minimum value, and a box with lines drawn at the first quartile, the median, and the third quartile. Modified Boxplot: a regular boxplot constructed with these modifications 1) a special symbol is used to identify outliers 2) the solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum value that is not an outlier.
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'