Join StudySoup for FREE

Get Full Access to
UH - MATH 2311 - Class Notes - Week 1

Description

Reviews

Chapter 1: Exploring Univariate Data Section 1.1: Types of Data ∙ Relevance of statistics - Statistics is used to gather and analyze data for any discipline - Statistics is used to analyze surveys ∙ What is Statistics? - Statistics is used to make intelligent decisions in a world full of uncertainty “A knowledge of statistics provides the necessary tool to differentiate between sound statistics conclusions and questionable conclusions.” - Statistics is the science of collecting, organizing, and interpreting numerical facts which we call data ∙ What is “Data”? - Statistics is the science of collecting, organizing and interpreting numerical facts which we call data The facts and figures collected, analyzed, and summarized for presentation and interpretation Amount of your last purchase at a grocery store The number of times that you access a certain website Your name ∙ Types of Data: - Population Data is everything or everyone we want information about It is a set of data that consists of all possible values pertaining to a certain set of observations or an investigation - Sample Data is a subset of the population that we have information from It is just a small section of the population taken for the purpose of investigation ∙ Examples of Types of Data - Identify the population and the sample for each of the following: University of Houston is interested in how many students buy used books as opposed to new ones. They randomly choose 100 students at the student center to interview. o Population – All UH Students o Sample – 100 Students Samples An elementary school is creating a new lunch menu. They send questionnaires to students with last names that begin with the letter M through R. o Population – All students at this school o Sample – Students with last names that start with M through R∙ A variable is a characteristic of an individual that can assume more than one value - Variables can be classified as categorical (qualitative) or quantitative (numeric). Categorical variables – describe qualities or characteristics that data may have o They usually represent a “type of something” such as a type of car. Quantitative variables – are measurements o These will be numeric values ∙ Quantitative variables can be classified as either discrete or continuous - Discrete quantitative variables – a countable set of values For example: the number of lives given in a single play of a video game - Continuous quantitative variables – data that can take on any values within some interval For example: the amount of time you wait in line at the drivers license office ∙ Example: Classify the following variables as categorical or quantitative. - If quantitative, state whether the variable is discrete or continuous. dPolitical preference o Categorical Number of siblings o Quantitative - Discrete Blood type o Categorical Height of men on a professional basketball team o Quantitative – Continuous Time it takes to be on hold when calling the IRS at tax time o Quantitative – Continuous Section 1.2: Mean and Median ∙ One question we want to answer about data is about its location, particularly the location of its center. - Mean – is denoted with the Greek letter µ when referring to the population mean and with the symbol ´x when referring to the sample mean Most common measure of center Arithmetic average We find the mean by adding up all the values and dividing by how many. Where n is the size of the sample and N is the size of the population Symbols for mean: ´x vs µ - Median – M is the midpoint of a data set such that half of the observations are smaller and the other half are larger Arrange all observations in order of size, from smallest to largest Find the middle value of the arranged observations by counting (n + 1)/2 from the bottom of the list o If the number of observations n is odd, the mean M is the center observation in the ordered list. o If the number of observations n is even, the median M is the mean of the two center observation in the ordered list - Mode – is the numerical value that appears the most frequently Mode is used as a description of center for categorical data The data set can have one mode, two or more modes A data set may not have any mode ∙ Examples: - 1. Twelve babies spoke for the first time at the following ages (in months): 8 9 10 11 12 13 15 15 18 20 20 26 a. What is the mean of the data? ´x = (8 + 9 + 10 + 11 + 12 + 13 + 15 + 15 + 18 + 20 + 20 + 26)/12 = 14.75 b. What is the median of the data? Median = (13 + 15)/2 = 14 c. What is the mode of the data? Bimodal modes are 15 and 20 - 2. Here are the weights (in pounds) of 20 steers on an experimental feed diet: 174 142 131 145 175 150 176 151 110 162 133 163 135 178 178 154 166 146 156 167a. What is the mean of the data? ´x = (174 + 142 + 131 + 145 + 175 + 150 + 176 + … + 167)/20 = 154.6 b. what is the mean of the data? 110, 131, 133, 135, 142, 145, 146, 150, 151, 154, 156, 162, 163, 166, 167, 174, 175, 176, 178, 178 Median = (154 + 156)/2 = 155 c. What is the mode of the data? Mode = 178 - 3. The test scores of a class of 20 students have a mean of 71.6 and the test scores of another class of 14 students have a mean of 78.4. Find the mean of the combined group. Mean = sum/n Class 1: 71.6 = sum/20 sum = 20(71.6) = 1432 Class 2: 78.4 = sum/14 sum = 14(78.4) = 1097.6 Mean of combined classes = (1432 + 1097.6)/(20 + 14) = 74.4 - 4. Explain why the conclusion drawn is not valid: A businesswoman calculates that the median cost of the five business trips that she took in a month is $600 and concludes that the total cost must have been $3000. 1 2 3 4 5 6 $600 If $400 was mean the conclusion would be correct Section 1.3: Standard Deviation and Variance ∙ Another important question we want to answer about data is about its spread or dispersion. - Roughly speaking, the population standard deviation, σ, tells the average distance that data values fall from the mean. - The standard deviation is the square root of the population variance, σ2. - So, what is the variance? - The variance is the average of the squared differences of the data values from the mean. ∙ If N is the number of values in a population with mean μ , and xi represents each individual value in the population, then the variance is found by: ∙ And the population standard deviation is σ = √σ2∙ Most of the time we are not working with the entire population. - Instead, we are working with a sample. Sample variance – Sample standard deviation – ∙ Example: - 1. A statistics teacher wants to decide whether or not to curve an exam. From her class of 300 students, she chose a sample of 10 students and their grades were: 72, 88, 85, 81, 60, 54, 70, 72, 63, 43 Find the mean, variance and standard deviation for this sample. ´x = (72 + 88 + 85 + 81 + 60 + 54 + 70 + 72 + 63 + 43)/10 = 68.8 s2 = [(72 – 68.8)^2 + (88 – 68.8)^2 + (85 – 68.8)^2 + … + (43 – 68.8)^2]/(10 – 1) = ~199.7 s = √199.7 = ~14.13 - 2. Suppose the statistics teacher decides to curve the grades by adding 10 points to each score. What is the new mean, variance and standard deviation? New mean: 78 (old mean + 10) or (68.8 + 10) New s2 = ~199.7 variance and standard deviation did not change New s = ~14.13 By adding 10 to each data point, the spread of the data does not change. This is variance and the standard deviation are unaffected by adding a value to each data point. ∙ We can see from example 2 that adding the same value to all elements does not affect the variance (or standard deviation) of a set of data. ∙ What about multiplying? - 3. Find the variance and the standard deviation for the following set of data (whose mean is 4.5) 3, 6, 2, 7, 4, 5 Now, multiply each value by 2. What is the new variance and the new standard deviation? Mean(x) = 4.5 Var(x) = [(3 – 4.5)^2 + (6 – 4.5)^2 + (∙ Sometimes we want to compare the variation between two groups. - The coefficient of variation can be used for this. - The coefficient of variation is the ratio of the standard deviation to the mean. - A smaller ratio will indicate less variation in the data. ∙ Example: - 4. The following statistics were collected on two different groups of stock prices:
Portfolio A
Portfolio B
Sample size
10
15
Sample mean
$52.65
$49.80
Sample standard deviation
$6.50
$2.95

## What is the median of the data?

## What is the mean of the data?

## ∙ What is Statistics?

We also discuss several other topics like ttu checklist

If you want to learn more check out (2) How is it produced?

Don't forget about the age old question of cdfs csulb

If you want to learn more check out what are the moral theories

We also discuss several other topics like vladimira wilent

Don't forget about the age old question of mrszx

References: