HS 370 Epidemiology Lesson 4
HS 370 Epidemiology Lesson 4 HS 370
Popular in Epidemiology
Popular in Nursing and Health Sciences
This 6 page Class Notes was uploaded by Cindy Cannon on Wednesday February 10, 2016. The Class Notes belongs to HS 370 at Brigham Young University - Idaho taught by Watson, Tyler A. in Fall 2016. Since its upload, it has received 19 views. For similar materials see Epidemiology in Nursing and Health Sciences at Brigham Young University - Idaho.
Reviews for HS 370 Epidemiology Lesson 4
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/10/16
Epidemiology Lesson 4 Summarizing Data Part 2 Measures of Spread Describe the dispersion (or variation) of values from peak in the distribution Range: the difference between a set of data’s largest (maximum) value and its smallest (minimum) value Percentiles: the Pth percentile (P ranging from 0 to 100) is the value that has P percent of the observations falling at it or below it Quartiles: Each quartile includes 25% of the data th th o 25 , 50th, 75th and 100 percentile Interquartile Range: the central portion of the distribution from the 25 percentile to the 75 percentile o A measure of variance that you use for data that is not normally distributed (skewed) It does not include the extreme variance o Method Step 1: Arrange the observations in increasing order Step 2: Find the position of the 1 and 3 quartile with the following formulas. Divide the sum by the number of observations Position of 1 quartile (Q1)=25 th percentile=(n+rd/4 th Position of 3 quartile (Q3)=75 percentile=3(n+1)/4= 3 X Q1 st rd Step 3: identify the value of the 1 and 3 quartiles If a quartile lies on an observation (if it’s position is a whole number), the value of the quartile is the value of that observation. Epidemiology Lesson 4 1 o Example, if the position of a quartile th 20, its value is the value of the 20 observation If a quartile lies between observations, the value of the quartile is the value of the lower observations. Example if the position of a quartile is 20 ¼, it lies between the 20 and 21th obthrvations and it’s value is the value of the 20 observation plus ¼ the difference between the value of the 20 and 21th observations Step 4: Calculate the interquartile range as Q3 minus Q1 o Example Step 1 0,2,3,4,5,5,6,7,8,9,9,9,10,10,10,10,11,12,12,12,13,1 4,16,18,18,19,22,27,49, Step 2 Position of Q1=(30+1)/4=7.75 Position of Q3=3(30+1)/4=23.25 Step3 The position of Q1 is the 7 observation plus ¾ of the 8 opbservation; therefore, the value of th Q1 is equal the value of the 7 observation plus ¾ of the difference between the values of the 7 and 8 observations th o The value of the 7 obthrvation: 6 o The value of the 8 observation: 7 Q1=6+(¾ X (7-6)=6+(¾ X 1)= 6+ 0.75 =6.75 rd The position of Q3 is the 23 observation plus ¼; therefore, the value of Q3 is equal the value of the 23 observation plus ¼ of the difference rd th between the values of the 23 and 24 observations o The value of the 23 observation: 14 th o The value of the 24 observation: 16 Q3=14+ (¼(16-14))=14+ (¼(2))= 14.5 Step 4 Interquartile range= Q3-Q1=14.5-6.75= 7.75 o Calculating Q2 (mean) Step 1: Position of Q1+ Position of Q3 Divided by 2 or (2(n+1))/4 Epidemiology Lesson 4 2 Step 2: Then follow the same steps you would use to find the positions of Q1 and Q3 o Example: 2, 6, 10, 33, 79, 104, 299, 315, 448, 818, 1005, 1402 Step 1 (12+1)/4=Q1=3.25 AND 3.25 X 3=Q3=9.75 (3.25+9.75)/2= 6.5 Step 2 th The 6 observation (104) plus .50 of observation 7 minus 6 (.50 X (299-104)= 104+ (.50 X 195) = 104+ 97.50 =201.5 o Q4 is simply the max of the data Standard Deviation: the measure of spread o Method Step 1: Calculate the arithmetic mean Step 2: Subtract the mean from each observation. Square the difference Step 3: Sum the squared differences Step 4: Divide the sum of the squared differences by n-1 Step 5: Take the square root of the value obtained in Step 4. The result is the Standard deviation o The standard deviation conveys how widely or tightly the observations are from the center o Standard deviation is usually calculated only when the data are more-or-less normally distributed o Example (27,31,15,30,22) Step 1 Mean=(27+31+15+30+22)/5=124/5=25.0 Step 2 (27-25) Squared=4.0 (31-25) Squared=36.0 (15-25) Squared=100.0 (30-25) Squared=25.0 (22-25) Squared=9.0 Step 3 4++36+100+25+9=174 Step 4 174/(5-1)=174/4=43.5 (This is called the Variance) Step 5 Standard Deviation=square root of the Variance (43.5)= 6.6 Epidemiology Lesson 4 3 o For normally distributed data, approximately 68.3% of the data will fall within 1 standard deviation of the mean o 95% of the data fall within 2 standard deviation of the mean o 99.7% of the data fall within 3 standard deviations 95.0% of the data fall within 1.96 standard deviations of the mean Standard Error of the Mean: variability we might expect in the arithmetic means of repeated samples taken from the same population o The standard error assumes that the data you have is actual a sample form a larger population Thus the mean for your sample is just one of an infinite number of other sample means and the standard error quantifies the variation in those sample means o Method Step 1: calculate the standard deviation Step 2: divide the standard deviation by the square root of the number of observations (n) o Example (Standard deviation=9.188 n=30) Standard error of mean=9.188/(square rood 30)= 9.188/5.477=1.67 o The primary use of the standard error of the mean is in calculating confidence intervals around the arithmetic mean Confidence Limits (Confidence Interval): to make generalizations about the larger population from which these subjects came o Method for Calculating a 95% Confidence Interval for a Mean Step1: Calculate the mean and its standard error Step 2: Multiply the standard error by 1.96 Step3 Lower limit of the 95% confidence interval=means minus 1.96 X standard error Upper limit of the 95% confidence interval= means plus 1.96 X standard error o Example Step 1 Mean=206 Standard error of the mean=3 Step2 3 X 1.96=5.88 Step 3 Lower limit=206-5.88=200.12 Epidemiology Lesson 4 4 Upper limit=206+5.88=211.88 o Confidence intervals for means, proportions, risk ratios, odds ratios, and other measures are all calculated using different formulas o Regardless of the measure the interpretation of a confidence interval is the same: the narrower the interval the more precise the estimate and the range of the values in the internal is the range of population values most consisted with the data from the study Choosing the Right Measure of Central Location and Spread Measures of central location and spread are useful fro summarizing a distribution of data. They also facilitate the comparison of 2 or more sets of data o However, not every measure of central location and spread is well suited to every set of data o For example, because the bell-shaped curve is perfectly symmetrical the mean, median, and mode all have the same value, however observed data rarely approach this idea shape, and the mean, median and mode usually differ Normal Distribution= Measure of Central Location (Arithmetic Mean) and Measure of Spread (Standard Deviation) Asymmetrical or Skewed Distribution= Measure of Central Location (Median) and Measure of Spread (Range or Interquartile Range) Exponential or Logarithmic Distribution= Measure of Central Location (Geometric Mean) and Measure of Spread (Geometric Standard) o In statistics the arithmetic mean is the most commonly used measure of central location But 1 disadvantage of the mean is that it is affected by the presence of 1 or a few observations with extremely high or low values You can tell the direction in which the data are skewed by comparing the values of the mean and the median The mean is higher than the median when the distribution of the data is skewed to the right The mean is lower than the median it is when the distribution of the data is skewed to the left The advantage of the median is that it is not affected by a few extremely high or low observations 2 measures of spread can be used in conjunction with the median: the range and the interquartile range Epidemiology Lesson 4 5 o Although many statistics books recommend the interquartile range as the preferred measures of spread, most practicing epidemiologist use thee simpler range instead The mode is the leas useful measure of central location The geometric mean is used for exponential or logarithmic data such as laboratory titer, and for environmental sampling data whose values can spam sever orders of magnitude o Analogous to the geometric mean, it is the antilogy of the standard deviation of the log of the values. Epidemiology Lesson 4 6
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'