# Chapter 1: Displaying Data MATH 243

UO

GPA 3.95

Date Created: 10/19/15

Math 243 We ll Begin by familiarizing ourselves with different ways to display data long the way we will de ne some nec essary terms De nitions Individuals are the Obj ects described by a set of data variable is any characteristic of an individual categorical variable places an indi vidual into one of several groups William R vechmnm telNana s quantitative variable takes numerical values for Which arithmetic operations make sense Often there is a unit of measurement The distribution of a variable tells us What values it takes and how often it takes these values o The values of a categorical variable are labels for the categories distribution lists the categories and gives the count or the percent of in dividuals Who fall in each category o The values of a quantitative variable are the possible numbers assigned to the an individual There are different ways to graphically represent data Madman1de Fteoltart bar 5MP 0 p1e Chart M m Wm MH C KEMOF quotmm DJ5397 Kim 26 o histogram Popula oWPuPlflnolass variablw hungry Murmurm kind of o stemplot Problem Young people are more likely than older folk to buy mu sic online Here are percents of people in several age 39 o l 11 td39 mnt groups Who bought musrc onlrne POPM a on d mogmpms glezGlrpup Whozzogght online yogmg39 x39f ml 0 i la WM n 1 18 24 21 a e Warmln c 2534 20 0mm categorical 35 44 16 yes or no 45 54 10 55 64 3 o g 65 and older 1 24 aw 545 Why 1s 1t not correct to use a p1e chart 5quot The percean do not add up to I00 Not raw dam WA W4 M Ma Histogram problem The table below gives the ratio of omega 3 to omega6 in some food oils Values greater than 1 show that an oil has more omega3 than omega6 Use a histogram to display this data us ing Classes bounded by Whole numbers from O to 6 Oil Ratio Oil Ratio Perila 533 Flaxseed 356 Walnut 020 Canola 046 Wheat germ 013 Soybean 013 Mustard 038 Grape seed 000 Sardine 216 Menhaden 196 Salmon 250 Herring 267 Mayonnaise 006 Palm 002 Cod liver 200 Rice bran 005 Shortening household 011 butter 064 Shortening industrial 006 Cocoa butter 004 Margarine 005 Corn 001 Olive 008 Sesame 001 Shea nut 006 Cottonseed 000 Sun ower oleic 005 Sun ower linoleic 000 What is the shape of the distribution How many foods have more omega 3 than omega6 How to proauce a histogram classes Frequency highlight cells you want the numbers to appear in frequencydata array bins array OUO IPCDN LO bins arraynumbers you want as boundanes classes command enter highlight make it into a chart with columns Stemplots can also be used for quanti tative variables Suppose the list below is a list of nal exam scores use When vou have mm sets atomq 96 87 8O 74 72 98 86 8O 7O 69 68 50 25 94 92 83 79 78 g Histaq mm 8 1 024M 713 00367 23903390 1090b0708040100 Z lb We can describe the overall pattern of a distribution by its shape center and spread SH PE symmetriclbeII smpeal curve skewed salarieshousin PVibeS nancing CENTER WIMVI Mame omedmntmiomle SPRE DZ Stanom m deviation 395 numbermmmm mime median 03 max Center The mean is the average denoted 7c If your 72 values are x1x2 xn then JC96196239 Xn Sample mean 71 OR VI Populah39on mm M 1 n 76 xi lexxnxgwxn ni1 The median is another measure of cen ter denoted It is the middle value If the number of Observations is even is midway between the two center Observations Mean and Median o The mean and median are close to gether if the distribution is symmet ric o In a skewed distribution the mean is typically farther out the extended side 0 So the mean is not a resistant measure and the median is resistant Example What are the mean and median of the worker commute times in minutes in North Carolina given below 5 16 60 40 30 16 20 6 St 126 K 40 12 20 25 Wl almn wlww 224 avcmgoL Example continued If we were talking about commute times in the Washington DC area you might expect a 180 minute commute time in the list so the list of times would be 5 L0 m 46 30 1020 16 30 K 1340 1220 25 Compute the mean and median Mwiaw 5L9 9793912 Lag was may 407467 L24 Mean 504 Spread o Five number summary Lym firs r alumH 11 median mm quartile max mediang mmwwn1 926 x 3320 07de I 12 a3 quartile one quar lc lt0 D quar e Hm rec unv lc EXL C 5 0 Standard deviation 8 Mme madam devia rion The venumber summary is Boxplot Consider the NC commute times 5 10 6040 30 10 20 15 30 10 10 40 12 20 25 Min 5 Q1 10 20 Q3 2 30 MaX 60 box plot is a graph of the venumber summary The standard deviation 5 is another way to measure spread Meme diS chefmm we mean S2 X1 X2xz 7c2xn x2 OR S Sample 5t alev STDEV5 m o 5 measures spread about the mean X o s and X are used for reasonably sym metric distributions 0 sis zero when the spread is zero 11 the numbers are the same 0 5 gets larger as the spread increases

