Chapter 2 Notes Feb. 6, 2017 By: Jill Huang Bold Letters are Important! Chapter 2: Displaying & Describing Categorical Data ◆ Area Principle – The bigger the counts / the bigger the Percentage → the bigger the slices. Exercise #1 The Wells Fargo/Gallup Small Index survey asked 604 small business about their cash flow over the next 12 months.

ed 604 small business about their cash flow over the next 12 months. • 13% responded “Very Good” • 37% responded “Somewhat Good” • 21% responded “Neither good nor poor” • 20% responded “Somewhat poor” • 7% responded “very poor” Questions: A) What do you notice about the percentages listed? ∙ 2% is missing. ∙ 13 + 37 + 21 + 20 + 7 = 98 B) Make a bar chart display the results and label it clearly C) Would a pie chart be an effective way of communicating this information? Why or why not? ∙ No, because the percentage doesn’t add up to 100%. To make a pie chart you must have 100%. D) Write a couple of sentences on the responses to small business owners about their cash flow in the next 12 months. ∙ 50% is good cash flow ∙ 21% will be ‘so-so’ ∙ the rest will be poor cash flow Exercise #2 A company started and managed by business students is selling campus calendars. The students have conducted a market survey with the various campus constituents to determine sales potential and identify which market segments should be targeted. Questions: A) What percentage of all respondents are alumni? ∙ Total Alumni / Total Respondents = 56 / 1415 = 3.96% B) What percentage of these respondents are very likely to buy the calendar? ∙ Total ‘Very Likely’ / Total Respondents = 481 / 1415 = 33.99% C) What percentage of the respondents who are very likely to buy the calendar are alumni? ∙ ‘Very likely in Alumni’ / Total Respondents = 18 / 1415 = 1.27% D) Of the alumni, what percentage are very likely to buy the calendar? ∙ ‘Very likely in Alumni’ / Total Alumni = 18 / 56 = 32.14% E) What is the marginal distribution of the campus constituents? ∙ Marginal Distribution – the distribution of one variable irrespective of other ones. It can be found by looking at row/column totals. ◦ Total Students / Total Respondents = 905 / 1415 = 63.96% ◦ Total Faculty & Staff / Total Respondents = 338 /1415 = 23.89% ◦ Total Alumni / Total Respondents = 56 /1415 = 3.96% ◦ Total Town Residents / Total Respondents = 116 /1415 = 8.197% = 8.2% Next page → F) What is the conditional distribution of the campus constituents among those very likely to buy the calendar? ∙ Conditional Distribution – Distribution of one variable, can be found by looking at row/column percentages. ◦ ‘Very likely in Student’ / Total ‘Very likely’ = 320 / 481 = 66.53% ◦ ‘Very likely in Faculty & Staff’ / Total ‘Very likely’ = 98 / 481 = 20.37% ◦ ‘Very likely in Alumni’ / Total ‘Very likely’ = 18 /481 = 3.74% ◦ ‘Very likely in Town Residents’ / Total ‘Very likely’ = 45 / 481 = 4.36% G) Does this study present any evidence that this company should focus on selling to certain campus constituents? ∙ This company should focus on Students, because students stand a big portion in this survey. ∙ 63.96% are students Exercise #3 A study of a sample of 1057 houses in upstate New York reports the following percentages of house falling into different Price and Size categories. Questions: A) Are these column, row, or total percentanges? How do you know? ∙ This is a Column Percentage Table ∙ If you add up the numbers in one column then you get 100% for every column. ∙ Column = Up to Down B) What percentage of the highest priced houses were small? ∙ 2.4% ∙ Look at ‘High’ & ‘Small’ C) From this table, can you determine what percentage of houses were in the low price category? ∙ Cannot, because you need a row percentage table or counts to determine the percentage of houses that were in the low price category. D) Among the lowest priced houses, what percentage were small or medium small? ∙ 91.9% = 61.5% + 30.4% ∙ Look at ‘Low’ and ‘Small & Med Small’ E) Write a few sentences describing the association between Price and Size ∙ (Ask Yourself) Is there an association between the size & price? ∙ Based on the graph (Below), the variables are dependent. The size of one variable on the graph will effect the size of another variable on the graph. ∙ The more you pay the bigger size house you will get. / When Price increase, Size increase. ∙ Conditional Distribution Exercise #4 An article in the magazine Science examined the graduate admissions process at Berkeley for evidence of gender bias. The following tables shows the number of applicants accepted to each of four graduate programsQuestions: A) What percentage of total applicants were admitted? ∙ 1284 / 3014 = 42.6% B) Overall, were a higher percentage of males or females admitted? ∙ Males ◦ Male = 1022 / 2165 = 47.2% ◦ Female = 262 / 849 = 30.86% C) Compare the percentages of males and females admitted in each program? ∙ Female acceptance rates are higher than males when counting each program individually. ◦ Program 1: (Male) 61.93% (Female) 82.4% ◦ Program 2: (Male) 62.86% (Female) 68% ◦ Program 3: (Male) 33.66% (Female) 35.2% ◦ Program 4: (Male) 5.9% (Female) 7.04% ∙ This called the Simpson’s Paradox – When performance added together is higher than separately. D) Which of the comparisons you made do you consider to be the most valid? Why? ∙ The second comparison is more valid. ∙ The comparison of percentages of Males and Females should be compared in the same programs (like Question 3). Because each program admit different amount of males and females. Adding all of the numbers together is changing the validity og data.

