×

### Let's log you in.

or

Don't have a StudySoup account? Create one here!

×

or

## Stat Chapter 2 Notes

by: Brandon Gearhart

2

0

13

# Stat Chapter 2 Notes Stat 206

Brandon Gearhart
USC

Get a free preview of these Notes, just enter your email below.

×
Unlock Preview

These notes cover topics for exam 1
COURSE
PROF.
Angela Ferguson
TYPE
Class Notes
PAGES
13
WORDS
KARMA
25 ?

## Popular in Math

This 13 page Class Notes was uploaded by Brandon Gearhart on Monday October 3, 2016. The Class Notes belongs to Stat 206 at University of South Carolina taught by Angela Ferguson in Fall 2016. Since its upload, it has received 2 views. For similar materials see Business Statistics in Math at University of South Carolina.

×

## Reviews for Stat Chapter 2 Notes

×

×

### What is Karma?

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 10/03/16
STAT 206: Chapter 2 (Organizing and Visualizing Variables)  Methods to Organize and Visualize Variables  For Categorical Variables:  Summary Table; contingency table (2.1)  Bar chart, pie chart, Pareto chart, side-by-side bar chart (2.2)  For Numerical Variables  (Array), Ordered Array, frequency distribution, relative frequency distribution, percentage distribution, cumulative percentage distribution (2.3)  Stem-and-Leaf display, histogram, polygon, cumulative percentage polygon (2.4)  Other methods later…  2.1 Organizing Categorical Variables  Must identify variable type to determine the appropriate organization and visualization tools èRecall Variable Types  Categorical (Category)  Nominal – Name of a Category  Ordinal – Has a natural ordering  Numerical / Quantitative (Quantity)  Discrete – distinct cutoffs between values  Continuous – on a continuum  Definitions:  Summary Table: shows values of the data categories for one variable and the frequencies (counts) or proportions/ percentages for each category  Contingency Table: shows values of the data categories for more than one variable and the frequencies or proportions/percentages for each of the joint responses  Each response counted/tallied into one and only one category/cell  Example (Problem 2.2, p. 40): The following data represent the responses to two questions asked in a survey of 40 college students majoring in business:  What is your gender? (M=male; F=female)  What is your major? (A=Accounting; C=Computer Information; M=Marketing) Gender : M M M F M F F M F M Major: A C C M A C A A C C Gender : F M M M M F F M F F Major: A A A M C M A A A C Gender : M M M M F M F F M M Major: C C A A M M C A A A 1 Gender : F M M M M F M F M M Major: C C A A A A C C A C SummaryTable(Gender): SummaryTable(Major): relative relative value frequencyfrequency percentageue frequencfrequency percentage Male(M) 25 0.625 62.5 A(Accounting) 20 0.500 50.0 Female(F) 15 0.375 37.5 C(Computer) 15 0.375 37.5 M (Marketing) 5 0.125 12.5 TOTALS 40 1.000 100.0 TOTALS 40 1.000 100.0  Now to combine the two variables (Gender and Major): MAJOR CATEGORIES A C (Accounting (Computer M GENDER ) ) (Marketing) TOTALS Male (M) 14 9 2 25 Female (F) 6 6 3 15 TOTALS 20 15 5 40  Table based on Total percentages: MAJOR CATEGORIES A C (Accounting (Computer M GENDER ) ) (Marketing) TOTALS Male (M) 35% 22.5% 5% 62.5% Female (F) 15% 15% 7.5% 37.5% TOTALS 50% 37.5% 12.5% 100%  Table based on Row percentages: MAJOR CATEGORIES A C (Accounting (Computer M GENDER ) ) (Marketing) TOTALS Male (M) 56% 36% 8% 100% Female (F) 40% 40% 20% 100% TOTALS 50% 37.5% 12.5% 100%  Table based on Column percentages: MAJOR CATEGORIES A C (Accounting (Computer M GENDER ) ) (Marketing) TOTALS Male (M) 70% 60% 40% 62.5% Female (F) 30% 40% 60% 37.5% TOTALS 100% 100% 100% 100% 2  Questions:  How many of the surveyed students were females majoring in Marketing? 3  What percentage of the surveyed students were females majoring in Marketing? 7.5%  What percentage of the male students surveyed were majoring in Computer? 36%  Of the students majoring in Accounting, what percentage was male? 70% 3  2.3 Visualizing Categorical Variables  Pie chart – uses sections of a circle to represent the tallies/frequencies/percentages for each category  Bar chart – a series of bars, with each bars representing the tallies/frequencies/percentages for a single category  Summary Table (Major): value frequency relative frequency percentage A (Accounting) 20 0.500 50.0 C (Computer) 15 0.375 37.5 M (Marketing) 5 0.125 12.5 TOTALS 40 1.000 100.0 Consider our previous example for Major Category: Pie Chart   Percentage by Major Category A (Accounting) C (Computer) M (Marketing) Bar Chart: Percentage by Major Category Question: Which major has the lowest concentration of students? Marketing 4 Summary Table of Causes of Incomplete ATM Transactions Cause Frequency Percentage ATM malfunctions 32 4.42% ATM out of cash 28 3.87% Invalid amount requested 23 3.18% Lack of funds in account 19 2.62% Card unreadable 234 32.32% Warped card jammed 365 50.41% Wrong keystroke 23 3.18% TOTAL 724 100.00% Discussion: Preference for type of chart? Bar Chart  Pareto chart – a series of vertical bars showing tallies/frequencies/percentages in descending order  Example: 5 Pareto Chart  Discussion: How or why do you think that a Pareto chart would be useful in the business world? Helps identify the important “few” than the important “many  Side-by-Side Bar charts – Uses sets A4256 ­ Calibration Solution / Chips of bars to show the joint response 300,000 from two categorical variables 250,000  Example: 200,000  Discussion: What can you 150,000 determine about product utilization for this side-by-side bar chart that 100,000 you might not be able to tell 50,000 otherwise? 0 55 55 55 66 66 55 55 55 66 66 55 55 55 66 66 55 55 55 66 66 22 33 44 11 22 22 33 44 11 22 22 33 44 11 22 22 33 44 11 22 ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ ­­ A A A A A B B B B B C C C C C D D D D D 6  2.2 Organizing Numerical Variables  Ordered array arranges the values of a numerical variable in rank order (smallest value to largest value) Array è Ordered Array  Example (Table 2.8 A & B, p. 42):  Frequency Distribution tallies the values of a numerical variable into a set of numerically ordered classes, called a class interval  How many classes? at least 5, no more than 15  Determine the interval width by the following: interval width = (highest value-lowest value)/number of classes  Using our Meal Cost data, we estimate that we want __10________ classes so the interval width is: (80-25)/10=55/10=5.5  But we really want to simplify to multiples of \$5 increments, say \$5 or \$10 but \$5 produces 13 classes (more than we want) so we choose \$10 to produce 7 classes (notice this is sometimes more art than science…) 7  Relative Frequency Distribution presents relative frequency, or proportion of the total for each group  Proportion or relative frequency, in each group is equal to the number of values in each class divided by the total number of values  Example:   CITY SUBURBAN Meal Cost (\$) Frequen Relative Percenta Frequenc Relative Percenta cy Frequency ge y Frequency ge 20, but <30 4 .08 8% 4 .08 8% 30, but <40 10 .2 20% 17 .34 34% 40, but <50 12 .24 24% 13 .26 26% 50, but <60 11 .22 22% 10 .2 20% 60, but <70 7 .14 14% 4 .08 8% 70, but <80 5 .1 10% 2 .04 4% 80, but <90 1 .02 2% 0 0 0% TOTALS 50 1 100.00% 50 1 100.00%  TOTAL of the relative frequency column MUST BE 1.00  TOTAL of the percentage column MUST BE 100.00  Cumulative Percentage Distribution provides a way of presenting information about the percentage of values that less than a specific amount CITY and SUBURBAN Meal Cost < lower (\$) Frequen Relative Percentag boundar Cumulative Percentage < lower cy Frequency e y boundary 20, but <30 8 0.08 8.0% <20 0 (no meals cost less than \$20) 30, but <40 27 0.27 27.0% <30 8% = 0 + 8% 40, but <50 25 0.25 25.0% <40 35% = 0 + 8% +27% 50, but <60 21 0.21 21.0% <50 60% = 0 + 8% +27% + 25% 60, but <70 11 0.11 11.0% <60 81% = 0 + 8% +27% + 25% + 21% 8 70, but 92% = 0 + 8% +27% + 25% + 21% + <80 7 0.07 7.0% <70 11% 80, but 99% = 0 + 8% +27% + 25% + 21% + <90 1 0.01 1.0% <80 11% + 7% 100% = 0 + 8% +27% + 25% + 21% + TOTALS 100 1.00 100.0% <90 11% + 7% + 1%  Question: What percentage of meal costs was less than \$50? 60% (0+8%+27%+25%) 9  2.4 Visualizing Numerical Variables  Stem-and-Leaf Display – How to create: 1. Separate each observation into  Stem (all but final digit(s)) and  Leaf (final digit(s)). 2. Write stems in vertical column – smallest on top 3. Write each leaf, in increasing numerical order, in row next to appropriate stem  Example: For each state, percentage (with one decimal place) of residents 65 and older  Notice stem of “7” does not have a leaf è we conclude no value of 7.x there should be the same number of leaves as observations! Include ALL stems even if no values/leaves  Leave a space holder if no leaf for a stem  No punctuation (i.e., no decimal points, no commas)  Leaves should be lined on top of one another to determine SHAPE  Simple way to deliver a lot of detailed information FOR THIS EXAMPLE read data values as: 6.8 - 8.8 9.8, 9.9 10.0,10.8 10  Histogram: Displays a quantitative variable across different groupings of values • Careful when choosing how to group together values! Groupings must cover the same range so have of equal width Height used to compare the frequency of each range of values Steps to create a frequency histogram: • Create equal width classes (groupings) • Count number of values HISTOGRAM of Meal Cost in each class • Draw histogram with a bar for each class • Height of a bar represents the count for that bar’s class • Bars touch since there are NO GAPS between classes Be careful: • Number of categories can’t be too large or too small • Don’t skip any categories • Be clear about contents of each category Histogram Example: using Age at Time of First Oscar Award: Groupings chosen here are: [20,25) [25,30) [30,35) [35,40) [45,50), … Where “[“ means the number is INCLUDED in the interval, but “)” means the number is NOT included in the interval • Question: If Jack Nicholson won Best Actor at age 70, which category frequency would increase? A. [60,65) B. [65,70) C. [70,75) D. [75,80) 11 Let’sexaminewhatitmeanstoturn afrequencyintoarela vefrequency bylookingattheageatOscardata TOTAL 76 •  Rela vefrequencyhistogram depictstherela vefrequency (count)ofcategoriesquency •  Doshapesofthefrequencyand rela vefrequencyhistograms differ? • Percentage polygon – used for visualization when dividing the data of a numerical variable into two or more groups Uses midpoints of each class to represent the data in the class Combines data from two groups to allow easier comparison Conclusions? 12 • Cumulative Percentage Polygon (Ogive) uses the cumulative percentage distribution (discussed previously) to plot the cumulative percentages along the Y axis LOWER BOUNDS of the class intervals are plotted on the X axis Conclusions? 13

×

×

### BOOM! Enjoy Your Free Notes!

×

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

Jim McGreen Ohio University

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Amaris Trozzo George Washington University

#### "I made \$350 in just two days after posting my first study guide."

Jim McGreen Ohio University

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Parker Thompson 500 Startups

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!
×

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com