### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Class Note for STAT 528 at OSU 30

### View Full Document

## 18

## 0

## Popular in Course

## Popular in Department

This 30 page Class Notes was uploaded by an elite notetaker on Friday February 6, 2015. The Class Notes belongs to a course at Ohio State University taught by a professor in Fall. Since its upload, it has received 18 views.

## Popular in Subject

## Reviews for Class Note for STAT 528 at OSU 30

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 02/06/15

Stat 528 Autumn 2008 Data Analysis I Elly Kaizar Department of Statistics 221 Cockins Hall ekaizar stat osu edu Course web page http www stat osu eduNekaizarcourses528 Course gradebook http www carmen osu edu Section Overview Reading Chapter 1 Introduction Section 11 0 Introduction 0 Looking at data Individuals and variables Categorical and quantitative variables Variation 0 Graphical summaries for data displaying distributions Bar and pie charts Stemplots Histograms Time series plots The language of statistics 0 Statistics is a lot about learning the language and de ni tions By learning the lingo we can communicate and present our ideas to others 0 We will introduce many de nitions It will help you to learn these terms as we go through this term 0 When you write be sure to use the terms you have learned to communicate clearly and precisely Data 0 Data is everywhere collect it for yourself newspapers television the internet etc etc o Studying data gives insight or can confuse Wes damn ltes and statistics 0 The way data is presented is very important 0 An individual is something that has a characteristic of in terest that can be measured or observed eg a person animal chemical process stock index 0 The actual characteristic of interest that is recorded or ob tained for each individual is called the variable Variables can be classi ed as categorical or quantitative Categorical and quantitative variables o Categorical variables record an individual7s group or char acteristic We cannot perform arithmetic on these variables lnstead often summarize these variables for a number of individuals at once 0 Quantitative variables describe a numeric characteristic We can perform arithmetic on these variables 1 Discrete variables take on a countable number of values 2 Continuous variables take on a continuum of val HES Presenting data 0 Easy to think of data being recorded in a spreadsheet or worksheet this is the case for MlNlTAB o The variables are the columns 0 The individuals are the rows 0 A case is de ned as a collection of the values of the variables for one individual Presenting data c0nt Example worksheet Candidate Party Total Raised Age States W011 Clinton Dem 90935788 60 0 Obama Dem 80256427 46 1 Edwards Dem 30329152 54 0 Romney Rep 62829069 60 0 Huckabee Rep 2345798 51 1 c Number of variables 0 Number of individuals 0 Types of each variable Thinking about data 0 ln order to analyze and present data ask the following ques tions 1 Where does the data come from 2 Why was the data collected 3 Who collected the data most good data sources have references 1 10 Thinking about the structure of the data o How much data was collected How many individuals are there usually the more the better How many variables 0 Are there variables missing that you think should have been collected 0 How do you de ne the variables How and under what conditions were the variables mea sured collected What are units of measurement ls there missing data ie are there individuals for which the variable was not recorded Why Missing data can be a big problem in statistics 11 A sheepish example o A vet is observing some Vital statistics of a ock of 74 sheep at a agricultural research station The vet records the following variables weight body temperature pulse rate estrus heat cycle 12 Making measurements 0 Some questions How do we make measurements of the variables what is the instrument used to make the measurement What are the units of measurement Does the variable actually measure what we want to measure 0 Answering these questions often involves eXpert knowledge 0 Easier to measure physical quantities rather than social or behavioral quantities 13 Variation 0 Example cont Suppose we have measured the weight of 23 sheep Their weights in pounds are 180 160 157 185 159 165 168 165 175 186 155 169 168 170 173 181 189 179 182 177 157 169 166 o The weights of the sheep vary o If we measure the weight of one sheep many times the values will still vary Why 14 Reasons for variation 0 Reasons for variation include natural variation eg one sheep ate more than another sheep male sheep are heavier in general than females measurement error depends on the measuring in strument roundingnumerical error 0 Key idea We should account and adjust for variation in the data eg the average weight of a number of sheep varies less than the individual measurements 15 Contrasting descriptive with inferential statistics 0 Descriptive statistics 7 we summarize only the data we have collected 0 Inferential statistics 7 based on the data we have collected and some assumptions we try to try to infer something about a more general larger group of individuals Example CBS News poll of 665 registered voters taken Sept 5 7 Candidate Number Who Favor Percent Who Favor McCain 308 46 Obama 295 44 Undecided 62 9 0 Descriptive o lnferential 16 Graphical summaries for univariate data univariate77 means one variable 0 Easy to pick out patterns in graphical displays a picture is worth a thousand words77 0 But can hide important information too 0 When we examine distributions using graphical displays Helpful to look at the pattern of values in the data Look for unusual features of the data eg an outlier is a value which lies outside the overall pattern of the data 18 Plots for categorical variables 0 Example What percentage of the class usually take cream in their coffee Answer Number Yes No Do not drink coffee c We will present the data as a bar Chart 100 80 60 Percentage 40 20 of l l l Don39t drink coffee No Yes Answe r 19 More plots for categorical data 0 NOW present the data as a pie Chart 0 Which display is more useful Why 20 Stemplots stem and leaf displays o Sternplots show the shape of data the distribution 0 Good for small datasets c We illustrate with the sheep weight data Stem and leaf of sheep N 23 Leaf Unit 10 15 5779 16 05568899 17 03579 18 012569 0 Using another scale Stem and leaf of sheep N 23 Leaf Unit 10 15 5779 16 0 16 5568899 17 03 17 579 18 012 18 569 21 Stemplots changing the scale Stemandleaf of sheep N 23 Leaf Unit 10 15 5 15 77 15 9 16 O 16 16 55 16 6 16 8899 17 17 17 17 DNU39Iwo 17 18 01 18 18 18 18 DOEU39IIQ o The last two sternplots here have split stems 0 Different scales can be useful for picking up different features of the data 22 Back to back stemplots The female cuckoo lays her eggs into the nest of foster parents The foster parents are usually deceived by the similarity of sizes of the eggs Here are the lengths of cuckoo eggs in nun found in two species Decimal point is at the colon Sparrow Robin 20 9 81 21 7 6443100 22 08 930000 23 00115889 24 0 25 0 How does the distribution of lengths differ for the two species 23 Hequency relative frequency and density Suppose we observe a discrete quantitative variable 0 The frequency of any particular value of that variable is the number of times that value occurs in the data 0 The relative frequency or percent of any particular value of the variable is the proportion of times that value occurs in the data 24 Hequency relative frequency and density continued Suppose we observe a continuous quantitative variable 0 We sort continuous variables into ranges of values called classes Classes are usually all the same Width 0 The frequency of a particular class is the number of times values fall in that class in the data 0 The relative frequency or percent of a particular class is the proportion of times values fall in that class in the data 0 The density of a particular class is the relative frequency divided by the Width of the class d t relative frequency ens1 y Width of class 25 Histograms a discrete example Temperature transducers of a certain type are shipped in batches of 50 A sample of 60 batches are selected and the number not conforming are determined 212401320533132 470230421311341 232284513150232 106421603336123 We have class frequency relative frequency 0 7 1 12 2 13 3 14 4 6 5 3 6 3 7 1 8 1 26 Drawing the histogram 15 7 gt 10 7 0 C D 3 U 2 u 57 0 7 1 1 1 1 1 1 0 1 2 3 4 5 transducers 02 7 b 396 8 D 01 7 00 7 1 1 1 0 1 2 3 4 5 transducers 27 Histograms a continuous example The following are losses for n 38 hurricanes occurring over a period of 38 years The losses are in units of millions of dollars and are adjusted for in ation They appear here sorted 293 308 1838 4791 12368 51359 447 1698 4141 10322 42168 677 1903 2530 4940 5260 14014 19201 54578 75039 712 2911 5992 6312 19845 22734 1056 1447 1535 3015 3373 4060 7781 10294 32951 36120 86388 163800 class frequency relative frequency 0 199 29 2938 0763 200 399 3 138 0079 400 599 3 338 0079 600 799 1 138 0026 800 999 1 138 0026 1000 1199 0 0 1200 1399 0 0 1400 1599 0 0 1600 1799 1 1380026 28 Features of histograms 0 Choice of the number of classes depends on the data Too many classes leads to a spread out histogram Not enough classes gives a squashed histogram Somewhere in betvveen is good tradeoff 0 area under density histogram is one o lf we round off relative frequencies they may not add up to one i this is called roundo 39 error 0 Often present relative frequencies as percentages The heights of relative frequency histograms sum to 1 29 Hurricane histograms changing the number of classes 0010 7 0005 7 Density 1 1 1 1 200 400 600 800 1000120014001600 hurricane loss 0000 7 1 1 1 1 0 00107 00107 b 3 900057 900057 D D o o 00007 00007 m 0 1000 2000 0 300 000 900 1200 1500 hurricane 1055 hurricane loss 30 Shapes of distributions o unimodal bimodal multimodal Frequency Frequency Frequency 31 Shapes of distributions c0nt o symmetric left skewed and right skewed Frequency Frequency Frequency 32 Continuous distribution summaries o shape 0 center 0 spread 33

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "When you're taking detailed notes and trying to help everyone else out in the class, it really helps you learn and understand the material...plus I made $280 on my first study guide!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.