New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

STP 231 week 3 notes

by: Andrej Sodoma

STP 231 week 3 notes STP 231

Andrej Sodoma
GPA 3.77

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

These notes cover chapter sections 2.1-2.4.
Statistics for Biosciences
Dr. Ye Zhang
Class Notes
25 ?




Popular in Statistics for Biosciences

Popular in Statistics

This 5 page Class Notes was uploaded by Andrej Sodoma on Friday September 9, 2016. The Class Notes belongs to STP 231 at Arizona State University taught by Dr. Ye Zhang in Fall 2016. Since its upload, it has received 10 views. For similar materials see Statistics for Biosciences in Statistics at Arizona State University.


Reviews for STP 231 week 3 notes


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/09/16
STP 231 lectures covering 2.1­2.4 I.) 2.1 section about variables. Variables are things that are a part of your study  that can be defined. Upper case letters represent the definition of the variable,  which could be height, gender, number of home runs, etc. Lower case letters  represent values of the variable, which could be six foot tall, number of males, 200 home runs, etc.  A. Numeric variable: the variables take on numbers, its quantitative.  i. Discrete: a numerical variable with a limit. Like the number of  seats in a stadium.  ii. Continuous variable: a numerical variable without a limit. Like a  measurement because a measurement can have an infinite number  of significant figures. For example 100.000009 meters.  B. Categorical variable: has ranked variables i. Example: Schooling; easy, medium, and hard. ii. Ordinal variable: It is a ranked categorical variable. For example  how much school you have had, bachelors, masters, PhD,  Doctorate.  C. Discrete variable = a variable with a limited number of values. Example,  number of people in a class.  D. Continuous variable = a variable with a unlimited number of values.  Example, weight.  II.) 2.2 and 2.3, sections about organizing data.  A. Frequency and frequency distribution i. Frequency: number of times each category appears in a data set.  ii. Frequency distribution: a value that shows the number of instances  a variable occurs.  iii. Relative frequency: the number of times the variable occurs  divided by the total.  ­ It is used when two or more data sets are compared.  iv. Relative frequency distribution: the number of times a variable  occurs in a percentage.  B. Bar Chart: A graph that displays the frequency or relative frequency in a  sequence of vertical bars.  i. It must include all of the classes in the data set.  ii. If relative frequency is used then the sum of all of the relative  frequencies must equal one.  iii. One can convert from relative frequency to frequency by  multiplying the relative frequencies by the total.  iv. It can be used for qualitative and quantitative data.  C. Dot­ plot i. A graph that shows all of the data.  ii. Each dot represents one value in the data set.  D. Cut point and single value grouping i. Single value grouping: each class is made up of one value ­ Example: 1,2,3,4,5,1,2,3,4,5,1,2. class Frequency Relative Frequency  1 3 0.25 2 3 0.25 3 2 0.167 4 2 0.167 5 2 0.167 ii. Cut point grouping: each class is made up of intervals, which  means each class is comprised of lower and upper limits. iii. Rules: All classes have the same width, the grouping must consider all values, each interval has a lower and upper class cut point,  values must belong to one class.  ­ width = ((max –minimum) / ( number of classes)) iv. Example: 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2. Number of classes = 2.  Width = 2  class frequency Relative frequency  1­3 6 0.50 3­5 6 0.50 E. Stem and leaf plot  i. A table that shows the data in an organized manner. It is not good  for large data sets.  ii. Rules: data must be in order from least to greatest, every plot must  have a key, for two line plots the lines must be divided evenly.  iii. One line per stem: consists of two parts one part is the stem the  other is the leaf and one digit takes up one stem.  iv. Two lines per stem: each stem value will take up two lines. ­ Example: 50,51,52,53,54,55,56,57,58,59,60,61,62,65,67,70.  Key: 7|0 = 70 Stem Leaf 5 1,2,3,4 5 5,6,7,8,9 6 0,1,2 6 5,7 7 0 7 F. Histogram: A graph comprised of vertical bars. The height represents the  frequency or relative frequency and width represents each class.  i. Fitted curves: occur when you connect the corners of the rectangles in your histogram.  ii. Choosing the width is important because it can drastically affect  the way your data is shown.  G. Mean: total divided by the number of values.  i. i = 1 represents the sum starting with the first number in the data  set. ii. n = number of observations  iii. Xn = final value in the data set  iv. It can be done using summation or by summing up the entire data  set then dividing by the total number of observations.  v. Not robust because it changes when there are extremes in the data  set. It can only be used in numerical data sets.  H. Median: middle measurement of a data set.  i. In order to find the median the data set must first be put in  ascending order.  ii. Odd numbered data sets the median is the middle value. iii. Even numbered data sets the median is the average of the two  middle values.  iv. Robust because it does not change with extreme values. It can only be used in numerical data sets.  I. Modality: describes the distribution of a data set by the shape of the curve. i. Uni­model distribution: A distribution on a graph with only one  peak.  ii. Bimodal Distribution: has two peaks.  iii. Multimodal distribution: has two or more peaks J. Mode: the measurement/s that occur the most in a data set.  i. Robust because it does not change with extreme values. It can be used in both categorical as well as numerical data sets.   K. Skewness: represents symmetry of a graph.  i. Symmetric distribution: equal distribution. Also the mean =  median ii. Left skewed distribution: the left side is elongated but not the right. The mean is less than the median because there are more small  values in the data set.  iii. Right skewed distribution: The right side is elongated but not the  left. The mean is greater than the median because there are  multiple larger values in the data set.  iv. Steps: first, arrange the data in ascending order. Second, Find the  median of the data set, which then represents Q2. Third, split the  data set into two equal halves. Fourth, obtain the median of the  first and second half of the data set halves resulting in Q1 and Q3.  III.) 2.4 section covering quartiles, boxplots, and five­number summaries.  A.) Quartiles: divide the data set into quarters (Q1, Q2, Q3)  i.) Q2 is simply the median of the entire data set.  ii.) Q1 is the median of the first half of the data set.  iii.) Q3 is the median of the second half of the data set.  iv.) Interquartile range (IQR) = Q3 – Q1= variation for the middle  50% of the data set.  v.) Outlier: it’s a value that is distant from the data set. It is considered an error.  ­ Upper fence: Greater than Q3 by 1.5 times the IQR. ­ Lower fence: Less than Q1 by 1.5 times the IQR.  B.) Five number summaries i. It consists of quartiles, which are Q1, Q2, and Q3. Plus minimum  and maximum values of the data set.  ii. These values make up a box and whisker plot.  C.) Box and whisker plots i. The box is comprised of the three quartiles. ii. The whiskers are comprised of a line going through the box ending at the maximum and minimum of the data set.  iii. Better than histograms because they show outliers clearly. 


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Allison Fischer University of Alabama

"I signed up to be an Elite Notetaker with 2 of my sorority sisters this semester. We just posted our notes weekly and were each making over $600 per month. I LOVE StudySoup!"

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.