New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Day 2

by: Heli Patel

Day 2 3339

Heli Patel

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

lec 2
Statistics for the Sciences
Prof. C Poliak
Class Notes
25 ?




Popular in Statistics for the Sciences

Popular in Math

This 4 page Class Notes was uploaded by Heli Patel on Sunday June 19, 2016. The Class Notes belongs to 3339 at University of Houston taught by Prof. C Poliak in Summer 2016. Since its upload, it has received 10 views. For similar materials see Statistics for the Sciences in Math at University of Houston.


Reviews for Day 2


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 06/19/16
● Population Variance  ○ If N is the number of values in a population with mean mu, and xi  represents each individual in the population, the the population variance is found  by:  ○ σ 2 = sumN i=1 (xi − µ) 2 N ○ and the population standard deviation is the square root, σ = √ σ 2. ○ Most of the time we are working with a sample instead of a population. So the sample variance is found by: s 2 = Pn i=1 (xi − x¯) 2 n − 1 and the sample standard deviation is the square root, s = √ s 2. Where n is the number of observations (samples), xi is the value for the i th observation and x¯ is the sample mean. ○ By hand ­ find mean, square each scores, 1/(#­1)*(all sum square­ #*mean), then square root the ans = sd  ○  If we change the data set by adding/subtract then the mean  changes and sd and var remains the same  ○ If multiplied or divided everything changes  ● X means + sd  ○ y=a+bx    a and b are constants  ○ mean(y)= a+b(mean(x)) ○ sd(y) = b(sd(x)) ○ var(y)=b^2(var(x)) ■ X mean (x) = 3 sd (x) = 0.5 ■ y= 3+2x mean(y) = 3+2(3) = 9 ○ sd(x) = 2(0.5) = 1 ● The function for the sample standard deviation in R is sd(data name$variable  name) ● . ● Coefficient of Variation  ○ This is to compare the variation between two groups.  ○ The coefficient of variation (cv) is the ratio of the standard  deviation to the mean.  ○ cv = sd/mean ○  A smaller ratio will indicate less variation in the data. ● Percentiles  ○ The pth percentile of data is the value such that p percent of the  observations fall at or below it.  ○ The use of percentiles to report spread when the median is our  measure of center.  ○ If you are looking for the measurement that has a desired  percentile rank, the 100P th percentile,the measurement with rank(or position in  the list)nP + 0.5 =position  where n represents the number of data values in the  sample. ○ >fivenum(price) ● IQR Interquartile range,  ○ IQR = Q3­Q1 ● outlier  ○ is an observation that is "distant" from the rest of the data.  ○ Outliers can occur by chance or by measurement errors. Any point that falls outside the interval calculated by Q1 − 1.5(IQR) and Q3 + 1.5(IQR) is considered an outlier. ● GRAPHS ○ R code ■ For bar graph: plot(datasetname$variablename)  ■ For pie chart: > counts<­table(shoes$Brand) >  pie(counts)  ○ Dotplots  ■ y putting dots above the values listed on a number  line.  ○ Stem and leaf plot  ■ 1. Separate each observation into a stem  consisting of all but the final rightmost digit and a leaf, the final digit.  Stems may have as many digits as needed, but each leaf contains only a  single digit.  2. Write the stems in a vertical column with the smallest at the top, and  draw a vertical line at the right of this column.  3. Write each leaf in the row to the right of its stem, in increasing order out from the stem.  Rcode: stem(dataset name$variable name)  ○ Histograms  ■ Bar graph for quantitative variables. Values of the  variable are grouped together. ■ The width of the bar represents an interval of  values (range of numbers) for that variable.  ■ The height of the bar represents the number of  cases within that range of values. ■ 1. Divide the range of data into classes of equal width. For example the price of the basketballs shoes are from $40 to $250 dollars. We can use a width of $20 for the classes. Thus the classes are: 40 ≤ price < 60 60 ≤ price < 80 . . . 240 ≤ price < 260 Be sure to specify the classes precisely so that each individual price falls into exactly one class and all of the prices are counted. 2. Count the number of shoe prices in each class. ■ 1. Mark on the horizontal axis the scale for the  variable whose distribution you are displaying.  2. The vertical axis contains the scale of the counts.  3. Each bar represents a class. The base of the bar covers the width of  the classes, and the bar height is the class count. There is no horizontal  space between bars unless a class is empty, so that its bar has height  zero. Rcode: hist(dataset name$variable name) ○ Boxplot ■ A graph of the five­number summary. I A central  box spans the quartiles. I A line inside the box marks the median. I Lines  extend from the box out to the smallest and largest observations. I  Asterisks represents any values that are considered to be outliers.  Boxplots are most useful for side­by­side comparison of several  distributions. ■ Rcode: boxplot(dataset name$variable name)  ○ Cumulative Frequency Polygon ■ Plot a point above each upper class boundary at a  height equal to the cumulative frequency of the class.  ■ Connect the plotted points with line segments. A  similar graph can be used with the cumulative percents.  ● Distribution  ○ distribution of a variable tells us what values it takes and how  often it takes these values based on the individuals.  ○ The distribution of a variable can be shown through tables,  graphs, and numerical summaries ○ There are four main characteristics to describe a distribution: ■ 1. Shape ○ skewed to the right if  the right side (higher values) of the graph extends much  farther out than the left side. I  ○ skewed to the left if  the left side (lower values) of the graph extends much  farther out than the right side. I  ○ uniform if the graph is at the same height (frequency) from lowest to highest  value of the variable. ■ 2. Center  ­ the values with roughly half the  observations taking smaller values and half taking larger values. ■ 3. Spread   ­from the graphs we describe the  spread of a distribution by giving smallest and largest values.  ■ 4. Outliers  ­individual values that falls outside  the overall pattern ○ Lists the categories and gives either the count or the percent of  cases that fall in each category. ○  One way is a frequency table that displays the different  categories then the count or percent of cases that fall in each category. Then we  look at the graphs (bar or pie) to determine the distribution of a categorical  variable. ○


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"

Jennifer McGill UCSF Med School

"Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.