×
Get Full Access to Rutgers - STAT 960 - Class Notes - Week 2
Get Full Access to Rutgers - STAT 960 - Class Notes - Week 2

×

RUTGERS / Statistics / STAT 960 / How do you know if a distribution is symmetric or skewed?

# How do you know if a distribution is symmetric or skewed? Description

##### Description: Week Two Notes
4 Pages 59 Views 2 Unlocks
Reviews

Stats 401 Week 2

## How do you know if a distribution is symmetric or skewed?

Relative frequency = How many times appeared in category ÷ Total items in category (like finding  percent without multiplying by 100 at the end)

For the median, if there is an even amount of number given then add the two middle terms then divide  by 2.

EX: 1 2 3 4 5 6. The two middle terms are 3 and 4 so 3 + 4 = 7. Then 7/2 = 3.5 (finding the average of  those numbers).

If there isn’t a mode (1 2 3 4 5 6) then write “NO MODE” not “The mode is zero” because that implies  that there is the number 0 and that is what appears the most.

Peak of curve is in the middle

When the mean = median = mode the distribution is bell shaped or symmetric.

Median > mean then it is left skewed

Median < mean then it is right skewed

## Which measure of center is not resistant to extreme values?

APPROPRIATE MEASURE OF CENTER

1. You have to check to see if it is Categorical or Numerical

a. Categorical

i. Use MODE

b. Numerical

i. Check to see if there are extremes (Numbers that are very far from the data, can

and can’t be outliers.)

∙ If Yes then use Median

∙ If No then use the mean.

You don’t use mode to find the center of numerical values because it doesn’t take into account all of the  values provided. Extreme observations don’t use mean because the average will float towards that  Extreme (which can be a really high or low number).

Percentile tells you where you are in respect to other. If a person is in the 80th percentile then 80% of  people are below them and they are in the top 10%. Percentile is measured out of 100 because it is  percent.

## What percent of data will lie within 2 standard deviation of the mean?

Don't forget about the age old question of Which escape sequence is used to print?

QUARTILES Each Quartile is 25 %

Q1 Q2 Q3

Lower Half Upper Half

Q2 is the median

From the Median, split that in two, Upper and Lower half. Q1 is the mean of the lower half and Q3 is the  mean of the upper half

Ex: 15, 18, 56, 78, 26, 43, 29

First Rearrange data from lowest to highest: 15, 18, 26, 29, 43, 56, 78. Then find median: 29.  Lower Half (15, 18, 26) Mean of this is 19.667, Upper Half (43, 56, 78) Mean of this is 59. So Q1= 19.667, Q2= 29, and Q3=59 If you want to learn more check out How does silica content affect magma viscosity?

Five Number Summary

(Min, Q1, Q2, Q3, Max) In that order to make a box plot/ box whisker

1st scale an even number line. 2nd Plot Q1, Q2, Q3 (from above), max, and min. Then draw vertical lines  through Q1, Q2, and Q3. Connect those lines making a box. Finally draw horizontal lines to max and min.

If you want to learn more check out What is the parallax of a star?

0 10 20 30 40 50 60 70 80

Measures of Variation (spread of data)

Range is Max – Min

Standard Deviation

σ = population

Outliers- How to Find Outlier

Inter Quartile Range (IQR) is Q3-Q1: 59 – 19.667 = 39.333

Lower Limit= Q1 – ((1.5)(IQR)): 19.667 – ((1.5)(39.333))= -39.333

Upper Limit= Q3 + ((1.5)(IQR): 59 + ((1.5)(39.333))= 118.000

If the max or min goes past these numbers then they are outliers.  If you want to learn more check out What does totemism mean?

Standard Deviation

Variance

There are two equations used to find Variance

s² =∑(��−��̅)²

��−1 (sample variance)² = The sum of (x minus mean)² divided by (# of units minus 1)

x

(x-x)

(x-x)²

1

-1

1

2

0

0

3

1

1

Total

6

0

2

1. Add all the numbers in the x column.

2. Find the mean which is 6/3= 2.

3. Fill in the next column by subtracting the mean (2) from x.  4. Square your answer from the previous column We also discuss several other topics like What is the meaning of the isolation effect?

5. Plug in (2) back in the numerator of the equation and then divide by 2  because the denominator says to divide by # of unit minus 1.

6. So s² = 1 and then s = 1 because the square root of one is just 1. s² = ∑(��2)−(∑ ��)²

��

��−1 (sample variance)² = the sum of (x squared) minus ((the sum of x)² divided by # of  units) then all divided by number of units minus 1.

x

2

4

4

16

6

36

8

64

10

100

Total

30

220

3. Plug in We also discuss several other topics like What is the meaning of family in family life?

s² = 220 −(30)²

5−1-> s² = 220 – 180 / 4 -> s² = 10 -> s (sample variance) = 3.162

Interpretation of Data using mean and Standard Deviation

Empirical Rule- For a bell shape distribution

1. About 68% of data lie within one standard deviation to either side of mean. 2. About 95% of data lie within two Standard deviations to either side of mean.

3. About 99.7% of the data lie within three standard deviations to either side of mean.

(https://saylordotorg.github.io/text_introductory-statistics/s06-05-the-empirical-rule-and chebysh.html)

Chebyshevs Data

For only numerical data

At least 75% of the data lie within two standard deviations to either side of the mean. At least 89% of data lie within three standard deviations to either side of mean Z score is a measure of how many deviations above or below the population mean a raw score.

Page Expired
It looks like your free minutes have expired! Lucky for you we have all the content you need, just sign up here
References: