New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here

Data Confidence of Distributions with Mean and Standard Deviations

by: Kaiyana Dudley

Data Confidence of Distributions with Mean and Standard Deviations CHEM 241

Marketplace > University of Michigan > Chemistry > CHEM 241 > Data Confidence of Distributions with Mean and Standard Deviations
Kaiyana Dudley


Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

These notes go over how to compare data sets using the true value and variance. There are distribution test equations and charts to refer to.
Intro to Chemistry Analysis
Stephen Maldonado
Class Notes
Chemistry, Statistics, Analytical Chemistry, average, standarddeviation
25 ?




Popular in Intro to Chemistry Analysis

Popular in Chemistry

This 8 page Class Notes was uploaded by Kaiyana Dudley on Wednesday September 21, 2016. The Class Notes belongs to CHEM 241 at University of Michigan taught by Stephen Maldonado in Fall 2016. Since its upload, it has received 5 views. For similar materials see Intro to Chemistry Analysis in Chemistry at University of Michigan.


Reviews for Data Confidence of Distributions with Mean and Standard Deviations


Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/21/16
Week  2:  Data  Confidence  of  Distributions  Using  Mean,   and  Standard  Deviations   What  is  measurement  distribution,  and  why  is  it  important?   • In  experiments  in  analytical  chemistry,  we  are  trying  to  find  the  true  value  of  the  subject  of   observation   • Measurement  distribution  is  when  a  number  is  reported  as  a  true  value  plus  or  minus  some  error   o We  can  never  know  the  true  value  of  a  distribution,  but  we  can  test  in  a  way  where  we  get   close   o The  bigger  the  range  of  numbers,  the  less  precise  we  can  be   • The  argument  about  measurements  is  that  all  quantitative  measures  are  games  of  chance   o The  goal  is  to  get  as  close  to  the  true  measurement  so  that  we  can  be  confident  in  our  data   • In  a  way,  measurement  distribution  is  a  game  of  probability  like  flipping  coins   o If  we  were  to  flip  a  coin  several  times,  we  can  figure  out  the  probability  of  getting  a  head  by   calculating  all  of  the  possible  combinations.  Then  count  how  many  of  those  combinations   are  a  head.  This  will  give  you  the  probability  of  successful  combinations.           Is  there  a  more  efficient  way  of  finding  probability  of  the  distribution?   • Dealing  with  measurement  distributions,  the  success  and  failure  most  likely  won't  be  50:50  like   flipping  a  coin.   • So,  for  bigger  distributions,  we  have  this  equation  of  Probability  for  success  or  failure:     o This  is  the  binomial  distribution  gives  us  the  number  of  trials  or  observations  when  each   trial  has  the  same  probability  of  having   one  particular  value  (the  area)   • This  equation  can  be  derived  by  multiplying  the  average  and  standard  deviation ,  and   multiplying  the  product  by  1/2  (this  is  just  the  area  equation)   § Plug  in  the  values  for  mean  and  deviation  below   • The  variable  p  is  the  possibility  for  success  (also  the  area)   • The  variable  x  is  the  one  successful  value  (or  the  value  that  we're  looking   for)   o The  variable  n  is  the  total  number  of  trials   • Recall  that  n!,  when  n=4,  is  4*3*2*1   o To  find  the  values  of  average  or  standard  deviation,  here  are  the  equations:         • The  Gaussian  (or  normal)  distribution  is  a  special  type  of  binomial  distribution  where   n  is  infinitely   large,  and  the  probability  of  success  p  is  significantly  bigger  than  zero:     o This  is  the  same  equation  as  before  (p=area),  but,  because  of  derivative  rules,  np^x  coverts   to  (1e/(standard  deviation))^1/2(z^n) .   o The  central  limit  theorem  says  that  given  a  large  sample  from  a  distribution  with  a  finite   variance,  the  average  of  all  samples  will  be  approximately  equal  to  the  average  of  the  whole   distribution  (we  will  find  out  more  after  the  first  example)   • In  other  words,  the  sampling  distribution  will  come  out  normal  at  the  average  value  if   the  samples  are  chosen  randomly,  even  if  the  population  is  not  normally  distributed   • Normal  distributions  are  Gaussian  distributions  with  a  total  area  of  1   § This  makes  our  graph  curve  perfectly  symmetrical  around   the  average  (or   normal)  value  given   § The  total  area  is  1  because  all  of  the  possibilities  (N  =100%  of  the  distribution)   o These  charted  values  below  show  the  areas  calculated  through  Gaussian's  equation.  (This  is   helpful  so  you  can  see   which  z  value  corresponds  to  which  area  without  manual  calculation!)     o The  following  example  is  valid  given  the  average  =  98.2+or -­‐0.7  and  z=3     • Mean  =  98.2,  standard  deviation  =  0.7  (these  should  always  be  given)           § The  z  equation  allows  us  to  plug  in  the  value  we're  looking  fo r  to  find  out  the   probability  of  it  happening.  We  then  compare  the  areas  as  z  moves  toward  and   away  from  infinity.  There  are  3  steps  to  go  through  to  get  the  area  values  at  z= -­‐ infinity,  z=0,  z=3,  and  z=infinity:   1.  -­‐infinity  <  z  <  0  (area  range  between  z=-­‐infinity  and  z=0)   i. The  area  between  these  is  always  zero  because  the  area  at  z  =  0  is  0   per  the  chart   2. 0  <  z  <  3  (range  between  z=0  and  z=3)   i. At  z  =  0,  we  are  at  the  average  (or  normal  value)  of  T  =  98.2   ii. At  z  =  3,  we  find  that  the  area  from  the  chart  is  0.498  ( as  z  increases   up  to  infinity,  p  reaches  0.5)   a. Standard  deviation  is  0.7,  so  the  area  at  z  =  0.7  and  z  =   -­‐0.7   are  the  same  on  either  side  of  the  axes  (because   total  area  =   1)   3. 3  <  z  <  infinity  (area  range  between  z=3  and  z=0)   i. At  z  =  infinity,  area  or  p  =  0.5   § Now  we  find  the  area  as  z  >  3  (from  3  <  z  <  infinity).  Subtract  these  values  from   the  total  area  to  find  the  area  or  probability  of  patients  at  T  >  100.3:  1-­‐0.5-­‐0.498   =  0.00135  out  of  100  patients   • The  doctor  should  expect  less  than  1  patient  to  have  a  tem perature   greater  than  or  equal  to  100.3  degrees  Fahrenheit   § Although  the  mean  and  standard  deviation  of  the  whole  distribution  will  always   be  given,  we  can  measure  the  mean  and  deviation  of  a  sample  of  the   distribution.   • Here  are  the  sample  equations:               o Remember,  the  Gaussian  Equation  is  really  just  a  way  to  make  a  really  good  guess  at  what   the  probability  would  be           So  how  can  we  be  sure  that  our  measurements  are  accurate?   • This  is  why  we  take  a  sample,  rather  than  trying  to  account  for  the  entire  dis tribution   • Again,  the  central  limit  theorem  says  that  given  a  large  sample  from  a  distribution  with  a  finite   variance,  the  average  of  all  samples  will  be  approximately  equal  to  the  average  of  the  whole   distribution  (we  will  find  out  more  after  the  first  exa mple)   o In  other  words,  the  sampling  distribution  will  come  out  normal  at  the  average  value  if  the   samples  are  chosen  randomly,  even  if  the  population  is  not  normally  distributed   • Without  any  data  collection,  we  presume  a   null  hypothesis  that  two  samples  are  pretty  much  the   same,  or  have  no  meaningful  differences   o Two  values  are  the  same  within  random  variation  (standard  deviance)   • The  only  differences  are  due  to  chance  and  random  error   • These  variations  are  also  known  as  independent  variables   o Basically,  two  samples  should  be  normal  or  average,  but  there  may  be  some  differences  or   deviation   o We  test  this  using  the  distribution  equation  process     • We  need  to  report  the  average  of  a  small  number  of  measurements   o For  finite  numbers  of  measurements,  the  validity  for  the   Gaussian/Normal  distribution  is  not   clear   o The  measurement  of  the  sample  average  and  standard  deviation  may  not  reflect  the  true   distribution  average  and  standard  deviation   • Confidence  intervals  approximate  how  close  the  sample  mean/deviation  match  the  distr ibution   mean/deviation   o We  use  the  student's  t  test  to  figure  this  out:     §  applying  Gaussian  to  things  that  don't  have  infinite  measures   This  chart  shows  some  t  confidence  levels  for  n  degrees  of  freedom:         o If  we  made  n  measurements  to  find  x  and  s  for  the  sample  population,  the  interval/range   that  would  include  the  true  population  mean  µ  (whose  value  we  do  not  know)  with  95%   confidence  (i.e.  95%  of  the  measurements),  would  be  defined  by  t95   o We’re  testing  that  there's  some  systematic  error  causing  differen ces  (not  why,  just  existence   of)   o We  reject  the  null  hypothesis  if  there  is  less  than  a  X%  chance  measured  difference  could   come  from  random  error  (standard  deviation).  We  can  do  this  3  different  ways:       (a)  Compare  value  from  one  number  of  measurements  wit h  the  average(single  t  test)     Calculate  t  in  relation  to  mean  (n=5),  then  find  the  98%  confidence  level  on  the  table  (n=4)   1. Blood  cell  ex:  n  =  5  days  of  measurements   2. Calculate  the  average  of  this  sample ,  then  plug  in  to  single  t  test  to  calculate  t   3. Compare  this  t  value  with  the  corresponding  slot  in  table  (4.28  is  closest  to  4.303  at   98%)   4. Try  a  different  degree  of  freedom:  n -­‐1=4  (4th  row  on  table)  at  x=98%   a. Specify  with  a  x%  of  confidence  level:  98%   5. If  calculated  t  value  >  table  t  value,  reject  null  hypothes is   a. If  calculated  t  <  or  =  table  t,  accept  null  hypothesis  (because  we  have  no  proof   otherwise)       (b)  Compare  replicate  measurements  to  see  if  they  agree  with  the  true  value  (calculating   two  separate  sets  of  samples  without  using  the  average)     1. Use  true  value  and  standard  deviation  for  two  different  n  sample  values   a. Compare  each  set  of  t  values  from  the  table  at  the  revealed  level  of  confidence   2. If  calculated  t  value  >  table  t  values,  reject  null  hypothesis   a. If  <  or  =,  accept  null  hypothesis       (  c)  Compare  individual  measurements  (calculating   two  paired  groups  of  the  same   population)     You  pick  two  n  values  and  their  corresponding  x  values,  and  calculate  the  t  test  one   trial  at  a  time  with  the  mean  in  consideration     1. Each  sample  n  measurement  has  a  pair  of  output s    whose  differences  are  summed  up   and  divided  by  (n-­‐1)  to  get  standard  deviation  of  population       What  is  the  F  Test?   • We  use  the  F  Test  to  see  if  two  or  multiply  different  data  sets  (groups)  have  the  same  standard   deviation  (rather  than  average)   • The  Null  hypothesis  says  that  the  mean  values  of  several  (more  than  2)  normally  distributed  sets   that  all  have  the  same  standard  deviation,  are  equal   o the  groups  should  have  the  same  average  (Fcalc   ≤  Ftable)     When  Fcalc  >  Ftable,  reject  null  hypothesis     When  Fcalc  ≤  Ftable,  accept  null  hypothesis     • Here  is  the  table  for  select  s1  and  s2  values:         • The  analysis  of  variance  (ANOVA)  is  to  test  the  differences  in  the  groups'  means  by  comparing  the   standard  deviations       All  measurements  aren't  a  reflection  of  the  average;  what  about  outliers?   • The  Grubbs  Test  and  the  Q  Test  determine  whether  it  is  appropriate  to  remove  an  outlier     Here  is  the  table:     If  Gcalc  >  Gtable  (at  x=95%),  reject  null  hypothesis   o It's  okay  at  95%  to  remove  outlier   If  Gcalc  <  Gtable  (at  x=95%),  accept  null  hypothesis  to  be  true   o You  are  less  than  5%  confident  to  remove  outlier  (so  don't!)   • The  Q  Test    allows  us  to  examine  if  one  (and  only  one)  observation  from  a  small  set  of  replicate  n   observations  can  be  legitimately  rejected  or  not   o Basically  the  same  as  the  Grubbs  Test,  but  using  different  variables     • The  Grubbs  test  and  the  Q  test  can  be  used  in terchangeably  (Grubbs  is  more  popular)   References:  Stephen  Maldonado,  September  14  and  16    


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Jennifer McGill UCSF Med School

"Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

Bentley McCaw University of Florida

"I was shooting for a perfect 4.0 GPA this semester. Having StudySoup as a study aid was critical to helping me achieve my goal...and I nailed it!"


"Their 'Elite Notetakers' are making over $1,200/month in sales by creating high quality content that helps their classmates in a time of need."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.