Popular in Course
Popular in Statistics
This 9 page Class Notes was uploaded by Chyna Hettinger on Saturday October 3, 2015. The Class Notes belongs to STA120 at California State Polytechnic University taught by MichaelNasab in Fall. Since its upload, it has received 72 views. For similar materials see /class/218378/sta120-california-state-polytechnic-university in Statistics at California State Polytechnic University.
Reviews for StatisticswithApplications
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/03/15
nsvuctuv PM M ke Nasab Chapter 2 Organizing and Summarizing Data a D t When data ave cuHected m uhgmaummhey ave caHed vawdata TheVuHqug ave the semes Dquot the ms1 es1 unhe aushcsdassmm uvznns snazszaemsaenazeemeam 9988 Enazaeaamaam asuasaammaa Group Data VVhthe yaw data suvgamzed mm a lrmuency dislrihminn Frequency Distribution sthe uvgamzmg uhaw data m ab EfDYm usmg c asses aha vequenmes cum Numbev u c asses h the gwen tame 55 classmm R epvesen he smaHes1 aha avges1 datava ues m eaeh c ass Laws am The uwes1 numbev m eaeh c ass h abwe amesn mhe hmev c ass hm ed the hm c ass ED mhe uwev c ass hm unhe 2nd c ass etc urru Class The mghe numbev m eaeh c ass h abwe ab e g mhe uppev c ass hm u the m1 gas 69 mm mm dass hm nnhe 2h ease etc classwmm pvevmus c ass h abwe abe the c ass mm e m Cumulative Frequency amp Relative Frequency Class nequency Cumulative Relatwe Frequency Frequency 45 50759 5 5 5 1 1 5059 3 35 we 7079 12 ZHS S 8089 13 W25 8 was 7 n 7 45 thfmta poms The most commonly used graphs in statistics are ElThe histogram ElThe frequency polygon ElThe cumulative frequency graph El ar chart EIPie Chart ElPareto charts ElOgive Graph Making decisions about a process product or procedure that could be improved after examining the vari an on Example Should the school invest in a computerbased tutoring program for low achieving students in Algebra I after examining the grade distribution 2 The frequency polygon Line Graph Making decisions about a process product or procedure that could be improved Exam le A frequency polygon for 642psychology test scores shown below 3 The cumulative frequency graph Cumulative frequency is used to determine the number of observations that lie above or below a particular value 4 The bar chart Bar charts are useful for comparing classes or groups of data 5 Pie Chart A pie chart is a way of summarizing a set of categorical data or displaying the different values of a given variable Example percentage distribution Pie charts usually show the component parts of a who e 6 Pareto charts A Pareto chart is a bar graph where the bars are drawn in decreasing order offrequency or relative frequency and is used to graphically summarize and display the relative importance of the differences between groups of data Examples What are the largest issues facing our class What 20 of sources are causing 80 of the problems we focus our quot39 quot greatest 39 mng amncm I rumquot 2m rain in i W l r m n 24 1 yr am I M g r 7mm an i a Jan I A 39 5mm m qunn I Ana an awn am Bexll quota 1mm l mm um umquot omen Italy Hula Angina Emil emu IN a 7 Time series Graph Month Prim of AOL Monrh 8 StemLeaf 1 The StemandLeaf Plot summarizes the shape of a set of data the distribution and provides extra detail regarding individual values 2 They are usually used when there are large amounts of numbers to analyze Series of scores on sports teams series of temperatures or rainfall over a period of time series of classroom test scores are examples of when Stem and Leaf Plots could be used 9 Ogive Graph read as oh jivequot A graph that represents the cumulative frequency or cumulative relative frequency for the class It is constructed by plotting points whose Xcoordinates are the upper class limits and whose y coordinates are the cumulative frequencies or cumulative relative frequencies Frequency Ogive Time Between Eruptions mam 000 OD Cumulative Frequency N D 674 5 684 5 694 5 704 5 714 5 724 5 734 5 Time seconds 5 02589 Emmi 161 6 01236689 7 012356668899 8 0123566688899 9 0234679 The first row reads 50525558 and 59 gtType of Distributions There are several different kinds of distributions but the following are the most common used in statistics i Symmetric normal or bell shape 39 El Positively skewed Right tail or skewed to the right side El Negativer skewed Left tail or skewed to the left side El Uniform gt Positively skewed gt Negatively skewed Away hm haxp cunty Risk9h or c gtUniform gtSymmetric Instructor Prof Mike Nasab WW Chapter 1 1 Statistics The science of collecting organizing summarizing analyzing numerical information called data Descriptive Statistics and drawing conclusions lnferential Statistics 2 Type of data a Qualitative variables Variables that can be separated into different categories distinguished by some nonnumeric characteristic Example Gender male orfemale Race White Black Hispanic etc Favorite Color Blue Red Silver etc b Quantitative variables Variables consisting of numbers representing counts or measurements Example Age Height Speed Test scores the number books in our bookstore and salary 3 Type of Quantitative variables a Discrete variables Data that can assume the values corresponding to isolated points along a line interval In this type the data are to be counted Example 1 Numbers of telephone calls is made at the switchboard of our school everyday 2 Number of car accidents on 405 FVV 3 Number of babies delivered at Long Beach Memorial hospital b Continuous variables Data that can assume any value along a line interval including every possible value between any two values In this type the data are to be measured Example 1 Height of boys born at UCLA hospital on fourth of July 2 Amount of rain fall in California in the year 2005 3 Amount Volume of coffee consumed by Americans in one day 4 Variable Property of an object or event that can take on different values For example college major is a variable that takes on values like mathematics computer science English psychology etc Variables whose values are determined by chance are called random variables 5 Data List of observations a variable assumes 6 Variable versus Data Example Gender is a variable whereas being male or female is the data 7 Control group is a group of subjects in an experiment who are not given a particular treatment Like Placebo 8 Experimental group is a group of subjects in an experiment who are given a particular treatment Like new drug 9 Double blind Neither the doctor nor the patient know whether he or she is part of the experimental or control group 10 Population The complete collection of all elements to be studied Scores People Measurements 11 Sample A sub collection of elements drawn from a population 1 Example1 A quality control manager randomly selects 50 bottles of CocaCola that were filled on October 15 in order to assess the calibration of the filling machine Determine the individuals the population and the sample Ans The population consists of all bottles of Cocacola filled by that particular machine on October 15 the individuals are just the individual bottles The sample consists of the 50 bottles selected by the quality control manager Example 2 A researcher is claiming that the average age of women who are graduated from Engineering School at UCLA is about 28 years To test his hypothesis he randomly selected 300 female engineers who have graduated from UCLA school of Engineering Determine the population Identify the variable of interest is the variable quantitative qualitative Is the variable discrete or continuous Describe the sample Describe the inference 12 Parameter Numerical measurement describing some characteristic of a population Example 1 Population mean 039 Population standard deviation and p Population proportion 13 Statistic Numerical measurement describing some characteristic of a sample Example f sample mean s Sample standard deviation and f7 Sample proportion 14 Census versus a sample Census is a collection of data from every element in a population whereas a sample is a subset of a population 15 Survey We study a part of a larger population in order to understand the whole 16 Survey sampling a Observational study Association relationship between two variables Study in which we observe and measure specific characteristics but do not attempt to manipulate or modify the subjects being studied Example The incidence of lung disease in a sample of workers in asbestos factories is compared to the incidence of lung disease in a sample of college professors b Designed experiment Causation Study in which a treatment is applied to the experimental units individuals and attempts to manipulate or modify the subjects being studied The a of an r 39 over an study is that an experiment is controlled An experiment is defined by the following types of variables Response and Predictor 17 Response variable The variable which measures the response of units or subjects to the various treatment Example A researcher is interested in determining if one can predict the scores on a statistics exam from the amount of time spent studying for the exam Identify the response variable The scores on the exam Example A large study used records from Canada s national health care system to compare the effectiveness of two ways to treat prostate disease traditional surgery and a new method that does not require surgery You have 300 prostrate patients who are willing to serve as subjects in an experiment to compare the two methods What is the response variable in this experiment Existence of prostrate disease is the response variable Example A study is done to compare the lung capacity measured by certain breathing tests of coal miners to the lung capacity of farm workers The researcher is able to study 200 workers of each type Lung capacity 18 Predictor Variable The factor s that affect the response variables 19 Levels of predictor variable The values that a factor can take are called the levels of the factor For example a drug dosage the factor may be administered at three different levels 2 20 Lurking variable The factors that are related to our study but they are not being identified 21 Frame The list of all individuals within the population Example A school psychologist wants to test the effectiveness of a new method for teaching reading She selects five hundred first grade students in Long Beach District and randomly divides them into two groups Group 1 is taught by means of the new method while Group 2 is taught via traditional methods The same teacher is assigned to teach both groups At the end of the year an achievement test is administered and the results of the two groups com a d Determine a Population b Sample c Subject units d Response variable e Treatment f Levels of the treatment 9 Predictors h DesignedObservational study Population First graders in District 203 Sample 500 hundred rst grade students in that district Subjects units 500 Students Response Variable Test scores Treatment Method of teaching Levels of the treatment 2 New versus the traditional method of teaching Predictors Grades Teachers School District DesignedObservational study Designed experiment ammonia957 gt Sampling Techniques 1 Random sampling Simple random sample In this technique each member of a population has an equal chance of being selected Each member of the population is assigned a number You can select a random sample of any population by using a calculator or computer to generate random numbers Example A list of students in elementary statistics is obtained in which the individuals are numbered 1 to 65 A professor randomly selects 12 of the students IT IS DIFFICULT TO OBTAIN A SIMPLE RANDOM SAMPLE SRS IN PRACTICE 2 Stratified sampling Separating the population into nonoverlapping groups In this technique a population is divided into at least two different subsets called strata that share a similar characteristic A sample is then randomly selected from each Conducting a SRS separately within each strata The defining characteristic can be gender age or even political preference Using a stratified sample ensures that each segment of a population is represented For this reason stratified samples are usually preferred over simple random samples Example A researcher segments the population of car owners into four groups Ford General Motors Chrysler and foreign She obtains a random sample from each group and conducts a survey 3 Cluster sampling To select a cluster sample divide a population into groups called clusters then select all of the members in one or more but not all of the clusters This technique is often used because of practical or economical restrictions but data collected may be less reliable than when a random sample is used Example A researcher randomly selects 5 of the 70 hospitals in Long Beach area and then surveys all of the surgical doctors in each hospital 4 Systematic sampling In this technique a population is ordered in some way and then members of the population are selected at regular intervals The selection process can start at any randomly chosen point An advantage of systematic sample is that it is easy to use Example An interviewer in a mall is told to survey every fifth shopper starting with the second Systematic sampling When the population size is known Procedure for systematic sampling when the population size is known N Pop Size n sample size Form Nn and round it down to the nearest integer and call it K Select a number between 1 and K and call this number p the sample will consist of the following individuals p p k p 2 p n1 k U39PF N 5 Convenience sampling Selfselected individuals In this technique simply use any members of population that are readily available This method is likely to produce biased results Usually contains the most bias Example I A radio station asks its listeners to call in their opinion regarding the use of American forces in Peacekeeping missions II A professor would like to know how many hours per week college students spend watching television She is teaching two large classes and uses all students in those classes as her sample Sampling rror The difference between a characteristic of the entire population and a sample of that population gt Sources of Errors in sampling a Sampling error The error that results from using sam pling to estimate information regarding a population Example The size of the sample the amount of variation that exists in the population ie How different the members of the population are from one another with regard to the variable being studied b Nonsampling error Example Respondent s lying measurement errors poorly worded questions and the error due to people not responding Sources of bias Sampling bias A systematic tendency to exclude one type of person from the sample A large sample will not solve this problem NonResponse bias This is when people who do not answer questions are different from people who do Undercoverage bias This is when some groups in the population are left out of the process of choosing the sample Response bias This is when the individuals do not reply truthfully Voluntary Response bias This is when the survey relies on individuals who volunteer to respond eg Internet surveys and they are unscientific and unreliable The following can affect the validity of a study a The method in which a sample was obtained b Wording of questions c The order in which questions are presented Example A magazine is conducting a study on the effects of infidelity in a marriage The editor randomly select 400 women whose husbands were unfaithful and ask Do you believe a marriage can survive when the husband destroys the trust that must exist between husband and wife What is wrong with wording of the question that was asked Ans Do you believe that a marriage can be maintained after an extramarital relation A key component of a welldesigned experiment is RANDOMIZATION 4
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'