### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Notes MATH 1140

Tulane

GPA 3.5

### View Full Document

## 13

## 1

## Popular in Business Statistics

## Popular in Math

This 30 page Class Notes was uploaded by Maya Notetaker on Monday November 23, 2015. The Class Notes belongs to MATH 1140 at Tulane University taught by Robert Herbert, in Summer 2015. Since its upload, it has received 13 views. For similar materials see Business Statistics in Math at Tulane University.

## Similar to MATH 1140 at Tulane

## Reviews for Notes

### What is Karma?

#### Karma is the currency of StudySoup.

Date Created: 11/23/15

Statistics for Business & Economics I. Chapter 1: Statistics, Data, & Statistical Thinking a. 1.1 The Science of Statistics i. Statistics is the science of data. It involves collecting, classifying, summarizing, organizing, and interpreting numerical and categorical information. ii. Professional statisticians are trained in statistical science— collecting data, evaluating it, and drawing conclusions from it. b. 1.2 Types of Statistical Applications in Business i. Statistics are widely identified as “numerical descriptions” ii. Sampling is when a smaller set of data is collected from a larger set of data to estimate characteristics of the larger set of data iii. Statistics involves 2 processes 1. Describing Sets of Data/Descriptive Statistics (i.e. look for patterns, summarize information, and present info in convenient form in data sets) 2. Drawing Conclusions/ Inferential Statistics (making estimates, decisions, predictions, etc.) about sets of data based on sampling c. 1.3 Fundamental Elements of Statistics i. Statistical methods are useful for studying, analyzing, and learning about populations of experimental units 1. An experimental (or observational) unit is an object (e.g., person thing, transaction, or event) upon which we collect data 2. A population is a set of/all experimental units (usually people, objects, transactions, or events) that we are interested in studying a. A variable is one or more characteristics or properties of the experimental units in the population (i.e. age, gender, income, etc.) b. Measurement is the process used to assign numbers to variables of individual populations (i.e. rate food on a scale of 1 to 10) c. If a population is small, measuring a variable for every experimental unit of a population is possible Maya Pelichet 1 i. Measuring a variable for every unit is called a census d. Sometimes conducting a census for a population that is very large can be time consuming or expensive, consequently a sample—a subset of the units of a population—is used i. A statistical inference is an estimate or prediction/generalization about a population based on information contained in a sample 1. Reliability—how good the statistical inference is 2. Resource constraints make it harder to be certain an inference is correct, so reliability is a constantly checked factor 3. A measure of reliability is a statement (usually quantified about the degree of uncertainty associated with a statistical inference ii. Four Elements of Descriptive Statistical Problems 1. The population or sample interest (group/category you are interested in looking at) 2. One or more variables (characteristics of the population or experimental units) that are to be investigated 3. Tables, graphs, or numerical summary tools 4. Identification of patterns in the data iii. Five Elements of Inferential Statistical Problems 1. The population or sample interest (group/category you are interested in looking at) 2. One or more variables (characteristics of the population or experimental units—one member of the population) that are to be investigated 3. The sample of population units (very often done because it’s unlikely you can survey/investigate whole population) 4. Inference— notion that assumes what was true for the sample is also true for the population 5. A measure of reliability for the inference—tells you how sure you are that your inference about the population is correct Maya Pelichet 2 d. 1.4 Processes i. Statistical methods are useful for making inferences about processes—a series of actions or operations that transforms inputs to outputs. A process produces or generates output over time (example: math function f(x)) ii. A process whose operations or actions are unknown (don’t know what the steps of the process is) or is very complex (or both) is called a black box 1. Frequently, inputs are not specified when a process is treated as a black box 2. To study a process, one or more characteristics are 2 generally focused on (variable ) 3. To study a process that has a number output, the property represented by the numbers (GDP, sales, stock) is typically the variable of interest 4. If the output is not numeric, measurement processes are what are used to give the variables numbers 5. Samples are defined differently concerning processes, a sample 2 is any set of output (object or numbers) produced by a process e. 1.5 Types of Data i. All data can be classified as one of two general types: quantitative & qualitative ii. Quantitative Data 1. Data measured on a naturally occurring numerical scale— i.e. the current unemployment rate (measured as a percentage) iii. Qualitative Data 1. The measurements that cannot be measured on a natural numerical scale; they can only be classified into one of a group of categories—i.e. Democratic, Republican, etc. f. 1.6 Collecting Data: Sampling and Related Issues i. Typically data is obtained in 2 different ways 1. Published source—a book, journal, newspaper, or Website (someone else found the data for you) 2. Design experiment—a researcher exerts strict control over the units (people, objects, or events in the study) 3. Observational study (e.g. a survey)—the researcher observes the experimental units in their natural setting and records the variable(s) of interests Maya Pelichet 3 a. The most common type of observational study is a survey—a researcher samples a group of people, asks one or more questions, and records the responses (i.e. political poll) ii. Regardless of how the data is collected, to apply inferential statistics,you must obtain a representative sample— representative of the population as a whole 1. The most common way to satisfy the representative sample is select to select a simple random sample— ensures that every subset of fixed size in the population has the same chance of being included in the sample/ a simple random sample of n experimental units is a sample selected form the population in such a way that every different sample of size n has an equal chance of selection 2. Procedure for selecting a simple random sample typically relies on a random number generator 3. There are more complex random sampling designs that can be employed a. Stratified Random Sampling—when experimental units associated with the population can be separated into two or more groups of units called strata, where the characteristics of the experimental units are more similar within strata that across strata b. Cluster Sampling—when natural groups of experimental units are sampled then used to collect data from experimental units (i.e. chose a random sample of clusters in one place to represent population, and survey everyone, or take a random sample within each cluster)— this method of sampling is often used in biological sampling c. Systematic Sampling—involves systematically sampling ever kth experimental units from a list of th all experimental units (i.e. survey every 10 person who comes into a grocery store) Maya Pelichet 4 d. Randomized Response Sampling 4. Selection bias/ sample bias occurs when some experimental until in the population has a less chance of being included in the sample than other, resulting in samples that are not representative of the population (even when your are trying to create a representative sample, this can occur) 5. Nonresponsive bias results when a subset of experimental units in the population has little or no chance of being selected for the sample 6. Sometimes even if your sample is representative of the population, the date collected may suffer from measurement error—inaccuracies in the values of the data collected. In surveys, this error may be due to ambiguous or leading questions and the interviewer’s effect on the respondent g. 1.7 Critical Thinking with Statistics i. For example, increasing the governmental impact on drugs and product testing, there is evidence for a need of quantitative literacy—the ability to evaluate data intelligently in the means of helping you think critically using statistics ii. Statistical thinking involves applying rational thought and the science of statistics to critically assess data and inferences. Fundamental to the thought proves is that variation exists in populations and process data. iii. Unethical statistical practice results when selection bias in a sample is intentional with the sole purpose to mislead the public iv. Most problems with surveys result from the use of nonrandom samples—samples subject to potential errors such as selection bias, nonresponsive bias, and measurement error v. The Role of Statistics in Managerial Decision Making Maya Pelichet 5 II. Chapter 2: Methods for Describing Sets of Data a. 2.1 Describing Qualitative Data i. Qualitative data are non-numerical; thus the value s of a qualitative variable can be classified only into categories called classes (i.e. Republican, Democrat, Libertarian, etc.) 1. Classes can be summarized in two numerical ways: a. Computing the class frequency—the number of observations in the data set that fall into each class; the sum of class frequencies will always equal the sample size n b. Computing the class relative frequency—the proportion of the total number of observations falling into each class; class relative frequency is the class frequency divided by the total number of observations in the data set Class frequency i. Relative frequency= n c. Class percentage is the class frequency divided by the total number of observations in the data set ClassPercentage=(Relative frequency)x100 i. Maya Pelichet 6 ii. The most widely used graphical methods for describing qualitative data 1. Bar graphs: the categories (classes) of the qualitative data are represented by bars, where the height of each bar is either the class frequency, class relative frequency, or class percentage 2. Pie chart: The categories (classes) of the qualitative variable are represented by slices of a pie (circle). The size of each slice is proportional to the class relative frequency 3. Pareto diagram: A bar graph with the categories (classes) of the qualitative variable (i.e., the bars) arranged by height in descending order from left to right b. 2.2 Graphical Methods for Describing Quantitative Data i. For describing, summarizing, and detecting patterns in data recorded on a meaningful numerical scale, three methods can be used: 1. Dot plots—the numerical value of each quantitative measurement in the data set is represented by a dot on a horizontal scale. When data values repeat, the dots are placed above one another vertically. 2. Stem-and-leaf displays—the numerical value of the quantitative variable is partitioned into a “stem” and a “leaf.” The possible stems are listed in order in a column. The leaf for each quantitative measurement in the data set is placed in the corresponding stem row. Leaves for observations with the same stem value are listed in increasing order horizontally 3. Histograms— the possible numerical values of quantitative variable are partitioned into class intervals, where each interval has the same width. These intervals form the scale of the horizontal axis. The frequency or relative frequency of observations in each class interval is determined. A horizontal bar is placed over each class interval with height equal to either the class frequency or class relative frequency. c. 2.3 Numerical Measures of Central Tendency Maya Pelichet 7 i. When a data set is spoken of, it is referred to as a sample or a population; if statistical inference in the goal, ultimately, numerical descriptive measures are used to make inferences about corresponding measures for the population ii. Most numerical methods measure one of two data characteristics 1. The central tendency of measurements—the tendency of the data to cluster, or center, about certain numerical values 2. The variability of the set of measurements—the spread of the data 3. The most popular measure of central tendency is arithmetic mean—a set of quantitative data that is the sum of the measurements divided by the number of measurements contained in the data set a. The mean of a sample of measurements is denoted: n ∑ x n i=1i Note: ∑ xi=(x1+x2+…+x )n ¯= i=1 n b. Note: the textbook has adopted a policy of using Greek letters to represent corresponding descriptive measures for the sample. The symbols for the mean are the following: Maya Pelichet 8 x=Samplemean i. ¯ μ=Populationmean ii. η=Pouplationmedian iii. m=Samplemean iv. 2 2 v. σ / s = Sample variance vi. σ∨s=Standarddeviation 1. Population mean is calculated as N ∑ x i i=1 where N is the population μ= N size (x−x)2 2. s = ∑ ¯ n−1 vii. Often the ¯ is used to estimate the μ viii. When using an estimator or an approximation for μ to make an inference, you need to know the reliability of your inference afterwards; the reliability of an inference depends on two factors; Maya Pelichet 9 1. The size of the sample. The larger the sample, the more accurate the estimate will tend to be 4. Another important measure of central tendency is the median—the middle number when the measurements are arranged in ascending (or descending) order a. The median is most valued when describing larger sets b. If the data set is a relative frequency histogram, the median is the point on the x-axis such that half the area under the histogram lies above the median and lies half below m c. The median of a sample is denoted by and η population median is denoted by Maya Pelichet 10 d. Calculating the sample median i. Arrange the n measurements from smallest to largest n m ii. If is odd is the middle number n m iii. If is even, is the mean of the middle two numbers 5. A data set is said to be skewed if one tail of the distribution has more extreme observations than the other tail a. If the data set is skewed to the right (rightward skewness) then typically the median is less than the mean b. If the data set is symmetric, the mean equals the median c. If the data set is skewed to the left (leftward skewness), then typically the mean is less than (to the left of) the median 6. A third measure of central tendency is the mode of a set of measurements a. Because it emphasizes data concentration, the mode is used with a quantitative data sets to locate the region in which much of the data set b. The mode may not be meaningful for some data sets because there may be more than one mode 7. A more meaningful measure can be obtained from a relative frequency of histogram for quantitative data a. The class interval containing the largest relative frequency is called the modal class Maya Pelichet 11 b. The simplest way to define the mode within a model class is to define it as the midpoint c. *For most application involving quantitative data the mean and median provide more descriptive information d. Numerical Measures of Variability i. Measures of central tendency provide a partial description of a quantitative data set; the description is incomplete without a measure of variability, or spread of the data set; knowledge of the data’s variability along with its center can help us visualize the shape of the data set as well as it’s extreme values ii. The range of a quantitative data set is equal to the largest measurement minus the smallest measurement 1. The range of a quantitative data set is equal to the largest measurement minus the smallest measurement; this is because two data sets can have the same range and be vastly different with respect to data variation iii. The sample variance for a sample n measurements is equal to the sum of the squared deviations from the mean divided by ( 2 n−1 ). The symbol s is use dto represent the sample variance xi n ∑ ¿ i=1 ¿ 2 n ¿ ∑ (xi−x´) ¿ 1. Formula : s = i=1 Shortcut: ¿ n−1 2 xi−¿ ¿ n ∑ ¿ i=1 s =¿ 2 2. The population variance, denoted by the symbol σ , is the average of the squared distances of the measurements on all units in the population from the Maya Pelichet 12 mean, μ , and σ is the square root of the quantity. Because there is rarely access to the population data, we σ 2 do not compute or σ . These two quantities are denoted by their respective symbols\ N a. Formula: ( is size of population): n 2 ∑ (xi−μ) σ = i=1 N n b. tends to produce an underestimate for the population variance, so (n-1) is used in the denominator—and preferred—to correct this tendency when computing the sample variance iv. The sample standard deviation, s is defined as the positive square root of the sample variance, s . Thus s= √ 2 e. 2.5 Using the Mean and Standard Deviation to Describe Data i. Chebyshev’s Rule applies to any data set regardless of the shape of the frequency distribution 1. At least 3/4 of the data lie within two standard deviations of the mean, that is, in the interval with endpoints (¯−2s,x¯2s) for samples and with endpoints (μ−2σ,μ+2σ) for populations; 2. At least 8/9 of the data lie within three standard deviations of the mean, that is, in the interval with endpoints (¯−3s,x¯3s) for samples and with (μ−3σ,μ+3σ) endpoints for populations; 1−1/k 2 k 3. At least of the data lie within standard deviations of the mean, that is, in the interval with Maya Pelichet 13 (¯−ks,¯+ks) endpoints for samples and with endpoints (μ−kσ,μ+kσ) k for populations, where is any positive whole number that is greater than 1. ii. The Empirical Rule is a rule of thumb that applies to data sets with frequency distributionsthat are mound-shaped and symmetric 1. Approximately 68% of the data lie within one standard deviation of the mean, that is, in the interval with endpoints (¯−s,x¯s) for samples and with endpoints (μ−σ,μ+σ) for populations; 2. Approximately 95% of the data lie within two standard deviations of the mean, that is, in the interval with (¯−2s,x¯2s) endpoints for samples and with (μ−2σ,μ+2σ) endpoints for populations; and 3. Approximately 99.7% of the data lies within three standard deviations of the mean, that is, in the interval Maya Pelichet 14 with endpoints (¯−3s,x+¯s) for samples and with endpoints (μ−3σ,μ+3σ) for populations. f. 2.6 Numerical Measures of Relative Standing i. Descriptive measures of the relationship of a measurement to the rest of the data are called measures of relative standing ii. One measure of the relative standing of a measurement is its percentile ranking, or percentile score 1. Percentile rankings are of practical value only for larger data sets 2. To find them, the measurements are ranked in order, and a rule I selected to define the location of each percentile 3. Because the primary interest in interpreting the percentile rankings of measurements (rather than finding particular percentiles for a data set), we define the p th percentile of a data set a. For any set of n measurements (arranged in ascending or descending order), the p percentile Maya Pelichet 15 is a number such that p of the measurement fall below the pth percentile and (100−p) fall above it 4. Percentiles that partition a data set into four categories, each category contains exactly 25% or the measurements , are called quartiles th a. The lower quartile (Q )Lis the 25 percentile, the th middle quartile (Q ) isMthe median or 50 percentile, and the upper quartile (Q )Uis the 75 th percentile iii. Another measure of relative standing in popular use is the z-score— the z-score makes use of the mean and standard deviation of the data set in order to specify the relative location of a measurement; the z-score represents the distance between a given measurement x and the mean, expressed in standard deviations x−x¯ 1. Sample z-score: z= s Population z-score: x−μ¯ z= σ 2. Interpretation of z-Scores for Mound-Shaped Distributions of Data Maya Pelichet 16 68 a. Approximately of the measurements will −1 1 have z-score between and b. Approximately 95% or the measurements will have −2 2 a z-score between and 99.7 c. Approximately (almost all) of the −3 measurements will have a z-score between 3 and i. Theses interpretations for z-score are identical to that give by the empirical rule for mound-shaped distributions. The statement that a measurement falls in the interval μ–σ μ+σ to is equivalent to the statement that measurement has a population z-score between -1 and 1 because all measurements between μ–σ and Maya Pelichet 17 μ+σ are within 1 standard deviation of μ . g. 2.7 Methods for Detecting Outliers: Box Plots and z-scores i. And observation that is unusually large or small relative to the data values we want to describe is called an outlier; outliers are typically attributable to one of the following causes; 1. The measurement is observed, recorded, or entered into the computer incorrectly 2. The measurement come from a different population 3. The measurement is correct but represents a rare (chance) event ii. Two useful methods for detecting outliers, one graphical and one numerical are called box plots and z-scores iii. A box plot is based on the quartiles of a data set and more specifically,the interquartile range (IQR)—the distance between the lower and upper quartiles/ IQR=Q −U L iv. Elements of a Box Plot 1. A rectangle (the box) is drawn with the ends (the hinges) drawn at the lower and upper quartiles. The median of the data is shown in the box, typically by a line 2. The points at distances 1.5 (IQR) from each hinges mark are the inner fences of the data set. Lines (the whiskers) are drawn from each hinge to the most extreme measurement within the inner fence IQR a. Lower inner fence ¿Q L1.5¿ ) IQR b. Upper inter fence ) ¿Q U1.5¿ 3. A second pair of fences, the outer fences, appear at a distance of 3(IQR) from the hinges. One symbol (usually "*") is used to represent measurements falling between the inner and outer fences, and another (usually "0") is used to represent measurements that lie beyond the Maya Pelichet 18 outer fences. Outer fences are not shown unless one or more measurements lie beyond them ¿Q −3(IQR) a. Lower outer fence L b. Upper outer fence ¿Q U3(IQR) 4. The symbols used to represent the median and the extreme data points (those beyond the fences) will vary depending on the software you use to construct the box plot v. Aids to the Interpretation of Box Plots 1. The line (median) inside the box represents the “center of the distribution of the data 2. Examine the length of the box. The IQR is a measure of the sample’s variability and is especially useful for the comparison of two samples 3. Visually compare the lengths of the whiskers. If one is clearly longer, the distribution of the data is probably skewed in the direction of the longer whisker 4. Analyze any measurements that lie beyond the fences. Fewer than 5% should fall beyond the inner fences, even for very skewed distributions. Measurements beyond the outer fences are probably outliers, with one of the following explanations a. The measurement is incorrect. It may have been observed, recorded, or entered into the computer incorrectly b. The measurement belongs to a population different from the population the rest of the sample was drawn from c. The measurement is correct and from the same population as the rest. Generally we accept this explanation only after carefully ruling out the others 5. Rules of Thumb for Detecting Outliers a. Box Plots: Observations falling between the inner and outer fences are deemed suspect outliers. Observations falling beyond the outer fence are deemed highly suspect outliers Maya Pelichet 19 Suspect Outliers Highly Suspect Outliers Between Q L1.5(IQR) and Below QL−3(IQR) QL−3(IQR) Q L1.5(IQR) IQR Between and Above Q +3¿ L Q +3(IQR) L b. z-Scores: Observations with z-scores greater than 3 in absolute value are considered outliers. For some highly skewed data sets, observations with z-scores greater than 2 in absolute value may be outliers Possible Outliers Outliers |z|>2 |z|>3 h. 2.8 Graphing Bivariate Relationships i. One way to describe the relationship between two variables— bivariate relationship—is to plot the data in a scatterplot—a two-dimensional graph with one variable’s value’s plotted along the vertical axis and the other’s variables plotted along the horizontal axis i. 2.9 The Time Series Plot Maya Pelichet 20 i. Data that are produced and monitored over time are called time series data ii. When measurements are made over time, it is important to record both the numerical value and the time or the time period associated with each measurement; with this information, a time series plot, or run chart can be constructed to describe the time series data and to learn about the process that generated the data iii. A times series plot is a scatter plot with the measurements on the vertical axis and the time or order in which the measurements were made on the horizontal axis iv. Time series plots reveal the movement (trend) and changes (variation) in the variable being monitored j. 2.10 Distorting the Truth with Descriptive Tactics i. A common way to distort the graph is to change the scale on the vertical axis, horizontal axis or both ii. Sometimes the distance between successive units on the vertical axis is to stretch the vertical axis by graphing only a few units per inch iii. Other times the width of the bars is mad proportional to the height iv. Sample populations can be distorted when only a measure of central tendency is reported, while both a measure of central Maya Pelichet 21 tendency and variability are needed to obtain an accurate mental image of a data set III. Chapter 3: Probability o In this chapter, unlike the others, the population is assumed to be known and used to infer the probable nature of a sample o The probability of many observed sample results are not easy to evaluated intuitively, thus the assistance of a theory of probability is needed a. 3.1 Events, Sample Spaces, & Probability i. The result seen and recorded is called an observation or measurement ii. The act or process of observation that leaves to a single outcome that cannot be predicted with certainty is called an experiment iii. Because observing the outcome of an experiment is similar to selecting a sample form a population, the basic possible outcomes to an experiment are called sample points 1. The sample space of an experiment is the collection of all of its sample points that are used as a reference 2. When each sample point is represented by a solid dot (i.e. a “point”) and labeled accordingly, these graphical representations are called Venn diagrams 3. The probability of a sample point is number between 0 & 1 inclusive that measures the likelihood that the outcome will occur when the experiment is performed a. This number is taken to be the relative frequency of the occurrence of a sample point in a very long Maya Pelichet 22 series of repetitions of an experiment—this result derives from and axiom in I iv. When there are experiments that have little or no information on the relative frequency of occurrence of the sample points, probabilities are assigned to the sample points based on general information about the experiment 1. No matter how you assign the probabilities to sample points, the probabilities assigned must obey 2 rules (i.e. pi represents the probability of the sample point i ): a. All sample point probabilities must lie between 0 & 1 ( 0≤ p i1¿ 2. The probabilities of all the samples must sum to 1 (i.e. ∑ pi1 ) v. An event is a specific collection of sample points. Furthermore, a simple event contains only a single sample point, while a compound event contains two or more sample points A 1. The probability of an event is calculated by summing the probabilities of the sample points in the ample space A for 2. Steps for Calculating Probabilities of Events a. Define the experiment; that is, describe the process used to make an observation and the type of observation that will be recorded b. List the sample points c. Assign probabilities to the sample points d. Determine the collection of sample points contained in the event of interest e. Sum the sample point probabilities to get the event probability vi. One method of determining the number of sample points for a complex experiment is to develop a counting system, starting by examining a simple version of the experiment vii. A second method of determining the number of sample points for an experiment is to use combinatorial mathematics Maya Pelichet 23 1. Combinations Rule a. A sample of n elements is to be drawn from a set of N elements. Then, the number of different N samples possible is denoted by (n and is N = N ! equal to (n n!(N−n)! Where the factorial symbol (!) means that n!=N(n−1)(n−2)…(3)(2)(1) b. Note: the quantity of! Is defined to be equal to 1 b. 3.2 Unions & Intersections i. An compound event is an event that can be formed (composed) in two ways 1. The union of two events A and B is the event that occurs if either A or B both occur on a single performance of the experiment. We denote the union of eventsA B A∪B A∪B and by the symbol, . consists of all A B the sample points that belong to or or both 2. The intersection of two events A and B is the event that occurs if both and B occur on a single performance of the experimental. We write A∩B for the intersection ofA and B . A∩B consists of all A B the sample points belonging to both and Maya Pelichet 24 ii. Unions and intersections can be defined for more than two events. For example, the event A∪B∪C represents the union of three events: A , B , and C . This event, which includes the set of sample points in A , B , or C , will occur if any one (or more) of the events A , B,∧C , will occur if any one (or more) of the events A , B , and C occurs. Similarly, the intersection of A∩B∩C is the event that all three of the events , B , and C occur. Therefore A∩B∩C is the set of sample points that are in all three of the events. c. 3.3 Complementary Events i. A very useful concept in the calculation of event probabilities is the notion of complementary events: A A 1. The complement of an event is the event that does not occur—that is, the event consisting of all sample points that are not in the event A . We denote the A Ac complement of by 2. Rule of complements a. The sum of the probabilities of complementary c events equals 1: P(A)+P(A )=1 3. In many probability problems, calculating the probability of the complement of the event o interest is easier than c calculating the event itself. Because(A)+P(A )=1 , we Maya Pelichet 25 P A ) can calculate , by using the relationship c P (A)=1−P(A ) d. 3.4 The Additive Rule and Mutually Exclusive Events i. It is also possible to obtain the probability of the union of two events by using the additive rule of probability—the probability of the union of events A and B is the sum of the probabilities of the events A and B minus the probability of the intersection of events A and B—that is, P(A∪B)=P(A)+P(B)–[P(A∩B)] 1. The union of two events will often contain many sample points because the union occurs if either one or both of the events occur A∪B A∧B ii. When contains no sample points ( have no sample points in common), events A & B are mutually exclusive Maya Pelichet 26 1. If two events A & B are mutually exclusive, the probability of the union of A and B equals the sum of the probabilities of A and B; that is, P(A∪B =P A +P(B) 2. If the events are not mutually exclusive the formula above is NOT mutually exclusive. For two non-mutually exclusive events, you must apply the general additive rule of probability e. 3.5 Conditional Probability i. Unconditional probabilities—event probabilities that give the relative frequencies of the occurrences of the events when the experiment is repeated a very large number of times a very large number of times; called this because not special conditions are assumed other than those that define the experiment ii. Conditional probability—event probabilities in which additional knowledge is known that effects the outcome of the experiment causing the probability of event interest to be altered 1. The symbol P (A|B) is used to denote the probability that A occurs given that event B occurs 2. Conditional probability formula—to find the conditional probability that event A occurs given that event B occurs, divide the probability that both A and B occur by the probability that B occurs P(A∪B) P A B = {It is assumed the P(B)≠0 P(B) a. This formula adjusts the probability of A B from its original value in the complete sample space S to a conditional probability in the reduced sample space B. If the sample points in the complete sample space are equally likely, then the formula will assign equal probabilities to the sample pints in the reduced sample space. If on the other hand, the sample points have unequal probabilities the Maya Pelichet 27 formula will assign conditional probabilities proportional to the probabilities in the complete sample space. f. 3.6 The Multiplicative Rule and Independent Events i. The Multiplicative Rule of probability is the probability of the intersection of events A and B P(A∪B)=P(A) B|A ) P(A∪B)=P(B) (A|A) or equivalently, ii. The formula for calculating intersection probabilitiesis invaluable when the intersection contains numerous sample points iii. When the assumption that event B has occurred will not alter the probability ofB , the events A and B are said to be independent events 1. Events A & B are independent if P B|A =P(A) , or if P A B =P(B) 2. Events that are not independent are said to be dependent iv. 3 points about independence 1. The probability of an independent event can’t be shown on or gleaned at from a venn diagram like mutually exclusive events 2. Mutually exclusive events are dependent events because A≠P A|B ) P ¿ 3. Because when A and B are independent the probability of intersection of two independent events comes into play a. If events A and B are independent, the probability of the intersection of A and B equals the product of the probabilities ofA and B ; that is Maya Pelichet 28 P(A∩B)=P(A)P(B) i. as well as the P(A∩B)=P(A)P(B) converse, If , events A and B are independent g. 3.7 Bayes’s Rule i. Bayesian statistical methods involve converting an unknown probability to one involving a known conditional probability ii. Bayes’s Rule can be applied when an observed event A occurs with any one of several mutually exclusive and exhaustive B events, B 1B 2 …, B k such that V, P(¿ ¿1)+P ( 2¿ … ¿ +B k1 A , and an observed event , then P B ∩A P ( | =) ( i ) i P(A) = P( )i(A B| i P ( 1 ( B|+P1B P( 2 (+…+| 2)P A B ( k ( | k) IV. Chapter 4: Random Variables and Probability Distributions a. 4.1 Two Types of Random Variables i. Random variables that can assume a countable—if you can list the values of a random variable x, even if the list in never ending—is never number (finite or infinite) of values are called discrete 1. More Examples of Discrete Random Variables a. The number of sales made by a salesperson given in a week: x = 0,1,2,… b. The number of concumers in a sample of 500 who ii. Random variables that can assume values corresponding to any of the points contained in one or more intervals (i.e. values that are infinite and uncountable) are called continuous Maya Pelichet 29 Maya Pelichet 30

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.