GEM 2901 Complete Notes
GEM 2901 Complete Notes GEM2901
Popular in Reporting Statistics in the Media
Popular in Statistics
This 24 page Bundle was uploaded by Shervin on Thursday August 6, 2015. The Bundle belongs to GEM2901 at National University of Singapore taught by in Spring 2015. Since its upload, it has received 329 views. For similar materials see Reporting Statistics in the Media in Statistics at National University of Singapore.
Reviews for GEM 2901 Complete Notes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 08/06/15
GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 1 Bene ts and Risks of Using Statistics Statistics 1 De nition Collection of procedures and principles for gaining and analyzing information To help people make decisions when faced with uncertainty To identify patterns and Relationships 2 Terminology Sample a Participants studied Population a Larger group from which sample was chosen 3 Statistical studies Components i Representative sample D In order to generalize results for a population ii Sizeable sample D Dependent on variability of responses D Larger sample helps to determine variances iii Observational study or randomized experiment 4 Differences Observational study a Merely observations about the sample n Causation cannot be established only relationships Randomized experiment a Randomly assign participants to various treatment groups n Includes control group where variables are unchanged 5 Improper use of statistics Representation a People who feel most strongly about an issue will be more likely to respond n Inappropriate measurement units used Shervin Lim Page 1 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 2 Reading the News Seven Critical Components FRIESDE 1 Funding and source of research By governmentprivate companies to make wise policy decisions By universitiesinstitutes to askanswer questions By companies to convince consumers their goods or services are better Funding of research may in uence research direction and publishing of results Independent research ground does not imply unbiased research group Motives should be minimized 2 Researchers in contact Exact particulars of people involved in data collection Preferred blinding of interviewer andor respondent People can be in uenced by the data collector 3 Individuals studied and selection process Results extend only to those similar to sample Understand how participants were enlisted Responses from volunteers can differ from nonvolunteers Volunteers often provide biased responses selfreporting may not be accurate Incentives may in uence respondents 4 Exact nature of measurements made or questions asked Some variables are difficult to measure Exact de nitions and wording must be known Wording and ordering can in uence responses 5 Setting in which measurements were taken Where when and how respondents were contacted or data collected Changes sample and proportion of respondents 6 Differences compared in addition to factor of interest If two or more groups are compared consider other differences that might in uence 7 Extent or size of any claimed effects or differences Size of effect must be explicit Practical importance should be assessed GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Shervin Lim Page 2 Chapter 3 Measurement Mistakes Misunderstandings Measurement 1 Key points Concerned with method of collection and variable measured D Wording in uences answers and data gathered question ofinterest is important D Point of measurement is important will provide varying results D Some concepts are hard to de ne eg stress and happiness 2 Pitfalls when asking questions Deliberate bias to support a particular cause D Do you agree that something should be illegal Unintentional bias that can be misinterpreted D Do you use drugs Desire to please by respondents D Understating undesirable habits Asking the uniformed D People who have no clue about a topic Unnecessaiy complexity D Multiple conditions within a single question Ordering of questions D Provide a factor in an earlier question before asking about potential factors Con dentiality and anonymity D Affects how people answer their questions and how truthful their responses a re 3 Open or closed questions Open questions allow ownword responses D Too many options to choose from D May not be related to question Closed questions provide options sometimes including an 39other39 H Options are not chosen by participants D Options not standardized Shervin Lim Page 3 4 De ning a common language Variables D Categorical variables can be categorized Variables that can be ordered are ordinal Variables that cannot be ordered are nominal D Measurement variables can be measured Numerical variables are quantitative Can distinguish between interval differences and ratio has a zero D Continuous variables Cannot be counted anything within a range D Discrete variables Can be counted Concerns D Validity Measure what it claims to measure eg income cannot be used to measure happiness D Reliability Same results will be given time after time can be repeated eg precise instruments of measurement D Bias Systematically off the mark in the same direction eg a clock that is fast Variability D Variability Two or more measurements in relation D Measurement error Amount by which each measurement varies from true value Due to imprecise measurements D Natural variability Changes across time in individual or system measured Across individuals at any given time or in the same individual D The more variability the harder to detect differences Reducing and controlling natural variability and systematic bias D Random assignment to treatments Reduces systematic biases due to confounding variables Confounding variables may exist between treatment groups D Matched pairs repeated measures and blocks Reduce known sources of natural variability in response variable Differences due to explanatory variable may be detected more easily Shervin Lim Page 4 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 45 Sampling Experiments and Observational Studies Data 1 De nition Collection of numbers or other information to which meaning has not been attached Meanings depend on how well information was obtained and summarized Can be numeric or nonnumeric 2 Origin Academic conferences or university media offices Published journals or government and agency reports Common Research Strategies 1 Sample surveys Subgroup of a large population questioned on a set of topics No intervention simply required to answer Faster to collect than census if population is large Resources can be channelled into accuracy Measurements do not destroy the unit being tested 2 Terminology Unit refers to the single individual or object to be measured a Experimental units D Smallest basic objects D Assigned different treatments in a randomized experiment a Observational units D Objects or people measured in any study a People are called participants or subjects often passive or recruited volunteers Population refers to the entire collection of units Sample refers to the collection of units we measure a 1500 adults as the sufficient sample for any population size Sampling frame refers to the list of units from which the sample is chosen Sample survey refers to measurements taken on a subset of units from the population Census refers to the survey in which the entire population is measured Margin of error refers to l square root of population a Sample proportion will only differ from population less than 5 of the time n Margin of error will be plus or minus the calculated percentage points 3 Randomized experiments Shervin Lim Page 5 Measures the effect of manipulating the environment in some way Randomized experiment a Manipulation assigned on random basis Explanatory variable a Feature being manipulated n One that may explain or cause differences in the response outcome variable Response variable a Outcome of interest Treatment n One or a combination of categories of the explanatory variable a Assigned by the experimenter Randomization n Makes groups approximately equal with the exception of explanatory variable a Create differences in explanatory variable and examine response variable Designing a good experiment Randomization U Random type of treatment B Each participant receives one treatment B Random assignment to experimental units D Protect against hidden or unknown bias a Random order of treatment B All treatments are applied in some experiments D Randomization determines order in which treatments are appHed Control groups n Handled identically in all aspects a They do not receive the actual treatment Placebos a People respond to placebos n Objects without the actual effect merely a semblance n Patients are not informed of the placebo Blinding n Doubleblind D Neither participant nor researcher know the treatment received a Singleblind D Either participant or research not both know the treatment assigned Matched pairs a Two matched individuals or the same individual receiving both treatments n Randomization assigns order of treatment Blocks n Extension of matched pair to three or more treatments Repeated measures a Block designs in which same participants are measured repeatedly Shervin Lim Page 6 4 Observational studies 5 Natural manipulation n Observe differences in the explanatory variable a Notice if these are related to differences in response variable Cannot provide causation only relationship Casecontrol studies attempt to include appropriate control groups Results may extend more readily than experiments Reasons for usage a Unethical or impossible to assign people to receive or not receive a treatment a Certain explanatory variables are inherent traits cannot be randomly assigned Presence of confounding variable a Related to explanatory variable and affects the response variable B Effect of confounding variable cannot be separated from effect of explanatory variable making it a bigger problem in observational studies Designing a good experiment Types n Casecontrol studies D Cases with a particular attribute or condition D Compared with 39controls39 D Advantages Efficiency in terms of time money Inclusion of sufficient cases Reduces potential confounding variables a Retrospective or Prospective studies D Retrospective refers to participants asked to recall past events D Prospective refers to participants followed into the future Events recorded Better because of inaccurate recalling Interactions between variables a Occurs when the effect of one explanatory variable depends on what is happening to another explanatory variable a eg smoking and exercise on the child39s IQ during pregnancy Effect modi ers in health related studies a Magnitude of relationship differs for subgroups eg gender m Subgroup variable becomes an effect modi er n Modi es the effect of the explanatory variable on the outcome 6 MetaAnalysis Quantitative review of a collection of studies on a similar topic Combination can result in emergence of patterns or effects Shervin Lim Page 7 7 Random sample vs Random assignment Random sampling can be impractical to obtain a random sample n Extent of results depend on representativeness Random assignment evens out confounding variables a Prevent natural confounding variables from creating an apparent relationship Types of Sampling 1 Simple random sampling Every conceivable group of people of the required size has an equal chance of selection Requirement a Complete list of units in the population used as random numbers Procedure U Obtain list of units and number them a Obtain the required number using a random number generation a Locate and interview the people whose numbers were selected 2 Strati ed random sampling Divide population into groups strata Conduct simple random sampling Advantages n Individual estimates provided for each stratum n More accurate estimates of population values if stratum vaues more consistent n Cheaper to sample separately if strata are separated a Different interviewers may be used 3 Cluster sampling Divide population into clusters eg coege dormitory and floors Identify a random cluster and measure only the random custer Advantages U Need only a list of clusters not a list of all individual units 4 Systematic sampling Divide population into as many consecutive segments as need Randomly choose a starting point for the segment Sample the same point in each segment 5 Random digit dialling Approximate a simple random sample of all households with telephones 6 Multistage sampling Combination of methods used at different stages Begin with strati cation followed by clusters and systematic within clusters Shervin Lim Page 8 Dif culties in Sampling 1 Dif culties Wrong sampling frame may include unwanted units or exclude desired units Not reaching the individuals selected even if a proper sample is selected Low response rate as not everyone can be contacted or respond even if they did 2 Disasters Volunteer or selfselected sample is a waste of time Convenience or haphazard sample can produce misleading results 3 Dif culties and disasters of experiments Confounding variables a Variables connected with explanatory variable can distort results a They may be responsible for changes in the response D Treated with randomization to spread effects of confounding variables Interacting variables a Second variable interacts with explanatory variables a Results reported without notice D Measure and report variables that may interact Placebo Hawthorne effect experimenter effects a Placebo is the power of suggestion n Hawthorne effect is the consciousness of an experiment going on n Experimenter effect refers to researcher39s cognitive bias D Treated with doubled blinding and control group D Data entered automatically into computer when collected Ecological validity n Variables measured in lab or arti cial setting a Results do not accurately re ect impact in real world a Results may not extend to larger group Generalizability a Design experiments to be performed in natural setting a Random sample from a population of interest a Measure other variables to see if related to response or explanatory variables 4 Dif culties and disasters of observational studies Confounding variables and implication of causation a No way to establish causation since there is no randomization D Measure potential confounding variables D Choose controls as similar as possible to cases Inappropriate extension of results a Usage of convenience samples are not representative D Entire segment of population of interest Past as a source of data a Unreliable difference in confounding variables D Prospective if possible D Authoritative sources over memory GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Shervin Lim Page 9 Chapter 6 Getting the Big Picture Seven Steps 1 Determine if research was a sample survey an experiment a observational study a combination or based on anecdotes 2 Consider the Seven Critical Components to familiarize with the details of the research 3 Review the difficulties and disasters inherent in the research type and determine if any apply 4 Determine if the information is complete Find the original source or contact the authors for missing information if necessary 5 Determine if the results make sense in the larger scheme 6 Decide if there is an alternative explanation for the results 7 Determine if results are meaningful enough to encourage a change in lifestyle attitudes or beliefs Shervin Lim Page 10 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 7 Summarizing and Displaying Measurement Data Turning Data Into Information 1 Key points Centre Unusual values outliers D Values far removed from the rest of the data Variability D Spread of the values D Range maximum minimum D lnterquartile range D Standard deviation Shape D Pictures of data express distribution of values Numbers used D Lowest refers to the minimum D Highest refers to the maximum D Quarties are medians of two halves 2 Mean median and mode Mean is the numerical average Median is the middle number or the average of the two middlemost numbers Mode is the most common value 3 Visualizing data Stemplots D Easy way to order numbers and get picture of shape Histograms D Better for large data sets with pictures of shape Ell5 4 5 51025505 p f r5642930965 W i l l 815430820 H 1 QIBSEUSQ 39 i Example 32 32 39 items I a Stemplimt for glidestquot ages W l rl Fragment Shervin Lim Page 11 4 Stemplots Create the stem D Divide range of data into equal units used on stem D Equally spaced intervals Attach the leaves D Each leaf represents each data point D Optional to order leaves on branch Split stems D Reusing digits two or ve times ie 0459 or 012345678 9 iv Obtain information D Determine shape D Identify outliers D Locate centre Pullse Estes Oillest Ages M39smien illnenmes 54 sue sew shes 4 611233444 L 0121133444 65556W89 l l 46549 T 0124 T l l g glmlgggm T 58 002345 8 sews sleessss Bellshape Otltlliet 0f 36 139343 Centered mid 60 s Apstt fiem 521 IBIS 11m etttlileii s rest fifem Wide range mm 4 the n 5 m 9 339 ltlllll lll llll y values 5 Histogram Divide range of data into intervals Count the number of values in each interval Draw bar over each interval with height in proportion 6 De ning common language about shape Symmetric D Mirrored on both sides eg bellshaped Unimodal D Single prominent peak Bimodal D Two prominent peaks Skewed to the right D Higher values more spread out than lower values Skewed to the left D Lower values more spread out and higher ones more clumped Shervin Lim Page 12 7 8 9 Fivenumber Emmiwry for Hummer fi39murs afsfeep 7 Median 6 8 Lower uar tile Upper Quartile 3 Lowest Highest T he venumber summary display Boxplots Steps to drawing a boxplot i Draw horizontal or vertical line and label it with values D Lowest to highest in data ii Draw rectangle box with ends at quartiles iii Draw line in box at value of media iv Compute interquartile range IQR or distance between quartiles v Compute 15IQR outliers are values beyond the number vi Draw line or whiskers B Each end of the box extending to the farthest data value that is not an outlier D If not outlier then minimum to maximum vii Draw asterisk to indicate outliers Visual picture of venumber summary Interpreting boxplots a Data is divided into fourths n Outliers are easily identi ed a Useful for comparison Traditional measures Useful for symmetric sets of data with no outliers Mean D Centre of data Variance D Standard deviation2 Standard deviation D Spread or variability of the values D Roughly the average distance of the observed values from their mean Computing the standard deviation E Find the mean E Find the deviation of each value from the mean Deviation value mean D Square the deviations and sum the results D Divide the sum by the number of values 1 or n 1 giving the vanance D The square root of the variance is the standard deviation Difference between average and normal Shervin Lim Page 13 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 10 Relationships between Measurement Variables 8 9 not included Statistical Relationships Grade FlairIt uerage Ll l i L HalliEl Sonire 1 De nition Correlation n Strength of a certain type of relationship between two measurement variables Regression n Numerical method for trying to predict one measurement variable from another Statistical relationship a Natural variability exist in both measurements a Useful for describing what happens to a population or aggregate Deterministic relationship a If we know the value of one variable we can determine the other39s value exactly a eg volume and water Statistical signi cance and strength a A relationship is statistically signi cant D If chances of observing the relationship in the sample when nothing is actually going on in the population are less than 5 D If the relationship is stronger than 95 of the relationships we would expect to see by chance a Warning D A minor relationship can be statistically signi cant if the sample is large B Strong relationships may not be signi cant if sample is very small Shervin Lim Page 14 Shervin Lim Page 15 2 Correlation Pearson productmoment correlation or correlation coef cient a Represented by the letter r Indicator of how closely the values fall to a straight line a Only measures linear relationships D How close individual points in a scatter plot are to a straight line Features a Correlation of 1 indicates a perfect linear relationship between two variables B One increases with the other D All individuals fall on the same straight line D Deterministic linear relationship D Variables of positive correlations increase together U Correlation of 1 also indicates a perfect linear relationship except D As one increases the other decreases D Variables of negative correlations decrease together a Correlation of 0 could indicate no linear relationship or that the best straight line through the data on a scatter plot is exactly horizontal n Correlations are unaffected if the units of measurement are changed D eg height in inches feet or millimetres 1 ti in rxiw 53 e39 iii husband s age 36 9 7 wif e s age 3 Regression Find a straight line that comes as close as possible to the points in a scatter plot a Resulting line is called the regression line Formula that describes the line is called the regression equation Most common procedure used produces the least squares regression line Equation of the line y 1 a bx 0 a intercept where the line crosses the vertical axis when x CI 0 b slope how much of an increase there is in y when x increases by one unit 4 Extrapolation Not useful for predicting values far outside the range where the original data fell No guarantee that the relationship will continue beyond the range where we have data a Only for a minor extrapolation beyond range of original data Shervin Lim Page 16 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 11 Relationships can be Deceiving Illegitimate Correlations 1 Problems with correlations Outliers can substantially in ate or de ate correlations n Outliers that are consistent with the trend will in ate the correlation n Outliers that are inconsistent can substantially decrease the correlation n 5 of all data points are corrupted when initially recorded or entered a Legitimate outliers and illegitimate correlation B When presented with data in which outliers are highly likely D When correlations presented for a small sample Groups combined inappropriately may mask relationship a see Simpson 395 Paradox 2 Causa on Legitimate correlation does not imply causation n Especially for observational studies possible third variable Con rmed through use of randomized experiments Or through nonstatistical considerations n Reasonable explanations varying conditions 39doseresponse relationship39 3 Reasons for relationships between variables Explanatory variable is the direct cause of the response variable a eg food consumed in the past hour and level of hunger Response variable is causing a change in the explanatory variable a eg occupancy rate in uencing advertising expenditure Explanatory variable is a contributing but not sole cause of the response variable a eg carcinogen contributing to cancer but not the only cause Confounding variables may exist U eg emotional support in relation to happiness and length of life Both variables may result from a common cause u eg verbal SAT and GPA can be result of the same cause Both variables are changing over time n eg divorce rates and drug offences Association may be a coincidence a new buildings and rate of brain cancer Shervin Lim Page 17 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 12 Relationships between Categorical Variables Displaying Relationships 1 Contingency Tables Row explanatory variable Column response variable Cell combination of row and column Count number of individuals who fall into each combination of categories Take into account conditional percentages and rates Haart Na Heart Heart Rate per Attack Attack Total Attacks 1000 Aspirin 1114 111933 113 94 94 P39Iaaeba 18939 111345 1 1341 11 11 Taital 293 21178 22071 Risk Probability and Odds 1 Expressions of relationships Number with trait total x 100 n Percentage as 40 Number with trait total a Proportion as 040 n Probability as 40 a Risk as 040 Number with trait number without trait to 1 n Odds as 4 to 6 2 Risk Baseline risk a Risk without treatment or behaviour n Dif cult to nd a If placebo group is included baseline risk refers to risk for placebo group Relative risk a Ratio of risks for each category for two categories of explanatory variable U eg Relative risk of 3 means one group is 3 times more at risk than the other Increased risk a Change in risk X 100 or Relative risk 10 X 100 Shervin Lim Page 18 3 Odds ratio Ratio of the odds of getting one group to the odds of getting another group Often reported after adjustment to account for confounding variables 4 Misleading statistics Common misrepresentations of risk a Missing baseline risk D eg three times more likely does not explain much 1 in 100000 to l in 300000 compared to l in 10 and 3 in 10 a Time period of risk not identi ed D eg Lifetime risk or speci c age groups different population size as well n Reported risk does not necessarily apply to everyone D eg confounding factors such as environment and individual traits 5 Simpson39s Paradox Simpson39s Paradox a Two or more groups a Relationship appears to be in different directions D Depending on consideration of third variable D Variables may be strongly correlated or not correlated a Dangerous to summarize information a eg severity of cases prior to treatment racial biases in retrenchment Survival tes far Standard and New Treatments Hespiral a Hespilal E Survive Die quotFatal 5 urwive Die Telel Standard 5 95 l GE 53936 5 l ODD New l Cit ll 35 i Teiell l i 435 ll iiifl 53923 it l lCiIi39i Risk Campared far Standard and New Treatments HEEFETEII A lHespi39ltll lEi Elsiz at dying wiilii the standard irea lrnenl Q5 i if 195 5GB i HIDi 335 Rataidymg v a with the new treatment RENEE i HUG 090 5quot i DD Relative rislll UQEIUQD 166 i 39f Estimating tillle Overall eduetidn it Risk Survive Die Teial Eisli t eF Death Standstd 05 5 95 i lDD 595r l iii3973 ied New W5 Q95 l lDU Q D it si 082 leidil 70f lEGi 2200 Shervin Lim Page 19 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 14 LongTerm Expectations Probability 1 Two interpretations Relative frequency a eg odds of winning lottery Personal probability a eg buying a home 2 Relativefrequencyinterpretation Applies to situations that can be repeated n eg buying lottery every week Longrun relative frequency a Proportion of time it occurs over the long run Summary n Situation can be repeated numerous times and outcome observed each time n Should settle down to constant value over time becoming probability a Does not apply to situations where outcomes can be in uenced past or present a Cannot be used on a single occasion used to predict longterm proportion Determining the probability of an outcome a Make an assumption about the physical world D eg coin ipping n Observe the relativefrequency D eg male births over a year of all births in the same year Try n Which the Gurcome First Happens Probability i P lquot Pill r pill pip E ii Fig0 A lquot will Dlll wipell wig s r pill pill will lull ll prim 3 Personalprobabilityinterpretation Degree to which a given individual believes an event will happen a Between 0 and l and must be coherent n eg nding a parking space downtown on Saturday Still follow rules of probability Shervin Lim Page 20 4 Rules of probability Outcomes must add to l Outcomes that cannot happen simultaneously are mutually exclusive D Total probability of either outcome is the sum of their individual probabilities Events that do not in uence one another are independent n Probability of two independent events is their probabilities multiplied Probability of subset event cannot be higher than probability of event Probability of an outcome is p probability of it not happening is 1 p Accumulated probability a lst occurrence not happening by occasion n is l pn n lst occurrence happened by occasion n is l l p Einstetitltt iIZleneLe er muteemee erilih eepilel letters E39 mull ee eh lily is lune Utlllgt itlflt i e11 etllei pueeilhle eLisleurmee are part elk eumplement elf1 l39L39i e lJ t i the lu39ehelltlilirtg thet the event er euteeirte tel weeniea leer all event ll 5 Pilll E ll mile ll Pei Peer 39LlHEllttl liL lll l t llll the 1139 euhe 111111 llhie Earle 3 IIIVeeelille A etltfl B are m ti mi1etc39hierre llietl File iii H E Fifi l Fi Ittrilej l t eee te A and it are r39iielepeee39een then Fit l end it PH Fifi Retinal itquot the retail1 5 in which in e1 ej i t H ILTHJL Well me eminent WT 11393 112 eiy em eelquot then I ll ah i 1 lite fl 5 Longterm gains losses and expectations Longterm outcomes can be predicted Expected value a If taken over a large group of individuals can be interpreted as mean value EV expeetedl value Alp Ag A3p3 Ath where A1 Al A3 Ak are the peeeiihlle emeunte endpl p25 p3 pk are the eeeeeieted prehehilitiee Shervin Lim Page 21 GEM2901 Reporting Statistics in the Media AY 1415 Semester 2 Chapter 1617 Psychological In uences Intuition and Relative Frequency Personal Probability 1 Equivalent probabilities different decisions Certainty effect U People more willing to pay to reduce risk to zero than to simply reduce risk a Even if the reduction amount is the same Pseudocertainty effect U People more willing to accept a complete reduction of risk on certain problems and no reduction on others a Less willing to accept a reduced risk on all problems 2 Distortion Availability Heuristic n Probability of an event based on the ease in which it is brought to mind D eg homicide receiving more attention in the media than diabetes Distorted personal probability a Detailed imagination D Risk distorted by one39s vivid imaginations D Amplify or diminish the actual risk Anchoring n Providing a reference point or an anchor from which they adjust a Most tend to stay close to the anchor D eg chance and the percentage point provided D Lowanchor lt Noanchor lt Highanchor Representative Heuristic a People assign higher probabilities than warranted a Scenarios that are representative of how we imagine things to happen Conjunction fallacy a Detailed scenarios are given higher probability assessments n Less attention paid to statements of one of the simple events alone n eg Linda is a bank teller Linda is a bank teller and an active feminist E Second is perceived to have a higher probability due to detail E First is more probable because it encompasses the second Forgotten base rates a Representative heuristic lead people to ignore likelihood D Does not always correspond to actual base rate Shervin Lim Page 22 3 Optimism reluctance to change and overcon dence Optimism refers to the belief that one is probably better than average Reluctance to Change is the reluctance to change one39s assessment n In spite of new evidence Overcon dence n Tendency to place too much con dence in their personal assessment U Overestimate probability that they are correct 4 Calibrating personal probabilities Professionals often use personal probabilities to help others make decisions Using relative frequency to check personal probabilities n Assess if probabilities are wellcalibrated only if there are sufficient repetitions 5 Tips to improve Think of the big picture Find out the baseline risk Do not be fooled by highly detailed scenarios List reasons that one might be wrong Try to be realistic Watch out for the anchoring effect Break events into pieces Revisiting Relative Frequency 1 Coincidences Surprising concurrence of events perceived as meaningfully related No apparent causal connection a eg winning the lottery twice Not improbable that the same circumstances can happen elsewhere Birthday problem a Probability that none of the 23 is 3653651365234336522 0493 n Probability that at least 2 people share becomes 0507 Seem improbable if we ask the probability of a speci c event occurring at that time to us 2 Gambler39s fallacy Longrun frequency does not necessarily apply in the short run Independent events have no memory a Does not apply to events with memory 3 Confusion of the Inverse Confusing two different probabilities n eg probability of a negative result and probability of having the result Probability of false positives u If base rate is low and test is less than perfect highly likely of a false positive a Requires knowledge of base rate sensitivity and speci city of test a Sensitivity refers to correct positive predictions n Speci city refers to accurate negative predictions Shervin Lim Page 23 if you were fac ndtli the following alternatives which would yon cllnoee Note that you can choose either A or and eitherquot or 11 of 240 guaranteed El 151 25 chance to 1000 and a T596 chance of getting nothing enre lone of 40 I1 T5clta11ce to lose 1000 and a 25 chance to lose nothing 4 Expected values in decisionmaking Expected value in B is higher but A is preferred Expected loss in D is higher but D is preferred Sure gain is valued but willing to take risk to prevent loss Dollar amounts are important in decision making Sure loss of 5 is easier to deal with than chance of losing 5000 Chance of winning 5000 is preferred to sure gain of 5 if you were faced with the following alteniatittee which would yon Ell i Note that you can choose eitherA or and eitherquot or al teioiatitre A Altnative B Alienation C Altantenna I A l in 1000 chance of ntilniing 5000 A enre gain of 5 A l in 1000 chance of looting 5000 A etne lose of 5 For Those Like Formulas Conditional Probability The conditional pinnontifn of event green knowledge that event Ei lifil ipet ed lFi denoted liftquot Plain Bayes Rule Snppoee Al and do are complementary et39ento witli knon n probabilities In other 11 i f l llagar one iHn39rir39l39l nllfwr gnrnlneinfa li llfql their de nh lnilifeaq l 39I1lH 39i39rt ll Ene aileronH1an 39Il wlxhlli LLJLiy IIKEL JxLJLI l Llllllxh39 591 lL luEl h H IJ39II Elto 1 JALUUEI L39ll xll mbl LHJJJ 15quot J J 39aJJ LHEILJLLJIIH they ought represent o treaence and theence of a doeaee 1n n candoran ehoeen indi tritlnal mineoee it ie anotlier et39ent each that the conditional probabilities Hit i it it and Fifi l A 1 are both loionn For eaanitile E inieln iiire titre intonelittlequot of rooting poeitiee tot tlte tlieeaee lie do not need to lotion Fi 39l39l nen Hence Ellie detennineo the conditional probability in the other direction Filo 3P3 Qli no I to l l a all 6quot eta 1th5 All PtdgiPtB All For enarnple Entree Rule can he road to determine the probabilityquot ofhat mg a die eaee given that the teat genitive The noise oenaitititfc and rapeei cttt would all need to tie ltnottn Batten little ie weaniii extended to more than too nntinallje enclnai te E i39El llS no long an the probability of each one iii known and tlte probability ot conditional on each one it linenn content 1003 Brook oiiiole a ditneion of mentorsn Learning Lac Shervin Lim Page 24
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'