All of Module One Notes
All of Module One Notes STA 210
Popular in Intro to Statistical Reasoning
Popular in Department
This 6 page Bundle was uploaded by Maddi Caudill on Wednesday February 3, 2016. The Bundle belongs to STA 210 at University of Kentucky taught by Dr. William S. Rayens in Spring 2016. Since its upload, it has received 30 views.
Reviews for All of Module One Notes
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/03/16
Basic Numeracy Inference • The process of reasoning from the known to the unknown • E.g., a jury considering evidence and inferring about guilt of defendant • E.g., a politician looking at the poll results and deciding whether she has a chance in the fall election Human Inference • An off hand phrase taken here to mean "inferences we make from statistical constructs like charts, graphs, numerical summaries." • Think of it as how you operationalize statistical descriptions •" reﬂexive inferences" Not Always Easy • Lots of things can compromise seemingly simple inferences • Obstacles range from basic math to subtle reasoning • Goal is to learn to recognize and avoid some of the more common pitfalls Decimal Dilemmas • Did Plato inadvertently create the Atlantis myth? • Claim made that over 4 million women in the U.S. are battered to death each year by a spouse or boyfriend • Several Verizon customers were quoted data rates like ".015 cents per kilobyte" One sentence reﬂection: basic numeracy is a critical step toward the goal of correctly forming human inferences from statistical constructs Computations and Benchmarks • Be alert and be aware • Be ready and willing to question what you read and hear • Be comfortable with fractions and percentages • Be well versed in some basic "benchmarks" Benchmarks: • U.S. population is just over 300 million • Each year about 4 million babies are born in U.S. • About 2.4 million Americans die each year • Roughly 1 in 4 die of heart disease • Roughly 1 in 4 die of cancer • About 35,000 die in trafﬁc accidents • About 17,000 deaths are homicides • About 16,000 deaths are from AIDS • There are about 40 million black Americans • About 16% of Americans identify as Latino Useful? How? • Claim has been made that over 4 million women in US are battered to death each year by a spouse or boyfriend • Is this possible? • Only about 2.4 million die per year all together • So, not possible Reﬂection: competence with fractions, percentages, and a knowledge of common benchmarks go a long way toward the goal of correctly forming human inferences from statistical constructs Confounding and Language of Experiments Experiments • Collecting data under controlled conditions with the goal of establishing something close to cause and effect • Emphasis on control separates this way of collecting data from a survey • Statistical science gets involved in a couple ways, but notable through hypothesis testing Confounding • This purposeful control produces some of the purest data one can collect • confounding can compromise inferences about cause and effect even with data that are so carefully collected • In the vernacular: bewilderment or confusion • In statistics: confusion caused by a third variable distorting the association being studied between two other variables • Practical upshot: compromises case for cause and effect. Cant tell what is what Example • Does acupuncture help with pain? • Why not do this? • Ask patients to rate their pain before treatment • Ask them to rate pain again after treatment • How will you know that they don’t genuinely feel better just because they are being treated? There are many sources of confounding: • Inadequate or improper comparison • Lack of randomization Language Response variable: the primary variable you are taking measurements on for your experiment Explanatory variable: what you are varying in your experiment (different treatments or treatment levels) Subjects: who or what you are experimenting on Lurking variable: another name for that third variable that can cause confounding Placebo Effect: real response from subjects to an inert treatment Reﬂection: credible inferences from experimental data have to be free from confounding, which has two primary sources- lack of proper comparison and improper randomization Experiments - Statistical Signiﬁcance • Even if confounding is not an issue…. • There is a larger challenge to inferences that are being made Statistical Signiﬁcance: a difference between treatments that is sufﬁciently large that is unlikely to have occurred by chance alone Reﬂection: credible inferences from experimental data have to ultimately be held to a formal standard of statistical signiﬁcance Correlation & Scatterplots Correlation: • Tendency for certain values of one variable to be paired with certain other values of another variable • "correlation" and "association' used interchangeable in the vernacular • "correlation" in statistics has a more particular meaning as we will see Example: • Variables: height and weight • Bigger values of height tend to be associated with bigger values of weight • So height and weight are associated Language of Association: • Positive association: points have an upward trend from left to right • Negative association: points have a downward trend from left to right • Strength: how tightly points are clustered about some clear pattern (maybe straight line) Scatterplots: • Visual way of assessing association, both direction and strength • Makes sense as long as you have two variables you can display in a scatterplot Reﬂection: simple scatterplots are an informal, but useful visual means of addressing both the direction and strength of the relationship between two variables that are appropriate for this kind of plot Correlation Coefﬁcient • Simply a numerical way of summarizing the association between two variables, provided those two can be represented in a scatterplot. Formula for "r" • "r" measures the strength of the linear relationship between two variables "x" and "y" Important to know… • It is only appropriate to compute r if the scatterplot of y versus x exhibits a linear trend • R will always be between -1 and 1 e b l l i w •R negative if the points in the scatterplot have a downward trend from left to right e b l l i w •R positive if the points in the scatterplot have an upward trend from left to right • The closer r is to 1 in absolute value, the tighter the cluster of points about the linear trend and the stronger the association between x and y. • If r is close to 0 then the association is weak Reﬂection: the correlation coefﬁcient is the most common numerical measure of the strength of a straight line relationship between two variables that can represented by a scatterplot. Correlation- Causation Human Inference Point • Two variables can be highly associated and no causal links exist at all • Human inferences from graphs and numbers that suggest association, or exhibit strong correlation, have to be made carefully Correlation Does Not Imply Causation • Need to be able to discern the strength of the association and the credibility of any implied causation • Not unlike a discussion of confounding, you need to be on the look out for a third variable responsible for the association you see between the original two variables Reﬂection: association doesn’t imply causation because of other variables that may be inﬂuencing the relationships, but could be evidence of it
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'