STAT 113 Review: Units 1-7
STAT 113 Review: Units 1-7 STAT 113
Popular in STAT 113
Popular in Department
verified elite notetaker
Test Prep (MCAT, SAT...)
verified elite notetaker
verified elite notetaker
One Day of Notes
verified elite notetaker
ANSC 221: Animal health and Nutrition
verified elite notetaker
One Day of Notes
verified elite notetaker
This 15 page Study Guide was uploaded by Purdue 1 on Thursday February 12, 2015. The Study Guide belongs to STAT 113 at Purdue University taught by Ellen Gundlach in Winter2015. Since its upload, it has received 276 views.
Reviews for STAT 113 Review: Units 1-7
-Lola Zhu Shan
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 02/12/15
STAT 113 Statistics amp Society Ellen Gundlach Unit 1 Sampling Part 1 Population vs Sample Individuals the objects described by a set of data Variable any characteristics of an individual can take different values for different individuals Proportion the fraction of a total that possesses a specific attribute of successes total sample size X 11 Example I I out of 50 children say chocolate is their favorite avor of ice cream Proportion 1150 022 or 22 Population the entire group of individuals about which we want information Census attempts to get information from every member of the populations time consuming expensive and hard to do well Examples inventory short version of the US Census Sample a part of the population that we actually examine in order to gather information about the whole population Cheaper quicker easier than taking a census Valid information if done well especially if a random sample from the population Parameter number that is true for the whole population Statistic number that is true for the sample Possible problems with sample statistics Bias consistent repeated deviation for the sample stat from the pop parameter in the same direction when we take many sample Choosing a random sample will reduce bias Variability how spread out the sampling distribution is for the stat Determined by sampling design and sample size n Larger samples have smaller variability We want both small bias and small variability Types of Samples Convenience Sample NOT RANDOM NOT THE BEST Selection for whichever individuals are easiest to reach shopping mall surveys Voluntary Response Sample NOT RANDOM NOT THE BEST sometimes the only ethical choice for experiments using people Consists of people who chose themselves by responding to a general appeal callin radio web poll Biased because people with strong opinions especially negative ones are most likely to respond Random Sample MUCH BETTER Eliminates by allowing impersonal chance to do the choosing Gives all individuals and equal chance to be chosen Types of Random Sampling Simple random sample of size n SRS starts with a list of the whole population then use a random method such as a random number table or software to select 11 of those individuals with each individual having an equal chance of being chosen We could randomly select 120 people from a master list of all students taking STAT 113 this semester Stratified random sample divide the individuals from the population into groups based on some characteristic gender year in school major or hometown for example then take simple random sample within each of those groups combine all those samples into one big sample We could sort the students by their lecture section and take a SRS of 10 students from each of the 20 recitations sections of S TAT 113 to get a total sample size of 200 Part 2 Sampling Problems and Surveys Problems with samples Random sampling error not a mistake Undercoverage Response errorbias Nonresponse Random sampling error Deviation between the sample statistic and the population parameter caused by chance in selecting a random sample Each time you take a random sample from the population you Will get a slightly different statistic simply due to random variability Taking a larger sample Will help reduce random sampling error but only taking a census would get rid of it entirely The margin of error in a confidence statement includes ONLY random sampling error 57 3 Sampling Error Undercoverage Undercoverage occurs When some groups in the population are left out of the process of choosing the sample Caused by the act of taking a sample They cause sample results to be different from the results of a census Example active duty Nonsampling errors Response errorbias occurs When a subject gives an incorrect response lying remember incorrectly doesn t understand the question etc Nonresponse the failure to obtain data from an individual selected for a sample Usually happens because some subjects can t be contact or because those who are contacted refuse to cooperate Not related to the act of selecting a sample from the population They can be present even in a census Unit 2 Experiments Part 1 Data Collection Features of Experiments Ways to collect data Anecdotal evidence a story from one person or just a few people not scienti c Examples Dateline NBC lead story something that happened to your neighbor often the lSt or last paragraph in a news story to try to help you make a personal connection with the research Observational study observes individuals and measures variable of interest buy does not attempt to in uence the responses Stand back and watch A survey is one type of observational study Experiment deliberately imposes some treatment on individuals in order to observe their responses Makes the individuals do something in particular Why are experiments better than observational studies Experiments can Help minimize lurking variables 0 Lurking variables has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied Possibly show a causeandeffect relationship between the treatment and response under certain conditions Features of Experiments Explanatory variable may explain or cause changes in the response variable independent variable Response variable measures the outcome or result of a study dependent variable Treatment any specific experimental condition applied to the subjects If an experiment has several explanatory variables a treatment is a combination of specific values of the variables Simplest designed experiment Individuals 9 Treatment 9 Results Problems Lurking variables a variable we did not study that is still important Placebo effect Where our brains trick our bodies into reacting to nothing Bias by researchers 3 principles of good experimental design Control group or comparison group Helps to avoid placebo effect Randomization to treatment groups Helps to reduce bias Large sample size in each treatments group Helps to reduce variability balances out quirks in individuals DoubleBlind Experiments Neither the researcher collecting the measurements nor the individuals being treated know Which treatment each individual received Someone else keeps track of matching up the treatments to the measurements after the data has been collected Very best kind of experiment Reduces bias Placebo Effect vs Nocebo Effect Placebo effect our expectation of feeling better can lead to real positive physiological change in our bodies Nocebo effect our expectation of side effects can create negative physiological change in our bodies Part 2 Types of Experiments Clinical Trials Specific types of experiments we will use in STAT 113 Completely randomized design Randomized block design Matched pairs design All 3 types of experiments listed above use randomization Clinical trials Study the effectiveness of medical treatments on actual patients Medical treatments can harm as well as heal Interests of individual are more important than interest of society and science Must use randomized comparative experiments Balance between belief in new treatments potential to justify exposing half the subjects to it vs suf cient doubt in the new treatment s ef cacy to justify Withholding it form the other half of subjects who might be assigned to placebos Why is it ethical to give a control group a placebo Often placebos work and they have no harmful side effects It is possible that the placebo may be better than the treatment What can go wrong in clinical trials Nonadherers subjects who participate but do not follow the instructions Refusals subjects who do not agree to participate Are these people systematically different from those cooperate Dropouts subjects who begin the experiment but do not finish why Because of the treatment Or for some other reason How a drug is approved Creation A drug company creates a product in its laboratory and conducts inhouse tests with lab animals Application the company files an investigational new drug application with the FDA asking permission to test the new drug on humans Phase 1 trials Healthy human testers receive the drug in small doses then larger doses Researchers look for the side effects Phase 2 trials a small number of patients with the ailment in question tries various doses Goal finding a recommended dose that balances effectiveness and side effects Phase 3 trials a large number of patients with the ailment takes various doses Research hone their data and capture side effects Review FDA scientists review all data 0 often enough to fill at least one lSwheeler s load This process can take 6 months or longer Decision if the FDA gives its approval the drug receives clearance to go on the market for general sales and prescriptions After approval any drug can be pulled off the market for various reasons including side effects that come to light only after the drug was put on the shelves and came into Wide use Unit 3 Causation Common Response and Confounding Causation Event X causes event y Example giving someone owers causes them to express a true smile Only a welldesigned and controlled experiment can show you causation but it is still difficult to show causation If you don t have causation what else could be going on Common Response Event 2 can cause event X or event y Example Researchers have found that grayhaired people die at a higher rate than people with other hair colors Does this mean that gray hair causes death Gray hair is x higher death rate is y lurking variable age is z Confounding Event X can cause event y event 2 can also cause y Example How much you study x helps you do better on the exam y but getting more sleep the night before the exam z can also help you do better Warnings about Causation A strong association alone is not enough Need carefully controlled eXperiment to be able to show causation Causation is usually not the whole story Causal relations may not generalize to other settings Real life is messy How can you show causation The association is strong The association is consistent Higher doses are associated with stronger responses Alleged cause precedes the effect The alleged cause is plausible Unit 4 Ethics of experiments with humans and animals Ethical Experiments with Humans Planned studies should be reviewed by a board to protect the subjects from harm All subjects must give their informed consent before data are collected Not behavioral studies in public space Not educational research that is part of normal class activities All individual data must be kept confidential Only summaries can be made public Anonymity is not the same as confidentiality Anonymity is not required Anonymity means even the researchers do not know who the participants are Confidentiality means the researchers know who the participants are but can t share that information with the public Voluntary response vs informed consent Voluntary response is a type of sampling that is not random and just uses a general callout to get volunteers to participate Inform consent should be used in EVERY type of sample The researcher can choose the sample but the people chosen for the sample still need to give informed consent Just because you re chosen does not mean you have to participate Informed consent for kids parents signed a different form Tells the kids how long the study lasts and what they ll have to do ll out a survey play games etc Reassures the kids it is ok to skip questions or to stop participating altogether No one will be mad at them for quitting Warns the kids about any risks even if it is just embarrassment Lets the kids know if there are any potential bene ts even if it s just knowledge about how their bodies work but that hopefully the results of the research will be helpful to others in the future Ethical Experiments with Animals Replacement use nonanimal models such as microorganisms or cell culture techniques computer simulations or species lower on the phylogenetic scale Reduction reduce the numbers of animals needed by implementing careful experimental design Refinement eliminate or reduce unnecessary pain and distress Unit 5 Evaluating Numbers and Claims Part 1 Measurements and Numbers Questions about the variables in any statistical study How exactly is the variable defined Is the variable a valid way to describe the property it claims to measure How accurate are the measurements Valid Measurement A measure of a property that is relevant or appropriate as a representation of that property Example you don t measure a dog s Q by the number of eas on his belly Measure value true value bias random error Bias in a measurement if the measurement systematically tends to overstate or understate the true value of the property it measure EX you clock is always 5 minutes faster than everybody else s clock Random error in a measurement process if repeated measurement process if repeated measurements on the same individual give different results Ex each time you weigh dog on the scale at the vet s office you get slightly different results Reliable measurement A measurement where the random error is small No measuring process is perfectly reliable EX The digital scale in your chemistry lab gives a more reliable measurement of an object than your bathroom scale does Reliability and validity are not the same thing Reliability has to do with how consistently the instrument measures Validity has to do with whether that instrument was even the right choice to measure the variable you want to study Use averages to improve reliability The more measurements you use for your average the better That was why statistics had to be invented because people were so unstable and irrational taken one at a time Raymond F Jones 1915 1994 Peer review is important for judging value of experimental results Peer review is a system used by scientists to decide which research results should be published in a scientific journal Peer review subjects scientific research papers to independent scrutiny by any other qualified scientific experts peers before they are made public Peer review can help you make sense of science stories as it tells you that the research has passed the scrutiny of other scientists and is considered valid significant and original Peer review means that statements made by scientists in scientific journals are critically different from other kinds of statements of claims such as those made by politicians newspaper columnists or campaign groups Science is therefore more than just another opinion Does the math make sense A quantity can increase by any amount 100amp increase means the original value has doubled A quantity cannot decrease by more than 100 Percentage change amount of change starting value X 100 Examples of percentage change Bill scored 10 baskets in the game yesterday and 22 baskets in the game tonight What is his percent change Tonight Yesterday Yesterday 22 1010 120 120 increase in score which is more than doubling his previous score Unit 6 Government Statistics Government Statistics What do citizens need from their government statistical agencies Need data that are accurate and timely and keep up with changes in society and the economy Freedom from political in uence How big of a deal is the US Census The Census is taken every 10 years our only official head count It is currently a 65 billion project Originally meant to allocate members of House of Representatives to the states Now used to help distribute 100 billion to the states based on their population and needs What happens if you don t ll out your census Legal trouble If you re over 18 and refuse to answer all or part of the Census you can be ned up to 100 If you give false answers you re subject to a ne of up to 500 If you offer suggestions or information with the intent to cause inaccurate enumeration of population you are subject to a ne of up to 1000 up to a year in prison or both American Community Survey Conducted annually by the US Census Bureau Replaces the long form of the census that comes out every 10 years It is only sent to a random sample of the population Subjects Included in the American Community Survey Demographic Characteristics Economic Characteristics Social Characteristics Housing Characteristics Financial Characteristics Why do we need the Census To figure out how to spend our tax dollars in the most effective way To figure out where our people are To tell our leaders What the living conditions and needs our people are To tell businesses Where the workers are To tell businesses What types of customers they have and What their needs are Unit 7 Big Data Big data what it is and why it matters Big data describes the exponential growth and availability of data both structured and unstructured And big data may be as important to business and society as the internet has become Why More data may lead to more accurate analyses More accurate analyses may lead to more confident decision making And better decisions can mean greater operational efficiencies cost reductions and reduced risk What big data can tell us Determine root causes of failures issues and defects in nearreal time potentially saving billions of dollars annually Optimize routes for many thousands of package delivery vehicles while they are on the road Analyze millions of SKUs to determine prices that maximize profit and clear inventory Generate retail coupons at the point of sale based on the customer s current and past purchases Send tailored recommendations to mobile devices while customers are in the right area to take advantage of offers Recalculate entire risk portfolios in minutes Quickly identify customers who matter the most Use clickstream analysis and data mining to detect fraudulent behavior Data Brokers Companies that analyze and sell huge amounts of consumer data for marketing purposes One data broker had 3000 data categories for nearly every individual American consumer No way for the consumer to correct errors Federal Trade Commission has very little control over them partly because the consumer is not the customer but the product Big data is neither good nor bad Big Data is neither ethical nor unethical The point of this is not scare or outrage you The point of this unit is to Introduce you to the idea of big data Show you some of the great things big data can accomplish Make you aware of how much data you produce every day Empower you to ask questions about Who collects your data how they use it and how they protect your privacy
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'