Intro to Epi 521
Popular in Fundamentals of Epidemiology
Popular in Department
HIST 3013: Civil War Reconstruction
verified elite notetaker
Test Prep (MCAT, SAT...)
verified elite notetaker
verified elite notetaker
verified elite notetaker
verified elite notetaker
verified elite notetaker
This 63 page Bundle was uploaded by Christy Taylor on Saturday January 31, 2015. The Bundle belongs to EPH 521 at University of Miami taught by Dr. Hlaing in Fall. Since its upload, it has received 168 views.
Reviews for Intro to Epi 521
You can bet I'll be grabbing Christy studyguide for finals. Couldn't have made it this week without your help!
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 01/31/15
Intro to Epi 825 Week 1 PH has many different elds 5 different disciplines Epidemiology is the fundamental of everything especially research Be able to assess and evaluate how good a study is form this class Some people publish without strong evidence like how many eggs a day should you eat Some studies are more clear cut like smoking and lung disease and cancer Epidemiology borrowed its methods from different elds like biostatics and sociology psychologymultidisciplinary eld Comes from Greek work 4 key words frequency distribution determinants and human population in epi we think of a problem at a population level not the individual and initially starts at the human population veterinary epidemiologyepi has expanded into many different elds so they study groups of animals we usually study disease phenomenon but the outcome is not always disease like mortality death quality of life infertility low birth weight we may study the risk factors before it becomes a disease like high uric levels before gout intermediate outcome is studied as well frequencyhow often there are 2 people with acute respiratory infection in the class the next question should be how many people are in the class this is known as population at risk epi is a denominational science we want to know the total amount of people we like to quantify based on the denominator we may start our study with simple count but we always have to describe the problem with denominator proportion ratio rate ways they are different mathematically rate by de nition is not a proportion but prevalence is there are still some misnomers such as prevalence rate epi is very precise in certain terms frequency includes pattern of disease person place and time PPT this is the descriptive part of epidemiology distribution includes the frequency and the descriptive part determinantscause does x cause y does smoking cause lung cancer Before we get to saying that x is a determinant of y we have to do a preliminary study to determine frequency and distribution There are preventive and risk factors Not all factors are determinants some are protective of certain diseases From determinants we can divide it into many factors such as immunologic genetic social etc Smallpox for exampleimmunologic factor vaccine Some may have more access to health facilities than others social factor or whether you have a network of support such as someone recovering from cancer Behavioral factors whether you smoke drink use condoms etc In certain parts of the world some health factors are closely related to social political and economic factors The 4 words of the de nition of epidemiology help to answer the objectives of epidemiology If we know what causes the disease we can do prevention and control factors There are 8 factors under uses of epi In textbook there are 7 What are the 2 main purposes of epidemiology The ultimate application of epi is etiologic research a step toward nding determinants causes Need to know cause of disease Other application is to monitor the trend or community health Should know the proportions of disease over the years Why is tracking the disease rate important For prevention and so we can see the transition and epidemiologic transition In 19005 it was infectious diseases not it is chronic illness Because of development of technology we were able to discover and name illnesses People did not like long enough back then to get chronic diseases As we improve in prevention vaccination and sanitation we reduce the burden of infectious disease but now chronic illnesses go up Health serviceswhat are the number of beds dedicated to ICU and compare it to last year Monitoring past present and future is important We think of IDepi triangle one apex is host environment and agent Now that we have chronic diseases we cannot think of the triangle anymore it is the web of causation now If we think of CVD there are so many factors like physical inactivity hypertension and high cholesterol known as the multifactorial web of causation Preventionif we know x causes y we know how to prevent it There are 3 levels of prevention In Friis it has the natural history of disease Prepathogenesisbefore man becomes disease pathogenesis phase mild or moderate form that becomes advance and the natural cause of a disease once it goes into the body either you recover or the disease becomes chronic or adverse consequences or it kills you Primary prevention is when you try to stop it before the disease occurs and coincides with prepathogenic phase Vaccination health promotion are examples PP has 2 types acti and passive Active is when individuals do it on their own like getting into your car and putting on your seatbelt Passive is uoridation or water so you do not have to do it yourself it is provided for you Secondary prevention occurs in pathogenic phase the disease is already there but mild enough for you to prevent it Health screening is an example Like screening for high blood pressure In Friis it says that cancer screening is secondary but it is tricky Colonoscopyif you go in there and they nd polyps it is considered primary prevention because it is not yet colon cancer On the other hand if you have symptoms and the physician finds early rst stage it is secondary Reading the scenario is very important Family history of breast cancer and did BRACl and 2 screening she has no symptoms or lumps Primary If the physician feels lumps and the physician tests that39s secondary Tertiary prevention is the disease already occurred and caused damage like a stroke victim suffering from paralysis and needs rehab Restore QOL and repairing damage What about someone going to pathway houses recovering from alcoholismdamage gas already occurred so that is considered tertiary Language of epiexposure variable independent variable determinant risk factor or sometimes it is treatment in some experimental studies Smoking would be an exposure or H pylori and condoms are an example Outcome variable disease variabledependent variable 3 circlesthe smallest is study middle is source and the biggest is target population if I do a study on college students for their GRE score and ability to do well in graduate programexposure GRE and outcome is how well you do randomly select 100 students which is the study sample so n100 study sample always has a sample size they were selected randomly from 2 campuses at UM undergraduates source sample is population from which sample was chosen which would be UM want to be able to generalize results to all college students the population in which you want to make an inference the target population which would be all students logistically we cannot survey all undergrad students so we need to be able to represent randomly who will be representative of all students studycondom use among Miami Dade teenagers targetall teenagers in Miami Dade go to high schools to select them source population have enough to study 200 teenagers study samplesample size research question should always include exposure outcome and target population question can be written as statement null hypothesis and research question in this example exposure is condom use outcome is STDs and sample is teenager association does not mean causation have to do studies rst to demonstrate causation want to be conservative and use association condom use behavior is associated with STDs among Miami Dade county teenagers null hypothesisstatistical test to reject or not reject it null means no association NHthere is no association use between condom and STD risk for teenagers in Miami Dade Example in classCigarette smokingexposure outcomeCHD both fatal and nonfatal study sample1394 men between ages 65 and 74 source populationmen in Honolulu heart program target populationelderly men Epidemic and outbreak are synonyms 2 applicationsmonitor the trends past present and future by knowing past you can determine the Ebola virus in Africa is in excess of what normally happens in that region People with food infections in US is another example Epidemic is an excess in expected rate Only by knowing past rate can we say it was in excess or not Pandemicsomething that goes global and crosses geographic lines 19205 Spanish Flu is an example that spread all over Europe More recently we had H1N1 outbreak as an example It is an epidemic that goes global Endemic is the usual expected rate In any given time there are people with the common cold A couple of people always have that condition Malaria in SE Asia and some Latin American countries is endemic WHO determines what diseases are endemic Hyperendemicendemic goes up but does not reach the threshold normally controlled by environmental conditions such as monsoons that causes a condition to go up a little but James Lind1st clinical trial 98 Week 3 4 types of measures used to describe disease frequency counts ratios proportions and rates Prevalence and incidence population at risk cumulative incidence and incidence density The simplest is count or number of cases In Miami during the summer there are 100 cases of x disease When we quantify in relation to a denominator we have ratio proportions and rates A ratio is a fraction in which the numerator is not part of the denominator D and N are two separate and distinct quantities They are normally measured in the same units Proportion is a fraction in which the numerator is part of the denominator and indicates the magnitude of a part relative to the total aab Value ranges from 0 to 1 They are often multiplied by 100 and expressed as a percentage Example proportion of males 10102033 Rate differs from a proportion because the denominator involves a measure of time The numerator expresses the number of events like cases and the denominator is the population at risk PAR includes only those individuals present at the beginning of the period of interest who are susceptible to and free of disease For example for ovarian cancer the population at risk would be females not male Patients who have the disease do not count either Rateof cases of disease X in a given time periodPAR at risk per unit time1000 Or rate cases of disease X in a given time periodpersontime at risk Unit size base present as 100 1000 10000 100000 If you are tracking people over a 10 year period and there are 1000 people and of which 10 will have the disease The rate would be 10100010 years00110001 so the rate is 1 per 1000 people a year Why is time important Rate considers time It is meaningless of only indicating the probability of developing X disease in a population without time We have 2 groups 30 of group 1 and 2 will develop CVD When you do not indicate the time it meaningless In 10 years 30 may develop the disease but in 20 years 50 will develop the disease Eventually the mortality will be 100 Proportion needs time For rate you always have to get the time and put more into research which is more complicated It is also difficult to interpret rates Example10 out of 100 people develop X disease in a year and 10 out of 10 people develop X disease in 10 years Rate10100101 per year and 10101001 per year The rates are the same but the condition is different denominator has a time Simple counts can be used in determining epidemics like an infectious disease such as tuberculosis If the number of TB cases exceeds 10 in Miami Dade we identify it as an outbreak Also used in developing health services like determining the amount of cleanings required in hospital beds nurses drugs etc Ratios proportions and rates are required for comparison of health status between population within a single population over time Let s say we have 30 patients in county A and 50 of the same disease in county b Can we say the incidence of disease is higher in county B No because we do not know the amount of people in each county To compare the disease in the 2 counties we need proportions Simple counts cannot be used because the size of the population from which the counts arose is not taken into account 2 groups B only 10 of people are over 60 years old and the group A is 20 group A may have a higher incidence of CVD because of the distribution of people so they will have a higher rate prevalence is the presence of a disease or condition in a population which includes previous and new cases incidence refers to development of new cases in previously disease free individuals DP is determined by the incidence and the duration of the disease Prevalence incidence duration Duration is affected by rate of cure of old cases death and emigration When you have programs that prevent death it decreases the duration of the disease because you help people to prevent death and the prevalence increases because of this Numerator all cases counted during a single survey or examination of a group Denominator is all individuals who are surveyed in a group and includes those who have it and those who do not everyone Point prevalencenumber of existing cases at a point of time eg September 27 2003 of persons with diseasetotal of persons in the group at a time point period prevalence is the number of existing cases during a period of time like during whole year 2003 of persons of disease average population during a time penod Period prevalencepoint prevalence incident of cases during the period Lifetime prevalenceproportion of people in a population who have ever suffered from condition of interest Incidencenumerator all new cases occurring during the follow up period in a group initially free of disease Denominator is all susceptible individuals present at risk beginning of the follow up period PAR Timeduration of the follow up period lncidence of new cases during a speci ed time period PAR during the same time period K If you do not know the year pick a mid year 2 types of population xed and dynamic xedall persons enter the cohort at once and are followed through time no new members dynamicobservations are not restricted to any xed group people move in and out and de ned by birth immigration and aging in people may not initially t into the group because of age but can eventually become old enough to t in people can leave because of death diseased emigration or aging out 2 types of incidence cumulative and incidence density Cl is a portion and ID is rate Cl is proportion of candidate population that becomes affected over speci ed period of time new cases in pop of interest during time of pop of interest at beginning of time period 100 in absence of PAR a reference population usually mid year pop of the same time period is used for denominator Cl cases of disease X in a given time period eg2000 reference population eg year 2000 2 types of Cl attack rate used with outbreaks refers to a short period of observation population at the start of the epidemic is the denominator AR ill ill well AR is not a true rate like Cl it is a proportion denominator has no time in it Example100 employees take part in a picnic and 20 develop diarrhea within 3 days of picnic o AR2010020 Secondary ARfrequency of new rates aming contacts of known rates 2ncl AR ill among contacts of primary cases during a period total of contacts 0 2nCI AR44010 20 employees gave diarrhea and these employees have a total of 40 household contacts and within 1 week 4 become ill incidence densitynumber of new cases of diseases occurring over a speci ed period of time in a PAR throughout the interval and is a rate 0 the number of new cases that occur per unit of population time ID new cases in POI during a speci ed time total person time observation which can be months or years 0 Usually used in cohort studies follow these people over time Persontime is amount of time contributed to observation from time a person enters a group until that person is either diagnosed becomes a new case dies or loses follow up 0 Prevalence and CI are portions and range from 0 to 1 0 ID is a rate so its range is from 0 to in nity 0 Prevalence includes existing and new cases 0 CI and ID only include new cases 0 Prevalence is often used to evaluate the burden of disease 0 CI and ID are used for etiologic research 0 Other measures of disease frequency crude vs speci c adjusted vs standardized Crude is a population and you calculate the percentage of people who have a disease 0 Speci c you have different age groups and calculate the proportions within the groups and specify the proportions according to subgroup they belong to 0 Sometimes you have 2 group of populations with differences in distribution of age and sex which makes it hard for comparison of diseases so you have to standardize proportions Sometimes the above rates are not rue rates but are referred to asthat 910 Week 4 0 We start with simple counts but do not like them because it does not tell anything no trends or patterns Ratios are much better In ratios then numerator is not part of the denominator Proportions have the numerator as part of the denominator Rate the denominator is expressed in time Rate tells the occurrence of something in a given time Incidence is new cases Prevalence is new and existing cases Prevalenceincidence duration When the duration is very short as it approaches 0 like motor vehicle accidents then incidence and prevalence approximate each other Chronic diseases like hypertension have a long duration if the duration approaches in nity prevalence and incidence diverge Pancreatic cancer is a rare disease with a low rate and is rapidly fatal Prevalence will be very low and equal to incidence Chronic diseases have a high prevalence There are 2 prevalences point and period Lifetime prevalence is a type of point prevalence You do a survey and ask if you currently smoke or have a disease that is point prevalence If you ask about something over the past year that is period prevalence quotdo you do marijuana nowquot 6207 people it now which is point point prevalence620741837148 and it is usually expressed as a percentage quotdid you use marijuana over the past yearquot this is a period prevalence8001418371912 quotHave you ever used marijuanaquot This is a lifetime prevalence there are 2 types of incidence cumulative and incidence density cumulative incidence is a proportion NOT a rate Incidence density is a rate You use CI for a population xed population n remains the same throughout the duration of study We use ID when the population is dynamic people come and go A lot of our population studies are dynamic but normally we do not have enough info to calculate ID even though it is more accurate For dynamic populations we need to know how long the people stay in the study individual person s year of observation which we normally do not collect especially in a large cohort study Cl of new casesPAR people who are present in the beginning of the study disease free but susceptible to disease Let s say in 5 years 100 out of 1000 people get the disease1001000 We assume all 100 stay in the study for the duration of 5 years Suppose our disease of interest is ovarian cancer People who have had their ovaries removed are not susceptible to disease and cannot be part of PAR ID of new cases persontime at risk how long the person stayed in the study 420 cases of cancer X over one year PAR6280000 We are calculating Cl because we start with a certain amount4206280000007 We need to use a base because there is no 007 of a person When comparing 2 populations make sure to use the same base 0071000070 cases per 10000 over one year for CI you need to include the duration of study 420 cases of cancer X over 3 years of study and PAR is the same as example before only 1000000 people stay in the study for all 3 years which is 3000000 person year 28000 stay for 2 years56000 PY 5252000 stay for 1 year8308000 if we have all the numbers then we can calculate lD4208308000 PY51 cases per 100000 PY we do not need to say it is for 3 years because we calculate each person s observation ID is an average rate so we do not need to say the length of the study We take into account each individual study and the length they stayed in the study Another example 5 people are in the study and 2 people died in the study PAR5 Cl25 for a 4 year study PY2322110 PY Cl40 deaths per 100 people in 4 year study lD210PY 20 deaths per 100 PY Cl is risk and is a proportion for a xed population ID is a rate for a dynamic population Cl talks about length of study 10 year study and you are given info on how each person is observed and how long they stay in the study it is a 10 year study totaling 50 people 30 people stayed in the study for 10 years1030300PY out of 50 people 5 people developed the disease Cl550 for 10 years of study 10 cases per 100 persons in a 10 year period and 1 case per 100 persons per year B and C PY of observation is 461 lD5461PY108 per 100 peryears of observation 3 types of measures crude speci c and adjusted they can be proportions such as crude or rates crude means you are not considering the population into any sub groups example what is the crude death of the US It is easier to get them and is for the entire population there is no consideration of demographic distribution Because it is a large population it is statistically more stable It would be wrong to use crude measures to compare time points or areas Usually the denominator and numerator have to have the same time point and we use the midyear population size because the census info is more accurate there The second type is speci c stratum gender You can look at the females and males They are stratum speci c measures It gives us more info because now we can compare for a speci c homogenous group If you have a lot of demographic variables it can be cumbersome but it gives us more information than crude An example is the death among 14 year old due to pneumonia in year 2000 in Z county age and cause speci c death measure Age can be a potential confounder We have to adjust for age to make a fair comparison of groups A and B This can be the case with multiple variables such as gender You can adjust in 2 ways direct and indirect Study populationa population that you wish to compare in this case A and B Another concept is standard population which is the population can be either in the population or outside so the standard population can be arbitrary ctional measures We force the populations to be equal in distribution The magnitudes of these measures depend on the standard population you pick The third is and adjusted The other 2 measures are years of potential life lost and disabilityadjusted life years YPLL we need to know how to calculate this is the years of life lost because of a disease in comparison to average life expectancy If someone dies at the age 30 their YPLL with the age of 65 35 YPLL What you pick as target age is arbitrary If our target age is 70 and some dies at 71 their YPLL is 1 they have to die before the target age for it to be positive If you want to compare 2 populations you have to convert YPLL to YPLL rate for it to be comparable DALYs calculation includes people who die of the disease and suffered from it but did not die their quality of life is affect The calculation is complicated so we do not need to know that but know what it means It includes YPLL disability weight and the years living with the disease A smaller DALY means a smaller burden of disease The global burden of disease due to communicable diseases in USA versus India is lower so the DALY is smaller for the US When comparing chronic diseases such as cancer the USA is worse so there is a bigger burden of disease DALYs can be presented in percent or with a base per 10000 for example but the interpretation is the same Infant and maternal mortality are good for telling how good the country is Use direct when you know the morbidity mortality rate You use indirect when you do not know the morbidity mortality rate or when you have a small population You can use one of the populations you want to compare as the standard population if you have complete information and does not involve retrieving census population Indirectyou do not have study population information but you can get it from standard population SMR of 100 means population A and B do not differ in death rates Over 100 means that one has a greater death rate than the other the standard Less than 100 means one has a smaller death rate than the other SMR has to be compared to the standard population so lets say that the US population is used as the standard A SMR110 and B is 90 lnterpretationthe death rate of A is 10 greater than the US and B is 10 less than US population We can compare qualitatively between A and B but not qualitatively 922 Week 5 Crude is for whole populations and speci c is for subgroups Adjusted is for populations that do not have the same demographic distributions so to make the groups more comparable Age adjustment is an example We could just for gender SES example We want to adjust for the factor that makes the two populations different Fictional measure is not drawn from the population you are interested in it can be an outside population There are two standardizations direct and indirect When you have complete information about the population you use direct If the population is it stable too small or you do not have all the info you use indirect SMR100 means the populations do not differ from the standard population SMR80 20 less than the US population SMR105 5 more than the US population We cannot stat it is 25 higher but we can say it is higher We can only say it qualitatively not qualitatively Primary you actually collected and secondary someone else coHect There is an association between cholesterol level and MI in men over the age of 60 If you collect this info collect sample and ways to measure cholesterol level MI and other related factors this is an example of primary Primary you start with a hypothesis Secondary is available to us public domain data collected with taxpayer s money If you want to do a study with secondary you start with a research question as well and then what data can answer the research question explore data to see if it is suitable Primary data is expensive and data intensive but you have control of what and how you want to collect it For secondary data it is cheaper but it may not have the answer to your research question may not collect the variables that are important to you There are different levels local state national and international We have info that we are interested in which is population info data sick or morbidity data disease and mortality data Whatever data set you use you have to asses the quality of the data and if it is usable Do you as an investigator have access to the data you want to get availability This is more difficult than it was 20 years ago because we have con dentiality issues now HIPAA is an example Even for secondary data you need IRB Nature of the data is important as well What is the source population For example in Florida we have an agency for healthcare administration AHCA and they compile info for all Florida nonfederal hospitals There is a 2 year lag on info to verify the info they get form the hospitals are correct You need to now who collects them what hospitals are included and by using the data what effects it has on analysis generalizability etc Your risk behavior survey for kids in grades 912 The sampling strategy is a complex design so you cannot use a regular analysis or it will not be valid Need to spend time understanding the data set you want to use NHANES includes the US representative population that is non institutionalized so it does not include jails for example but college dorms it will They use a complex survey design so you need to use complex analysis Every study has a strength and limitation Sometimes your research question cannot be answered by just one data set What is the proportion on CABG operation and what is it s association with SES You can get the info from Florida hospitals for CABG but for the SES that info has to be retrieved somewhere else The best you could do is whether they have insurance or not I have to take additional information through another data set You may need to do a lot more work and collect your own data primary or check another data set Hospital discharge records need a surrogate data for SES but a strength is that we get CABG data There are a few different datas used Pop demographic statistics CDC surveillance health and behavioral and admin data Pop data is collected from the US department of commerce The census data is collected by being sent to people s house and is collected every 10 years Government provides intercensal data for information in between the 10 years of pop size As a strength it collects info for everyone The weakness is that there is an undercount especially for minorities and older people Homeless people will also not be included in the count Census is important because it gives us our denominator data The second study is vital statistics data collected by NVST via CDC They coect info on birth death marriage and divorce We care about birth and death The strength is it is extensive every birth and death is record Loca hospitals report to local health departments who report to state who report to national and WHO We have accurate info in terms of numbers of births and deaths There are still some inaccuracies in some elds Birth certi cates and fetal death certi cates are separate The weakness is inaccurate or unreliable If your interest is congenital malformations a birth certi cate may not be good Some of the info is physical measurements like weight Some are info that they get from mom s There is some reca bias There will be certain unreliable and undifferential reca Mom s with congenital malformed babies will be more likely to remember their pregnancy Some congenital malformations are diagnosed after birth so it will not be updated until later on the birth certi cate Death certi cates are nearly complete The problem with them is that you may not know the cause of death not having all of the info may be a problem in who records it You can only put one cause of death although there are a few levels One limitation is that the physician writing the death certi cate may not know the cause If it is due to an accident or suicide it can be the funeral director or coroner writing the info If an elderly person dies they may have tons of morbidity but only one thing is written There is also a changing of ICD codes Surveillance data we have for ID and chronic diseases The list of noti able and reported ID for each state varies because it depends on the priority of the state what is more pertinent For chronic disease in Florida we monitor cancer lead poisoning SEERmonitoring data for cancer BRFSS is for adults and asks about behavior and diseases while YRBSS is for youth Surveillance is important because we can monitor trends of disease is past present and future so that we make etiologic associations Monitor trends for diseases and risk factors importance of surveillance The government legally mandates it whereas registry is more expensive some are funded by federal agencies private foundations like rare diseases universities etc Many states are starting immunization registries Registry is more important compared to surveillance Registry gets the surveillance information who gets the disease the treatment they get procedures done survival rate follow up info in detail Surveillance is ongoing but they have a system We also can get data from national surveys Many states collect their own state level surveys and report it to the national level and then they compile it They may not release everything though like frozen blood info may need to be purchased You have to read the methodology of how it was collected It is well documented but you may be able to extract the local data Admin data is discharge ER data It is complete and you can get it electronic Hospital data is normally collected for reimbursement so some diagnostic codes are more reimbursed than others do some codes may be over counted Admin data is not really for research data but you can use it if you know the limitations If you use a Florida hospital and you interest is exposure X and MI a limitation may be that people cannot get to a hospital and people die at home so you may not capture severe cases which means undercount people may visit somewhere else other than Florida and have an MI and die Another disadvantage is if you are hospitalized twice it shows you are discharged twice even though you are the same person Some ICD codes are common if here is more reimbursement There is absenteeism data like for schools and employees But some people take off when they are not sick or go to work when they are Absentee data for schools may be important if you are looking at an outbreak Insurance data may be a problem because they are working people and likely to be healthier than those who do not have insurance School health can be important for correlations like with scores or tests Armed forces shows for people who are joining For example studying women who completed substance abuse treatment program to see if they were less likely to be on welfare You have to see who went through it what they are labor department and welfare department For all sources no one wanted to give the social security number There are more challenges being faced even though there is more data available to us What is PH surveillance It is for us to monitor the health of our communities The info may be different depending on the priorities The de nition of surveillance is not much different for that of research The difference is that surveillance is ongoing that s how we get the present past and future info This is a core PH function There are 3 types active passive and sentinel Active means the health department personnel goes out and gets the info Depending on the disease they have a priority and determine which ones are active passive and sentinel They go to hospitals of ces labs etc For passive health professionals are legally mandated to report to the health department Most diseases are passive This way is less expensive A limitation is that you may not always report Sentinel is something that you prearrange is you see that something is going to be a problem based off of active and passive data It is not ongoing but only when you see speci c problems with certain diseases that may likely be a problem For SENSOR this was done in Minnesota a lot of injuries caused amputation in certain industries It does not only have to be infectious diseases Very likely it is passive The use of surveillance data is to monitor trends and understand if our control measures are effective like making the prevalence of a disease go down It can also stimulate research by looking at the surveillance data It also helps with planning and what should be done Data collection analysis interpretation and dissemination Whether it is active or passive a certain disease is reported to be included in the data You interview the person or data to see if it meets the case criteria There are some symptoms that can include multiple differential diagnoses We use universal case de nitions so it is the same across states for example so that we can make comparisons For a speci c outbreak we may use a unique de nition like a salmonella outbreak after a cruise so it includes person place and time Once it is reported we collect info and make sure it is a disease we want to track treatment and enter the data We do descriptive epi person place time when it was diagnosed etc For interpretation we compare with previous era Then we think of action necessary for it For dissemination of surveillance data it has to be timely This is another difference with research data You hope to publish it right away but this is not normally the case For surveillance data you have to immediately because it is a timely action The limitationsif the patient does not go to the doctor it will not be included The ef ciency of physician diagnosing and reporting it lab testing errors The physician may not want to compromise physician patient relationship so they may not report it whether the patients seek care Sensitivityhow good is it to diagnose and capture how well it is reported and it depends on the disease How do we know if surveillance is effective or not Depends on the case de nition and how sensitive it is Toxic shock syndrome in the 805 we did not know a lot of info in it or where it was coming from We need a unique case de nition for that If it is a noti able disease you have to include everyone that has the disease so the representation of the population may be a problem Can you make a qualitative difference will prevention measures be put in place after this Is it cost ef cient and easy to use 2 types of epidescriptive and analytic descriptive epi describes who gets the disease when and where it occurs person place and time describe patterns of disease and exposure it only gives us chance and is useful for hypothesis generation whereas analytic is for hypothesis testing with the hypothesis generation you do not know exposure and association of disease usually when a disease is very novel and you do not know much about it you start with descriptive epi the objectives are to monitor health status and trend evaluate a PH system and how we can plan for it who when and where for analytic studies we answer how and why It is for the second objective etiologic research Nu hypothesis means no if there is no difference association or effect You use a statistical test to reject or fail to reject the null hypothesis Alternative hypothesis is the opposite Nondirectional alternative says there is an association just the opposite of null One directional determines the odds of getting it There are a few methods for coming up with hypothesis and they are purely theoretical Method of differencewhen you do not know something it is a totally new disease you observe two groups of people to determine a hypothesis and see how the groups are different through natural observation and describe person place and time factor between two groups Suppose you look at two groups of workers and they were the same except that the one with less Ml exercise more You use the difference to make an association Method of agreesuppose you observe that injection drug users have a high prevalence of HIV infection and those that are hemophiliacs have a high prevalence HIV virus must be present in the blood You use something that agrees between groups Concomitant variationone group smokes 510 cigarettes a day and they have respiratory infection and the other groups smokes 1020 and have COPD and there is another group that smokes more than 40 and have emphysema You see variation of diseases based on dosage ResidueIf you think of chronic diseases they are often multifactorial like with CVD so you cannot pinpoint it to cholesterol levels hypertension or smoking exactly you add them up and they don t add to 100 then there is reside or other factors Analogya lot of viral and bacterial diseases can be airborne and cause respiratory diseases If a people in a conference became sick and it was found that the infection came through the air ducts This one is not as scienti c There are tons of person factors for descriptive epi Female paradoxfor certain disease for morbidity more women are effect but when it comes to death more men die A lot of paradoxical events are those that we have not teased out all the info for Hispanic paradoxCompared to nonHispanic whites they have more disease but when it comes to dying they die less One theory is that when they are dying they go to their country to die and are not counted so they are statistically immortal We can also look at place differences which can be very tricky Sometimes you look because of the place itself but because of the political and social differences between places the difference in culture and diet Time factorsthere are a few things we can look at Some diseases have cyclic uctuations like seasonal Food outbreaks are point epidemics Cyclic uctuations are the changes that occur within a year Flu is prevalent in the winter until February and is very low in June Secular time trend has a lot longer time trend and changes occur in decades like with chronic diseases such as CVD CVD morbidity and mortality prevalence went down because people are smoking less advancements in medicine healthier eating coronary units are better CABD technology low dose aspirin preventive efforts and treatment aspects Some changes are behavioral and changes within decades Cohort effect is a group of people that experience the same thing together Those who served in Vietnam together had certain exposures more than others Space and time clusterif you see women getting post partum depression after delivery is an example You see a lot of cases of ID or chronic diseases Usually descriptive epi for outbreak investigation is very important When you see certain changes we need to gure out whether the change was a real or arti cial Real change is actual prevalence and incidence goes down or up real change Arti cial is because of changes in the ICD coding Medicaid had a severe problem once because there was an incentive to give more reimbursement code so it was increasing as a result Sometimes you have to dig deeper than descriptive epi to found out Case reportsee a case that is very unusual Case seriesgroup of cases with similar symptoms There are non epi studies because there is no comparison group It is just a descriptive of an unusual cases lt lead to may important things though like cultural relevance HIV prevalence among homosexuals started out with 5 men who were studied in Los Angeles that were healthy but had ex with other men and got a speci c case in pneumonia seen among elderly or immunocompromised people 929 Week 6 To determine the quality of data sources you look at availability nature completeness representativeness strengths and limitations Census is nearly complete data of population It is useful because it gives up the denominator population It does not include 60 and older minority populations are not counted and does not include homeless or illegal immigrants US death certi catesit is required to report every death which is a strength A limitation is that the cause of death may not be accurate physician who certi es death may not know patient history When someone dies of homicide or suicide coroner or funeral director may input it An elderly person with multiple chronic conditions may have the wrong death Immediate death may be one thing and the underlying disease is something else The ICD code may be something different from actual death as well Birth certi cate strengthall births must be recorded which is a strength Congenital malformations may not show up immediately and will not be recorded on the certi cate Sometimes not all moms remember info related to their pregnancy Surveillance is ongoing core PH function and research is only for a certain period of time Active is PH personnel goes out and gets the info and passive involves the people reporting to the personnel Sentinel is prearranged surveillance lf cases of staph infection ay be increasing in a hospital you could monitor it before it becomes a problem Surveillance is by the government and a registry can be funded by different entities like a private foundation non pro t organization philanthropy Cancer registry versus surveillance is that the registry follows people and has follow up info and is more expensive what treatment survival rate etc Descriptive epi provides info on pattern of diseases person place and time Analytic epi we answer how and why For place info we compare rural versus urban between states nations hospitals In terms of time cyclic shorter periods of time like seasonal and secular longer time periods like decades trends Hierarchy of study typesecological is discretional and cross sectional is descriptive and analytic Case report is one speci c case or patient Case series is more than one case A case report can be something very interesting or dangerous Case reports can lead to important epi studies An example of case series is HIV AIDS They present with a special pneumonia that you usually see in elderly and immunocompromised people Another study was Good Rich a tire company where they saw angiosarcoma liver cancer and it was associated with the chemicals used in that factor vinyl chloride There were 7 cases Case series and reports are hypothesis generating and can lead to hypothesis testing They are not epi studies because they do not have a comparison group Observational studies do not do anything to the population In experimental studies you do some manipulation If you are interested in studying hypertension and stroke with observational cohort you follow them and see who developed it With experimental you think of a new drug that treats high blood pressure before it gets to the market to test its effectiveness You would give one group a new drug and the other a sugar pill or placebo manipulating Exposure is the new drug For randomization you want to make sure people in the study have an equal chance of getting the drug This is random selection Say out of 55 of us 10 are selected via random selection with a hat drawing 5 get drug A and 5 get B placebo I want to randomly allocate that 5 people get the new drug This process is called randomization or random assignment Randomization is only for experimental study whereas random selection can be for observational and experimental Quasi experimentalonly manipulation and no randomization For example if you want to compare to see whether to give long active insulin you ask patients to volunteer You manipulate by giving long or short acting insulin but instead of randomizing it you ask the patients to volunteer It is biased though because patients may be automatically different but because of ethical reasons this ay be the case In the hierarchy of study designs as the strength evidence improves causality improves Ecological and cross sectional are at the bottom of the study design but are useful it if is new and an association has not been made Cross sectional are quick easy and cheap to do As you move to the right causality info gets better Eco is descriptive but can also be analytic How do we get info for epi studies we collect info about exposure and diseases that are collected form study and get them form individuals correlational and eco studies get info form groups instead of individually info is already collected data our unit of analysis is called group or aggregate there are 2 types of eco studies one is spatial and the other is time for space we can call it geographic correlation studies comparing between counties or between states for example or international comparisons how we do that is by getting info from each country or state and from the entire population that is already collected alcohol studyget info from each area about average CHD death and info on alcohol consumption normally get this form federal trade commission to get alcohol sales and it is converted but it is a surrogate measure with this you can do a large scope of study you can go to CDC to get death info another example is alcohol sales in states and liver cirrhosis states with higher taxes may have lower rates manipulation is only in geographic area eco studies have become more popular in outcome research comparing quality of care in hospitals schools etc second type of correlational is time trend or series graphhow much we spend on healthcare in different countries versus wealth Greece has a small expenditure and low mortality Ireland is average Germany is high Italy is medium expenditure but high mortality This is an example of a geographic correlational study Time trends are normally for one area but over time graph 2TB death from 18321970 in England wales As lab and pharmacological advances for TB treatment there was a decrease in TB death Info is retrieved for one area but for each year You cannot make an interpretation or conclusion about individuals because the unit is group Doing this results in ecological fallacy an erroneous conclusion An example about CHD Suppose we have 20 cities and theur average income and average CHD in these countries You found form eco study that HD is higher in the richer cities than poorer ones Can we say that being rich increases risk of CHD No because this is info from country level data From group level data you cannot make conclusions about individuals They did the individual level studies and found there was no correlation Another study as done on alcohol consumption and CVD In the eco study they found an inverse association between alcohol consumption and CHD the more you drink the more it goes down Doing and individual level study we found something different It was a shaped curve which means moderate drink is protective of heart disease where as non drinkers or heavy are more at risk Cross sectional studythe key to this study is that you collect for individual The unit of analysis is individual Suppose you want to study smoking and lung cancer from JMH You have a patient roster and randomly select 100 patients and individually ask if they ve ever smoked or been diagnosed with lung cancer The key feature is simultaneous collection of exposure and disease National Health and Nutrition Survey is an example Cross sectional is descriptive and analytical If you go out and collect info from the 100 individuals about smoking and lung cancer you are getting prevalence of lung cancer The advantage is that you can do it really quick you are not waiting for lung cancer to develop and follow people You also use CS to hypothesis generating and testing The disadvantage is that by collecting exposure and outcome at the same time you do no know the temporal or directional relationship If you are thinking of coffee drinking and MI you do not know which comes rst chicken or egg effect temporal relationship cannot be ascertained With cross sectional you cannot say X causes Y These 100 people are those who died We have no chance of selecting those who have died This is selector bias We chose those who came to the hospital We should be able to pick a representative sample but that is not the case fro CS It is not good for studying rare conditions When do we use CS are descriptive and when as analytic For descriptive if I give a survey of whether you have had a u vaccine this year Then I can see how many students had the vaccine this is simply describing Or I can go to a hospital to see how many beds are reserved for NICU If I want to do a study on the association between coffee drinking and pancreatitis that is an analytic study Assess prevalent or burden of disease descriptive Test associations analytic With CS we cannot say exposure comes before disease There is a special circumstance where you can say exposure comes before outcome If you have an exposure that does not change over time like eye color blood group ethnicity then you can say which comes rst Contingency table 22 tablesimpest way to analyze We put the exposure then disease Sometimes you can measure exposure on a continuous scale like blood pressure and then divide into 2 groups high versus low BP We set up the table so we have 4 cells A is those who smoke and have lung cancer B is they smoke and do not develop it C is those who do not smoke and develop it D is no exposure and no cancer For the JMH example you can use info to ll out the table For eco study we get info from area not individuals so we have no numbers to ll out the cells Suppose we do an eco study among teenagers those who have ever taken driver s ed exposure and motor vehicle accidents outcome and we want to compare all 50 states How many exposure points do we have 50 1 from each state Info is retrieved from motor vehicle department DMV and the accidents from police and hospital records outcomes What we do not know is who had the accident and who had driver s ed so we cannot ll out the cells only marginal totals We only have state averages We do not know which teenagers did and did not have driver s ed and accidents With eco studies we can only say that states with higher proportion of driver s ed have higher motor vehicle accidents if we nd a positive correlation or vice versa We can only talk about the states not individuals Sampling methodsl go toJMH and there are 1000 patients but I only need 100 Figure out a way to sample Probability and nonprobability samples The hat sample is probability sampling simple random sampling Probability means there is a known chance of being sampling it does not mean equal Nonprobability is that they do not necessarily have a chance of being sampled Simple randomeach person has an equal chance 2nCI type is systematic samplingthere is some kind of rule to it if you do a study and you pick patients who come to a particular clinic on Friday there may be a roster of 15000 patients and you pick every 100th patient you have a rule that predetermines priority strati ed samplingcan have 2 hats one for males and one for females this makes an equal number of gender cluster samplingeither geographically or administratively de ned groups NHANES randomly select states which is the rst cluster then zip code areas 2rml cluster households 3rCI cluster and then form that household they pick one person That is a 3 stage cluster Doing a survey of condom use in teenagers is going to high schools First randomly select school 1st cluster then randomly select 9 12 classroom 2nOI cluster then random individual high school students 3rCI cluster If we have a list if all teenagers that would be easier but is normally not the case Nonprobability sampling is convenient sampling Sit in front of Publix for example and hand out surveys Not selecting randomly form the community but go to a speci c place Everyone does not have a chance of being selected no chance of being selected 106 Week 7 Descriptive describes study population in terms of person place and time who where and when Analytic how and why what are the mechanisms Observational studies you re just observing Experimental studies involve manipulation and randomization not present in observational studies We manipulate independent variable or exposure You as an investigator are giving a new drug or sugar pill Quasi experimental is only manipulation no randomization Random selection is putting your names in the hat and picking Everyone has an equal chance Randomization random assignment is only seen in experimental study First step is random selection and the second is randomization Everyone has a known chance of getting one of the treatments You can do random selection in an experimental study Any study you can do random selection but only experimental does randomization How many types of sampling do we know Probability and nonprobability There are 4 types of probability one is simple random sampling The next type is strati ed when you make the groups into different subgroups of interest The 3rd type is systematic which is based on a rule that you specify before you start sampling people The last is cluster group and can be geographically like different area codes and administratively like high schools or classrooms de ned Simple random sampling gives everyone is study an equal chance Nonprobability sampling is when not everyone necessarily has the same chance of being picked Convenience sampling is an example If we want to know the prevalence of STDs in Dade colleges we should do random sampling but go to a student sample instead and pick the rst 100 people who are willing to take the survey not random 2 types of ecological study spatial and time series for the papermain exposure was dietary practice 2 outcomestesticular and prostate cancer measures of frequencyincidence and mortality they compared 42 countries 5 continents an advantage of thisyou can cover large geographical areas in a relatively short amount of time and cheaper unit of analysis was countries or groups statistical resultscorrelation coefficient ecological aggregate or correlational study they got the dietary practice information from Food and Agricultural Organization on the different countries the outcomes came from cancer registries international agencies for research and cancer limitation to this studyyou cannot make a conclusion on the individual level we measure exposure and disease in ecological study at the same time in cross sectional studies we also measure information at the same time the difference between ecological and cross sectional study is that you we get information form individuals with cross sectional studies in the 2 by 2 tables we do not have information to put in the values for the boxes you cannot make inferences on the individual level the main limitation is that temporal relationship cannot be determined in cross sectional studies when we collect info from cross sectional studies the measures of frequency are prevalence all existing cases re ective of who have the disease and survive those who dies cannot be used in the study resulting in survivor bias 2nCI paper2 exposures medication use and chronic diseases outcomefalls study sample4050 elderly women between the ages of 65 and 75 source populationwomen in the British Heart and Health study data sourcegeneral practice clinics in 23 towns in Great Britain target populationelderly women how did they de ne fallshaving fallen in the past 12 months and needing medical attention 2 outcomesseverity and frequency of falls limitationsreverse causality ecological is hypothesis generating analytic can be hypothesis generating or testing when exposure variable is unalterable over time then you can establish causality case control study most commonly used study design and most evolved it is different form the rst two study designs we learned by we start with disease or outcome rst we have people with pancreatic cancer and the next step is picking non diseased people control and diseased cases after we assemble 2 populations we retrospectively asked if they have exposure or not how are exposure and disease measured in CC studies Start with disease and go back in time sometimes CCS is called a retrospective study we look back for the exposure which is why we call it this also called case reference we can make the 2 by 2 tables with CCS as well it is common because it is the natural process of how we think start with the problem and look back at what you did because we have both exposed nonexposed cases and controls we can calculate measures of associations applications of CCS etiologic research causation vaccine effectiveness and for evaluation of prevention programs the design feature is pick cases rst and then comparable control groups advantagescan study small number of participants very well suited for rare diseases cannot do a long cohort study no follow up so it is cheap can evaluate potential risk factors simultaneously and control for confounding 3rOI factor that can distort the association between exposure and disease other risk factors disadvantagesbecause this design is good for rare diseases or diseases where you need to seek medical care you pick them wherever they are you cannot be picking and wait for new incidence of cases you have to pick prevalent cases you cannot estimate risk from case control studies you cannot tell which came rst exposure or disease selection and observation bias are problems not good for rare exposures how do we do the study First design issue is who are the cases going to be Have to have an operational case de nition This ensures the cases entered into the study are the same Need the same criteria This has to be de ned at the beginning and you stick with that de nition It depends on disease of interest on whether we want to use a broad or restrictive criteria Broad gives false people in it But if it is too strict you may miss people like for a rare disease For outbreak we may include con rmed and possible cases but it depends on the disease Ml criteriasymptoms of chest pain irregular EKG and This is a little too restrictive and we may not have the money to do all 3 Next we need to decide where we get the cases ldeally for Ml for example we should have a list of everyone diagnosed and then randomly pick people We do not have that study though Ideally we would want to pick incident cases Incident cases are better than prevalent cases because they are more likely to recall their exposure For CCS we go to where the cases are Usually in a hospital register cases or a de ned geographical area For cancer it is easier because we have a registry and surveillance From surveillance system you can pick incidence cases You want the cases that represent spectrum of disease different stages of cancer for example We have to forego this often times though House red wineCCS Selection and info bias are more common with CCS How do we de ne controls One of the most difficult methods Ideally you should pick the controls from which cases arise If you use a surveillance system to get cases controls should be form there are well from the same source population Controls are free of the disease you are interested in studying not necessarily disease free altogether One of the arguments is that you should pick controls with disease because they are more likely to recall their exposure We try to make it more comparable and forego representativeness If you pick cases from hospitals you pick controls from same hospital for example in order to make them more comparable Sometimes you see 14 control ratio We want to increase the statistical power so you can have more controls to case You should not really go over more than 4 controls Controls should be at risk for the disease you are interested in Ask about the exposure in the same manner or accuracy between cases and controls Matching for confounding factors Like matching age for stroke patients cases and controls CCS can be unmatched or matched Where do we get controls We try to get them from the same place as controls You can also get more than one controls from two different areas for more comparison 2 errorssystematic errors bias can come in 2 stages the way you sample or select people in your study selection bias the second is asking and collecting data information biasthe way you collect the data 3 biasesselection bias how we select the population or study sample one example is OC and venous thromboembolism doctors are more likely to admit women if they use OC making the exposure proportion higher than the general population in that hospital the 2nCI bias is observer bias and subject recall bias interviewer biasalcohol consumption and MI and you assign one interviewer to get exposure info if the interviewer is aware of the study hypothesis and has a certain thinking he may probe cases more than controls or vice versa when it should be in the same manner recall bias comes from the subjects themselves pelvic radiation during pregnancy and congenital malformation is an example cases are more likely to remember and give more accurate info in cases than in controls in this case we should pick controls with other diseases so that they will remember as well measures of frequencyincidence and prevalence measures of association linking exposure and disease for CCS there is the odds ratio analysiscontingency or 2 by 2 table in CCS disease are cases and non disease are controls it does not have to be 0 exposure at all though it could be heavy versus light smokers for example cell aexposure and have disease b have exposure and do not develop the disease c is no exposure but get disease and d do not have exposure or disease OR is the ratio of 2 odds one odd is for the cases the other for controls Each odd is a ratio of 2 probabilities or proportion What is the proportion of cases that are exposed aac exposure non exposure cac odds of exposure ac proportions of exposed in controls b bd not exposed d bd odds of exposure among contros bd ORadbc If we have cell values we can calculate OR We use this for unmatched data OR1 means there is no association between exposure and disease OR is greater than 1 it means that current contraceptive use is a risk factor for Ml lf OR is less than 1 it means it is a protective factor In medical journals 46 means that current contraceptive users are 46 times more likely to get Ml compared to non current OC users common interpretation We should be really comparing between cases and controls disease and non disease Odds of being current OC user is 46 times in those with Ml without Ml correct way Common way is more intuitively and relative risk Odds are very dif cult to interpret and less intuitive but more correct OR of 29 the odds of use of rely brand tampons is 29 times more likely in people with TSS compared to people without TSS When we have matched CCS there are 2 types frequency and individual matching Frequency matchingassemble cases 30 of them and nd 40 are African American and 60 are Caucasian you select controls with similar frequency 1 variable Individual matchyou have 30 cases rst is 40 year old white female you try to pick a case the same way 3 variablesage gender ethnicity You calculate OR just like unmatched If you match 1 to 1 or pair match you calculate differently match analysis Sample size is 10 so there are 5 pairs We look at each pair In rst pair both case and exposure are exposure in 2 both are not exposed in 3rd pair case is exposed control is not 4 case is exposed control is not and 5 case is not exposed and control is Match pair 1 and 2 are called concordant pairs cases and controls are the same in terms of exposure The other 3 are called discordant pairs one case is exposed and the other is not We do not care about concordant pairs they do not contribute to OR Cell b represents cases exposed and controls not exposed Cell b is on top because of cases OR2 means odds of exposure X is 2 times more likely in people with disease Y compared to people without disease Y 1013 Week 8 Odds ratio of 1 means no association OR of greater than 1 means it is a risk factor OR of less than one is a protective factor Numerator is the odds of exposure among diseased individuals and denominator is odds of exposure among controls Design feature of CCyou start with people who have disease and select comparable groups people without disease Exposure info is gathered retrospectively Using vitamin supplements is protective of NTD in babies 137 63 Vitamin supplement users have 63 protection from NTD articledisease of interest MS exposure was sun exposure when they were younger 615 how old were patients with MS 60 and under NHthere is no association between sun exposure at a young age and MS among people in Tasmania Hypothesis must include exposure outcome and target pop 136 cases 272 controls study sample Cohort Studiesobservational and analytic study Observational because there is no manipulation but analytic because it has a hypothesis and the hypothesis is tested A cohort study is 2 groups of people exposure and non exposed initially the groups have no disease and you follow them over time to compare the incidence of the outcomes which they develop through this time Characteristicsobjective is the same as CS to test the association between exposure and outcome First you should formulate a hypothesis what is exposure and outcome Then you select study groups according to de nition of exposures You have the two groups and can also be individuals with and without certain characteristics When you have a chemical exposure or sun exposure you can say this is individuals exposed or not exposed Sometimes you want to study the association of age or gender between a disease Some people will have the certain characteristics and some will not After study groups you follow both groups and compare the incidence of disease our outcome death injury etc Then you nd a statistical association between exposure and disease Difference between CS and cohort studies In CS at the beginning you already the cases and controls In cohort all of participants do not have the disease In cohort you have the 2 groups and follow them forward and see how they develop the outcome In CC you already have the outcome and you look back to compare their exposure status Assumptions in CS rst exposure exposed and non exposed groups are representatives of a well de ned pop so you can make inferences of target pops The absence of exposure should be well de ned and maintained in the nonexposed groups during the course of study Drinking green tea and immunityde ne exposure like drinking tea and how many times per day but also de ne nonexposed group people cannot drink tea Have to de ne exposed and nonexposed groups De nitions of disease outcomes should not be changed during the course of the study and should be de ned at the beginning There should be no bias in determining outcomes in both groups SARs de nition have changed over 3 years as we learned more info about it Diagnosis criteria changed Disease de nition can change Cohort has a long time period Just maintain the de nition you did prior to study De nitions of disease should be reliable and reproducible Make sure to follow up with participants How to classify exposureit can be a clear cut yes or no and some are complicated like second hand smoke or dietary intake of fat We may have some exposure of it Have to divide these people into several groups of different levels of exposure like little or highly exposed After exposure disease does not occur until induction period has not passed and sometimes there is a latent period you have a disease but it is not diagnosed You have to allow time for disease to develop Radiation exposure and leukemia is an example If you let people enter study group earlier than 3 years it may not be due to the exposure that happened less than 3 years ago Types of exposure groupsgeneral pop if you have common risk factors Framinghamused males and collected exposure like smoking to see development of heart disease Nurses health studystudy pop is the nurses For rare risk factors there are special exposure groups like the survivors of Nagasaki and pest control workers Sometimes you cannot always nd the exposure group in general pop so you have to use special exposure group Comparison groupinternal separate and comparison Internalalways have the exposure in different levels so study group will all be exposed but at different levels for comparison like the amount of packs of cigarettes smoked Separatecompare development of CVD and hypertension and non hypertension Comparecompare with available pop rates Types of cohort 2closed and open Closedfixedn one can be added after study begins and the initial roster can dwindle because people can die lose interest or develop outcome In dynamic you can take on new members as time passes so you can maintain a stable population 3 types of cohortpro retro and historical prospectivehave the baseline now and the 2 groups and follow them and see the development of disease in the future at the start of the study you decide your 2 groups British doctor study is an example in the beginning you do a survey who is and who are not smokers divide them in 2 groups and follow them over time 0 see if they develop lung cancer retrospective you look back but how you choose the population of study is different com CCS although you look back the exposure and disease happened after the study started so you still have the 2 groups rst and they are still followed through a time forward to see if they develop disease the follow up time happened before the study began ambispectivecombination of pro and retro investigator looks back and forward exposure and nonexposure are determined in the past and as present you may not have outcomes so you still have to follow through time to wait for the outcome to happen prodetermine exposure as present and follow in time and wait for disease to happen retroat present you have the data of exposure and disease ambithe exposure is assessed in the past but you still have to wait for a period of time for disease to develop pro is most accurate but you have to follow over a long time which can be hard in retroyou use second hand data so you do not have to wait just divide and analyze but not as accurate but efficient ambi is a combo of the two it is efficient and you can still control study advantagesit measures incidence and gives a direct estimation of disease in CCS you have prevalence not incidence Cohort does not rely on memory for exposure status so there is no record bias like in CCS Cohort begins with no disease so it gets rid of selective bias It shows the temporal relationship between exposure and disease It can yield info on exposure of several diseases in CCS you can test multiple exposures in cohort you can follow natural history of disease different stages Cohort is good for rare exposure In CCS you may not have enough samples of exposure Disadvantagesrequires large samples that can be studied more efficiently then CCS Inefficient for studies of rare disease because you follow a large sample but only few develop It is difficult because of how long you have to follow them People may drop out study without notifying you and may be more likely to be in exposure group so you cannot count them The estimation of incidence in group will be underestimated and the difference in the two will be diluted Direct observations of participants may cause them to change their behavior due to stress possibly Criteria change can result in bias in ascertainment of disease case may not be one anymore It is expensive and still have biases Attritionwhen current study participants selectively drop out once study has begun Misclassificationcan happen in exposure and disease Some people may have disease and it is diagnosed so you put them in nondiseased groups Healthy worker effecthappens mostly in occupational studies You want to study workers in tire factory and exposure to chemicals in tires in relation to death Sometimes if the control group are not in same factor you compare them to general pop and the worker may be healthier than general population they can lose theirjob if they are not heakhy You will have underestimated incidence because they are healthier MOFmeasure incidence Cl or ID SMR Cl is a proportion of cases in pop of interestPAR ID is a rateinstead of total number of PAR just the total person time of observation SMR tells lesser or greater risk of death or disease in the study populations Have the incidence by sub groups in standard pop and have the number of sub groups in study and multiply this by incidence in general pop and add this together to get number of expected Relative riskmost frequently used RRstudy pop is divided into 4 categories Exposure status and disease status RRratio of risk of death to exposed and non exposed Magnitude of association between exposure and disease lncidence can be Cl or ID Calculate incidence in exposed and nonexposed groups In example it is a ab 48526692 Nonexposed16923869 RR485169 R41 so women who had benign breast disease has 41 times the risk of developing carcinoma as compared with women who did not have benign breast disease RR1 risk of disease is not different among exposed or non exposed no association RR greater than 1 is a risk factor and RR of less than 1 is a protective factor Statistical significance X2 Temporal relationship can be determined with cohort MOF incidence CCS we get measures of association but no MOF no prevalence or incidence because we determine how many cases and controls we take with cross sectional you can select prevalence because you sample randomly form pop Cohortwe are able to estimate incidence so we can estimate risk RR is a proportion so it is a Cl ls we have persontime info our 2 by 2 table will be different Denominator will be person time of observation not just a number ID of exposedID nonexposed rate ratio CI of exposed Cl nonexposed Risk ratio Biggest limitation of cohorttakes a long time expensive large sample bias attritionloss to follow up cannot study rare diseases Proyou start with disease free individuals divide sample into 2 groups follow them and see if they develop disease or not Retronon diseased individuals exposure still follow up but it is done before start of study and determine whether they develop disease Ambistill has all 4 steps does sampling by exposure status Retrospectivedisease exposure and follow up are already completed and make sure they do not have the disease of interest in the past investigator is at the tail end Ambiexposure occurred in the fast you follow them and see if they develop disease in the future so investigator is in the middle Risk versus odds from cohort you get risk Both are ways of expressing chance or probability 30 of you are rollerblading and 5 fall what is the risk of fall CI 530 odds of fall 525 do not express odds as a percentage for every 5 people that 5 25 will not fall so interpretation is not as intuitive 1027 Week 10 cohort is retro and prospective design case report and series are not epi studies because there is no comparison group ecological studies2spatial and temporal cross sectional1unit of analysis is individual ecologicalexposure and outcome at same time unit of analysis is group or aggregate eco is hypothesis generating cross sectional can be hypothesis generating and testing limitation od ecogroup level data cannot be inferred upon individualsfallacy cross sectionalmeasure exposure and outcome at one point cannot establish temporal relationship when do we do analytical cross section study and when just descriptional study Descriptive is describing pop based on who when where and analytical answers how and why Can cross section ever be used to established relationship When the exposure is unalterable over time Eye color is an example Case controlstart with disease and then control groups and then you ask about exposure retrospectively nd exposure status and ask the same Selection biashow you select people into study like not picking appropriate cases or control Can get info bias form instruments that are not properly calibrated Info bias2interviewer and recall bias Biases with CCSrecall bias CCS is different from cohort in that you select people with exposure and then divide them into exposed and nonexposed and then you follow them to see if they develop disease of interest Cohort types3perspectiveyou are at the beginning of the study before you determine exposure and then follow them exposed and non exposed Retroselect exposed and nonexposed and go backwards using other people s records to determine what happened You start at the end You start with exposure and follow them the say way to see if they develop exposure and disease has already occurred and follow up CCS you assemble people by disease status Retrospective you assemble by exposure status Ambispective you look back at exposure and follow people to see if they develop disease has both parts Collect exposure info from records that occurred in the past and then you follow them to see if they develop the disease or not An example is people with head injuries in 1945 and in 1960 you start to follow them for 30 years to see if they develop dementia observational study no manipulation or randomization quasijust manipulation no random allocation experimentalmanipulation and randomization example in classperspective cohort measure of frequencyCl Relative Riskincidence of E incidence NE Can also be called risk ratio measure of association for xed population Rate ratio ID E ID NE describes person time Use X2 to test if it is a statistically signi cant association Experimental intervention studyclinical and community trial How is random allocation different from random selection Random selection50 of you and only want 20 sol randomly select 20 people from a hat For clinical trial l have to randomly allocate 20 of you into 2 groups one gets drug and the other is placebo The second step is random allocation or assignment In terms of assessing X Y association as causal clinical trials is on top of hierarchy Not all should be though because it has to have 3 method issues One is random allocation second is control group could be placebo control or something else and the third is at least double blind Clinical trial is a planned experiment where humans are involved to study the best treatment When you do one to test a new drug you test ef cacy of the patients within the clinical trial Our goal is go beyond that sometimes to multinational trials to assess the effectiveness of the trial Effectiveness is anyone who will use the medication in the future Ef cacy is only for the people within the trial Populationthen random selection After random selection conformed consent patients nds out you will get drug or placebo Then you randomly allocate and see outcome which can be anything It can be death remission or survival time for example 3 types of clinical trialtherapeutic testing a new drugdrug development some drugs available may not be suitable for everyone intervention do not have the disease but you may have the risk factor like moms with kids with NTD and giving them folic acid to prevent it in other pregnancies prevention is for healthy people getting vaccination and following them to see if they develop disease or not 4 phases of drug clinical trial1st phase to test the safety of a drug usually these are healthy people med students or pharmaceutical employees 2nCI phase is to see the dosing and is tested on healthy individuals 3rCI phase is it is effective n the outcome being assessed and then it needs to be repeated in many settings and people this is random selection and random allocation When nding is consistent drug is available for general pop to use 4th phase which is post marketing surveillance why do we need a control group Because sometimes disease can go into random remission and without it we would not be able to tell if the drug is working or not Sometimes patients change behavior which happens a lot during interventions like in CVD so we need control groups Experimental allocationuse people form previous treated patients One ethical issue is if you have a standard treatment already available you should not give control group placebo pills The difference detected will be a lot higher than if you use drug and standard treatment but this is unethical Sometimes when you look at a rare outcome you may not have enough people to be in CT so doctors may enroll all people they see into experiment so they use patients they use to treat 10 years ago as control groups they already received another treatment which is probably standard historical controls They were selected for different reason so criteria is different and the ay they respond is probably different This is very biased info Another one is systematic assignmentanybody who comes on odd days get treatment and even days is placebo There is a pattern investigators can go which can create a problem Random allocation has to be tamperproof Investigator should not know what the patient is receiving Random allocation avoids bias and is the best Example90 eligible subjects and say anyone with odd numbers get treatment and even gets placebothis is not tamper proof To make it random it has to be random table numbers or computer generated Can do simple random or strati cation May have moderate and severe group and select form that strati ed Blindingsingle is when the patient do not know what treatment they get Doubleneither patient or observer know the treatment Triplepatient observer and data analysts do not know Clinical trial has to be at least double blind Surgical trials are harder to evaluate and depends on the precision of the surgeon Unit of interventionmost common one is individual but if it is community you can randomly allocate it to community like 2 schools or factoriesO Community trial is group Special issue with randomized control trial is intention to treat analysis Once you random patients you must analyze Many investigators if they see patients are not compliant they want to exclude them form trial which you cannot do Once randomized you must analyze no matter compliance Complianceextent to which patients listen to medical advice Lack of compliance in drug trials undermines treatment What are the reasons for noncompliance Side effect people may think instructions are too confusing sometimes medication may run out or it is a disease like cancer participants may want to nd another alternative participants forget to take meds How do we assess compliance Have patients keep a log self reported bring pill bottles back and count remaining to see if they used it Another problem is loss of follow up attrition or participants may die Advantagegold standard is they follow 3 conditions eliminates bias in treatment comparison through random allocation Disadvantagemay be less generalizable if it is a new drug you try to get pop that will bene t like ACE drugs for people with hypertension so it is not people form general pop which is why we have to do so may trials it is a more complicated protocol compared to observational studies or there may be clinical issues in which you cannot do the study Basic premise of doing any research is if you badly plan or execute study Bad if investigator is bias against or for one of the treatments Clinical equipoiseinvestigator or clinical community involved in the trial should not know that the new drug is better than the standard drug If you have too few patients to draw meaningful conclusions Risk to participant should not outweigh the bene ts gained by the trial Any clinical trial should have an independent committee DSNB that does not consist of any of the study investigators They look at your clinical trial data at the end and interim If a patient has side effects you have to report it or DSMB has to look at it You have to stick to the protocol but do want to hard patients which is why the committee is in place Stopping it breaks the code they assess if it is ok to do that or not Termination of the study in general is allowed only if you have suf cient evidence that you see there is an extreme bene t or harm Informed consentwe should include all possible treatment options and they will be in one not sure which risks and bene ts name of investigators who they contact and the data is kept under con dentiality and anonymity they have the right to withdraw from study at any time Community trialunit of analysis is community group rather than individual We determine which communities are eligible coect baseline interventions randomly allocate interventions implement intervention and monitor outcomes Fluoridation of water in 2 NY communities is a famous example Picked 2 communities comparable in demographics and one got uoride in water Newburgh and one not Kingston and measured the proportion of decayed and lled teeth The one with uoride in their water had a lower proportion of dental problems There is manipulation and exposure random allocation Communities did not know but then the problem with community trial is when the intervention is related to behavior There is a risk of contamination because you have no control of who tells what in community trials There is also the issue of population shift or migration Methodology is similar to clinical trial but unit of analysis is different Pay attention to DSMB and consort statement checklist for 3 criteria for it to be published page 390 Do not need to now xed and adapted randomization just ones mentioned in class Do not need to know cross over study design or how to evaluate community trial Clinical is most similar to prospective cohort so you can calculate relative risk Quiz 2 includes all study designs 113 Week 11 MOFincidence dynamicID and xed CI and prevalence period point and lifetime MOAcombining two MOF to make it a meaningful measurement MOA in cohort RRincidence among ENE ID is a rate and CI is a proportion If you have a dynamic pop you have a denominator that I person time so you calculate relative rate using ID CCSMOAodds ratio Why do we calculate odds and not risk of cases available to you and is good for rare cases You will get whoever gets the disease and pick them into study so you cannot estimate risk so you have to calculate odds of exposure among people who have disease ORadbc RR fixedaabccd If the disease is very rare what happens to ac it will be a small number If they are small then we can ignore a and c in denominator which means OR approximates RR You have a man that weighs 100kg and a woman that is 50kg and you have to compare their weight Jack is 50kg heavier than Jill 100 more is twice as heavy Jill s weight is half that ofJack Absolute comparisondifference subtraction in which you retain the unit Relative comparisondivide the 2 quantities and you have to pick a baseline although it is arbitrarysimple ratio Relative to baselineanother way Various ways to compare 2 groups for relative comparison Relative to differencethe difference is percent difference to baseHne RR is a relative measure Everyone has a baseline risk of getting a certain disease Smokers have a certain baseline risk and smokers has an excess risk because they smoke Excess risk subtract incidence among exposed incidence among nonexposed so it is an absolute measure Attributable riskMOA PARpopulation attributable risk subtraction of incidence among population form incidence of nonexposed Incidence among population is retrieved from government data Ideally you should get it form the population from which the study pop comes form but we normally do not have that so we get it from study pop and assume this is a good estimate of incidence among general pop Incidence among pop ac N Anyone who has disease over total population is incidence among pop If you want to estimate incidence of pop you need to know how prevalent the exposure is If you have info of how prevalent smoking factor is in Miami Dade you can get how prevalent nonsmoking is Proportionincidence among exposednonexposed divided by incidence of exposed time 100 to get attributable risk percentage which is a relative measure What is the difference between AR and AR AR is excess number of cases and AR is excess proportion AR etiologic proportion Sometimes when the risk factor is protective incidence among nonexposed is bigger and we have to reverse the formula and it is called preventive fraction Population attributable risk taking absolute measure in relation to baseline incidence among pop and making it into a proportion For cohort we have 5 MOARR AR PAR AR and PAR If asked to calculate any of 25 the rst thing that must be check is the assumption of causality which means that if RR is not equal to 1 then there is an association and you can calculate If RR1 there is no association and no causation and no sense in calculating 25 Theoretical assumptionif our outcome is mortality we are assuming that there is no competing causes of death If we looking at smoking and death we assume the people are dying form lung cancer and nothing else Traditional CCSyou cannot get risk only OR so you cannot calculate the rest unless OR is an approximation of RR so you can get relatives not absolute Pop based CCS in this case we improve traditional CCS to be similar to cohort so you can calculate them all If RR1 AR0 How do we interpret AR Exposure is benign breast disease and outcome is invasive breast cancer o How likely are those with benign breast disease to develop cancer RR 41 so those with benign breast disease are 41 times more likely to develop invasive breast cancer compared to those who do not have benign breast disease AR 029007 02191000 22 per thousand excess 0 If women did not have the exposure 22 per 1000 would be saved from invasive breast cancer PARif it is lung cancer and smoking you need incidence of exposure and nonexposure and prevalence of smoking and nonsmoking in the general pop 0 Proportional exposure among population attributable riskPAR o What is the difference between AR and PAR They are both excess risk and absolute measures but AR is excess risk among the exposed and PAR is excess risk among the IOOIO Excess proportion among the exposedAR Why do you care about these measures 0 When we want to change policy people only care about PAR and PAR not the others 0 Preventive Fractionwhen exposure is a protective factor etiologic is used when exposure is a risk factor PFincidence among NE subtracted from incidence of exposed over nonexposed group baseline 0 Traditional CCS you cannot get absolute measures 0 We can get relative measures because odds ratio estimates relative risk 0 Formula is different form cohort CCS AR OR1OR100 Cohort prevalence of exposure ababcd CCS PAR Prevalence of exposure in control is a good estimate for prevalence of exposure in the pop 0 Incidence of study pop is a good measure of incidence of disease among general pop bbd prevalence of exposure 111014 Week 12 o if RR1 you do not perform the other MOA for practice problems calculate RR rst and if it is 1 you have to do the other MOA 0 AR is among the exposed and PAR is for the population Difference between AR and AR AR is absolute and AR is relative AR is excess number of cases and AR is excess proportion PAR is excess number of cases in population and PAR is excess proportion 2 goals in epietiologic research and monitoring trends when we do not anything about x and y research we use epi reasoning in the 405 we did not know smoking caused lung cancer and CHD how do we start evaluating whether x causes y what is the frequency of lung cancer Prevalence of smoking in the pop MODF descriptive epiperson place and time next 3rd nd determinants causation have to nd out if smoking is associated with lung cancer develop hypothesis null and alternative nd comparison group that s where study design comes in and then pick comparison group collect and analyze data and then perform calculations OR for smoking and lung cancer6 which means the odds of smoking is 6 times likely in patients with lung cancer compared to people without lung cancer OR could be big because of real association or alternative answers Could be due to chance or random error systematic error or bias or confounding factors Only after we rule the other 3 out does it mean it is a real association Then it is internally valid Only after that can you evaluate that smoking is a cause of lung cancer Bradford hill causal criteria Chance or random error means bad luck Investigators do not intentionally cause this error When we nd something like OR6 in cohort study we test whether it is statistically signi cant or not by using something like X2 We are testing on out null hypothesis with statistical test NH there is no association between smoking and lung cancer Ho OR 1 Ha OR not 1 When you do a test you either reject or fail to reject NH NH is true alternative hypothesis is false or false AH is true Fail to reject itno association We have 4 options but only 2 are correct a and d 2 incorrect options are type 1 and 2 errors type 1 is alphaprobability or p valuefalse positive another is type 2you are not rejecting when the NH is falsefalse nega ve type 2beta is when you take 100beta which corresponds to the study power study powerif you have enough sample size your study has the probability of rejecting a NH when true association exists sometimes there is association but the power is so small that you cannot detect the association random error or chancetest by statistical test on NH ony suppose we can study everyone and we no the association we will have a representative estimate we study a variable and characteristic and without random error you nd the distribution one certain way but with random error the distribution of the characteristics measured are closer to the truth or true value random error ony causes change around the true value but not the mean value systematic erroractual aws in your study another name is bias 2 typesthe way people are selected into study and the way you collect info selection and info bias difference between random and systematicsystematic changes the mean variability and everything because it is a aw in the study let s say the truth OR is 21 n100 if we have random error it is 19 systematic OR9 or 13 because of systematic error you can overa underestimate the 2 biases one good thing about random error is that as we select more people our results become more accurate the closer to the truth random error can be reduced by increasing sample size this is not the case for systematic error random error can be reduced by precision study 1 n100 OR25 1000 OR23 10000 OR21 Type 1 error p value of equal to or less than 05 statistically signi cant and con dence interval is 95 Cl and P values are calculated simultaneously OR2 p values is 05 means that it is a true association and only rule out random error Assuming no other errors it means there is a true association Corresponding 95 CI 1724 signi cant because there is no 1 found in there Suppose you have PR2 and p value 08 not signi cant so 95 Cl could be 928 which means not signi cant because you nd 1 in there and 01 means no association Statistical tests only rule out random error P valuedo not be guided by this alone If you nd in clinical trial and you test antihypertensive trial and you nd BP difference is 1mmHg it is signi cant but does not make sense clinically Many clinical trials have small sizes so you may not nd statistical signi cance between 2 groups Systematic error aw in implementation or design phase of study Could be over or underestimation How can it occur The way you select people or the way you measure then Case control study are very prone to selection bias why We get cases from where they are like hospitals we do not have the luxury of going to the public Hospitals are systematically different from the rest of the pop Inappropriate selection of cases and controls can cause systematic bias Once you have people in study you collect data by asking measuring questionnaires For BPactual device is not calibrated error due instrument data collector or observer error someone who is not trained measures BP the time you measure BP error due to subjects themselves Have to train people to collect BP prepare subjects in a way that is similar and constantly calibrate the instrument used Have to use random spigormetric that is calibrated to reduce error Has to be consistent measurement to reduce info bias Bias cannot be controlled You can assess it at the end of the study or try to minimize it Have to estimate how likely is study going to have selection and info bias If they exist what happens to MOA like OR or RR will you over or underestimate it 5 types of selection bias detection biasmenopausal estrogen and endometrial cancer we got it from hospitalized cancer when post menopausal women come to the doc with vaginal bleeding and is asked if they use hormonal replacement therapy and if they do the doctor is more likely to look at them and are hospital because of exposure CCS exposure proportion will be higher as a result if you pick the cancer cases from hospital in 22 table a cell is going to be bigger than it should ORadbc Numerator is higher so OR will be overestimated Berkson s bias only for hospital based CCS Example is coffee drinking exposure and pancreatic cancer outcome Control cannot have pancreatic cancer but another disease is ok The way you pick controls are important too They picked gall bladder disease as control People with gall bladder disease their GI doctor told them not to drink coffee so the exposure so b cell will be affected it will be lower than it should Denominator is smaller so OR will be overestimated First two biases are for CCS Response biaspeople who do not respond but are eligible eligible nonparticipants can occur in any study Exampleobesity and hypertension what is the people who do not participated are more likely to be obese or hypotensive study result will be affected Follow up or attrition biascohort and experimental clinical trial People who drop out may be more likely to develop disease and may affect RR Prevalenceincidence biassurvival biasCCS and cross sectional studyyou can only pick people who are still alive so those who died have no chance of being included in study Another reason is once people have the disease they will likely change their behavior If you ask during cross sectional their smoking behavior may change because they have COPD for example but you catch exposure info only at one time Recall biaspeople with disease are more likely to remember exposure info than those who do not Observer or interviewer bias if the interviewer knows these people are cases and controls they will be more or less likely to probe Questions have to be asked the same way or results in unequal exposure info Misclassi cation biasdisease are misclassi ed as nondiseased and vice versa and exposure is classi ed as nonexposure This can occur in any study design From CCS we ask exposure from cases and controls We ask about DC use and we give a questionnaire to both groups but 2 or 3 questions were worded by instrument has an error You could potentially misclassify exposure and nonexposure but the degree of error is the samenondifferential misclassi cation or random misclassi cation Suppose we have 2 groups and want to measure BP an done group we measure 3 times and take an average for 2nCI group we measure it once so the measure error is differentdifferential misclassi cation or nonrandom Nondifferential misclassi cation will always be towards the null underestimate If it is different it could under or overestimated or 1 so you cannot predict the magnitude of bias Practical implication for thatwant to make sure you get the info the same way from both groups you want to compare Confoundingnot primary exposure not outcome but a 3rd variable that can distort the associate between exposure and disease Association is around six but differs by gender this is the 3rd variable Theoretical de nitionhas to meet 4 criteriamust be associated with exposure associated with disease cannot be in the causal pathway between exposure and disease and it has to be differentially distributed Primary exposuresmoking and outcome disease 3rCI variableyellow staining on hands is associated with exposure and does not cause heart disease associated with exposure only but not disease so it does not meet the criteria for confounding potential 3rCI variable is ageis it associated with smoking and lung cancer Yesmeets criteria 1 and 2 at this point we call age a potential confounder we must prove it with data confounder can not be in the causal link between exposure and disease HRT and CHD 3rOI variable is HDL cholesterol If you take HRT CHD lessens and increases HDL good cholesterol 3rCI HDL is in the causal pathway and does not meet the de nition of confounder age is not in the causal link though confounder has to be differentially distributed suppose smoking and lung cancer and potential confounder is age smokers and nonsmokers are exposed and nonexposed groups and age those who as 60 and up and less than 60 in this case they are differentially distributed it is almost the same so even though age is a confounder you will not be able to detect it to detect with data it has to be differentially distributed usually set up 22 with exposure and disease how do we tell if confounder is associated with exposure or disease Set up a table with confounder and exposure and then confounder with disease Some biological links are new and we may not know it so you may treat it as a potential confounder Method of analyzing data is strati ed analysis If we nd in a sample of 100 people we nd association between exposure and disease this is called crude association We think gender may be potential confounder so we stratify data on gender In 100 people 60 are male and 40 are females making two tables one for each gender We have 3 measures crude stratum speci c 1 and 2 How do we know if gender is confounding if SS1 and 2 are homogeneous and they are different form crude Crude is 28 SS male is 15 and SS female is 138 they are similar so it is homogenous but they are different from crude so gender is a confounder Effect modi eralso a 3rd variable that changes the level of your MOA Crude28 SS male 13SS female65 SS are heterogeneous and different from crude We call this gender effect modi er It changes level of association by effect of 3rd variable Crude28 SS male 27SS female29all close to 28 homogenous Gender is neither confounder nor effect modi er You have to set a percentage as a priority before the study what is modi er EM is also called interaction You primary exposure is calcium blocker and outcome is reduce BP Suppose you give diuretics 3rCI variable which reduces BP Combining them together will reduce BP together called a synergistic interaction same as effect modi er epi word for synergistic Calcium blocker hypertension and cough medicine with pseduoaphrogine which increases BP and affect each other In stats this is an antagonistic interaction and in epi it is still EM We have to set a prioriwhat it means Potential confounderonly assess by conceptual de nition and do not have the data yet Confounding as oppose to bias can be control One way is through strati cation in analysis There are many variables that have to be controlled simultaneously That is multivariable analysis In design phase we an use randomization Clinical Trials you randomize groups you are picking them and subjects do not pick what they get Randomizations helps equalization Cannot use randomization for cohort CCS or cross sectional designs 2nCI method is restrictionexclusion and inclusion variable if you think age will be a problem you can restrict certain age groups to control for age factor matchingCCS EM is a biological phenomenon so you cannot control for it you just present it Asbestos can be an EM so you cannot control for it in your analysis you just nd the OR between smoking and asbestos and show it Confounding tends to be on one side of crude whereas EM tends to be on either side True for the study pop I am studyinginternally valid inside of study of pop Can I apply this conclusion to the general popexternal validity outside of study pop Internal validity is more important without it you cannot make generalizations Now we need to assess is smoking a cause of lung cancer Now we do Bradford Hill Causal Criteria which is very subjective no rule for minimum of numbers needed Need to know natural history of disease 9 criteria speci city is dif cult for chronic diseasesone exposure causes one outcome many times it is not met temporal relationshipdetermined by study design dose responseas exposure increases MOA increases this is a linear relationship bio plausibilityhigh cholesterol and CHD there may be times it is not met because with our current knowledge we do not have that info yet coherentdoesn t con ict with existing knowledge and is also hard to prove because we may not have any other info besides epi studies strength and dose responseonly 2 criteria we can prove quantitatively 1117 Week 13 Screeningto be able to detect a disease before the symptoms before it is too late the disease process is still early stage so treatment may be more effective and less costly Screening tests are not 100 accurate which can result in misclassi cation Reliabilityif you measure something multiple times you get the same results Validityyou are getting exactly what the truth is quality Say someone s BP is 140 and you measure it 3 times 120 120 and 1195 making it reliable but not valid Is all three are close to 140 it is valid and reliable If the 3 measurements are all over it is neither reliable consistency repeatability or valid accuracy Random error affects reliability and systematic error affects validity If you increase sample size you are more likely to be reliable lf person is truly diseased test is positive and vice versa To see how good the screening is we have to test reliability For screening tests we have 2 tests sensitivity correctly getting true diseased people and speci city getting truly non diseased people For primary prevention screening like for cholesterol are an example Secondary prevention is mammography before breast cancer Tertiary would be after disease arises What is the primary purpose of doing screening in PHprevention and control monitoring over time for surveillance data societal protection and exclusions like to be an athlete or part of the military Screening should be done during the asymptomatic period Criteria for screening testhas to be rapid inexpensive there should be treatment and acceptable to community That is why there are no screenings for rare diseases that How do we assess how good a screening test is Accuracy and validity Only true positive and true negative are real Sensitivityaac it is a conditional probability of getting a positive test given individuals who are truly diseased Complement of sensitivity is false negative If you do a test that has a 90 sensitivity 10 will be a false nega ve Speci cityconditional probability of getting a negative test when someone is truly nondiseased d db and the complement is false positive Lets say you have 200 people and you want to test for VD and the test is 95 speci c 190 are truly nondiseased and 10 are a false positive S and S are in uenced by what you use as a cutoff value If a test is highly sensitivity the speci city goes down and vice versa Improving speci city decreases sensitivity S and S have nothing to do with how prevalent a disease is but how good the test is and is in uenced by cut offs In early phase of HIV AIDS and the disease is fatal you want speci city to be better this increases nondiseased people who are truly fatal What if you have a disease that is highly transmissible but preventable you want sensitivity to be higher Positive and negative predictive valueswhen you go to the doctor and ask if you have a positive test how likely am I to truly have the disease Opposite of S and S PPV a a b in a conditional probability statementprobability of if I have a positive test how likely am to have the disease d dc negative test if I have a negative test how likely am to not have the disease for sure P and N values are affected by prevalence of the disease How do we calculate prevalence using 2 by 2 acabcd How to calculate accuracy of the test ad total We always use one test as the gold standard EKG can be called true and the people you can detect with it is true You put the gold standard on top Our sensitivity would be 57 107 Specificity 86 93 PPD6764 86136 NPV prevalence 107200 accuracy of test is 5786200 testing now in prevalence downS and S remain the same because they are not affected by prevalence for P and NPV they answers change when the prevalence went down PPV goes down and NPV goes Up if the prevalence is close to 0 it is not worth doing a screening because you will not be able to detect it NPV has indirect relationship and PPV has a direct relationship with prevalence Risk to screeningmay not get health insurance stress associated with results costs associated with screening After mass screening in pop you have to assess how good the screening program is Have to be aware of some of the biases associated with it Type of biases in screening self selection or volunteer bias people who are willing to come for the screening are usually healthier 2nCI biaslead time bias want to doing screening test when it is detectable by screening and asymptomatic if you want to compare screened versus non screen have to consider lead time bias Suppose Jack was detected by screening before developing the disease and he now knows he has colon cancer John comes to doctor because he has symptoms so he was detected at the same time Both die at age 65 0 You may erroneously say that Jack lives longer than John even though they both died at age 65 0 Jack was detected earlier but they died at the same time so we cannot say that Jack lived earlier he was just detected earlier 0 When comparing screened and unscreened pop you have to think about when they are detected 0 Length bias is how aggressive the disease is 0 Let s say pancreatic cancer for 2 pops some have slowing growing cancer and some have aggressive Slow growing cancer has a long time to detect while aggressive has a short time o If you re comparing the 2 pops and the disease has a possibility of slow and fast growing tumor types you have to consider that 121 Week 15 0 Move from 120 to 110 increase sensitivity and reduce speci city Sensitivitytruly diseased person getting a positive result PPDchance of getting disease if the test is positive 0 What is an outbreak Endemicexpected rate commonly present like common cold and u o Epidemichigher than normal Pandemicglobal outbreak crosses geographic boundaries Outbreak is a synonym for epidemic Investigation for infectious diseases Cluster investigationsoutbreak of chronic diseases 0 When there is an excess of expected rate 0 We know this because of monitoring trends 0 When do we know it is an outbreak Normally because of surveillance 0 Patients or family report to health care professionals 0 Passive surveillancephysicians labs and clinics are required to report Activehealth care professionals go out and get info 0 Sometimes media gets the info before other people do 0 Why do we need to investigate outbreaks o For control and prevention and the opportunity for research and training 0 We need to reassure public to minimize economic and social disruption have legal obligation to go out and investigate what is the problem 0 What is the priority of investigation If there are simultaneous outbreaks they prioritize based on causative agent and mode of transmission CDC 10 steps in outbreak investigation prepare for eld workmay involve review of literature if you go to CDC website all possible outbreaks in US is listed eld study for outbreaks is different because you have to respond fast administrationpreparing for supplies equipment following people who got sick consultationnotifying labs of samples securing labs and equipment establish existence of outbreak using previous data chronic disease is a cluster investigation with cluster you do not have to follow more than is expected because many times we do not know what to expect very the diagnosisclinical ndings of signs and symptoms lab ndings interviews with cases and people who were exposed and did not develop the disease establish a case de nition for outbreak case de nition should include person place and time if they come from the same source you do not have to wait for all cases to have con rmatory lab results it s a way to cut costs then we count cases and plot epidemic curve which looks at number of cases from day of possible exposure it could be a restrictive population like just checking kids who go to daycare perform descriptive epiperson place and time time info gives us a lot rst exposure to time to develop symptomsincubation period normally there is a range by plotting epidemic curve we can get an idea of the incubation period if we do not know what is causing this infection point sourceoccurs within a very short time and one incubation pedod if you see a spike in histogram it could be a point source continuous common sourceinfectious agent is in the water and you might not drink it so it is usually occurring over a long time on a graphit would show more of a plateau example 1example of a point source intermittent common sourcemaybe blood is contaminated from donors and the healthcare workers work with this blood but not all the timeyou would have point source and then lower and then a spike again propagated sourcebar graph with red and blue the source is closed but the infected people infect more people original people index case later go and infect person to person common and pointyou just get it from food water and blood placehelpful for chronic disease cluster problem with spot map just nd out cases you do not know the denominator to nd things like prevalence and incidence personlook at characteristics of people who do and don t get the disease do they differ Then we develop a hypothesis Test the hypothesis Sometimes you may not need to test the hypothesis I depend son the situation if you can narrow down the source Usually evidence is not obvious so you have to carry out a study Type of study design for outbreak investigationcohort and case control Cohortneed to be able to follow these people so it is recommended if population is well de ned MOA for cohortrelative risk but calculate attack rate rst which is Cl a proportion Example in OswegoNHthere is no association between eating vanilla ice cream and gastroenteritis Attack rate among exposed4354 Attack rate of nonexposed 321 You can tell from Cl that the ones who ate it have an 80 of getting it and those who did not 14 RRC exposednonexposed 55 Interpretationthose who eat vanilla ice cream are 55 times the risk of getting gastroenteritis than those who did not Have to calculate X2 to see if it is statistically signi cant Not well de ned populationcase control Start with cases and select appropriate controls NHthere is no association between shopping at Grocery Store A and Legionellosis Can we calculate attack rate her No because we are not following people so we cannot calculate Cl ORadbc112 Interpretationodds of shopping at GSA among people who have L is 112 times greater than those who did not Use X2 to see if it is statistically signi cant Sometimes we have to do additional stuff because we cannot nd a source of association so we have to do more studies Implement control and prevention ASAP Then we want to communicate ndings Tell local authorities and residents Example of cluster investigationsthalidomide and congenital malformations in babies love canal tragedy Causation of chronic diseases is multifactorial which presents a chaHenge In terms of cluster investigations we do the same as ID and do casecontrol Know incubation period and herd immunity pg 511 of Friis on other outbreaks Herd immunitymajority of people are protected from diseases due to vaccination or infection Study design target pop main outcome chance and bias anything after CCSincidence and prevalence
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'