BIOSTATISTICS OF RESRCH DESIGN
BIOSTATISTICS OF RESRCH DESIGN DENS 580
Virginia Commonwealth University
Popular in Course
Popular in Dental Public Health Sciences
This 8 page Class Notes was uploaded by Shanelle Schimmel on Wednesday October 28, 2015. The Class Notes belongs to DENS 580 at Virginia Commonwealth University taught by Alvin Best in Fall. Since its upload, it has received 16 views. For similar materials see /class/230600/dens-580-virginia-commonwealth-university in Dental Public Health Sciences at Virginia Commonwealth University.
Reviews for BIOSTATISTICS OF RESRCH DESIGN
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 10/28/15
Trials to assess equivalence the importance of rigorous methods BMJ 1996 3133639 6 July B Jones professor of medical statistics 1 P Jarvis senior lecturer in medical statistics1 J A Lewis visiting professor in medical statistics1 A F Ebbutt director of European clinical statistics The aim of an equivalence trial is to show the therapeutic equivalence of two treatments usually a new drug under development and an existing drug for the same disease used as a standard active comparator Unfortunately the principles that govern the design conduct and analysis of equivalence trials are not as well understood as they should be Consequently such trials often include too few patients or have intrinsic design biases which tend towards the conclusion of no difference In addition the application of hypothesis testing in analysing and interpreting data from such trials sometimes compounds the drawing of inappropriate conclusions and the inclusion and exclusion of patients from analysis may be poorly managed The design of equivalence trials should mirror that of earlier successful trials of the active comparator as closely as possible Patient losses and other deviations from the protocol should be minimised analysis strategies to deal with unavoidable problems should not centre on an intention to treat analysis but should seek to show the similarity of results from a range of approaches Analysis should be based on confidence intervals and this also carries implications for the estimation of the 1 Department of Medical Statistics School of Computing Sciences De Montfort University Leicester LE1 9BH Correspondence to Professor J A Lewis Medicines Control Agency London SW8 5NQ 2 Glaxo Wellcome Ltd Greenford Mddlesex UB6 OHE required numbers of patients at the design stage The gold standard in clinical research is the randomised placebo controlled double blind clinical trial This design is favoured for confirmatory trials carried out as part of the phase III development of new medicines Because of the number and range of medicines already available however new medicines are 39 39 39JbeingJ 39pJfor39 in which a placebo control group would be unethical In such situations one obvious solution is to use as an active comparator an existing drug already licensed and regularly used for the indications in question Some authors have questioned whether placebo controlled trials are used excessively and unethicallyl Z 1 and such views would reinforce the trend towards using active comparators Others have proposed that once licensed new drugs should be compared with existing treatments for the same indication in order to examine their relative cost effectiveness and that large randomised trials are the appropriate tool 139 c When an active comparator is used the expectation may sometimes be that the new treatment will be better than the standard and the objective is to demonstrate this fact unequivocally This situation is similar to using a placebo control and poses no special methodological problems More probably however the new treatment is simply expected to match the efficacy of the standard treatment but have advantages in safety convenience or cost in some cases the new treatment may have no immediate advantage but may present an alternative or second line therapy Under these circumstances the objective of the trial is to show equivalent efficacythe so called equivalence trial Suchtrials have been referred to as active control equivalence studies or positive control studies This paper describes the methodological issues that surround equivalence trials and explains their implications We explain why equivalence trials generally need to be larger than theirplacebo controlled counterparts why their standard of conduct needs to be especially high why the handling of withdrawals losses and protocol deviations needs more care than usual and why different approaches to analysis and interpretation are appropriate A proper appreciation of these issues ensures that when equivalence trials are conducted they reach the scienti c standards necessary for reliable conclusions to be drawn There are two fundamental methodological features of equivalence trials which underlie the general approach to their design and analysis and these will be addressed first These features distinguish equivalence trials from trials whose aim is to detect a difference between two treatments and which are referred to here as comparative trials Con dence intervals and sample size The first feature relates to the statistical methods used for analysis and the consequences for determining the required number of patients In a comparative trial the standard analysis uses statistical significance tests to determine whether the null hypothesis of no difference may be rejected together with confidence limits to place bounds on the possible size of the difference between the treatments In an equivalence trial the conventional significance test has little relevance failure to detect a difference does not imply equivalencel a difference which is detected may not have any clinical relevance and may correspond to practical equivalence The relevance of the confidence interval however is easier to see This defines a range forthe possible true difference between the treatments any point of which is reasonably compatible with the observed data If every point within this range corresponds to a difference of no clinical importance then the treatments may be consideredto be equivalent It is important to emphasise that absolute equivalence can never be demonstrated it is possible only to assert that the true difference is unlikely to be outside a range which depends onthe size of the trial the results of the trial and the specified probabilities of error If we have predefined a range of equivalence as an interval from 395 to 5 we can then simply check whether the confidence interval centred on the observed difference lies entirely between i and 3 If it does equivalence is demonstrated if it does not there is still room for doubt Possible results of the comparison of a confidence interval with a predefined range of equivalence are shown in figure I and the importance of not basing conclusions on statistical significance can also be seen in this figure Any con dence interval which does not overlap zero corresponds to a statistically significant difference This intuitive procedure of checking whether a confidence interval lies within a range of equivalence does in fact correspond to a significance testing procedure but one in which the roles of the usual null and alternative hypotheses are reversed In comparative trials the null hypothesis is that there is no difference between the treatments The alternative hypothesis is that adifference exists In equivalence testing the relevant null hypothesis is that a difference of at least 3 exists and the trial is targeted at disproving this in favour of the alternative that no difference exists This formulation is important in validating the intuitive confidence interval procedure and it also helps in calculating sample sizes The formulas for calculating sample sizes for normally distributed and binary data are Statistical significance i Not equivalent 4 Uncertain lt v gt i i l l l l J l l Uncertain Not equivalent l lt gti Equivalent lt gt Unce E EquivalentE rtain 4 l l l A O A True difference Fig lExamples of possible results of using the con dence interval approach to is the prespeci ed range of equivalence the horizontal lines correspond to possible trial outcomes expressed as con dence intervals with the associated signi cance test result shown on the left above each line is the decision concerning equivalence provided in the appendix Values need to be speci ed for the range of equivalence 5 and the probabilities of typeI and II errors alpha and B respectively An important point to note is that if a 10012alpha con dence interval is used to decide on equivalence then the signi cance level is alphathat is the probability of the type I error is alpha So for example if a 95 interval is used then alpha 0025 The choice of 5is dif cult and requires extensive debate with knowledgable clinical experts and the chosen 5should generally be smaller than in a comparative trial In comparative trials against placebo ais o en set equal to a difference of undisputed clinical importance and hence may be above the minimum difference of clinical interest by a factor of perhapstwo or more there may be scienti c reasons to expect a treatmentto have more than aminimal effect However when comparing a new agent with a standard comparator it is necessary to show that the new agent is suf ciently similar to the standard to be clinically indistinguishable This entails using smaller values of Sthan were used to detect the effect of the standard relative to placebo A factor of two does not seem inappropriate leading to sample sizes roughly four times as large as those in similar comparative trials The selection of alpha and 5 follows similar lines as for comparative trials The use of a 95 con dence interval in an equivalence trial as recommended by the European Committee for Proprietary Medicinal Products in its note for guidance on biostatistics 2 corresponds to a value for alpha of 0025 However 5 is treated identically and is generally setto 01 to give a power of 090 or 02 to give a power of 08 The distinction between one sided and two sided tests of statistical signi cance also carries over into the con dence interval approach For a one sided test equivalence is declared if the lower one sided con dence limit exceeds 3 This approach is indicated when the objective is to ensure that the new agent is not inferior to the standard Equivalence or superiority are both regarded as positive outcomes Internal validity of trials The second special feature affecting the equivalence trial is the lack of any natural internal check on its validity In a comparative trial there is a strong incentive to remove any sloppiness in design conduct and analysis because such sloppiness is likely to obscure any differences between the treatments As a consequence the detection of a treatment difference not only implies that a difference exists but also that the trial was of sufficient quality to detect it Such an incentive and natural check on quality are lacking in an equivalence trial where the finding of equivalence may arise either from true equivalence or from a trial with poor discriminatory poweratrial which was too small for example or one in which most patients were likely to improve spontaneously without medical intervention Example of sample size calculation Two inhalers used for the relief of asthma attacks are to be assessed for equivalence They will be considered equivalent ifthe 95 two sided confidence interval for the treatment difference measured using morning a expiratory flow rate lmin falls wholly within the interval 15 lmin that is 3 15 and alpha 10952 From a pre vious trial the prior estimate of delta2 the between subject variance of morning peak expiratory flow rate is 1600 lmin2 The sample size of each group is to be such that there is a power of 080 that the inhalers will be deemed equivalent if they are in fact identical To use the formula for normally distributed data given in the appendix we note that z l alpha z0975 196 and z1l32 z090 128 from tables ofthe normal distribution so n 2 x 402152 196 1282 1493 about 150 Each group should contain 150 patients The finding in a trial that two treatments are equivalent does not require that both treatments were effective it is equally compatible with the alternative that neither was In any equivalence trial therefore it is vitally important to have some means of con rming that both treatments were indeed effective We need to be certain that if a third placebo arm had been included both active treatments would have been shown to be superior to placebo The degree of certainty can be increased only by paying careful attention to the design of the equivalence trial by being strict about matters of conduct and by making additional checks during analysis The active comparator is usually a licensed medicine which has been evaluated in controlled trials against placebo perhaps during the phase III studies used to support its marketing application If the equivalence trial mirrors as closely as possible the methods used in these earlier placebo controlledtrials then con dence in its results will be increased since the methods have been positively validated in a similar context Important design features to follow as closely as possible are the inclusion and exclusion criteria defining the patient population the dosing schedule of the standard treatment the use of concomitant medication and other interventions and the primary response variable and its schedule of measurements During analysis it is valuable to show similarities between the equivalence trial and the earlier comparative trials in terms of patient compliance the response during any run in period and the scale of patient losses and the reasons for them The two major features covered so far provide the background for some brief comments on other considerations in the design conduct and analysis of equivalence trials Design and conduct The amount of information available to plan an equivalence trial will generally exceed the amount available at the time of planning earlier trials of the active comparator There should be little excuse therefore for poor design Double blinding of medication may pose extra difficulties but is no less important than in comparative trials and randomisation is equally important Inclusion and exclusion criteria must be carefully chosen on the basis of prior experience of the active comparator to ensure that the trial contains patients likely to respond to the active comparator and hence avoid a conclusion of equivalence through nonresponse Care in this choice should be mirrored in the response observed to the trial treatments The level of success for successfailure outcomes should be similar to that seen in previous trials of the active comparator For more quantitative endpoints improvements from baseline in the course of the trial provide some assurance that the trial treatments have both been effective entirely within the range of equivalence of 15 min to 15 min and so equivalence is confirmed The dosing regimen and period of dosing of the active comparator should re ect the standard manner of use known to be effective on the basis of earlier clinical trials and there should be a sound rationale for the choice of the potentially equivalent dosing regimen of the new medication Ifthe doses chosen for both agents are too high then patients may reach an upper threshold in response leading to a conclusion of equivalence which may not carry over to the doses more likely to be used in practice Unreasonably low doses may lead to similar false conclusions through lack of response It is sometimes necessary to check that all patients can tolerate one or both treatments in orderto maintain patient numbers and hence power and this should be done during a run in period before randomisation The use in all patients of a standard dose of quot 39 quot with known beneficial Example Assessment of equivalence Two inhalers R and T used forthe relief of asthma attacks were compared in an equivalence trial using morning peak expiratory flow rate lmin as the primary measurement The range of equivalence was set at 15 minthat is 6 15 The results of the trial were as follows Mean morning peak expiratory flow rate on treatment R 420 min 150 patients T 417 min 150 patients Mean difference between R and T d 3 Estimated standard error of the mean difference SEd 4 The 95 confidence interval for the true difference ranges from 196 SEd to 196 SEd where 196 is the appropriate value from tables ofthe normal distribution that is z0975 This interval is 48 to 108 and lies effects can also result in patients reaching their upper threshold of response and hence lead to the masking of treatment differences Alternatively if the use of concomitant medication is exible greater use in one arm of the trial may produce a bias towards equivalence Similar biases towards equivalence can arise from the use of rescue medication in patients in whom treatment failsthat is from patients who withdraw from randomised treatment because of lack of efficacy These issues are closely connected with the means adopted for dealing with such patients in the analysis Analysis The most difficult issue relating to the analysis of an equivalence trial concerns which patients and which data from these patients to include The most common approaches to the analysis of randomisedtrials are intention to treat and per protocol analyses A fuller discussion of intention to treat can be found in Lewis and Machinm and a severe criticism in Salsburgu In an intention to treat analysis patients are analysed according to their randomised treatment irrespective of whether they actually received the treatment Patients may fail to take atreatment altogether may be given the wrong treatment or may violate the protocol in some other way but under an intention to treat analysis this does not affect matters The strength claimed for such an analysis is that it is pragmaticthat is that it mirrors what will happen when the treatment is used in practice In a comparative trial where the aim is to decide if two treatments are different an intention to treat analysis is generally conservative the inclusion of protocol violators and withdrawals will usually tend to make the results from the two treatment groups more similar However for an equivalence trial this effect is no longer conservative any blurring of the difference between the treatment groups will increase the chance of declaring equivalence A per protocol analysis compares patients according to the treatment actually received and includes only those patients who satisfied the entry criteria and properly followed the protocol This approach might be expected to enhance any difference between the treatments rather than diminishing it because of the removal of uninformative noise Unfortunately it is possible to envisage circumstances under which the exclusion of patients in a per protocol analysis might bias the results towards a conclusion of no differencefor example if patients not responding to one of the two treatments dropped out early For this reason the subgroup of patients excluded from a per protocol analysis should be examined carefully to explore whether any biases of this nature might have occurred Indeed if the two treatments produce a different pattern of withdrawal for adverse events or lack of effect then this in itself is evidence that they are not entirely equivalent In an equivalence trial it is probably best to carry out both types of analysis and hope to show equivalence in either case In preparation for this policy it is important to collect complete follow up data on all randomised patients as per protocol irrespective of whether they are subsequently found to have failed entry criteria withdraw from trial medication prematurely or violate the protocol in some other way Such a rigid approach to data collection allows maximum exibility during later analysis and hence provides a more robust basis for decisions With respect to other aspects of analysis equivalence trials are similar in nature to comparative trials The result of the analysis of the primary endpoint should be one of the following that the confidence interval for the difference between the two treatments lies entirely within the equivalence range so that equivalence may be concluded with only a small probability of error 0 that the confidence interval covers at least some points which lie outside the equivalence range so that differences of potential clinical importance remain a real possibility and equivalence cannot safely be concluded and 0 that the confidence interval is wholly outside the equivalence range though this is likely to be rare Discussion The most common failing of reported equivalence studies is that they are planned and analysed as if they were comparative studies and the lack of a statistically signi cant difference is then taken as proof of equivalence The material covered in this paper should make it clear that such an approach is likely to lead to wrong conclusions Improvements to the standards of this type of research could be encouraged if journal editors and referees adopted a more critical attitude The following is a suggested minimal set of criteria against which to judge reports of clinical trials in which the equivalence of two treatments is claimed The size of the trial should be based on a null hypothesis of nonequivalence and an alternative hypothesis of equivalence Conclusions should be drawn on the basis of an appropriate confidence interval using the prespecified criteria of equivalence used in the sample size calculation The results of both intention to treat and per protocol analyses should be presented There should be adequate evidence on the rigour of the trial and of the similarity of important features of design to those of earlier comparative trials which showed useful clinical effects The trial data should provide some evidence of the efficacy of the treatments this might be success rates similar to those of previous trials or clinically important changes from baseline treatments Some ofthese aspects could most easily be covered by insisting that papers submitted to journals referred to published trials of the standard comparator against placebo with similar methods Referees should also be familiar with the special difficulties surrounding equivalence trials in the relevant clinical area Improving the standards of equivalence trials has consequences for the resources required Such trials will become larger and their monitoring will become more labour intensive in orderto ensure they are conducted in close accordance with the protocol so minimising the occurrence of biases towards a conclusion of no difference Funding BJ and PJ thank Glaxo Wellcome Ltd for providing aresearch grant Con ict of interest None Appendix Sample size and power formulas NORMALLY DISTRIBUTED DATA COMPARISON OF MEANS We assume that subjects are randomised into two treatment groups of equal size n the groups being denoted by R reference treatment and T test treatment Let uR and uT denote the expected mean values of the normally distributed observations in groups R and T respectively and let S2 be an estimate of sigma2 the variance of the observations assumed to be the same in the two groups In the confidence interval approach equivalence is concluded if the interval falls entirely within two prespecified tolerance limits we and 5 If xR and xT denote the observed means of the reference and treatment groups respectively then providedn is reasonably large the two sided 100 l2alpha confidence interval for uR uT is xR xT zlalpha square root2s2n where zlalpha is the lOOlalpha point of the normal distribution That is if Chi has the standard normal distribution with mean 0 and variance 1 then PrChi lt zlalpha l alpha When the con dence interval or significance testing approach is used to assess equivalence two sorts of mistake can occur we can decide that the treatments are equivalent when they are not the type I error with probability alpha or we can decide the treatments are not equivalent when they are the type II error with probability 5 These definitions are an exact switch of those applying to conventional signi cance testingg The values of alpha and B depend on the size of the true difference between the treatment means delta uR uT The value of alpha reaches a maximum on the boundary of the range of equivalence that is when ldeltal 57 and this is the value of alpha used in calculations The value of B is usually calculated at the point of equivalence that is at delta 0 The corresponding power of the trial 13 is the probability of correctly declaring equivalence when delta 0 The sample size and the power formulas for a 100 l2alpha two sided interval are as follows The null hypothesis is H0 luR uTl gt 3inequivalence The alternative hypothesis is Hl 5 lt uR uTlt equivalence n 2s2 2zlalpha zlB22 Power 2Phi 5square roots22n zlalpha 1 where Phix denotes PrChi lt x and Chi has the standard normal distribution with mean 0 and variance 1 For a lOOlalpha one sided interval the corresponding formulas are H0 uR uT gt 5inequivalence Hl uR uT lt equivalence n 2s2rzb 2 zlalpha zl 52 Power Phi 5square roots22n zlalpha l BINARY DATA COMPARISON OF PERCENTAGES Using notation found in PocockE we define p to be the overall percentage of successes to be expected if the treatments are equivalent and use to define the range of equivalence for the difference in percentage success rates Other notation is unchanged The required size of each treatment group and the power can be calculated as follows Two sided case n 2p100 p52zlalpha zlB22 Power 2Phi gssquare rootp100 p 2n zl alpha 1 One sided case n 2p100 pi2zlalpha zlB2 Power Phi 5square rootp100 p 2n zl alpha 1 References l Rothman KJ Michels KB The continuing unethical use of placebo controls N Engl J Med 19943313948 Free Full Text 2 Taubes G Use of placebo controls in clinical trials disputed Science 1995267256 Medline 3 The use ofplacebo controls NEnglJMed 1995332602 Free Full Text
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'