[Federal Register Volume 64, Number 185 (Friday, September 24, 1999)]
[Notices]
[Pages 51767-51780]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 99-24855]
-----------------------------------------------------------------------
DEPARTMENT OF HEALTH AND HUMAN SERVICES
Food and Drug Administration
[Docket No. 99D-3082]
International Conference on Harmonisation; Choice of Control
Group in Clinical Trials
AGENCY: Food and Drug Administration, HHS.
ACTION: Notice.
-----------------------------------------------------------------------
SUMMARY: The Food and Drug Administration (FDA) is publishing a draft
guidance entitled ``E10 Choice of Control Group in Clinical Trials.''
The draft guidance was prepared under the auspices of the International
Conference on Harmonisation of Technical Requirements for Registration
of Pharmaceuticals for Human Use (ICH). The draft guidance sets forth
general principles that are relevant to all controlled trials and are
especially pertinent to the major clinical trials intended to
demonstrate drug (including biological drug) efficacy. The draft
guidance describes the principal types of control groups and discusses
their appropriateness in particular situations. The draft guidance is
intended to assist sponsors and investigators in the choice of control
groups for clinical trials.
DATES: Written comments by December 23, 1999.
ADDRESSES: Submit written comments on the draft guidance to the Dockets
Management Branch (HFA-305), Food and Drug Administration, 5630 Fishers
Lane, rm. 1061, Rockville, MD 20852. Copies of the draft guidance are
available from the Drug Information Branch (HFD-210), Center for Drug
Evaluation and Research, Food and Drug Administration, 5600 Fishers
Lane, Rockville, MD 20857, 301-827-4573. Single copies of the guidance
may be obtained by mail from the Office of Communication, Training and
Manufacturers Assistance (HFM-40), Center for Biologics Evaluation and
Research (CBER), or by calling the CBER Voice Information System at 1-
800-835-4709 or 301-827-1800. Copies may be obtained from CBER's FAX
Information System at 1-888-CBER-FAX or 301-827-3844.
FOR FURTHER INFORMATION CONTACT:
Regarding the guidance: Robert Temple, Center for Drug Evaluation
and Research (HFD-4), Food and Drug Administration, 5600 Fishers Lane,
Rockville, MD 20857, 301-594-6758.
Regarding the ICH: Janet J. Showalter, Office of Health Affairs
(HFY-20), Food and Drug Administration, 5600 Fishers Lane, Rockville,
MD 20857, 301-827-0864.
SUPPLEMENTARY INFORMATION: In recent years, many important initiatives
have been undertaken by regulatory authorities and industry
associations to promote international harmonization of regulatory
requirements. FDA has participated in many meetings designed to enhance
harmonization and is committed to seeking scientifically based
harmonized technical procedures for pharmaceutical development. One of
the goals of harmonization is to identify and then reduce differences
in technical requirements for drug development among regulatory
agencies.
ICH was organized to provide an opportunity for tripartite
harmonization initiatives to be developed with input from both
regulatory and industry representatives. FDA also seeks input from
consumer representatives and others. ICH is concerned with
harmonization of technical requirements for the registration of
pharmaceutical products among three regions: The European Union, Japan,
and the United States. The six ICH sponsors are the European
Commission, the European Federation of Pharmaceutical Industries
Associations, the Japanese Ministry of Health and Welfare, the Japanese
Pharmaceutical Manufacturers Association, the Centers for Drug
Evaluation and Research and Biologics Evaluation and Research, FDA, and
the Pharmaceutical Research and Manufacturers of America. The ICH
Secretariat, which coordinates the preparation of documentation, is
provided by the International Federation of Pharmaceutical
Manufacturers Associations (IFPMA).
The ICH Steering Committee includes representatives from each of
the ICH sponsors and the IFPMA, as well as observers from the World
Health Organization, the Canadian Health Protection Branch, and the
European Free Trade Area.
In May 1998, the ICH Steering Committee agreed that a draft
guidance entitled ``E10 Choice of Control Group in Clinical Trials''
should be made available for public comment. The draft guidance is the
product of the Efficacy Expert Working Group of the ICH. Comments about
this draft will be considered by FDA and the Efficacy Expert Working
Group.
In accordance with FDA's good guidance practices (62 FR 8961,
February 27, 1997), this document is now being called a guidance,
rather than a guideline.
The draft guidance sets forth general principles that are relevant
to all controlled trials and are especially pertinent to the major
clinical trials intended to demonstrate drug (including biological
drug) efficacy. The draft guidance includes a description of the five
principal types of controls, a discussion of two important purposes of
clinical trials, and an exploration of the critical issue of assay
sensitivity, i.e., whether a trial could have detected a difference
between treatments when there was a difference, a particularly
important issue in noninferiority/equivalence trials. In addition, the
draft guidance presents a detailed description of each type of control
and considers, for each: (1) Its ability to minimize bias, (2) ethical
and practical issues associated with its use, (3) its usefulness and
the quality of inference in particular situations, (4) modifications of
study design or combinations with other controls that can resolve
ethical, practical, or inferential concerns, and (5) its overall
advantages and disadvantages.
This draft guidance represents the agency's current thinking on
the choice of control group in clinical trials. It does not create or
confer any rights for or on any person and does not operate to bind FDA
or the public. An alternative approach may be used if such approach
satisfies the requirements of the applicable statute, regulations, or
both.
Interested persons may, on or before December 23, 1999, submit to
the Dockets Management Branch (address above) written comments on the
draft guidance. Two copies of any comments are to be submitted, except
that individuals may submit one copy. Comments are to be identified
with the docket number found in brackets in the heading of this
document. The draft guidance and received comments may be seen in the
office above between 9 a.m. and 4 p.m., Monday through Friday. An
electronic version of this guidance is available on the Internet at
``http://www.fda.gov/cder/guidance/index.htm'' or at CBER's World Wide
Web site at ``http://www.fda.gov/cber/publications.htm''.
The text of the draft guidance follows:
[[Page 51768]]
E10 Choice of Control Group in Clinical Trials\1\
---------------------------------------------------------------------------
\1\ This draft guidance represents the agency's current thinking
on the choice of control group in clincal trials. It does not create
or confer any rights for or on any person and does not operate to
bind FDA or the public. An altenative approach may be used if such
approach satisfies the requirements of the applicable statute,
regulations, or both.
---------------------------------------------------------------------------
1.0 Introduction
The choice of control group is always a critical decision in
designing a clinical trial. That choice affects the inferences that
can be drawn from the trial, the degree to which bias in conducting
and analyzing the study can be minimized, the types of subjects that
can be recruited and the pace of recruitment, the kind of endpoints
that can be studied, the public credibility of the results, the
acceptability of the results by regulating authorities, and many
other features of the study, its conduct, and its interpretation.
1.1 General Scheme and Purpose of Guidance
The general principles considered in this guidance are relevant
to all controlled trials. They are of especially critical importance
to the major clinical trials carried out during drug development to
demonstrate efficacy. This guidance does not address the regulatory
requirements in any region, but describes what studies using each
design can demonstrate. Although any of the control groups described
and discussed below may be useful and acceptable in studies serving
as the basis for registration in at least some circumstances, they
are not equally appropriate or useful in particular cases. After a
brief description of the five principal kinds of controls (see
section 1.3), a discussion of two important purposes of clinical
trials (see section 1.4), and an exploration of the critical issue
of whether a trial could have detected a difference between
treatments when there was a difference in noninferiority/equivalence
trials (see section 1.5), the guidance will describe each kind of
control group in more detail (see section 2.0-2.5.7) and consider,
for each:
Its ability to minimize bias
Ethical and practical issues associated with its use
Its usefulness and the quality of inference in
particular situations
Modifications of study design or combinations with
other controls that can resolve ethical, practical, or inferential
concerns
Its overall advantages and disadvantages
Several other ICH guidances are particularly relevant to the
choice of control group:
E3: Structure and Content of Clinical Study Reports
E4: Dose-Response Information to Support Drug
Registration
E6: Good Clinical Practice: Consolidated Guideline
E8: General Considerations for Clinical Trials
E9: Statistical Principles for Clinical Trials
In this guidance, the drug terms ``test drug,'' ``study drug,''
and ``investigational drug'' are considered synonymous and are used
interchangeably; similarly, ``active control'' and ``positive
control,'' ``clinical trial'' and ``clinical study,'' ``control''
and ``control group;'' and ``treatment'' and ``drug'' are
essentially equivalent terms.
1.2 Purpose of Control Group
Control groups have one major purpose: to allow discrimination
of patient outcomes (changes in symptoms, signs, or other morbidity)
caused by the test drug from outcomes caused by other factors, such
as the natural progression of the disease, observer or patient
expectations, or other treatment. The control group experience tells
us what would have happened to patients if they had not received the
test treatment (or what would have happened with a different
treatment known to be effective).
If the course of a disease were uniform in a given patient
population, or predictable from patient characteristics such that
outcome could be predicted reliably for any given subject or group
of subjects, results of treatment could simply be compared with the
known outcome without treatment. For example, one could assume that
pain would have persisted for a defined time, blood pressure would
not have changed, depression would have lasted for a defined time,
tumors would have progressed, the mortality after an acute
infarction would have been the same as previously seen. In unusual
cases, the course of illness is in fact predictable in a defined
population and it may be possible to use a similar group of patients
previously studied as a ``historical control'' (see section 1.3.5).
In most situations, however, a concurrent control group is needed
because it is not possible to predict outcome with adequate
accuracy.
A concurrent control group is one chosen from the same
population as the test group and treated in a defined way as part of
the same trial that studies the test drug. The test and control
groups should be similar with regard to all baseline and on-
treatment variables that could influence outcome other than the
study treatment. Failure to achieve this similarity can introduce a
bias into the study. Bias here (and as used in ICH E9) means the
systematic tendency of any aspects of the design, conduct, analysis,
and interpretation of the results of clinical trials to make the
estimate of a treatment effect deviate from its true value.
Randomization and blinding are the two techniques usually used to
prevent such bias and to ensure that the test treatment and control
groups are similar at the start of the study and are treated
similarly in the course of the study (see ICH E9). Whether a trial
design includes these features is a critical determinant of its
quality and persuasiveness.
1.2.1 Randomization
Assurance that subject populations are similar in test and
control groups is best attained by randomly dividing a single sample
population into groups that receive the test or control treatments.
Randomization avoids systematic differences between groups with
respect to variables that could affect outcome. The inability to
eliminate systematic differences is the principal problem of studies
without a concurrent randomized control (see external control
trials, section 1.3.5). Randomization also provides a sound basis
for statistical inference.
1.2.2 Blinding
The groups should not only be similar at baseline, but should
be treated and observed similarly during the trial, except for
receiving the test and control drug. Clinical trials are often
``double-blind'' (or ``double-masked''), meaning that both subjects
and investigators (including analysts of data, sponsors, other
clinical trial personnel) are unaware of each subject's assigned
treatment, to minimize the potential biases resulting from
differences in management, treatment, or assessment of patients, or
interpretation of results that could arise as a result of subject or
investigator knowledge of the assigned treatment. For example:
Subjects on active drug might report more favorable
outcomes because they expect a benefit or might be more likely to
stay in a study if they knew they were on active drug.
Observers might be less likely to identify and report
treatment responses in a no-treatment group or might be more
sensitive to a favorable outcome or adverse event in patients
receiving active drug.
Knowledge of treatment assignment could affect vigor
of attempts to obtain on-study or followup data.
Knowledge of treatment assignment could affect
decisions about whether a subject should remain on treatment or
receive concomitant medications or other ancillary therapy.
Knowledge of treatment assignment could affect
decisions as to whether a given subject's results should be included
in an analysis.
Knowledge of treatment assignment could affect choice
of statistical analysis.
Double-blinding is intended to ensure that subjective assessments
and decisions are not affected by knowledge of treatment assignment.
1.3 Types of Controls
Control groups in clinical trials can be classified on the
basis of two critical attributes: (1) The type of treatment received
and (2) the method of determining who will be in the control group.
The type of treatment may be any of the following four: (1) Placebo,
(2) no treatment, (3) different dose or regimen of the study
treatment, or (4) different active treatment. The principal methods
of determining who will be in the control group are by randomization
or by selection of a control population separate from the population
treated in the trial (external or historical control). This document
categorizes control groups into five types. The first four are
concurrently controlled (the control group and test groups are
chosen from the same population and treated concurrently), usually
with random assignment to treatment, and are distinguished by which
of the types of control treatments listed above are received.
External (historical) control groups, regardless of the comparator
treatment, are considered together as the fifth type because
[[Page 51769]]
of serious concerns about the ability to ensure comparability of
test and control groups in such trials and the ability to minimize
important biases, making this design usable only in exceptional
circumstances.
It is increasingly common to carry out studies that have more
than one kind of control group. Each kind of control is appropriate
in some circumstances, but none is usable or adequate in every
situation. The five kinds of control are:
1.3.1 Placebo Concurrent Control
In a placebo-controlled study, subjects are randomly assigned
to a test treatment or to an identical-appearing inactive treatment.
The treatments may be titrated to effect or tolerance, or may be
given at one or more fixed doses. Such trials are almost always
double-blind, with both subjects and investigator unaware of
treatment assignment. The name of the control suggests that its
purpose is to control for ``placebo'' effect (improvement in a
subject resulting from knowing that he or she is taking a drug), but
that is not its only or major benefit. Rather, the placebo
concurrent control design, by allowing blinding and randomization
and including a group that receives no treatment, controls for all
potential influences on the actual or apparent course of the disease
other than those arising from the pharmacologic action of the test
drug. These influences include spontaneous change (natural history
of the disease), subject or investigator expectations, use of other
therapy, and subjective elements of diagnosis or assessment.
Placebo-controlled trials seek to show a difference between
treatments when they are studying effectiveness, but may also seek
to show lack of difference (of specified size) in evaluating a
safety measurement.
1.3.2 No-Treatment Concurrent Control
In a no-treatment controlled study, subjects are randomly
assigned to test treatment or to no (i.e., absence of) test or
control therapy. The principal difference between this design and a
placebo-controlled trial is that subjects and investigators are not
blind to treatment assignment. Because of the advantages of double-
blind designs, this design is likely to be needed and suitable only
when it is difficult or impossible to double-blind (e.g., medical
versus surgical treatment, treatments with easily recognized
toxicity) and only when there is reasonable confidence that study
endpoints are objective and that the results of the study are
unlikely to be influenced by the factors listed in section 1.2.2.
Note that it is often possible to blind endpoint assessment, even if
the overall trial is not double-blind. This is a valuable approach
and should always be considered in studies that cannot be blinded,
but it does not solve the other problems associated with knowing the
treatment assignment (see section 1.2.2).
1.3.3 Dose-Response Concurrent Control
In a randomized, fixed-dose, dose-response study, subjects are
randomized to one of several fixed-dose groups. Subjects may either
be placed on their fixed dose initially or be raised to that dose
gradually, but the intended comparison is between the groups on
their final dose. Dose-response studies are usually double-blind.
They may include a placebo (zero dose) and/or active control. In a
concentration-controlled trial, treatment groups are titrated to
several fixed-concentration windows; this type of trial is
conceptually similar to a fixed-dose, dose-response trial.
1.3.4 Active (Positive) Concurrent Control
In an active-control (or positive control) study, subjects are
randomly assigned to the test treatment or to an active-control
drug. Such trials are usually double-blind, but this is not always
possible; many oncology studies, for example, are considered
impossible to blind because of different regimens, different routes
of administration (see section 1.3.2) and different toxicities.
Active-control trials can have two distinct objectives with respect
to showing efficacy: (1) To show efficacy of the test drug by
showing it is as good as (equivalent, not inferior to) a known
effective agent or (2) to show efficacy by showing superiority of
the test drug to the active control. They may also be used with the
primary objective of comparing the efficacy/safety of the two drugs
(see section 1.4). When this design is used to show equivalence/
noninferiority or to compare the drugs, it raises the critical
question of whether the trial was capable of distinguishing active
from inactive treatments (see section 1.5).
1.3.5 External Control (Including Historical Control)
An externally controlled study compares a group of subjects
receiving the test treatment with a group of patients external to
the study, rather than to an internal control group consisting of
patients from the same population assigned to a different treatment.
External controls can be a group of patients treated at an earlier
time (historical control) or during the same time period but in
another setting. The external control may be defined (a specific
group of patients) or nondefined (a comparator group based on
general medical knowledge of outcome). Use of this latter comparator
is particularly treacherous (such trials are sometimes called
uncontrolled) because general impressions are so often inaccurate.
Baseline-controlled studies, in which subjects' status on therapy is
compared with status before therapy (e.g., blood pressure, tumor
size), are a variation of this type of control. In this case, the
changes from baseline are often compared to a general impression of
what would have happened without intervention, rather than to a
specific historical experience, although a more defined experience
can also be used.
1.3.6 Multiple-Control Groups
As will be described further below (see section 1.5.1), it is
often possible and advantageous to use more than one kind of control
in a single study, e.g., use of both active drug and placebo.
Similarly, trials can use several doses of test drug and several
doses of active control, with or without placebo. This design may be
useful for active drug comparisons where the relative potency of the
two drugs is not well established, or where the purpose of the trial
is to establish relative potency.
1.4 Purposes of Clinical Trials
Two purposes of clinical trials should be distinguished: (1)
Assessment of the efficacy and/or safety of a treatment and (2)
assessment of the relative (comparative) efficacy, safety, benefit/
risk relationship or utility of two treatments.
1.4.1 Evidence of Efficacy
In some cases, the purpose of a trial is to demonstrate that a
test drug has any clinical effect (or an effect of some specified
size). A study using any of the control types may demonstrate
efficacy of the test drug by showing that it is superior to the
control (placebo, low dose, active drug). An active-control trial
may, in addition, demonstrate efficacy in some cases by showing the
new drug to be similar in efficacy to a known effective therapy. The
known efficacy of the control is then attributed to the new drug.
Clinical studies designed to demonstrate efficacy of a new drug by
showing that it is similar in efficacy to a standard agent have been
called ``equivalence'' trials. Because in this case the finding of
interest is one-sided, these are actually noninferiority trials,
attempting to show that the new drug is not less effective than the
control by more than a defined amount. As the fundamental assumption
of such studies is that showing noninferiority is evidence of
efficacy, the decision to utilize this trial design necessitates
attention to the question of whether the active control can be
relied upon to have an effect in the setting of the trial and
whether, as a result, the trial can be relied on not to find a truly
inferior drug to be noninferior (see section 1.5).
1.4.2 Comparative Efficacy and Safety
In some cases, the focus of the trial is the comparison with
another agent, not the efficacy of the test drug per se. Depending
on the therapeutic area, these trials may be seen as providing
information needed for relative benefit-risk assessment. The active
comparator(s) should be acceptable to the region for which the data
are meant. Depending on the situation, it may not be necessary to
show equivalence or noninferiority; for example, a less effective
drug could have safety advantages and thus be considered useful.
Even though the primary focus of such a trial is the comparison
of treatments rather than demonstration of efficacy, the cautions
described for conducting and interpreting noninferiority trials need
to be taken into account (see section 1.5). The ability of the
comparative trial to detect a difference between treatments when one
exists needs to be established because a trial incapable of
distinguishing between treatments that are in fact different cannot
provide useful comparative information.
In addition, for the comparative trial to be informative
concerning relative benefit and risk, the trial needs to be fair,
i.e., each drug should have an opportunity to perform well. In
practice, an active-control equivalence/noninferiority trial offered
as evidence of efficacy also almost always should provide a fair
comparison with the control, because any
[[Page 51770]]
doubt as to whether the control in the study had its usual effect
would undermine assurance that the trial had assay sensitivity (see
section 1.5). Note that fairness is not an issue when the purpose of
the trial is to show efficacy by demonstrating superiority to the
control (i.e., the trial will show such efficacy even if the
comparator is poorly used; such a trial will not, however, show an
advantage over the control).
Among aspects of study design that could unfairly favor one
treatment group are choice of dose or patient population and
selection and timing of endpoints.
1.4.2.1 Dose. In comparing the test drug with an active
control for the purpose of assessing relative benefit/risk, it is
important to choose an appropriate dose and dose regimen of the
control. In examining the results of a comparison of two drugs, it
is important to consider whether an apparently less effective
control drug has been used at too low a dose or whether the
apparently less well tolerated control drug has been used at too
high a dose. In some cases, to show superior efficacy or safety
convincingly it will be necessary to study several doses of the
control and perhaps of the test agent, unless the dose of test agent
chosen is superior to any dose (or the only recommended dose) of the
control and at least as well tolerated.
1.4.2.2 Patient population. Selection of subjects for an
active-control trial can affect outcome; the population studied
should be carefully considered in evaluating what the trial has
shown. For example, if subjects are drawn from a population of
nonresponders to the standard agents, there would be a bias in favor
of the new agent. The results of such a study could not be
generalized to the entire population of previously untreated
patients. The result is, however, still good evidence of the
efficacy of the new drug. Moreover, a formal study of a new drug in
nonresponders to other therapy, in which treatment failures are
randomized to either the new or failed therapy (so long as this does
not place the patients at risk), can provide an excellent
demonstration of the value of the new agent in such nonresponders, a
clinically valuable observation (see appendix).
Similarly, it is sometimes possible to identify patient subsets
more or less likely to have a favorable response or to have an
adverse response to a particular drug. For example, blacks respond
poorly to the blood pressure effects of beta blockers and
angiotensin-converting enzyme inhibitors, so that a comparison of a
new antihypertensive with these drugs in these patients would tend
to show superiority of the new drug. It would not be appropriate to
conclude that the new drug is generally superior. Again, however, a
planned study in a subgroup, with recognition of its limitations and
of what conclusion can properly be drawn, could be informative. See
the appendix for a general discussion of ``enrichment'' study
designs, studies that choose a subset of the overall population to
increase sensitivity of the study or to answer a specific, but
narrow, question.
1.4.2.3 Selection and timing of endpoints. When two treatments
are used for the same disease or condition, they may differentially
affect various outcomes of interest in that disease, particularly if
they represent different classes or modalities of therapy.
Therefore, when comparing them in a clinical trial, the choice and
timing of endpoints may favor one therapy or the other. For example,
thrombolytics in patients with acute myocardial infarction can
reduce mortality but increase stroke risk. If a new, more active
thrombolytic were compared with an older thrombolytic, the more
active drug might look better if the endpoint were mortality, but
worse if the endpoint were a composite of mortality and disabling
stroke. Similarly, in comparing two analgesics in the management of
dental pain, assigning a particularly heavy weight to pain at early
time points would favor the agent with more rapid onset over an
agent that provides greater or longer lasting relief.
1.5 Sensitivity-to-Drug-Effects and Assay Sensitivity of Studies
Intended to Show Noninferiority/Equivalence
As noted in section 1.4.1, use of an active-control
noninferiority/equivalence design to demonstrate efficacy poses a
particular problem, one not found in trials intended to show a
difference between treatments. A demonstration of efficacy by
showing noninferiority/equivalence of the new therapy to the
established effective treatment or, more accurately, by showing that
the difference between them is no larger than a specified size
(margin), rests on a critical assumption: that if there is a true
difference between the treatments, i.e., if the new drug has a much
smaller effect or no effect, the study would not have concluded
there was no such difference. This assumption, in turn, rests on the
assumption that the active-control drug will have had an effect of a
defined size in the study. If these assumptions are incorrect, an
erroneous conclusion that a drug is effective may be reached because
a trial seeming to support noninferiority will not in fact have done
so.
The ability of a specific trial to detect differences between
treatments if they exist has been called, and is here termed,
``assay sensitivity.'' In the noninferiority trial setting, assay
sensitivity requires that there be an effect of the control drug in
the trial of at least a specified size and that, because of the
presence of that effect, the trial has an ability not to declare
noninferiority of a new drug when the new drug is in fact inferior.
As noted, because the actual effect size of the control in the trial
is not measured, the presence of assay sensitivity must be deduced.
In this document, the term assay sensitivity, a property of a
particular trial, is distinguished from sensitivity-to-drug-effects.
Sensitivity-to-drug-effects is defined as the ability of
appropriately designed and conducted trials in a specific
therapeutic area, using a specific active drug (or other drugs with
similar effects), to reliably show a drug effect of at least a
minimum size under the conditions of the trial. Sensitivity-to-drug-
effects is determined from historical experience; it will usually be
established by a determination that such trials, when adequately
powered, regularly distinguish active drugs from placebo.
Sensitivity-to-drug-effects, established in this way, will imply
that, in a similarly well-designed and conducted noninferiority
trial, there will be an ability not to find an ineffective agent to
be noninferior. Assay sensitivity, in contrast, applies to a
specific trial and requires the actual presence of a control drug
effect and thus the actual ability of the trial not to declare an
inferior drug noninferior. This ability depends on the details of
the design and conduct of a specific trial, as well as the presence
of sensitivity-to-drug-effects.
1.5.1 Need to Ensure Assay Sensitivity in Noninferiority
(Equivalence) Trials; Difference-Showing Versus Noninferiority
Studies
When designing a noninferiority study, study designers need to
consider the fundamental distinction between two kinds of clinical
trials: (1) Those that seek to demonstrate efficacy by showing
superiority of a treatment to a control (superiority trials) and (2)
those that seek to show efficacy by demonstrating that a new
treatment is as good as (not inferior by some specified amount to) a
treatment known to be effective. In the difference-showing trial,
the finding of a difference itself documents the assay sensitivity
of the trial and documents the efficacy of the superior treatment,
so long as the inferior treatment, if an active drug, is known to be
no worse than a placebo. In the noninferiority situation, in
contrast, a finding of noninferiority leaves unanswered the
question: Would the study have led to a conclusion of noninferiority
even if the study drug were inferior? In a noninferiority trial
without a placebo group, there is no internal standard (that is, a
showing of an active drug-placebo difference) to measure/ensure
assay sensitivity. The existence of assay sensitivity of the trial
therefore needs to be deduced or assumed based on past experience
(``historically'') with the control drug, generally from placebo-
controlled trials, establishing the sensitivity-to-drug-effects of
well-designed and conducted trials, together with evidence that the
trial was in fact well conducted.
The question of assay sensitivity, although particularly
critical in noninferiority studies, actually arises in any trial
that fails to detect a difference between treatments, including a
placebo-controlled trial. If a drug fails to show superiority to
placebo, for example, it means either that the drug was ineffective
or that the study was not capable of detecting the effect of the
drug. A straightforward solution to the problem of assay sensitivity
is the three-arm study, including both placebo and a known active
treatment, a study design with several advantages. Such a study
measures effect size (test drug versus placebo) and allows
comparison of test drug and active control in a setting where assay
sensitivity is established by the active control-placebo comparison.
The design is also particularly informative when the test drug and
placebo give similar results in the study. In that case, if the
active control is superior to placebo, the study did have assay
sensitivity and the study provides some evidence that the test drug
has little or no efficacy. On the other hand, if neither drug,
including the known effective active control, can be distinguished
from placebo with
[[Page 51771]]
respect to efficacy, the clinical study lacks assay sensitivity and
does not provide evidence that the drug is ineffective.
1.5.2 Choosing the Noninferiority Margin
As noted earlier, most active-control ``equivalence'' trials
are really noninferiority trials intended to establish the efficacy
of a new drug. Analysis of the results of noninferiority trials is
discussed in the ICH guidances E9 and E3. Briefly, in such a trial,
new and established therapies are compared. Prior to the trial, an
equivalence or noninferiority margin, sometimes called a ``delta,''
is selected. This margin is the degree of inferiority of the test
drug compared to the control that the trial will attempt to exclude
statistically. If the confidence interval for the difference between
the test and control treatments excludes a degree of inferiority of
the test drug as large as, or larger than, the margin, the test drug
can be declared noninferior and thus effective; if the confidence
interval includes a difference as large as the margin, the test drug
cannot be declared noninferior and cannot be considered effective.
The margin chosen for a noninferiority trial cannot be greater
than the smallest effect size that the active drug would be reliably
expected to have compared with placebo in the setting of the planned
trial, but may be smaller based on clinical judgment. If a
difference between active control and new drug favors the control by
as much as or more than that amount, the new drug might have no
effect at all. The margin generally is identified based on past
experience in placebo-controlled trials of adequate design under
conditions similar to those planned for the new trial. Note that
exactly how to calculate the margin is not described in this
document, and there is little published experience on how to do
this. The determination of the margin is based on both statistical
reasoning and clinical judgment, should reflect uncertainties in the
evidence on which the choice is based, and should be suitably
conservative. If this is done properly, a finding that the
confidence interval for the difference between new drug and the
active control excludes a suitably chosen margin could provide
assurance that the drug has an effect greater than zero. In
practice, the margin chosen usually will be smaller than that
suggested by the smallest expected effect size of the active control
because of interest in ensuring that some particular clinically
acceptable effect size (or fraction of the control drug effect) was
maintained. This would also be true in a trial whose primary focus
is the therapeutic equivalence of a test drug and active control
(see section 1.4.2), where it would be usual to seek assurance that
the test and control drug were quite similar, not simply that the
new drug had any effect at all.
The fact that the choice of the margin to be excluded can only
be based on past experience gives the noninferiority trial an
element in common with a historically controlled (externally
controlled) study. This study design is appropriate and reliable
only when the historical estimate of an expected drug effect can be
well supported by reference to the results of previous studies of
the control drug. These studies should lead to the conclusion that
the active control can consistently be distinguished from placebo in
trials of design similar to the proposed trial (patient population,
study size, study endpoints, dose, concomitant therapy, etc.) and
should identify an effect size that represents the smallest effect
that the control can reliably be expected to have. If placebo-
controlled trials of a design similar to the one proposed more than
occasionally show no difference between the proposed active control
and placebo, and this cannot be explained by some characteristic of
the study, only superiority of the test drug would be interpretable.
Note that it is the estimated difference from placebo, not the total
change from baseline, that needs to be used to calculate the
expected effect of the control.
1.5.3 Sensitivity-to-Drug-Effects Is Difficult to Support in Many
Situations
Whether the historically based assurance of sensitivity-to-
drug-effects of a trial is supported in any given case is to some
degree a matter of judgment. There are many conditions, however, in
which drugs considered effective cannot regularly be shown superior
to placebo in well-controlled studies, and one therefore cannot
reliably determine a minimum effect the drug will have in the
setting of a specific trial. Such conditions tend to include those
in which there is substantial improvement and variability in placebo
groups, and/or in which the effects of therapy are small, or
variable, such as depression, anxiety, dementia, angina, symptomatic
congestive heart failure, seasonal allergies, and symptomatic
gastroesophageal reflux disease.
In all these cases, there is no doubt that the standard
treatments are effective because there are many well-controlled
studies of each of these drugs that have shown an effect. Based on
available experience, however, it would be difficult to describe
study conditions in which the drug would reliably have at least a
minimum effect (i.e., conditions in which there is sensitivity-to-
drug-effects) and that, therefore, could be used to identify an
appropriate margin. In some cases, the experience on which the
expectation of sensitivity-to-drug-effects is based may be of
questionable relevance, e.g., if standards of treatment and
diagnosis have changed substantially over time. If someone proposing
to use an active-control noninferiority design cannot provide
acceptable support for the sensitivity-to-drug-effects of the study
with the chosen inferiority margin, a finding of noninferiority
cannot be considered informative with respect to efficacy or to a
showing of clinical comparability/equivalence.
1.5.4 Assay Sensitivity and Study Quality in Noninferiority
Designs
Even where historical experience indicates that studies in a
particular therapeutic area are likely to have sensitivity-to-drug-
effects, this likelihood can be undermined by the particular
circumstances under which the study was conducted. Great attention
therefore needs to be paid to how the trial was designed and
conducted to determine whether it actually did have assay
sensitivity. There are many factors that can reduce a trial's assay
sensitivity, such as:
1. Poor compliance with therapy
2. Poor responsiveness of the study population to drug effects
3. Use of concomitant medication or other treatment that
interferes with the test drug or that reduces the extent of the
potential response
4. A population that tends to improve spontaneously, leaving no
room for further drug-induced improvement
5. Poor diagnostic criteria (patients lacking the disease to be
studied)
6. Inappropriate (insensitive) measures of drug effect
7. Excessive variability of measurements
8. Biased assessment of endpoint because of knowledge that all
patients are receiving a potentially active drug, e.g., a tendency
to read blood pressure responses as greater than they actually are,
reducing the difference between test drug and control
Clinical researchers and trial sponsors intend to perform high
quality studies, and the publication of the Good Clinical Practices
guidance will enhance study quality. Nonetheless, it should be
appreciated that in trials intended to show a difference between
treatments there is a strong imperative to utilize a good study
design and minimize study errors, because trial imperfections
increase the likelihood of failing to show a difference between
treatments when one exists. In placebo-controlled trials, for
example, there is often a withdrawal period to be sure study
subjects actually have the disease for which treatment is intended,
and great care is taken in defining entry criteria to be sure
patients have an appropriate stage of the disease. It is common to
have a single-blind placebo run-in period to discover and eliminate
subjects who recover spontaneously, whose measurements are too
variable, or who are likely to comply poorly with the protocol.
There is close attention to trial conduct, including administration
of the correct treatments to patients, encouraging compliance with
medication use, controlling (or at least recording) concomitant drug
use and other concomitant illness, and use of standard procedures
for measurement (technique, timing, training periods). All of these
efforts will help ensure that an effective drug will be
distinguished from placebo. Nonetheless, in many clinical settings,
despite the strong stimulus and extensive efforts to ensure study
excellence and assay sensitivity, clinical studies are often unable
to reliably distinguish effective drugs from placebo.
In contrast, in trials intended to show that there is not a
difference of a particular size (noninferiority) between two
treatments, there is a much weaker stimulus to engage in many of
these efforts, which help ensure that differences will be detected,
i.e., ensure sensitivity, because failure to show a difference
greater than the margin is the desired outcome of the study.
Although some kinds of study error diminish observed differences
between treatments, it is noted that some kinds of study errors can
increase variance, which would decrease the likelihood of showing
noninferiority by widening the confidence interval so that a
[[Page 51772]]
test drug control difference greater than the margin cannot be
excluded. There would therefore be a strong stimulus in these trials
to reduce variance, which might be caused, for example, by poor
measurement technique. Many errors of the kind described, however,
reduce the observed difference between treatments (and thus assay)
without necessarily increasing variance. They therefore increase the
likelihood that an inferior drug will be found noninferior.
When a noninferiority study is offered as evidence of
effectiveness of a new drug, both the sponsor and regulatory
authority need to pay particularly close attention to study quality.
Whether a given study has assay sensitivity often cannot be
determined, but the known reasons for failure to have such
sensitivity should be monitored. The design and conduct of the study
need to be shown to be similar to studies of the active control that
were successful in the past. To ensure that sensitivity-to-drug-
effects seen in past studies is likely to be present in the new
study, there should be close attention to critical design
characteristics such as the entry criteria and characteristics of
the study population (severity of medical condition, method of
diagnosis), the specific endpoint measured and timing of
assessments, and the use of washout periods to exclude patients
without disease or to exclude patients with spontaneous improvement.
Similarly, aspects of study conduct that could decrease assay
sensitivity should also be examined, including such characteristics
as compliance with therapy, monitoring of concomitant therapy,
enforcement of entry criteria, and prevention of study dropouts.
One other possibility should be considered. Even where a study
seems likely to have sensitivity-to-drug-effects based on prior
studies, the population studied or other aspects of study design or
conduct in a noninferiority study may be so different that results
with the active-control treatment are visibly atypical (e.g., cure
rate in an antibiotic trial that is unusually high or low). In that
case, the results of a noninferiority trial may not be persuasive.
2.0 Detailed Consideration of Types of Control
2.1 Placebo Control
2.1.1 Description (See Section 1.3.1)
In a placebo-controlled study, subjects are assigned, almost
always by randomization, to either a test drug or to a placebo. A
placebo is a ``dummy'' medication that appears as identical as
possible to the investigational or test drug with respect to
physical characteristics such as color, weight, taste and smell, but
that does not contain the test drug. Some trials may study more than
one dose of the test drug or include both an active control and
placebo. In these cases, it may be easier for the investigator to
use more than one placebo (``double-dummy'') than to try to make all
treatments look the same. The use of placebo facilitates, and is
almost always accompanied by, double-blinding (or double-masking).
The difference in measured outcome between the active drug and
placebo groups is the measure of drug effect under the conditions of
the study. Within this general description there is a wide variety
of designs that can be used successfully: Parallel or cross-over
designs (see ICH E9), single fixed dose or titration in the active
drug group, several fixed doses. Several designs meriting special
attention will be described below. Note that not every study that
includes a placebo is a placebo-controlled study. For example, an
active-control study could use a placebo for each drug (double-
dummy) to facilitate blinding; this is still an active-control
trial, not a placebo-controlled trial. A placebo-controlled trial is
one in which treatment with a placebo is compared with treatment
with an active drug.
2.1.2 Ability to Minimize Bias
The placebo-controlled trial, using randomization and blinding,
generally reduces subject and investigator bias maximally, but such
trials are not impervious to blind-breaking through recognition of
pharmacologic effects of one treatment (perhaps a greater concern in
cross-over designs); blinded outcome assessment can enhance bias
reduction in such cases.
2.1.3 Ethical Issues
When a new agent is tested for a condition for which no
effective treatment is known, there is usually no ethical problem
with a study comparing the new agent to placebo. Use of a placebo
control may raise problems of ethics, acceptability, and
feasibility, however, when an effective treatment is available for
the condition under study in a proposed trial. In cases where an
available treatment is known to prevent serious harm, such as death
or irreversible morbidity in the study population, it is generally
inappropriate to use a placebo control. There are occasional
exceptions, however, such as cases in which standard therapy has
toxicity so severe that many patients will refuse therapy.
In other situations, when there is no major health risk
associated with withholding or delay of effective therapy, it is
considered ethical to ask patients to participate in a placebo-
controlled trial, even if they may experience discomfort as a
result, provided the setting is noncoercive and they are fully
informed about available therapies and the consequences of delaying
treatment. Such trials, however, may pose important practical
problems. For example, deferred treatment of pain or other symptoms
may be unacceptable to patients or physicians and they may not want
to participate in such a study. Whether a particular placebo-
controlled trial of a new agent will be acceptable to subjects and
investigators when there is known effective therapy is a matter of
investigator, patient, and institutional review board (IRB)/
independent ethics committee (IEC) judgment, and acceptability may
differ among ICH regions. Acceptability could depend on the specific
design of the study and the patient population chosen, as will be
discussed below (see section 2.1.5).
Whether a particular placebo-controlled trial is ethical may,
in some cases, depend on what is believed to have been clinically
demonstrated and on the particular circumstances of the trial. For
example, a short term placebo-controlled study of a new
antihypertensive agent in patients with mild essential hypertension
and no end-organ disease might be considered generally acceptable,
while a longer study, or one that included sicker patients, probably
would not be.
It should be noted that use of a placebo or no-treatment
control does not imply that the patient does not get any treatment
at all. For instance, in an oncology trial, when no active drug is
approved, patients in both the placebo/no-treatment group and the
test drug group will receive needed palliative treatment, such as
analgesics.
2.1.4 Usefulness of Placebo-Controlled Trials and Quality/Validity
of Inference in Particular Situations
When used to show effectiveness of a treatment, the placebo-
controlled trial is as free of assumptions and need for external
(extra-study) information as it is possible to be. Most trial design
problems and careless errors result in failure to demonstrate a
treatment difference (and thereby establish efficacy), so that the
trial contains built-in incentives for study excellence. Even when
the primary purpose of a trial is comparison of two active agents or
assessment of dose-response, the addition of a placebo provides an
internal standard that enhances the inferences that can be drawn
from the other comparisons.
Placebo-controlled trials also provide the maximum ability to
distinguish adverse effects due to drug from those due to underlying
disease or intercurrent illness. Note that where they are used to
show similarity, for example, to show the absence of an adverse
effect, placebo-controlled trials have the same assay sensitivity
problem as any equivalence or noninferiority trial (see section
1.5.1). To interpret the result, one must know that if the study
drug caused an adverse event, it would have been observed.
2.1.5 Modifications of Design and Combinations With Other Controls
That Can Resolve Ethical, Practical, or Inferential Issues
It is often possible to address the ethical or practical
limitations of placebo-controlled trials by using modified study
designs that still retain the inferential advantages of these
trials. In addition, placebo-controlled trials can be made more
informative by inclusion of additional treatment groups, such as
multiple doses of the test agent or a known active-control
treatment.
2.1.5.1 Additional control groups.
2.1.5.1.1 Three-arm study; placebo and active control. As
noted in section 1.5.1, three-arm studies including an active-
control as well as a placebo-control group can readily assess
whether a failure to distinguish test drug from placebo implies
ineffectiveness of the test drug or simply a study that lacked the
ability to identify an active drug. The placebo-standard drug
comparison in such a trial provides internal evidence of assay
sensitivity. It is possible to make the active groups larger than
the placebo group in order to improve the precision of the active
drug comparison, if this is considered important. This may also make
the study more
[[Page 51773]]
appealing to patients, as there is less chance of being randomized
to placebo.
2.1.5.1.2 Additional doses. Randomization among several fixed
doses of the test drug in addition to placebo allows assessment of
dose-response and may be particularly useful in a comparative trial
to ensure a fair comparison of treatments (see ICH E4: Dose-Response
Information to Support Drug Registration).
2.1.5.1.3 Factorial/combination studies. Factorial/
combination (response-surface) designs may be used to explore
several doses of the investigational drug as monotherapy and in
combination with several doses of another agent proposed for use in
combination with it. A single study of this type can define the
properties of a wide array of combinations. Such studies are common
in the evaluation of new antihypertensive therapies, but can be
considered in a variety of settings where more than one treatment is
used simultaneously. For example, the independent additive effects
of aspirin and streptokinase in preventing mortality after a heart
attack were shown in such a trial.
2.1.5.2 Changes in study design.
2.1.5.2.1 Add-on study, placebo-controlled; replacement study.
An ``add-on'' study is a placebo-controlled trial of a new agent
conducted in people also receiving standard therapy. Such studies
are useful when standard therapy is known to decrease mortality or
irreversible morbidity, so that the therapy cannot be withheld from
a patient population known to benefit from it, and when a
noninferiority trial with standard treatment as the active control
cannot be carried out or would be difficult to interpret (see
section 1.5). It is common to study anticancer, antiepileptic, and
anti-heart-failure drugs this way. This design is useful only when
standard therapy is not fully effective (which, however, is almost
always the case), and it has the advantage of providing evidence of
improved clinical outcomes (rather than ``mere'' noninferiority).
Efficacy is, of course, established by such studies only for
combination therapy, and the dose in a monotherapy situation might
be different from the dose found to be effective in combination. In
general, this approach is likely to succeed only when the new and
standard therapies utilize different pharmacologic mechanisms,
although there are exceptions. For example, AIDS combination
therapies may show a beneficial effect of pharmacologically-related
drugs because of delays in development of resistance.
A variation of this design that can sometimes give information
on monotherapy and that is particularly applicable in the setting of
chronic disease, is the replacement study, in which the new drug or
placebo is added by random assignment to conventional treatment
given at an effective dose and the conventional treatment is then
withdrawn, usually by tapering. The ability to maintain the
subjects' baseline status is then observed in the drug and placebo
groups using predefined success criteria. This approach has been
used to study steroid-sparing substitutions in steroid-dependent
patients without need for initial steroid withdrawal and
recrudescence of symptoms in a wash-out period, and has also been
used to study antiepileptic drug monotherapy.
2.1.5.2.2 ``Early escape''; rescue medication. It is possible
to design a study to plan for ``early escape'' from ineffective
therapy. Early escape refers to prompt removal of subjects whose
clinical status worsens or fails to improve to a defined level
(blood pressure not controlled by a prespecified time, seizure rate
greater than some prescribed value, blood pressure rising to a
certain level, angina frequency above a defined level, liver enzymes
failing to normalize by a preset time in patients with hepatitis),
who have a single event that treatment was intended to prevent
(first recurrence of unstable angina, grand mal seizure, paroxysmal
supraventricular arrhythmia), or who otherwise require added
therapy. In such cases, the need to change therapy becomes a study
endpoint. The criteria for deciding whether these endpoints have
occurred should be well specified, and the timing of measurements
should ensure that patients will not remain untreated with an active
drug while their disease is poorly controlled. The primary
difficulty with this trial design is that it may give information
only on short-term effectiveness. The randomized withdrawal trial
(see section 2.1.5.2.4), however, which can also incorporate early-
escape features, can give information on long-term effectiveness. It
should be noted that formal use of rescue medication in response to
clinical deterioration could be utilized similarly.
2.1.5.2.3 Limited placebo period. In a longer term active-
control trial, the addition of a placebo group treated for a short
period may establish assay sensitivity (at least for short-term
effects). The trial would then continue without the placebo group.
2.1.5.2.4 Randomized withdrawal. In a randomized withdrawal
study, subjects receiving an investigational therapy for a specified
time are randomly assigned to continued treatment with the
investigational therapy or to placebo (i.e., withdrawal of active
therapy). Subjects for such a trial could be derived from an
organized open single-arm study, from an existing clinical cohort
(but usually with a formal ``wash-in'' phase to establish the
initial on-therapy baseline), from the active arm of a controlled
trial, or from one or both arms of an active-control trial. Any
difference that emerges between groups receiving continued treatment
and placebo would demonstrate the effect of the active treatment.
The prerandomization observation period on drug can be of any
length; this approach can therefore be used to study long-term
persistence of effectiveness when long-term placebo treatment would
not be acceptable. The postwithdrawal observation period could be of
fixed duration or could use early escape or time to event (e.g.,
relapse of depression) approaches. As with the early-escape design,
procedures for monitoring patients and assessing study endpoints
need careful attention to ensure that patients failing on an
assigned treatment are identified rapidly.
The randomized withdrawal approach is suitable in several
situations. First, it may be suitable for drugs that appear to
resolve an episode of recurring illness (e.g., antidepressants), in
which case the withdrawal study is in effect a relapse-prevention
study. Second, it may be used for drugs that suppress a symptom or
sign (chronic pain, hypertension, angina), but where a long-term
placebo-controlled trial would be difficult; in this case, the study
can establish long-term efficacy. Third, the design can be used to
determine how long a therapy should be continued (e.g.,
postinfarction treatments with a beta-blocker).
The general advantage of randomized withdrawal designs, when
used with an early-escape endpoint, such as return of symptoms, is
that the period of placebo exposure with poor response that a
patient would have to undergo is short.
Dosing issues can be addressed by this type of design. After
all patients had received an initial fixed dose, they could be
randomly assigned in the ``withdrawal'' phase to several different
doses (as well as placebo), a particularly useful approach when
there is reason to think the initial and maintenance doses might be
different, either on pharmacodynamic grounds or because there is
substantial accumulation of active drug resulting from a long half
life of parent drug or active metabolite. Note that the randomized
withdrawal design could be used to assess dose-response after an
initial placebo-controlled titration study. The titration study is
an efficient design for establishing effectiveness, but does not
give good dose-response information. The randomized withdrawal
phase, with responders randomly assigned to several fixed doses and
placebo, will study dose-response rigorously while allowing the
efficiency of the titration design.
In utilizing randomized withdrawal designs, it is important to
appreciate the possibility of withdrawal phenomena, suggesting the
wisdom of relatively slow tapering. A patient may develop tolerance
to a drug such that no benefit is being accrued, but the drug's
withdrawal may lead to disease exacerbation, resulting in an
erroneous conclusion of persisting efficacy. It is also important to
realize that treatment effects observed in these studies may be
larger than those seen in the general population because randomized
withdrawal studies are ``enriched'' with responders (see appendix).
This phenomenon results when the study explicitly includes only
subjects who appear to have responded to the drug or includes only
people who have completed a previous phase of study (which is often
an indicator of a good response).
2.1.5.2.5 Other design considerations. In any placebo-
controlled study, unbalanced randomization (e.g., 2:1, study drug to
placebo) may enhance the safety data base and may also make the
study more attractive to patients and/or investigators.
2.1.6 Advantages of Placebo-Controlled Trials
2.1.6.1 Ability to demonstrate efficacy credibly. Like other
difference-showing trials, the interpretation of the placebo-
controlled study relies on no externally based
[[Page 51774]]
assumptions of sensitivity-to-drug-effects nor an assessment of
assay sensitivity. These may be the only credible study designs in
situations where it is not possible to conclude that noninferiority
studies would have assay sensitivity (see section 1.5).
2.1.6.2 Measures ``absolute'' effectiveness and safety. The
placebo-controlled trial measures the absolute effect of treatment
and allows a distinction between adverse events due to the drug and
those due to the underlying disease or ``background noise.'' The
absolute effect size information is valuable in a three-group trial
(test, placebo, active), even if the primary purpose of the trial is
the test versus active control comparison.
2.1.6.3 Efficiency. Placebo-controlled trials are efficient in
that they can detect treatment effects with a smaller sample size
than any other type of concurrently controlled study. Active-control
trials intended to show superiority of the new treatment are
generally seeking smaller differences than the active-placebo
difference sought in a placebo-controlled trial, resulting in need
for a larger sample size. Noninferiority active-control trials also
need larger sample sizes because they must use conservative
assumptions about the effect size of the control drug to ensure that
noninferiority of the test drug would in fact demonstrate efficacy.
Designers of dose-response studies need to guess at the shape and
position of the dose-response curve and may wastefully assign some
subjects to several doses that have no effect or are on a response
plateau.
2.1.6.4 Minimizing the effect of subject and investigator
expectations. Use of a blinded placebo control may decrease the
amount of improvement resulting from subject or investigator
expectations because both are aware that some subjects will receive
no active drug. This may increase the ability of the study to detect
true drug effects.
2.1.7 Disadvantages of Placebo-Controlled Trials
2.1.7.1 Ethical concerns (see sections 2.1.3 and 2.1.4). When
effective therapy that is known to prevent harm exists for a
particular population, that population cannot usually be ethically
studied in placebo-controlled trials; the particular conditions and
populations for which this is true may be controversial. Ethical
concerns may also direct studies toward less ill subjects or cause
studies to examine short-term endpoints when long-term outcomes are
of greater interest. Where a placebo-controlled trial is unethical
and an active-control trial would not be credible, it may be very
difficult to study new drugs at all. For example, it would not be
considered ethical to carry out a placebo-controlled trial of a beta
blocker in postinfarction patients; yet it would be difficult to
conclude that a noninferiority trial would have sensitivity-to-drug-
effects. The designs described in section 2.1.5 may be useful in
some of these cases.
2.1.7.2 Patient and physician practical concerns. Physicians
and/or patients may be reluctant to accept the possibility that the
patient will be assigned to the placebo treatment, even if there is
general agreement that withholding or delaying treatment will not
result in harm. Subjects who sense they are not improving may drop
out of trials because they attribute lack of effect to having been
treated with placebo, complicating the analysis of the study. With
care, however, drop-out for lack of effectiveness can sometimes be
used as a study endpoint. Although this may provide some information
on drug effectiveness, such information is less precise than actual
information on clinical status in subjects receiving their assigned
treatment.
2.1.7.3 Generalizability. It is sometimes argued that any
controlled trial, but especially a placebo-controlled trial,
represents an artificial environment that gives results different
from true ``real world'' effectiveness. If study populations are
unrepresentative in placebo-controlled trials because of ethical or
practical concerns, questions about the generalizability of study
results can arise. For example, patients with more serious disease
may be excluded by protocol, investigator, or patient choice from
placebo-controlled trials. In some cases, only a limited member of
patients or centers may be willing to participate in studies.
Whether these concerns actually (as opposed to theoretically) limit
generalizability has not been established.
2.1.7.4 No comparative information. Placebo-controlled trials
lacking an active control give little useful information about
comparative effectiveness, information that is of interest and
importance in many circumstances. Such information cannot reliably
be obtained from cross-study comparisons, as the conditions of the
studies may have been quite different.
2.2 No-Treatment Concurrent Control (See Section 1.3.2)
The randomized no-treatment control is similar in its general
properties and its advantages and disadvantages to the placebo-
controlled trial. Unlike the placebo-controlled trial, however, it
cannot be fully blinded, and this can affect all aspects of the
trial, including subject retention, patient management, and all
aspects of observation (see section 1.2.2). This design is
appropriate in circumstances where a placebo-controlled trial would
be performed, except that blinding is not feasible because the
treatments themselves are so different, e.g. radiation therapy
versus surgery, or because the treatment side effects are so
different. When this design is used, it is desirable to have
critical decisions, such as eligibility and endpoint determination
or changes in management, made by an observer blinded to treatment
assignment. Decisions related to data analysis, such as inclusion of
patients in analysis sets, should also be made by individuals
without access to treatment assignment (See ICH E9 for further
discussion).
2.3 Dose-Response Concurrent Control (See Section 1.3.3)
2.3.1 Description
A dose-response study is one in which subjects are randomly
assigned to one of several dosing groups, with or without a placebo
group. Dose-response studies are carried out to establish the
relation between dose and efficacy/adverse effects and/or to
demonstrate efficacy. The first use is considered in ICH E4; the
latter is the subject of this guidance. Evidence of efficacy could
be based on significant differences in pair-wise comparisons between
dosing groups or between dosing groups and placebo, or on evidence
of a significant positive trend with increasing dose, even if no two
groups are significantly different. In the latter case, however,
further study may be needed to assess the effectiveness of the low
doses. As noted in ICH E9, the particular approach for the primary
efficacy analysis should be prespecified.
There are several advantages to inclusion of a placebo (zero-
dose) group in a dose-response study. First, it avoids studies that
are uninterpretable because all doses produce similar effects so
that one cannot assess whether all doses are equally effective or
equally ineffective. Second, the placebo group permits an estimate
of absolute size of effect, although the estimate may not be very
precise if the dosing groups are relatively small. Third, as the
drug-placebo difference is generally larger than inter-dose
differences, use of placebo may permit smaller sample sizes. The
size of various dose groups need not be identical; e.g., larger
samples could be used to give more precise information about the
effect of smaller doses or be used to increase the power of the
study to show a clear effect of what is expected to be the optimal
dose. Dose-response studies can include one or more doses of an
active-control agent. Randomized withdrawal designs can also assign
subjects to multiple dosage levels.
2.3.2 Ability to Minimize Bias
If the dose-response study is blinded, it shares with other
blinded designs an ability to minimize subject and investigator
bias. When a drug has pharmacologic effects that could break the
blind for some patients or investigators, it may be easier to
preserve blinding in a dose-response study than in a placebo-
controlled trial. Masking treatments may necessitate multiple
dummies or preparation of several different doses that look alike.
2.3.3 Ethical Issues
The ethical and practical concerns related to a dose-response
study are similar to those affecting placebo-controlled trials.
Where there is therapy known to be effective in preventing death or
irreversible morbidity, it is no more ethically acceptable to
randomize deliberately to subeffective therapy than it is to
randomize to placebo. Where therapy is directed at less serious
conditions or where the toxicity of the therapy is substantial
relative to its benefits, dose-response studies that use low,
potentially subeffective doses or placebo may be acceptable to
patients and investigators.
2.3.4 Usefulness of Dose-Response Studies and Quality/Validity of
Inference in Particular Situations
In general, a blinded dose-response study is useful for the
determination of efficacy and safety in situations where a placebo-
controlled trial would be useful and has similar credibility (see
section 2.1.4).
[[Page 51775]]
2.3.5 Modifications of Design and Combinations With Other Controls
That Can Resolve Ethical, Practical, or Inferential Problems
In general, the sorts of modification made to placebo-
controlled studies to mitigate ethical, practical, or inferential
problems are also applicable to dose-response studies (see section
2.1.5).
2.3.6 Advantages of Dose-response Trials, Other Than Those Related
to Any Difference-Showing Study
2.3.6.1 Efficiency. Although a comparison of a large, fully
effective dose to placebo is maximally efficient for showing
efficacy, this design may produce unacceptable toxicity and gives no
dose-response information. When the dose-response is monotonic, the
dose-response trial is reasonably efficient in showing efficacy and
also yields dose-response information. If the optimally effective
dose is not known, it may be more prudent to study a range of doses
than to choose a single dose that may prove to be suboptimal or
toxic.
2.3.6.2 Possible ethical advantage. In some cases, notably
those in which there is likely to be dose-related efficacy and dose-
related important toxicity, the dose-response study may represent a
difference-showing trial that can be ethically or practically
conducted even where a placebo-controlled trial could not be,
because there is reason for patients and investigators to accept
lesser effectiveness in return for greater safety.
2.3.7 Disadvantages of Dose-Response Study
A potential problem that needs to be recognized is that a
positive dose-response trend (i.e., a significant correlation
between the dose and the efficacy outcome), without significant
pair-wise differences, can establish efficacy, but may leave
uncertainty as to which doses (other than the largest) are actually
effective. But, of course, a single-dose study poses a similar
problem with respect to doses below the one studied, giving no
information at all about such doses.
It should also be appreciated that it is not uncommon to show
no difference between doses in a dose-response study; if there is no
placebo group to provide a clear demonstration of an effect, this is
a very costly ``no test'' outcome.
If the therapeutic range is not known at all, the design may be
inefficient, as many patients may be assigned to sub-therapeutic or
supratherapeutic doses.
Dose-response designs may be less efficient than placebo-
controlled titration designs for showing the presence of a drug
effect; they do, however, in most cases provide better dose-response
information (see ICH E4).
2.4 Active Control
2.4.1 Description (See Section 1.3.4)
An active-control (positive-control) trial is one in which an
investigational drug is compared with a known active drug. Such
trials are usually randomized and usually double-blind. The most
crucial design question is whether the trial is intended to show a
difference between the two drugs or to show noninferiority/
equivalence. A sponsor intending to demonstrate effectiveness by
means of a trial showing noninferiority of the test drug to a
standard agent needs to address the issue of the sensitivity-to-
drug-effects and assay sensitivity of the trial, as discussed in
section 1.5. In a noninferiority/equivalence trial, the active-
control agent needs to be of established efficacy at the dose used
and under the conditions of the study (see ICH E9: Statistical
Principles for Clinical Trials). In general, this means it should be
an agent acceptable in the region to which the studies will be
submitted for the same indication at the dose being studied. A
superiority study favoring the test drug, on the other hand, is
readily interpretable as evidence of efficacy, even if the dose of
active control is too low or the active control is of uncertain
benefit (but not if it could be harmful). Such a result, however--
superiority in the trial of the test agent to the control--is
interpretable as actual superiority of the test drug to the control
treatment only when the active control is used in appropriate
patients at an optimal dose and schedule (see section 1.4.2). Lack
of appropriate use of the control drug would also make the study
unusable as a noninferiority study if superiority of the test drug
is not shown, because assay sensitivity of the study would not be
ensured (see section 1.5.4).
2.4.2 Ability to Minimize Bias
A randomized and blinded active-control trial generally
minimizes subject and investigator bias, but a note of caution is
warranted. In a noninferiority trial, investigators and subjects
know that all subjects are getting active drug, although they do not
know which one. This could lead to a biased interpretation of
results in the form of a tendency toward categorizing borderline
cases as successes in partially subjective evaluations, e.g., in an
antidepressant study. Such biases may decrease variance and/or
treatment differences and thus can increase the likelihood of an
incorrect finding of equivalence.
2.4.3 Ethical Issues
Active-control trials are generally considered to pose fewer
ethical and practical problems than placebo-controlled trials
because all subjects receive active treatment. It should be
appreciated, however, that subjects getting a new agent are not
getting standard therapy (just as a placebo group is not) and may be
receiving an ineffective or harmful drug. This is an important
matter if the active-control therapy is known to improve survival or
decrease the occurrence of irreversible morbidity. There should
therefore be a sound rationale for the investigational agent. If
there is not strong reason to expect the new drug to be at least as
good as the standard, an add-on study (see section 2.1.5.2.1) may be
more appropriate, if the conditions allow such a design.
Using a very low dose, either of the active control or of the
test drug, may provide a de facto placebo that can be shown inferior
to the full dose of the test drug. This, however, is only considered
ethical where a placebo would also be ethical, unless there is a
legitimate reason to study such low doses.
2.4.4 Usefulness of Active-Control Trials and Quality/Validity of
Inference in Particular Situations
When a new drug shows an advantage over an active control, the
study has inferential properties regarding the presence of efficacy
equivalent to any other difference-showing trial, assuming that the
active control is not actually harmful. When an active-control trial
is used to show noninferiority/equivalence, there is the special
consideration of sensitivity-to-drug-effects and assay sensitivity,
which are considered above in section 1.5. If assay sensitivity is
established, either historically (by reference to past experience
with the control drug) or by including a placebo control as well as
active control, the active-control trial can assess comparative
efficacy.
2.4.5 Modifications of Design and Combinations With Other Controls
That Can Resolve Ethical, Practical, or Inferential Issues
As discussed earlier (section 2.1.5), active-control studies
can include a placebo group, multiple-dose groups of the test drug,
and/or other dose groups of the active control. Comparative dose-
response studies, in which there are several doses of both test and
active control, are typical in analgesic trials. The doses in
active-control trials can be fixed or titrated, and both cross-over
and parallel designs can be used. The assay sensitivity of a
noninferiority trial can sometimes be supported by a randomized
placebo-controlled withdrawal phase at the end (see section
2.1.5.2.4). Active-control superiority studies in selected
populations (nonresponders to other therapy) can be very useful and
are generally easy to interpret (see appendix), although the results
may not be generalizable.
2.4.6 Advantages of Active-Control Trials
2.4.6.1 Ethical/practical advantages. The active-control
design, whether intended to show noninferiority/equivalence or
superiority, reduces ethical concerns that arise from failure to use
drugs with documented important health benefits. It also addresses
patient and physician concerns about failure to use documented
effective therapy. Recruitment and IRB/IEC approval may be
facilitated, and it may be possible to study larger samples. There
may be fewer dropouts due to lack of effectiveness.
2.4.6.2 Information content. Where superiority to an active
treatment is shown, active-control studies are readily interpretable
regarding evidence of efficacy. The larger sample sizes needed are
sometimes more achievable and acceptable in active-control trials
and can provide more safety information. Active-control trials also
can, if properly designed, provide information about relative
efficacy.
2.4.7 Disadvantages of Active-Control Trials
2.4.7.1 Information content. See section 1.5 for discussion of
the problem of assay sensitivity and the ability of the trial to
support an efficacy conclusion in noninferiority/equivalence trials.
Even when assay sensitivity is supported and the study is suitable
for detecting efficacy, there is no
[[Page 51776]]
direct assessment of absolute effect size and greater difficulty in
quantitating safety outcomes as well.
2.4.7.2 Large sample size. Generally, in noninferiority
trials, the margin of difference that needs to be excluded is chosen
conservatively, first, because the smallest effect of the active
control expected in trials will ordinarily be used as the estimate
of its effect and, second, because there will usually be an intent
to rule out loss of more than some reasonable fraction (see section
1.5.2) of the control drug effect, leading to a still smaller
margin. Because of the need for conservative assumptions about
control drug effect size, sample sizes may be very large. In a
difference-showing active-control trial, the difference between two
drugs is always smaller, often much smaller, than the expected
difference between drug and placebo, again leading to large sample
sizes.
2.5 External Control (Historical Control)
2.5.1 Description
An externally controlled trial is one in which the control
group consists of patients who are not part of the same randomized
study as the group receiving the investigational agent, i.e., there
is no concurrently randomized comparative group. The control group
is thus not derived from exactly the same population as the treated
population. Usually, the control group is a well-documented
population of patients observed at an earlier time (historical
control) at another institution, or even at the same institution but
outside the study. An external-control study could be a superiority
study or an equivalence study. Sometimes certain patients from a
larger experience are selected as a control group on the basis of
particular characteristics that make them similar to the treatment
group; there may even be an attempt to ``match'' particular control
and treated patients.
So-called ``baseline-controlled studies'' are a variety of
externally controlled trials; these are sometimes thought to use
``the patient as his own control,'' but that is logically incorrect.
In fact, the comparator group is an estimate of what would have
happened in the absence of therapy to the patients. Both baseline-
controlled trials and studies that use a more complicated on-off-on
(cross-over) design, but that do not include a concurrently
randomized control group, are of this type. As noted, in these
studies the observed changes from baseline or between study periods
are always compared, at least implicitly, to some estimate of what
would have happened without the intervention. Such estimates are
generally made on the basis of ``general knowledge,'' without
reference to a specific control population. Although in some cases
this is plainly reasonable, e.g., when the effect is dramatic,
occurs rapidly following treatment, and is unlikely to have occurred
spontaneously (e.g., general anesthesia, cardioversion, measurable
tumor shrinkage), in most cases it is not so obvious and a specific
historical experience should be sought. Designers and analysts of
such trials need to be aware of the risks of this type of control
and should be prepared to support its use.
2.5.2 Ability to Minimize Bias
Inability to control bias is the major and well-recognized
limitation of externally controlled trials and is sufficient in many
cases to make the design unsuitable. It is always difficult, in many
cases impossible, to establish comparability of the treatment and
control groups and thus to fulfill the major purpose of a control
group (see section 1.2). The groups can be dissimilar with respect
to a wide range of factors, other than the study drug, that could
affect outcome, including demographic characteristics, diagnostic
criteria, stage or duration of disease, concomitant treatments, and
observational conditions (such as methods of assessing outcome,
investigator expectations). Blinding and randomization are not
available to minimize bias when external controls are used. It is
well documented that untreated historical-control groups tend to
have worse outcomes than an apparently similar control group in a
randomized study, primarily because of selection bias. Control
groups in a randomized study should meet certain criteria to be
entered into the study, criteria that are generally more stringent
and identify a less sick population than is typical of external-
control groups. The group is often identified retrospectively,
leading to potential bias in its selection. A consequence of the
recognized inability to control bias is that the persuasiveness of
findings from externally controlled trials depends on obtaining much
more extreme levels of statistical significance and much larger
estimated differences between treatments than would be considered
persuasive in concurrently controlled trials.
The inability to control bias restricts use of the external-
control design to situations in which the effect of treatment is
dramatic and the usual course of the disease highly predictable. In
addition, use of external controls should be limited to cases in
which the endpoints are objective and the impact of baseline and
treatment variables on the endpoint is well characterized.
As noted, the lack of randomization and blinding, and the
resultant problems with lack of assurance of comparability of test
group and control group, make the likelihood of substantial bias
inherent in this design and impossible to quantitate. Nonetheless,
some approaches to design and conduct of externally controlled
trials could lead them to be more persuasive and potentially less
biased. A control group should be chosen for which there is detailed
information, including, where needed, individual patient data
regarding demographics, baseline status, concomitant therapy, and
course on study. The control patients should be as similar as
possible to the population expected to receive the test drug in the
study and should have been treated in a similar setting and in a
similar manner, except with respect to the study therapy. Study
observations should utilize timing and methodology similar to those
used in the control patients. To reduce selection bias, selection of
the control group should be made before performing comparative
analyses; this may not always be feasible, as outcomes from these
control groups may have been published. Any matching on selection
criteria or adjustments made to account for population differences
should be specified prior to selection of the control and
performance of the study. Where no obvious single ``optimal''
external control exists, it may be advisable to study multiple
external controls, providing that the analytic plan specifies
conservatively how each will be utilized in drawing inferences
(e.g., study group should be substantially superior to the most
favorable control to conclude efficacy). In some cases, it may be
useful to have an independent set of reviewers reassess endpoints in
the control group and in the test group in a blinded manner
according to common criteria.
2.5.3 Ethical Issues
When a drug is intended to treat a serious illness for which
there is no satisfactory treatment, especially if the new drug is
seen as promising on the basis of theoretical considerations, animal
data, or early human experience, there may be understandable
reluctance to perform a comparative study with a concurrent control
group of patients who would not receive the new treatment. At the
same time, it is not responsible or ethical to carry out studies
that have no realistic chance of credibly showing the efficacy of
the treatment. It should be appreciated that many promising
therapies have had less dramatic effects than expected or have shown
no efficacy at all when tested in controlled trials. Investigators
may, in these situations, be faced with very difficult judgments. It
may be tempting in exceptional cases to initiate an externally
controlled trial, hoping for a convincingly dramatic effect, with a
prompt switch to randomized trials if this does not materialize.
Alternatively, and generally preferably, in dealing with
serious illnesses for which there is no satisfactory treatment, but
where the course of the disease cannot be reliably predicted, even
the earliest studies should be randomized. This is usually possible
when studies are carried out before there is an impression that the
therapy is effective. Studies can be monitored by independent data
monitoring committees so that dramatic benefit can be detected
early. Despite the use of a single-treatment group in an externally
controlled trial, a placebo-controlled trial is usually a more
efficient design (needing fewer subjects) in such cases, as the
estimate of control group outcome generally needs to be made
conservatively, causing need for a larger sample size. Great caution
(e.g., applying a more stringent significance level) is called for
because there are likely to be both identified and unidentified or
unmeasurable differences between the treatment and control groups,
often favoring treatment. The concurrently controlled trial can
detect extreme effects very rapidly and, in addition, can detect
modest, but still valuable, effects that would not be credibly
demonstrated by an externally controlled trial.
2.5.4 Usefulness of Externally Controlled Trials and Quality/
Validity of Inference in Particular Situations
An externally controlled trial should generally be considered
only when prior belief in the superiority of the test therapy to
[[Page 51777]]
all available alternatives is so strong that alternative designs
appear unacceptable and the disease or condition to be treated has a
well-documented, highly predictable course. It is often possible,
even in these cases, to utilize alternative, randomized,
concurrently controlled designs (see section 2.1.5 and appendix).
Externally controlled trials are most likely to be persuasive
when the study endpoint is objective, when the outcome on treatment
is markedly different from that of the external control and a high
level of statistical significance for the treatment-control
comparison is attained, when the covariates influencing outcome of
the disease are well characterized, and when the control closely
resembles the study group in all known relevant baseline, treatment
(other than study drug), and observational variables. Even in such
cases, however, there are documented examples of erroneous
conclusions arising from such trials.
When an external-control trial is considered, appropriate
attention to design and conduct may help reduce bias (see section
2.5.2).
2.5.5 Modifications of Design and Combinations With Other Controls
That Can Resolve Ethical, Practical or Inferential Problems
The external-control design can incorporate elements of
randomization and blinding through use of a randomized placebo-
controlled withdrawal phase, often with early-escape provisions, as
described earlier (see section 2.1.5.2.4). The results of the
initial period of treatment, in which subjects who appear to respond
are identified and maintained on therapy, are thus ``validated'' by
a rigorous, largely assumption- and bias-free study.
2.5.6 Advantages of Externally Controlled Trials
The main advantage of an externally controlled trial is that
all patients can receive a promising drug, making the study more
attractive to patients and physicians.
The design has some potential efficiencies (smaller sample
size) because all patients are exposed to test drug, of particular
importance in rare diseases.
2.5.7 Disadvantages of Externally Controlled Trials
The externally controlled study cannot be blinded and is
subject to patient, observer, and analyst bias, major disadvantages.
It is possible to mitigate these problems to a degree, but even the
steps suggested in section 2.5.2 cannot resolve such problems fully,
as treatment assignment is not randomized and comparability of
control and treatment groups at the start of treatment, and
comparability of treatment of patients during the trial, cannot be
ensured or well assessed. It is well documented that externally
controlled trials tend to overestimate efficacy of test therapies.
3.0 Choosing the Control Group
Figure 1 and Table 1 provide a decision tree for choosing among
different types of control groups. Although the table and figure
focus on the choice of control to demonstrate efficacy, some designs
also allow comparisons of test and control agents. The choice of
control can be affected by the availability of therapies and by
medical practices in specific regions.The potential usefulness of
the principal types of control (placebo, active, and dose-response)
in specific situations and for specific purposes is shown in Table
1. The table should be used with the text describing the details of
specific circumstances in which potential usefulness can be
realized. In all cases, it is presumed that studies are
appropriately designed. External controls are so distinct a case
that they are not included in the table. In the table, a P notation
refers to the need to make a convincing case that the study has
assay sensitivity.
In general, evidence of efficacy is most convincingly
demonstrated by showing superiority to a concurrent control
treatment. If a superiority trial is not feasible or is
inappropriate for ethical or practical reasons, and if a defined
treatment effect of the active control is regularly seen (e.g., as
it is for antibiotics in most situations), a noninferiority/
equivalence study can be utilized and can be persuasive. Use of this
design calls for close attention to the issue of sensitivity to drug
effects in active-control noninferiority trials of the condition
being studied and to the assay sensitivity of the particular study
carried out (see section 1.5).
BILLING CODE 4160-01-F
[[Page 51778]]
[GRAPHIC] [TIFF OMITTED] TN24SE99.000
[[Page 51779]]
[GRAPHIC] [TIFF OMITTED] TN24SE99.001
BILLING CODE 4160-01-C
[[Page 51780]]
APPENDIX
Studies of Efficacy in Subsets of the Whole Population; Enrichment
1.0 Introduction
Ideally, the effect of a drug should be known in general and in
relevant demographic and other subsets of the population, such as
those defined by disease severity or other disease characteristics.
To the extent study patients are not a random sample of the patients
who will be treated with the drug once it is marketed, the
generalizability of the results can be questioned. Even if the
overall result is obtained in a representative sample, however, that
does not suggest the result is the same in all people. If subject
selection criteria can identify people more likely to respond to
therapy (e.g., high renin hypertensives to beta blockers), we
consider therapy more rational and the drug more useful.
Subjects entering clinical studies are in fact almost never a
random sample of the potential treatment population, and they are
not treated exactly as a nonstudy patient would be treated. They
must give informed consent, be able to follow instructions, and be
able to get to the clinic. They are sometimes assessed for
likelihood of complying with treatment. They are usually not very
debilitated and generally are without complicated or life-
threatening illness, unless those conditions are being studied. They
are usually selected using particularly stringent diagnostic
criteria that make it very certain they actually have the disease to
be treated (more likely than in clinical practice). Lead-in periods
are often used to exclude subjects who improve spontaneously or
whose relevant functional measures (blood pressure, exercise
tolerance) are too variable. Of course, the entire setting of trials
is artificial in varying degrees, generally directed toward reducing
unwanted variability and increasing study efficiency.
All of these departures from a truly unselected population of
people likely to receive the drug are directed at identifying and
including subjects likely to make a ``good assay population.'' They
can be considered methods of ``enrichment'' of the population,
modifications of a truly random sample of potential users to produce
a population of subjects more likely to discriminate between an
active and an inactive therapy. The kinds of enrichment described
above are widely accepted and ``benign,'' i.e., it seems likely that
results in such a population will be of general applicability, at
least to patients with good compliance. There is a view, however,
that in-use ``effectiveness'' may often be different from the
artificial ``efficacy'' established in these enriched ``efficacy''
trials.
There are other kinds of enrichment that could also be useful
but that would more clearly alter the inference that could be drawn
from the results. This should not discourage their use but should
encourage attention to what such studies do, and do not, show. Some
enrichments of potential value include:
1.1 Studies of Patients Nonresponsive to, or Intolerant of, Other
Therapy
In this kind of study, patients failing therapy on a drug, or
failing to tolerate it acceptably, are randomized to the failed or
poorly tolerated therapy or to the investigational treatment.
Greater efficacy (or better tolerance) of the new therapy shows that
the drug is useful in failures on the other therapy. This is a
valuable showing if, e.g., the drug is relatively toxic and intended
for a ``second-line'' use, but it does not show that the new therapy
is superior in general, and such studies need to be carefully
interpreted. By selecting study patients who will only infrequently
respond to the control agent or who are very likely to have a
particular adverse effect of the control drug, the design
facilitates showing the second drug's advantage in that
circumstance. A direct comparison of the two drugs in an unselected
population that could contain responders to both drugs would need to
be much larger to show a difference between the treatments, even if
there was an overall advantage of the new drug. Moreover, it could
be that each drug has a similar rate of nonresponders (but the other
drug works in some of these), so that no difference could be seen in
a direct comparison in unselected subjects.
In this design, it is usually critical to randomize the
nonresponders or intolerants to both the new agent and the failed
agent, rather than simply place the failures on the new drug.
Patients who failed previously may ``respond'' to the failed drug
when it is readministered in a clinical trial, or may tolerate the
previously poorly tolerated drug in the new circumstance. This can
present a problem. In the ``intolerance'' case, although subjects
can be randomized to a drug that has caused certain kinds of
intolerance, they cannot be randomized to a drug that would endanger
them if administered (e.g., if the intolerance was anaphylaxis,
liver necrosis). Similarly, in the nonresponder case, patients
cannot be restudied on the failed drug if failure would lead to
harm. In some cases, the prior experience may be an adequate control
(e.g., failure of a tumor to respond), a baseline-controlled study
design.
1.2 Studies in Likely or Known Responders
If patients cannot respond to the main pharmacologic effect of
the drug, they cannot be expected to show a clinical response. Thus,
subjects with no blood pressure response to sublingual nitroglycerin
have been excluded from trials of organic nitrates, as they show no
ability to respond to the mechanism of action of these drugs and
including them would only dilute the drug effect. A similar approach
was used in Cardiac Arrhythmia Suppression Trial (CAST). Only
subjects responding to encainide or flecainide with a 70 percent
reduction in ventricular premature beats (VPB's) were randomized to
the mortality phase of the study because there was no reason to
include people who could not possibly benefit (i.e., people with no
VPB reduction). It is important in such cases to record the number
of subjects screened in order to construct the study population so
that users of the drug will have a reasonable expectation of what
they will encounter. It will often be appropriate to incorporate
similar selection criteria in labeling the drug for use.
The nitroglycerin and CAST enrichment approaches were generally
accepted. A potentially more controversial enrichment procedure
would be to identify responders in an initial open phase, withdraw
treatment, then carry out a randomized study in the responders. This
could be a useful approach when efficacy has proved difficult to
demonstrate. For example, it has been difficult to obtain evidence
that gut motility-modifying agents are effective in gastroesophageal
reflux disease, perhaps because there are unrecognized
pathophysiologic subsets of patients, some of which can respond and
some of which cannot. It seems possible that identifying apparent
responders clinically, then randomizing the apparent responders to
drug and placebo treatments, would best utilize both clinical
observation and rigorous design.
In seeking dose-response information, little is to be learned
from studying the drug in a population of nonresponders (although
one would want to know the proportion of the population that is
nonreponsive). Such studies might better be carried out in known
responders to the drug. Similarly, in evaluating a drug of a
particular class, studies including only known responders to the
class might be more likely to detect an effect of the drug or to
show differences between members of the class.
Finally, it should be appreciated that randomized withdrawal
studies (see section 2.1.5.2.4), and studies of maintenance
treatment in general, are often studies in known responders and can
therefore be expected to show greater effect than studies in an
unselected population.
Dated: September 16, 1999.
Margaret M. Dotzel,
Acting Associate Commissioner for Policy
[FR Doc. 99-24855 Filed 9-23-99; 8:45 am]
BILLING CODE 4160-01-F