[Federal Register Volume 59, Number 183 (Thursday, September 22, 1994)]
[Unknown Section]
[Page 0]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 94-23379]
[[Page Unknown]]
[Federal Register: September 22, 1994]
_______________________________________________________________________
Part IX
Department of Health and Human Services
_______________________________________________________________________
Food and Drug Administration
_______________________________________________________________________
International Conference on Harmonisation; Guideline on Detection of
Toxicity to Reproduction for Medicinal Products; Availability; Notice
-----------------------------------------------------------------------
[Docket No. 93D-0140]
International Conference on Harmonisation; Guideline on Detection
of Toxicity to Reproduction for Medicinal Products; Availability
AGENCY: Food and Drug Administration, HHS.
ACTION: Notice.
-----------------------------------------------------------------------
SUMMARY: The Food and Drug Administration (FDA) is publishing a final
guideline entitled ``Guideline on Detection of Toxicity to Reproduction
for Medicinal Products.'' This guideline was prepared under the
auspices of the International Conference on Harmonisation of Technical
Requirements for Registration of Pharmaceuticals for Human Use (ICH).
The guideline is intended to reflect sound scientific principles for
reproductive toxicity testing. The guideline is applicable to sponsors
submitting applications to both the Center for Drug Evaluation and
Research (CDER) and the Center for Biologics Evaluation and Research
(CBER).
DATES: Effective September 22, 1994. Submit written comments at any
time.
ADDRESSES: Submit written comments on the guideline to the Dockets
Management Branch (HFA-305), Food and Drug Administration, rm. 1-23,
12420 Parklawn Dr., Rockville, MD 20857. Copies of the guideline are
available from the CDER Executive Secretariat Staff (HFD-8), Center for
Drug Evaluation and Research, Food and Drug Administration, 7500
Standish Pl., Rockville, MD 20855.
FOR FURTHER INFORMATION CONTACT:
Regarding the guideline: Joy A. Cavagnaro, Center for Biologics
Evaluation and Research (HFM-500), Food and Drug Administration, 1401
Rockville Pike, Rockville, MD 20852, 301-594-2860.
Regarding the ICH: Janet J. Showalter, Office of Health Affairs
(HFY-20), Food and Drug Administration, 5600 Fishers Lane, Rockville,
MD 20857, 301-443-1382.
SUPPLEMENTARY INFORMATION: In recent years, many important initiatives
have been undertaken by regulatory authorities and industry
associations to promote international Harmonisation of regulatory
requirements. FDA has participated in many meetings designed to enhance
Harmonisation and is committed to seeking scientifically based
harmonized technical procedures for pharmaceutical development. One of
the goals of Harmonisation is to identify and then reduce differences
in technical requirements for drug development.
ICH was organized to provide an opportunity for tripartite
Harmonisation initiatives to be developed with input from both
regulatory and industry representatives. FDA also seeks input from
consumer representatives and others. ICH is concerned with
Harmonisation of technical requirements for the registration of
pharmaceutical products among three regions: The European Union, Japan,
and the United States. The six ICH sponsors are the European
Commission, the European Federation of Pharmaceutical Industry
Associations, the Japanese Ministry of Health and Welfare, the Japanese
Pharmaceutical Manufacturers Association, FDA, and the U.S.
Pharmaceutical Research and Manufacturers of America. The ICH
Secretariat, which coordinates the preparation of documentation, is
provided by the International Federation of Pharmaceutical
Manufacturers Association (IFPMA).
The ICH Steering Committee includes representatives from each of
the ICH sponsors and the IFPMA, as well as observers from the World
Health Organization, the Canadian Health Protection Branch, and the
European Free Trade Area.
Harmonisation of reproductive toxicology testing was selected as a
priority topic during the early stages of the ICH initiative. In the
Federal Register of April 16, 1993 (58 FR 21074), FDA published a draft
tripartite guideline entitled, ``Guideline on Detection of Toxicity to
Reproduction for Medicinal Products.'' The notice gave interested
persons an opportunity to submit comments by May 17, 1993.
After consideration of the comments received and revisions to the
guideline, a final draft of the guideline was submitted to the ICH
Steering Committee in June 1993 and endorsed by the three participating
regulatory agencies. The final guideline was subsequently presented at
the second ICH meeting held in October 1993. The guideline provides
information applicable to sponsors submitting applications to both CDER
and CBER. Sponsors submitting future applications may be asked to
explain differences from the approach suggested in the guideline.
To help facilitate understanding of the guideline, the agency is
providing further clarification of important questions that have been
raised since initial general distribution of the document at ICH 2 by
both industry and regulatory scientists.
General Comments
First pass tests in the guideline are those tests that will likely
be performed as general screens (i.e., the three-study design or ``most
probable option'') to identify potential treatment related effects.
Secondary tests are those designed to characterize, e.g., the nature,
scope, and/or origin of the toxic effect. In general, repeated dose
general toxicity studies of 2 to 4 weeks duration may provide a close
approximation of the doses to be used in the reproductive toxicology
studies.
Male Fertility
As stated in the introduction to the guideline, studies are ongoing
to optimize parameters to be used in fertility studies, including the
optimal treatment period for males prior to mating, histological
techniques for the evaluation of sex organs, and techniques to evaluate
sperm. It is expected that, in most cases, viability will be measured
indirectly by evaluating sperm motility. A variety of methods will be
acceptable to evaluate sperm, including vital dye staining, flow
cytometric analysis, and nonautomated and automated methods to measure
the percent of motile sperm. Sponsors should justify the methods used
and define the objective criteria established to assess the data
obtained. It is expected that improvements in methods to assess male
reproductive performance will evolve over the next few years.
The design of the study of fertility (ICH 4.1.1) assumes that,
especially for effects on spermatogenesis, use will be made of data
from repeated dose toxicity studies of at least 1-month duration. The
agency encourages the use of good pathological and histopathological
examination techniques in the repeated dose toxicity studies in
addition to the staging of spermatogenesis which is routinely employed.
The preservation of testes and epididymides from all animals from ICH
study 4.1.1 provides an opportunity for more detailed histopathological
examination on a case-by-case basis; for example, if unexpected effects
on sperm count or viability are observed. There may be cases due to
species-specific effects or technical considerations (e.g., multiple
samplings are required overtime) when sperm evaluation in nonrodents
may be more appropriate.
The duration of pretreatment for males in ICH study 4.1.1 is 4
weeks, unless data from other studies suggest that this should be
modified. Males should be treated throughout the mating period
(generally between 2 and 3 weeks) and at least through implantation of
the females. Thus, males will generally be sacrificed following at
least 7 to 9 weeks dosing. Evaluations should generally include organ
weights and macroscopic examinations of testis, epididymis, seminal
vesicle, and prostate. Sperm counts and sperm viability (e.g.,
motility) should be assessed. Tissues should be saved for potential
histological assessment, as such assessments may be required on a case-
by-case basis. If histological data are not available from previous
studies or the quality of the data are dubious, then histological
evaluation should be performed in this study.
Prenatal and Postnatal Development
When studying the effect on postnatal development, the reduction of
litter size by culling is still under discussion. If culling is
performed, it should be randomized. Whether or not it is performed, it
should be explained by the investigator. Observations on offspring in
ICH study 4.1.2 include sensory functions and reflexes and behavior,
consistent with previous guidelines from Japan and the European Union.
Specific functional tests have not been recommended in the ICH
guideline. Investigators are encouraged to use methods that will assess
sensory functions, motor activity, learning, and memory to help
characterize functional deficits in offspring. Under the terminology
section of the guideline, a three-generation study is defined as direct
exposure of the F0 generation, indirect and direct exposure of the F1
and F2, and indirect exposure of the F3 generation.
In the past, guidelines have generally been issued under
Sec. 10.90(b) (21 CFR 10.90(b)), which provides for the use of
guidelines to state procedures or standards of general applicability
that are not legal requirements but are acceptable to FDA. The agency
is now in the process of revising Sec. 10.90(b). Therefore, this
guideline is not being issued under the authority of Sec. 10.90(b), and
it does not create or confer any rights, privileges, or benefits for or
on any person, nor does it operate to bind FDA in any way.
As with all of FDA's guidelines, the public is encouraged to submit
written comments with new data or other new information pertinent to
this guideline. The comments in the docket will be periodically
reviewed, and, where appropriate, the guideline will be amended. The
public will be notified of any such amendments through a notice in the
Federal Register.
Interested persons may, at any time, submit written comments on the
guideline to the Dockets Management Branch (address above). Two copies
of any comments are to be submitted, except that individuals may submit
one copy. Comments are to be identified with the docket number found in
brackets in the heading of this document. The guideline and received
comments may be seen in the office above between 9 a.m. and 4 p.m.,
Monday through Friday.
The text of the guideline follows:
Guideline on Detection of Toxicity to Reproduction for Medical Products
1. Introduction
1.1 Purpose of the Guideline
There is a considerable overlap in the methodology that could be
used to test chemicals and medicinal products for potential
reproductive toxicity. As a first step to using this wider
methodology for efficient testing, this guideline attempts to
consolidate a strategy based on study designs currently in use for
testing of medicinal products; it should encourage the full
assessment on the safety of chemicals on the development of the
offspring. It is perceived that tests in which animals are treated
during defined stages of reproduction better reflect human exposure
to medicinal products and allow more specific identification of
stages at risk. While this approach may be useful for most
medicines, long-term exposure to low doses does occur and may be
represented better by a one- or two-generation study approach.
The actual testing strategy should be determined by:
Anticipated drug use especially in relation to
reproduction,
The form of the substance and route(s) of
administration intended for humans, and
Making use of any existing data on toxicity,
pharmacodynamics, kinetics, and similarity to other compounds in
structure/activity.
To employ this concept successfully, flexibility is needed (Note
1). No guideline can provide sufficient information to cover all
possible cases. All persons involved should be willing to discuss
and consider variations in test strategy according to the state-of-
the-art and ethical standards in human and animal experimentation.
Areas where more basic research would be useful for optimization of
test designs are male fertility assessment, and kinetic and
metabolism in pregnant/lactating animals.
1.2 Aim of Studies
The aim of reproduction toxicity studies is to reveal any effect
of one or more active substance(s) on mammalian reproduction. For
this purpose, both the investigations and the interpretation of the
results should be related to all other pharmacological and
toxicological data available to determine whether potential
reproductive risks to humans are greater, lesser, or equal to those
posed by other toxicological manifestations. Further, repeated dose
toxicity studies can provide important information regarding
potential effects on reproduction, particularly male fertility. To
extrapolate the results to humans (assess the relevance), data on
likely human exposures, comparative kinetics, and mechanisms of
reproductive toxicity may be helpful.
The combination of studies selected should allow exposure of
mature adults and all stages of development from conception to
sexual maturity. To allow detection of immediate and latent effects
of exposure, observations should be continued through one complete
life cycle, i.e., from conception in one generation through
conception in the following generation. For convenience of testing
this integrated sequence can be subdivided into the following
stages.
A. Premating to conception (adult male and female reproductive
functions, development and maturation of gametes, mating behavior,
fertilization).
B. Conception to implantation (adult female reproductive
functions, preimplantation development, implantation).
C. Implantation to closure of the hard palate (adult female
reproductive functions, embryonic development, major organ
formation).
D. Closure of the hard palate to the end of pregnancy (adult
female reproductive functions, fetal development and growth, organ
development and growth).
E. Birth to weaning (adult female reproductive functions,
neonate adaption to extrauterine life, preweaning development and
growth).
F. Weaning to sexual maturity (postweaning development and
growth, adaption to independent life, attainment of full sexual
function).
For timing conventions see Note 2.
1.3 Choice of Studies
The guideline addresses the design of studies primarily for
detection of effects on reproduction. When an effect is detected,
further studies to characterize fully the nature of the response
have to be designed on a case-by-case basis (Note 3). The rationale
for the set of studies chosen should be given and should include an
explanation for the choice of dosages.
Studies should be planned according to the ``state-of-the art,''
and take into account preexisting knowledge of class-related effects
on reproduction. They should avoid suffering and should use the
minimum number of animals necessary to achieve the overall
objectives. If a preliminary study is performed, the results should
be considered and discussed in the overall evaluation (Note 4).
2. Animal Criteria
The animals used should be well defined with respect to their
health, fertility, fecundity, prevalence of abnormalities,
embryofetal deaths, and the consistency they display from study to
study. Within and between studies, animals should be of comparable
age, weight, and parity at the start; the easiest way to fulfill
these criteria is to use animals that are young, mature adults at
the time of mating with the females being virgin.
2.1 Selection and Number of Species
Studies should be conducted in mammalian species. It is
generally desirable to use the same species and strain as in other
toxicological studies. Reasons for using rats as the predominant
rodent species are practicality, comparability with other results
obtained in this species and the large amount of background
knowledge accumulated.
In embryotoxicity studies only, a second mammalian species
traditionally has been required, the rabbit being the preferred
choice as a ``nonrodent.'' Reasons for using rabbits in
embryotoxicity studies include the extensive background knowledge
that has accumulated, as well as availability and practicality.
Where the rabbit is unsuitable, an alternative nonrodent or a second
rodent species may be acceptable and should be considered on a case-
by-case basis (Note 5).
2.2 Other Test Systems
Other test systems are considered to be any developing mammalian
and nonmammalian cell systems, tissues, organs, or organism cultures
developing independently in vitro or in vivo. Integrated with whole
animal studies either for priority selection within homologous
series or as secondary investigations to elucidate mechanisms of
action, these systems can provide invaluable information and,
indirectly, reduce the numbers of animals used in experimentation.
However, they lack the complexity of the developmental processes and
the dynamic interchange between the maternal and the developing
organisms. These systems cannot provide assurance of the absence of
effect nor provide perspective in respect of risk/exposure. In
short, there are no alternative test systems to whole animals
currently available for reproduction toxicity testing with the aims
set out in the introduction (Note 6).
3. General Recommendations Concerning Treatment
3.1 Dosages
Selection of dosages is one of the most critical issues in
design of the reproductive toxicity study. The choice of the high
dose should be based on data from all available studies
(pharmacology, acute and chronic toxicity and kinetic studies, Note
7). A repeated dose toxicity study of about 2 to 4 weeks duration
provides a close approximation to the duration of treatment in
segmental designs of reproductive studies. When sufficient
information is not available, preliminary studies are advisable (see
Note 4).
Having determined the high dosage, lower dosages should be
selected in a descending sequence, the intervals depending on
kinetic and other toxicity factors. Whilst it is desirable to be
able to determine a ``no observed adverse effect level,'' priority
should be given to setting dosage intervals close enough to reveal
any dosage-related trends that may be present (Note 8).
3.2 Route and Frequency of Administration
In general the route or routes of administration should be
similar to those intended for human usage. One route of substance
administration may be acceptable if it can be shown that a similar
distribution (kinetic profile) results from different routes (Note
9).
The usual frequency of administration is once daily but
consideration should be given to use either more frequent or less
frequent administration taking kinetic variables into account (see
also Note 10).
3.3 Kinetics
It is preferable to have some information on kinetics before
initiating reproduction studies since this may suggest the need to
adjust choice of species, study design, and dosing schedules. At
this time, the information need not be sophisticated nor derived
from pregnant or lactating animals.
At the time of study evaluation, further information on kinetics
in pregnant or lactating animals may be required according to the
results obtained (Note 10).
3.4 Control Groups
It is recommended that control animals be dosed with the vehicle
at the same rate as test group animals. When the vehicle may cause
effects or affect the action of the test substance, a second (sham-
or untreated) control group should be considered.
4. Proposed Study Designs--Combination of Studies
All available pharmacological, kinetic, and toxicological data
for the test compound and similar substances should be considered in
deciding the most appropriate strategy and choice of study design.
It is anticipated that, initially, preference will be given to
designs that do not differ too radically from those of established
guidelines for medicinal products (the most probable option). For
most medicinal products, the three-study design will usually be
adequate. Other strategies, combinations of studies, and study
designs could be as valid or more valid as the ``most probable
option'' according to circumstances. The key factor is that, in
total, they leave no gaps between stages and allow direct or
indirect evaluation of all stages of the reproductive process (Note
11).
Designs should be justified.
4.1 The Most Probable Option
The most probable option can be equated to a combination of
studies for effects on:
Fertility and early embryonic development,
Prenatal and postnatal development, including maternal
function, and
Embryo-fetal development.
4.1.1 Study of Fertility and Early Embryonic Development to
Implantation
Aim
To test for toxic effects/disturbances resulting from treatment
from before mating (males/females) through mating and implantation.
This comprises evaluation of stages A and B of the reproductive
process (see 1.2). For females this should detect effects on the
oestrous cycle, tubal transport, implantation, and development of
preimplantation stages of the embryo. For males it will permit
detection of functional effects (e.g., on libido, epididymal sperm
maturation) that may not be detected by histological examinations of
the male reproductive organs (Note 12).
Assessment of
Maturation of gametes,
Mating behavior,
Fertility,
Preimplantation stages of the embryo, and
Implantation.
Animals
At least one species, preferably rats.
Number of Animals
The number of animals per sex per group should be sufficient to
allow meaningful interpretation of the data (Note 13).
Administration Period
The design assumes that, especially for effects on
spermatogenesis, use will be made of data from repeated dose
toxicity studies of at least 1-month duration. Provided no effects
have been found that preclude this, a premating treatment interval
of 2 weeks for females and 4 weeks for males can be used (Note 12).
Selection of the length of the premating administration period
should be stated and justified (see also 1.1, pointing out the need
for research). Treatment should continue throughout mating to
termination of males and at least through implantation for females.
This will permit evaluation of functional effects on male fertility
that cannot be detected by histologic examination in repeated dose
toxicity studies and effects on mating behavior in both sexes. If
data from other studies show there are effects on weight or
histologic appearance of reproductive organs in males or females, or
if the quality of examinations is dubious or if there are no data
from other studies, then a more comprehensive study should be
designed (Note 12).
Mating
A mating ratio of 1:1 is advisable and procedures should allow
identification of both parents of a litter (Note 14).
Terminal Sacrifice
Females may be sacrificed at any point after midpregnancy.
Males may be sacrificed at any time after mating but it is
advisable to ensure successful induction of pregnancy before taking
such an irrevocable step (Note 15).
Observations
During study:
Signs and mortalities at least once daily;
Body weight and body weight changes at least twice
weekly (Note 16);
Food intake at least once weekly (except during
mating);
Record vaginal smears daily, at least during the mating
period, to determine whether there are effects on mating or
precoital time; and
Observations that have proved of value in other
toxicity studies.
At terminal examination:
Necropsy (macroscopic examination) of all adults;
Preserve organs with macroscopic findings for possible
histological evaluation; keep corresponding organs of sufficient
controls for comparison;
Preserve testes, epididymides, ovaries and uteri from
all animals for possible histological examination and evaluation on
a case-by-case basis; tissues can be discarded after completion and
reporting of the study;
Sperm count in epididymides or testes, as well as sperm
viability;
Count corpora lutea, implantation sites (Note 16); and
Live and dead conceptuses.
4.1.2 Study for Effects on Prenatal and Postnatal Development,
Including Maternal Function
Aim
To detect adverse effects on the pregnant/lactating female and
on development of the conceptus and the offspring following exposure
of the female from implantation through weaning. Since
manifestations of effect induced during this period may be delayed,
observations should be continued through sexual maturity (i.e.,
stages C to F listed in 1.2) (Notes 17 and 18).
Adverse Effects To Be Assessed
Enhanced toxicity relative to that in nonpregnant
females;
Prenatal and postnatal death of offspring;
Altered growth and development; and
Functional deficits in offspring, including behavior,
maturation (puberty), and reproduction (F1).
Animals
At least one species, preferably rats.
Number of Animals
The number of animals per sex per group should be sufficient to
allow meaningful interpretation of the data (Note 13).
Administration Period
Females are exposed to the test substance from implantation to
the end of lactation (i.e., stages C to E listed in 1.2).
Experimental Procedure
The females are allowed to deliver and rear their offspring to
weaning at which time one male and one female offspring per litter
should be selected (document method used) for rearing to adulthood
and mating to assess reproductive competence (Note 19).
Observations
During study (for maternal animals):
Signs and mortalities at least once daily,
Body weight and body weight change at least twice
weekly (Note 16),
Food intake at least once weekly at least until
delivery,
Observations that have proved of value in other
toxicity studies,
Duration of pregnancy, and
Parturition.
At terminal examination (for maternal animals and where
applicable for offspring):
Necropsy (macroscopic examination) of all adults;
Preservation and possibly histological evaluation of
organs with macroscopic findings; keep corresponding organs of
sufficient controls for comparison;
Implantations (Note 16);
Abnormalities;
Live offspring at birth;
Dead offspring at birth;
Body weight at birth;
Preweaning and postweaning survival and growth/body
weight (Note 20), maturation, and fertility;
Physical development (Note 21);
Sensory functions and reflexes (Note 21); and
Behavior (Note 21).
4.1.3 Study for Effects on Embryo-Fetal Development
Aim
To detect adverse effects on the pregnant female and development
of the embryo and fetus consequent to exposure of the female from
implantation to closure of the hard palate (i.e., stages C to D
listed in 1.2).
Adverse Effects To Be Assessed
Enhanced toxicity relative to that in nonpregnant
females,
Embryofetal death,
Altered growth, and
Structural changes.
Animals
Usually, two species: one rodent, preferably rats; one
nonrodent, preferably rabbits (Note 5). Justification should be
provided when using one species.
Number of Animals
The number of animals should be sufficient to allow meaningful
interpretation of the data (Note 13).
Administration Period
The treatment period extends from implantation to the closure of
the hard palate (i.e., end of C, see 1.2).
Experimental Procedure
Females should be sacrificed and examined about 1 day prior to
parturition. Eleven fetuses should be examined for viability and
abnormalities. To allow subsequent assessment of the relationship
between observations made by different techniques fetuses should be
individually identified (Note 22).
When using techniques requiring allocation to separate
examination for soft tissue or skeletal changes, it is preferable
that 50 percent of fetuses from each litter be allocated for
skeletal examination. A minimum of 50 percent rat fetuses should be
examined for visceral alterations, regardless of the technique used.
When using fresh microdissection techniques for soft tissue
alterations--which is the strongly preferred method for rabbits--100
percent of rabbit fetuses should be examined for soft tissue and
skeletal abnormalities.
Observations
During study (for maternal animals):
Signs and mortalities at least once daily,
Body weight and body weight change at least twice
weekly (Note 16),
Food intake at least once weekly, and
Observations that have proved of value in other
toxicity studies.
At terminal examination:
Necropsy (macroscopic examination) of all adults;
Preserve organs with macroscopic findings for possible
histological evaluation; keep corresponding organs of sufficient
controls for comparison;
Count corpora lutea, numbers of live and dead
implantations (Note 16);
Individual fetal body weight;
Fetal abnormalities (Note 22); and
Gross evaluation of placenta.
4.2 Single Study Design (rodents)
If the dosing period of the fertility study and prenatal and
postnatal study are combined into a single investigation, this
comprises evaluation of stages A to F of the reproductive process
(see 1.2). If such a study, if it includes fetal examinations,
provided clearly negative results at sufficiently high exposure, no
further reproduction studies in rodents should be required. Fetal
examinations for structural abnormalities can also be supplemented
with an embryo-fetal development study (or studies) to make a two-
study approach (Notes 3 and 11).
Results from a study for effects on embryo-fetal development in
a second species are expected (see also 4.1.3).
4.3 Two Study Design (rodents)
The simplest two-segment design would consist of the fertility
study and the prenatal and postnatal development study, if it
includes fetal examinations. It can be assumed, however, that if the
prenatal and postnatal development study provided no indication of
prenatal effects at adequate margins above human exposure, the
additional fetal examinations (see 4.1.3) are most unlikely to
provide a major change in the assessment of risk.
Alternatively, female treatment in the fertility study (4.1.1)
could be continued until closure of the hard palate and fetuses
examined according to the procedures of the embryo-fetal development
study (4.1.3). This, combined with the prenatal and postnatal study
(4.1.2) would provide all the examinations required in ``the most
probable option'' but use considerably less animals (Notes 3 and
11).
Results from a study for effects on embryo-fetal development in
a second species are expected (see also 4.1.3).
5. Statistics
Analysis of the statistics of a study is the means by which
results are interpreted. The most important part of this analysis is
to establish the relationship between the different variables and
their distribution (descriptive statistics), because these determine
how groups should be compared. The distributions of the endpoints
observed in reproductive tests are usually nonnormal and extend from
almost continuous to the extreme categorical.
When employing inferential statistics (determination of
statistical significance) the mating pair or litter, not the fetus
or neonate, should be used as the basic unit of comparison. The
tests used should be justified (Note 23).
6. Data Presentation
The key to good reporting is the tabulation of individual values
in a clear concise manner to account for every animal that was
entered into the study. A reader should be able to follow the
history of any individual animal from initiation to termination and
should be able to deduce with ease the contribution that the
individual has made to any group summary values. Group summary
values should be presented in a form that is biologically plausible
(i.e., avoid false precision) and that reflects the distribution of
the variable. Appendices or tabulations of individual values such as
bodyweight, food consumption, litter values should be concise and,
as far as possible, consist of absolute rather than calculated
values; unnecessary duplication should be avoided.
For tabulation of low frequency observations such as clinical
signs, autopsy findings, abnormalities, etc., it is advisable to
group together the (few) individuals with a positive recording.
Especially in the presentation of data on structural changes (fetal
abnormalities) the primary listing (tabulation) should clearly
identify the litters containing abnormal fetuses, identify the
affected fetuses in the litter, and report all the changes observed
in the affected fetus. Secondary listings by type of change can be
derived from this, if necessary.
7. Terminology
Besides effects on the reproductive competence of adult animals
toxicity to reproduction includes:
Developmental toxicity: Any adverse effect induced prior to
attainment of adult life. It includes effects induced or manifested
in the embryonic or fetal period and those induced or manifested
postnatally.
Embryotoxicity, fetotoxicity, embryo-fetal toxicity: Any adverse
effect on the conceptus resulting from prenatal exposure, including
structural or functional abnormalities or postnatal manifestations
of such effects. Terms like ``embryotoxicity'' or ``fetotoxicity''
relate to the timepoint/-period of induction of adverse effects,
irrespective of the time of detection.
One-, two-, or three-generation studies: Are defined according
to the number of adult breeding generations directly exposed to the
test material. For example, in a one-generation study there is
direct exposure of the F0 generation and indirect exposure (via the
mother) of the F1 generation, and the study is usually terminated at
the weaning of the F1 generation. In a two-generation study as used
for agro-chemicals and industrial chemicals there is direct exposure
of the F0 generation, indirect and direct exposure of the F1
generation and indirect exposure of the F2 generation. A three-
generation study is defined accordingly.
Body burden: The total internal dosage of an individual arising
from the administration of a substance, comprising parent compound
and metabolites, taking distribution and accumulation into account.
Kinetics: The term ``kinetics'' is used consistently throughout
this guideline, irrespective of intending to mean pharmaco- and/or
toxicokinetics. No better single term was available.
Notes
Note 1 (1.1) Scientific Flexibility
These guidelines are not mandatory rules, they are a starting
point rather than an endpoint. They provide a basis from which an
investigator can devise a strategy for testing according to
available knowledge of the test material and the state-of-the art.
For encouragement, some alternative test designs have been mentioned
in this document but there are others that can be sought out or
devised. In devising a strategy, the primary objective should be to
detect and bring to light any indication of toxicity to
reproduction.
Fine details of study design and technical procedures have been
omitted from the text. Such decisions rightly belong in the field of
the investigator since a technique that may be suitable for one
laboratory may not be suitable in another. The investigator needs to
utilize staff and resources to do the best he or she can achieve and
should know how to do this better than any outsider; human
attributes of attitude, ability, and consistency are more important
than material facilities. For necessary compliance to good
laboratory practices (GLP), reference is made to such regulations.
Note 2 (1.2) Timing Conventions
In this guideline the convention for timing of pregnancy is to
refer to the day that a sperm-positive vaginal smear and/or plug is
observed as day 0 of pregnancy even if mating occurs overnight.
Unless shown otherwise it is assumed that, for rats, mice and
rabbits implantation occurs on day 6-7 of pregnancy, and closure of
the hard palate on day 15-18 of pregnancy.
Other conventions are equally acceptable if defined in reports.
Also, the investigator should be consistent in different studies to
ensure that no gaps in treatment occur. It is an advisable
precaution to provide an overlap of at least 1 day in the exposure
period of related studies.
The accuracy of the time of mating should be specified because
this will affect the variability of fetal and neonatal parameters.
Similarly, for reared litters, the day offspring are born will
be considered as postnatal or lactation day 0 unless otherwise
specified. However, particularly with regard to delays in, or
prolongation of, parturition, reference to a postcoital timeframe
may be useful.
Note 3 (1.3) First Pass and Secondary Testing
To a greater or lesser degree, all first pass (guideline) tests
are apical in nature, i.e., an effect on one endpoint may have
several different origins. A reduced litter size at birth may be due
to a reduced ovulation rate (corpora lutea count), higher rate of
preimplantation deaths, higher rate of postimplantation deaths, or
immediate postnatal deaths. In turn, these deaths may be the
consequence of an earlier physical malformation that can no longer
be observed due to subsequent secondary changes and so on.
Particularly for effects with a natural low frequency among
controls, discrimination between treatment-induced and coincidental
occurrence is dependent upon association with other types of
effects.
A toxicant usually induces more than one type of effect in a
dose-dependent manner. For example, induction of malformation is
almost invariably associated with increased embryonic death and an
increased incidence of less severe structural changes. Given an
effect on one endpoint, secondary investigations for possible
associations should be considered, i.e., the nature, scope, and
origins of the substance's toxicity should be characterized.
Characterization should also include identification of dose-response
relationships to facilitate risk assessment; this is different from
the situation in first pass tests where the presence or absence of a
dose response assists discrimination between treatment-related and
coincidental differences.
Note 4 (1.3) Preliminary Studies
At the time most reproduction studies are planned or initiated
there is usually information available from acute and repeated dose
toxicity studies of at least 1-month duration. This information can
be expected to be sufficient in identifying doses for reproductive
studies. If adequate preliminary studies are performed, they are
part of the justification of the choice of dose for the main study.
Such studies should be submitted regardless of their GLP-status in
principle. This may avoid unnecessary use of animals.
Note 5 (2.1) Selection of Species and Strains
In choosing an animal species and strain for reproductive
toxicity testing, care should be given to select a relevant model.
Selection of the species and strain used in other toxicology studies
may avoid the need for additional preliminary studies. If it can be
shown--by means of kinetic, pharmacological, and toxicological
data--that the species selected is a relevant model for the human, a
single species can be sufficient. There is little value in using a
second species if it does not show the same similarities to humans.
Advantages and disadvantages of species (strains) should be
considered in relation to the substance to be tested, the selected
study design, and in the subsequent interpretation of the results.
All species have their advantages. Rats, and to a lesser extent
mice, are good general purpose models; the rabbit has been somewhat
neglected as a ``nonrodent'' species for repeated dose toxicity and
other reproduction studies than embryotoxicity testing. It has
attributes that would make it a useful model for fertility studies,
especially male fertility. For both rabbits and dogs (which are
often used as a second species for chronic toxicity studies) it is
feasible to obtain semen samples without resorting to painful
techniques (electro ejaculation) for longitudinal semen analysis.
Most of the other species are not good, general purpose models and
probably are best used for very specific investigations only.
All species have their disadvantages, for example:
Rats: Sensitivity to sexual hormones, unsuitable for dopamine
agonists due to dependence on prolactin as the primary hormone for
establishment and maintenance of early pregnancy, highly susceptible
to nonsteroidal anti-inflammatory drugs in late pregnancy.
Mice: Fast metabolic rate, stress sensitivity, malformation
clusters (which occur in all species) particularly evident, small
fetus.
Rabbits: Often lack of kinetic and toxicity data, susceptibility
to some antibiotics and to disturbance of the alimentary tract,
clinical signs can be difficult to interpret.
Guinea pigs: Often lack of kinetic and toxicity data,
susceptibility to some antibiotics and to disturbance of the
alimentary tract, long fetal period, insufficient historical
background data.
Domestic and/or mini pigs: Malformation clusters with variable
background rate, large amounts of compound required, large housing
necessary, insufficient historical background data.
Ferrets: Seasonal breeder unless special management systems used
(success highly dependent on human/animal interaction), insufficient
historical background data.
Hamsters: Intravenous route difficult if not impossible, can
hide doses in the cheek pouches and can be very aggressive,
sensitive to intestinal disturbance, overly sensitive teratogenic
response to many chemicals, small foetus.
Dogs: Seasonal breeders, inbreeding factors, insufficient
historical background data.
Nonhuman primates: Kinetically they can differ from humans as
much as other species, insufficient historical background data,
often numbers too low for detection of risk. They are best used when
the objective of the study is to characterize a relatively certain
reproductive toxicant, rather than detect a hazard.
Note 6 (2.2) Uses of Other Test Systems Than Whole Animals
Other tests systems have been developed and used in preliminary
investigations (``prescreening'' or priority selection) and
secondary testing.
For preliminary investigation of a range of analogue series of
substances, it is essential that the potential outcome in whole
animals is known for at east one member of the series to be studied
(by inference, effects are expected). With this strategy, substances
can be selected for higher level testing.
For secondary testing or further substance characterization,
other test systems offer the possibility to study some of the
observable developmental processes in detail, e.g., to reveal
specific mechanisms of toxicity, to establish concentration-response
relationships, to select `sensitive periods,' or to detect effects
of defined metabolites.
Note 7 (3.1) Selection of Dosages
Using similar doses in the reproductive toxicity studies as in
the repeated dose toxicity studies will allow interpretation of any
potential effects on fertility in context with general systemic
toxicity.
Some minimal toxicity is expected to be induced in the high-dose
dams.
According to the specific compound, factors limiting the high
dosage determined from repeat dose toxicity studies or from
preliminary reproduction studies could include:
Reduction in bodyweight gain;
Increased bodyweight gain, particularly when related to
perturbation of homeostatic mechanisms;
Specific target organ toxicity;
Haematology, clinical chemistry;
Exaggerated pharmacological response, which may or may
not be reflected as marked clinical reactions (e.g., sedation,
convulsions);
The physico-chemical properties of the test substance
or dosage formulation which, allied to the route of administration,
may impose practical limitations in the amount that can be
administered; under most circumstances 1 gram per kilogram per day
(g/kg/day) should be an adequate limit dose;
Kinetics can be useful in determining high-dose
exposure for low toxicity compounds; there is, however, little point
in increasing administered dosage if it does not result in increased
plasma or tissue concentration; and
Marked increase in embryo-fetal lethality in
preliminary studies.
Note 8 (3.1) Determination of Dose-Response Relationships
For many of the variables in reproduction studies the power to
discriminate between random variation and treatment effect is poor
and the presence or absence of a dosage-related trend can be a
critical means of determining the probability of a treatment effect.
It has to be kept in mind that in these studies dose responses may
be steep, and wide intervals between doses would be inadvisable. If
an analysis of dose-response relationships for the effects observed
is attempted in a single study, it is recommended to use at least
three dose levels and appropriate control groups. If in doubt, a
fourth dose group should be added to avoid excessive dosage
intervals. Such a strategy should provide a ``no observed adverse
effect level'' for reproductive aspects. If not, the implication is
that the test substance merits a greater depth of investigation and
further studies.
Note 9 (3.2) Exposure by Different Routes of Administration
If it can be shown that one route provides a greater body
burden, e.g., area under the curve (AUC), there seems little reason
to investigate routes that would provide a lesser body burden or
which present severe practical difficulties (e.g. inhalation).
Before designing new studies for a new route of administration,
existing data on kinetics should be used to determine the necessity
of another study.
Note 10 (3.3) Kinetics in Pregnant Animals
Kinetic investigations in pregnant and lactating animals may
pose some problems due to the rapid changes in physiology. It is
best to consider this as a two- or three-phase approach. In planning
studies kinetic data (often from nonpregnant animals) provide
information on the general suitability of the species, and can
assist in deciding study designs and choice of dosage. During a
study kinetic investigations can provide assurance of accurate
dosing or indicate marked deviations from expected patterns.
Note 11 (4) Examples for Choosing Other Options
For compounds causing no lethality at 2 g/kg and no evidence of
repeated dose toxicity at 1 g/kg, conduct of a single two-generation
study with one control and two test groups (0.5 and 1.0 g/kg) would
seem sufficient. However, it might pose the question as to whether
the correct species had been chosen or whether the compound was an
effective medicine.
For compounds that may be given as a single dose, once in a
lifetime (e.g., diagnostics, medicines used in operations), it may
be impossible to administer repeated dosages more than twice the
human therapeutic dosage for any length of time. A reduced period of
treatment allowing a higher dose would seem more appropriate. For
females, considerations of human exposure suggest little or no need
for exposures beyond the embryonic period.
For dopamine agonists or compounds reducing circulating
prolactin levels, female rats are poor models; the rabbit would
probably make a better choice for all the reproductive toxicity
studies, but it does not appear to have been attempted. This also
applies to other types of compound when the rabbit shows a pattern
of metabolism considerably closer to humans than the rat.
For drugs where alterations in plasma kinetics are seen
following repeated administration, the potential for adverse effects
on embryo-fetal development may not be fully evaluated in studies
according to 4.1.3. In such cases it may be desirable to extend the
period of drug administration to females in a 4.1.1 study to day 17.
With sacrifice at term, both fertility and embryo-fetal development
can be assessed.
Note 12 (4.1.1) Premating Treatment
The design of the fertility study, especially the reduction in
the premating period for males, is based on evidence accumulated and
reappraisal of the basic research on the process of spermatogenesis
that originally prompted the demand for a prolonged premating
treatment period. Compounds inducing selective effects on male
reproduction are rare; mating with females is an insensitive means
of detecting effects on spermatogenesis; good pathological and
histopathological examination (e.g., by employing Bouin's fixation,
paraffine embedding, transverse sections of 2 to 4 microns for
testes, longitudinal sections for epididymides, PAS, and
haematoxylin staining) of the male reproductive organs provides a
more sensitive and quicker means of detecting effects on
spermatogenesis; compounds affecting spermatogenesis almost
invariably affect postmeiotic stages; there is no conclusive example
of a male reproductive toxicant the effects of which could be
detected only by dosing males for 9 to 10 weeks and mating them with
females.
Information on potential effects on spermatogenesis can be
derived from repeated dose toxicity studies. This allows the
investigations in the fertility study to be concentrated on other,
more immediate, causes of effect. It is noted that the full sequence
of spermatogenesis (including sperm maturation) in rats lasts 63
days. When the available evidence, or lack of it, suggests that the
scope of investigations in the fertility study should be increased,
or extended from detection to characterization, appropriate studies
should be designed to further characterize the effects.
Note 13 (4.1.1, 4.1.2, 4.1.3) Number of Animals
There is very little scientific basis underlying specified group
sizes in past and existing guidelines nor in this one. The numbers
specified are educated guesses governed by the maximum study size
that can be managed without undue loss of overall study control.
This is indicated by the fact that the more expensive the animal is
to obtain or keep, the smaller the group size proposed. Ideally, at
least the same group size should be required for all species and
there is a case for using larger group sizes for less frequently
used species such as primates.
It should also be made clear that the numbers required depend on
whether or not the group is expected to demonstrate an effect. For a
high frequency effect few animals are required, to presume the
absence of an effect the number required varies according to the
variable (endpoint) being considered, its prevalence in control
populations (rare or categorical events), or dispersion around the
central tendency (continuous or semicontinuous variables). See also
Note 23.
For all but the rarest events (such as malformations, abortions,
total litter loss), evaluation of between 16 to 20 litters for
rodents and rabbits tends to provide a degree of consistency between
studies. Below 16 litters per evaluation, between study results
become inconsistent, above 20 to 24 litters per group, consistency
and precision are not greatly enhanced. These numbers relate to
evaluation. If groups are subdivided for different evaluations the
number of animals starting the study should be doubled. Similarly,
in studies with 2 breeding generations, 16 to 20 litters would be
required for the final evaluation of the litters of the F1
generation. To allow for natural wastage, the starting group size of
the F0 generation must be larger.
Note 14 (4.1.1) Mating
Mating ratios: When both the sexes are being dosed or are of
equal consideration in separate male and female studies, the
preferred mating ratio is 1:1 because this is the safest option in
respect of obtaining good pregnancy rates and avoiding incorrect
analysis and interpretation of results.
Mating period and practices: Most laboratories would use a
mating period of between 2 and 3 weeks, some remove females as soon
as a positive vaginal smear or plug is observed whilst others leave
the pairs together. Most rats will mate within the first 5 days of
cohabitation (i.e., at the first available estrus), but in some
cases females may become pseudopregnant. Leaving the female with the
male for about 20 days allows these females to restart estrus cycles
and become pregnant.
Note 15 (4.1.1) Terminal Sacrifice
Females
When exposure of the females ceases at implantation, termination
of females between days 13 and 15 of pregnancy in general is
adequate to assess effects on fertility or reproductive function,
e.g., to differentiate between implantation and resorption sites.
In general, for detection of adverse effects, it is not thought
necessary, in a fertility study, to sacrifice females at day 20/21
of pregnancy in order to gain information on late embryo loss, fetal
death, and structural abnormalities.
Males
It would be advisable to delay sacrifice of the males until the
outcome of mating is known. In the event of an equivocal result,
males could be mated with untreated females to ascertain their
fertility or infertility. The males treated as part of study 4.1.1
may also be used for evaluation of toxicity to the male reproductive
system if dosing is continued beyond mating and sacrifice delayed.
Note 16 (4.1.1, 4.1.2, 4.1.3) Observations
Daily weighing of pregnant females during treatment can provide
useful information. Weighing an animal more frequently than twice
weekly during periods other than pregnancy (premating, mating,
lactation) may also be advisable for some compounds.
For apparently nonpregnant rats or mice (but not rabbits),
ammonium sulphide staining of the uterus might be useful to identify
peri-implantation death of embryos.
Note 17 (4.1.2) Treatment of Offspring
Consequent to derivation from existing guidelines for medicines,
this guideline does not fully cover exposures from weaning through
puberty, nor does it deal with the possibility of reduced
reproductive life span.
To detect adverse effects for medicinal products that may be
used in infants and juveniles, special studies (case-by-case
designs) involving direct treatment of offspring, at ages to be
specified, should be considered.
Note 18 (4.1.2) Separate Embryotoxicity and Peripostnatal Studies
If a prenatal and postnatal study is separated into two studies,
one covering the embryonic period the other the fetal period,
parturition, and lactation, postnatal evaluation of offspring is
required in both studies.
Note 19 (4.1.2) F1-Animals
The guideline suggests selection of one male and one female per
litter on the evidence that it is feasible to conduct behavioral and
other functional tests on the same F1 individuals that will be used
for assessment of reproductive function. This has the advantage of
allowing cross referencing of performance in different tests at the
individual level. It is recognized, however, that some laboratories
prefer to select separate sets of animals for behavior testing and
for assessment of reproductive function. Which is the most suitable
for an individual laboratory will depend upon the combination of
tests used and the resources available.
Note 20 (4.1.2) Reduction of Litter Size
The value of culling or not culling for detection of effects on
reproduction is still under discussion. Whether or not culling is
performed, it should be explained by the investigator.
Note 21 (4.1.2) Physical Development, Sensory Functions, Reflexes, and
Behavior
The best indicator of physical development is bodyweight.
Achievement of preweaning landmarks of development such as pinna
unfolding, coat growth, incisor eruption, etc., is highly correlated
with pup bodyweight. This weight is better related to postcoital
time than postnatal time, at least when significant differences in
gestation length occur. Reflexes, surface righting, auditory
startle, air righting, and response to light are also dependent on
physical development.
Two postweaning landmarks of development that are advised are
vaginal opening of females and cleavage of the balanopreputial gland
of males. The latter is associated with increasing testosterone
levels whereas testis descent is not. These landmarks indicate the
onset of sexual maturity and it is advised that bodyweight be
recorded at the time of attainment to determine whether any
differences from control are specific or related to general growth.
Functional tests: To date, functional tests have been directed
almost exclusively to behavior. Even though a great deal of effort
has been expended in this direction it is not possible to recommend
specific test methods. Investigators are encouraged to find methods
that will assess sensory functions, motor activity, learning, and
memory.
Note 22 (4.1.3) Individual Identification and Evaluation of Fetuses
It must be possible to relate all findings by different
techniques (i.e., body weight, external inspection, visceral, and/or
skeletal examinations) to single specimen in order to detect
patterns of abnormalities. The examination of mid- and low-dose
fetuses for visceral and/or skeletal abnormalities may not be
necessary where the evaluation of the high-dose and the control
groups did not reveal any relevant differences. It is advisable,
however, to store the fixed specimen for possible later examination.
If fresh dissection techniques are normally used, difficulties with
later comparisons involving fixed fetuses should be anticipated.
Note 23 (5) Inferential Statistics
``Significance'' tests (inferential statistics) can be used only
as a support for the interpretation of results. The interpretation
itself is to be based on biological plausibility. It is unwise to
assume that a difference from control values is not biologically
relevant simply because it is not ``statistically significant.'' To
a lesser extent it can be unwise to assume that a ``statistically
significant'' difference must be biologically relevant. Particularly
for low frequency events (e.g., embryonic death, malformations) with
one-sided distributions, the statistical power of studies is low.
Confidence intervals for relevant quantities
can indicate the likely size of the effect. When using statistical
procedures, experimental units of comparison should be considered: the
litter, not the individual conceptus, the mating pair, when both sexes
are treated, the mating pair of the parent generation in a two-
generation study.
Dated: September 15, 1994.
William K. Hubbard,
Interim Deputy Commissioner for Policy.
[FR Doc. 94-23379 Filed 9-21-94; 8:45 am]
BILLING CODE 4160-01-F