| Sign In to gain access to subscriptions and/or personal tools. |
Excess Mortality in Two-Year Rodent Carcinogenicity StudiesDepartment of Statistics Pfizer Global Research and Development, Groton, CT
This paper considers the impacts of various patterns of differential or excess mortality on the biological and statistical interpretation of 2-year rodent carcinogenicity studies. It provides suggestions on experimental design that are intended to maximize the value of such studies for carcinogenic risk assessment. Specifically, it recommends dose reduction, possibly to the level of dose cessation, when biologically feasible and considers the merits of termination of the entire study as alternatives to the commonly employed strategy of terminating particular dose groups. It then recommends statistical analysis modifications that are appropriate when these suggestions on experimental design are adopted. One of the recommended modifications is a new statistical test to determine whether a dose group exceeds the maximum tolerated dose (MTD) on the basis of mortality. While the authors provide recommendations for the most commonly occurring exigencies, they acknowledge the need for and strongly support the practice of active engagement of the appropriate regulatory agency, e.g., the FDA, prior to any action.
The assessment of the human safety of a pharmaceutical often includes the study of carcinogenic risk in 2-year rodent bioassays. The biological premise for this testing is that exposure for long duration at up to maximum tolerated doses in a relatively small number of animals will be informative about the risks of lower doses and shorter exposures in humans. Consequently, the standard designs of these studies employ lifetime exposures at up to maximum tolerated doses using sample sizes of at least 50 animals per dose per sex. In order to maximize statistical power, trend tests are commonly employed in the analysis of tumor incidence; an age adjustment is commonly incorporated in order to avoid bias. Interpretation of these studies becomes more difficult on both biological and statistical grounds when the treatment groups differ substantially in their mortality rates and/or the mortality rates are extremely high. Biologically, high mortality rates in and of themselves have the obvious effect of limiting full lifetime assessment of the treatment in affected groups. When high mortality rates are coupled with tumor findings, interpretation becomes even more problematic; the challenge, as discussed in the next paragraph, is then to decide whether the tumors are relevant to human risk assessment. In the most common situation where not all of the increased mortality is attributable to tumors, the dose is, by definition, above the MTD. Dosing at levels above the MTD is known to have the potential to perturb biochemical pathways and can result in tumor formation by nongenotoxic mechanisms. Current thinking considers tumors at such dose levels irrelevant to human risk assessment if and only if they occur at a high multiple of the anticipated human exposure and neither the tumors nor their non-neoplastic precursor lesions are observed at the MTD or below. Conversely, when all of the increased mortality is due to tumors, the dose cannot be deemed above the MTD based on mortality. Unless there is other evidence that dosing has occurred above the MTD (e.g., a substantial body weight reduction) or there is a defensible argument that the tumors arise by a mechanism not applicable to humans, tumors arising in such circumstances are usually regarded as relevant to human risk assessment. In rare circumstances, such a conclusion can be mitigated (but not eliminated) by a lack of findings at lower doses when those doses result in exposures that are high multiples of the expected human exposure. Thus, determining whether the dose in question exceeds the MTD can be critical to the ultimate assessment of the compounds carcinogenic potential. From a statistical perspective, any study mortality has a detrimental effect on the power of tests for dose response in tumor incidence rates. However, continuing the study to its scheduled completion will neither affect the validity of these tests nor exacerbate the problem of reduced power unless the sample sizes become both very small and extremely unbalanced. Quantitatively, one reasonable rule of thumb using purely statistical considerations would be to refrain from terminating a group unless its sample size was less than 10 and some other groups sample size was at least 5 times as large or its sample size was less than 5 and some other groups sample size was at least 3.5 times as large.* If the study has dual control groups, they should be pooled when determining this ratio. Note that the sample sizes specified in this rule of thumb are smaller than the FDA currently recommends allowing. If the study is continued to completion when this rule so dictates, the information gained in the last part of the study will usually provide at least a small amount of additional power even though relatively few animals remain alive. Thus, early termination of one or more groups in the presence of high mortality is advisable for statistical reasons only under conditions that can never arise under current FDA policy and would occur in practice only rarely even under a more statistically optimal policy. It is usually contemplated exclusively on biological grounds, i.e., for the prevention of lost tissue due to autolysis. Below we consider the statistical ramifications of differential or excess mortality. We propose alternatives to early termination of particular dose groups in certain situations and suggest modifications to the standard statistical analyses when the various design modifications are employed. We discuss 4 situations involving increased mortality occurring in:
The design modifications we consider are two different dose reduction strategies (one of which includes the possibility of dose cessation), treatment group termination with or without histological examination, and study termination. All discussion applies to 2-year bioassays in rats and mice. The understanding throughout is that the male and female data from these studies are analyzed separately, and thus recommendations are specific to the sex(es) with increased mortality issues. In all cases, increased mortality becomes problematic only when the absolute number of animals (not the percent surviving) in one or more groups becomes too small. Thus, the percentage of mortality that can be tolerated in a given study depends on the initial sample sizes of the treatment groups.
Increased Mortality in Only the High-Dose Group
In (b) above and throughout the rest of this paper, we use the term "mid-dose" to represent the second highest dose, regardless how many groups are present in the study.
The Design Modification Recommendations If reducing the dose as described in the preceding paragraph fails to satisfactorily modulate the high dose mortality, or if dose reduction is not attempted because the study directors believe a priori that it will fail, the recommended design modification depends on how far the study has progressed.
If both of the above dose reduction strategies are either rejected or tried without success, the only remaining design modification choices are early termination of either just the high dose or all groups (in that sex). Early termination of the high dose (or any group) after 12–15 months but prior to the other groups creates problems, sometimes extreme, with the statistical analysis in general and with the trend test in particular. The strongly adverse effect on power is described in some detail in the next paragraph. On the other hand, early termination of all groups foregoes the potential information in the remaining weeks of exposure in the groups below the high dose. If this decision point arises extremely late in the study, so that very little time remains, perhaps the statistical benefits of sacrificing all groups at the same time might outweigh the potential information to be gained from lower doses by terminating only the high dose. Although there are no rigorous criteria to aid this choice, one reasonable rule of thumb might be to terminate all groups during or after Week 100 and to terminate just the high dose before Week 100. If the decision is made to terminate the high dose group after 12–15 months but earlier than the other groups, either before or after trying one or both of the dose reduction strategies described above, we recommend sacrificing some control animals at the same time the high dose is terminated. The appropriate number of control animals is not clear cut, but one reasonable rule of thumb is the lesser of 12 or the number of high dose animals remaining just prior to the groups termination. This is necessary to maximize the statistical power of the analysis of high dose tumor incidence. We note, however, that many (perhaps most) tumor types will still have insufficient power despite this action. For example, suppose that we are dealing with the extreme case of an old age tumor which rarely appears before Week 95, and suppose that there are 15 animals remaining in the high dose at Week 95. If we sacrifice the 15 high dose animals and 12 controls at that time, none of the animals that died earlier provide any information about this tumor type. For this design modification, our recommended statistical analysis, described below, is that high dose tumor incidence should be assessed via a two-group comparison versus controls. This is equivalent to doing a study with sample sizes of 15 and 12. Obviously, there would be almost no power for detecting an increase in the incidence of this tumor type in the high dose. Tumor types that have a greater frequency of early onset would have greater effective sample sizes and hence greater power; and it is these tumor types whose analyses can be helped somewhat by sacrificing some control animals at the same time that the high dose is terminated, as otherwise the high dose animals sacrificed at the groups termination would contain no statistically usable information for the two-group comparison. But even for these tumor types, the effective sample sizes would still be smaller than usual, and power is still hampered by the necessity for a two-group comparison rather than a trend test. These severe power issues are the main reason why we regard this option (early termination of just one group) as at best a last resort to be used only if none of the other alternatives are at all feasible. The reduction in the number of control animals going the full two years via the sacrifice of some controls to coincide with the early termination of the high dose will usually not be problematic for the analysis of tumor incidence at the (lower) dose levels that receive the full two years of exposure, especially in cases where two control groups are maintained. While statistical theory mandates treating the two control groups as one in all statistical analyses, there is no harm in sacrificing animals so as to equalize the remaining sample sizes of each of the two groups. In studies with one control group equal in sample size to the treated groups, a bit more care is required in deciding how many control animals to sacrifice early, as there might be an occasional situation where the potential tradeoff between power for assessing tumor incidence at the high dose and power for assessing tumor incidence at lower doses turns out to be of practical importance. The analysis modification recommendations to provide some context, in the more common situation where excess high dose mortality is not an issue, the almost universally used procedure is to perform the analysis of high dose tumor incidence using trend testing methodology and to regard this as the primary analysis. In such situations, there is no reason to perform any analysis to determine whether the high dose exceeds the MTD due to mortality, and the analysis of mid dose tumor incidence is typically done as a followup trend test only for those tumor types where significantly increased tumor incidence was detected in the high dose group. In situations where high-dose mortality is an issue, we recommend the following modifications to the analysis.
Increased Mortality in the High Dose and Other Treated Groups (but not the Controls)
Increased Mortality in a Treated Group or Groups Other than the High Dose (but not the Controls)
Increased Mortality in the Controls Only or in the Controls and One or More Treated Groups
The FDA Guidelines
We have proposed several recommendations for dealing with differential or excess mortality in 2-year rodent carcinogenicity studies. In general, we discourage early termination of dose groups if other options are biologically feasible. Our motivation is maximization of study utility for carcinogenic risk assessment. The reader will note that we have attempted to address the most common patterns of differential or excess mortality in a systematic fashion. We hope we have provided a useful conceptualization of the problem and some practical suggestions for addressing it. We do, however, acknowledge that no aspect of the conduct of an experiment as complex as the bioassay can be addressed by a simple algorithm. Consequently, we agree fully with the suggestion presented in the FDA Guidelines that sponsors confer with the agency when confronted with a study presenting mortality related issues.
FDA Guidelines "However, early termination of a study for mortality, even if unavoidable, may render a study uninformative, leaving too few animals living long enough to represent adequate exposure to the chemical. This is especially important in the evaluation of the design validity of a negative study. In general, a 50 percent survival rate to weeks 80 to 90 of the 50 initial animals in any treatment group is considered adequate. The percentage can be lower or higher if the number of animals used in each treatment/sex group is larger or smaller than 50, but between 20 to 30 animals should be still alive during these weeks (Lin and Ali 1994). Whether a study could be terminated before the scheduled termination date if the survival of any treatment group goes below 50 percent or 20 to 30 surviving animals (provided that sufficient numbers of animals were exposed through week 80 to 90) depends on the situation. For example, there is no reason to stop a study if the survival of only the low-dose group and/or the medium-dose group is altered, because the control vs. high-dose comparison will still be informative. If the survival of the high-dose group falls below 50% or 20–30 surviving animals after week 80, the study should be continued, either stopping dosing of animals in the high dose or terminating only the high dose group, because the comparison of at least the control and low/middle doses would still be informative (the high dose comparison would depend on the situation). A study could be terminated early if the survival of the control group (or groups) goes below 50 percent or 20–30 surviving animals after weeks 80 to 90 as the later comparisons would not be informative. Others have suggested, for example, that an experiment be terminated early when the survival of the control or low-dose group is reduced to 20–25 percent of the original number of animals. If the mortality is increased only in the high-dose group, consideration can be given to early termination of that group (OFR 1985). Because early study termination poses complex problems, it is strongly recommended that a decision to terminate a study or a study group early be made with input from the Center and the medical division responsible for the review of the associated application. If in discussions with CDER, the Center approves the early termination of a study under this recommendation, the studys sponsor can be assured that the study will be considered by the Center as valid in terms of adequate duration of drug exposure."
* Although this rule of thumb was included only in a fairly late revision at the request of a reviewer, an even later reviewer strongly suggested that we also provide detail to justify it. After considering the length and complexity of a full explanation, we have decided not to provide one. Briefly, the combination of at least one very small sample size and one comparatively large one leads to both power and bias issues for the analysis, regardless whether the best analysis turns out (see ensuing discussion) to be a trend test or a pairwise comparison against controls.
Toxicologic Pathology, Vol. 35, No. 7,
1040-1043 (2007)
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

