The evaluation of endpoint variability and implications for study statistical power and sample size in conscious instrumented dogs.


Introduction
The sensitivity of a given test to detect a treatment-induced effect in a variable of interest is intrinsically related to the variability of that variable observed without treatment and the number of observations made in the study (i.e. number of animals). To evaluate test sensitivity to detect drug-induced changes in myocardial contractility using the variable LVdP/dtmax, a HESI-supported consortium designed and conducted studies in chronically instrumented, conscious dogs using telemetry. This paper evaluated the inherent variability of the primary endpoint, LVdP/dtmax, over time in individual animals as well as the variability between animals for a given laboratory. An approach is described to evaluate test system variability and thereby test sensitivity which may be used to support the selection of the number of animals for a given study, based on the desired test sensitivity.

Methods
A double 4 × 4 Latin square study design where eight animals each received a vehicle control and three dose levels of a test compound was conducted at six independent laboratories. LVdP/dtmax was assessed via implanted telemetry systems in Beagle dogs (N = 8) using the same protocol and each of the six laboratories conducted between two and four studies. Vehicle data from each study was used to evaluate the between-animal and within-animal variability in different time averaging windows. Simulations were conducted to evaluate statistical power and type I error for LVdP/dtmax based on the estimated variability and assumed treatment effects in hourly-interval, bi-hourly interval, or drug-specific super interval.

Results
We observe that the within-animal variability can be reduced by as much as 30% through the use of a larger time averaging window. Laboratory is a significant source of animal-to-animal variability as between-animal variability is laboratory-dependent and is less impacted by the use of different time averaging windows. The statistical power analysis shows that with N = 8, the double Latin square design has over 90% power to detect a minimal time profile with a maximum change of up to 15% or approximately 450 mm Hg/s in LVdP/dtmax. With N = 4, the single Latin square design has over 80% power to detect a minimal time profile with a maximum change of up to 20% or approximately 600 mm Hg/s in LVdP/dtmax.

Discussion
We describe a statistical procedure to quantitatively evaluate the acute cardiac effects from studies conducted across six sites and objectively examine the variability and sensitivity that were difficult or impossible to calculate consistently based on previous works. Although this report focuses on the evaluation on LVdP/dtmax, this approach is appropriate for other variables such as heart rate, arterial blood pressure, or variables derived from the ECG.