Exercises for the BBS course ‘Advanced group-sequential and adaptive confirmatory clinical trial designs, with R practicals using rpact’ on 13Sep2022

Author

Kaspar Rufibach (Roche), Marc Vandemeulebroecke (Novartis), Gernot Wassmer (rpact), Marcel Wolbers (Roche)

Published

April 16, 2024

Purpose of this document

This R markdown file contains the exercises of the BBS course “Advanced group-sequential and adaptive confirmatory clinical trial designs, with R practicals using rpact”.

All materials related to this course are available on the BBS webpage at this link. Solutions to the exercises will also be available through that webpage after the course.

Getting started

  • Make sure that you have the latest version of rpact installed.
  • Download the raw R markdown exercise file by clicking on the “code” block of the right upper corner here.
  • Start up your local Rstudio installation. Then open the R markdown exercise file.
  • Start answering the questions below by adding text or your own R code to the R markdown file. Use the rpact help for guidance.

\(\Rightarrow\) Good Luck!

Load rpact

# Load rpact
library(rpact)
packageVersion("rpact") # version should be version 3.0 or higher
[1] '3.5.1'

Exercise 1 (Group-sequential survival trial with efficacy and futility interim analyses)

Assume we plan a phase III, randomized, multicenter, open-label, two-arm trial with a time-to-event endpoint of OS. The general assumptions for the sample size assessment are:

  • 2:1 randomization.
  • Uniform recruitment of 480 patients over 10 months (48 per month).
  • The dropout rate is 5% in both arms over 12 months.

The sample size section for OS states that the following additional assumptions were made:

  • Exponentially distributed OS in the control arm with a median of 12 months.
  • Median OS improvement vs. control of 4.9 months (medians 16.9 vs. 12 months, i.e. a HR of approximately 0.71).
  • Log-rank test at a two-sided significance level of 0.05, power 80%.
  • One interim analyses for efficacy (IA) and one final analysis using the O'Brien-Fleming boundaries approximated using the Lan-DeMets method. The first IA will be performed after 60% of information.

Exercise 1a (Sample size calculation)

Calculate the required number of events and timing of analysis for OS using the information fraction of 60%. Use the rpact functions getDesignGroupSequential and getSampleSizeSurvival.

Solution:

# basic parameters
infofrac <- c(0.6, 1)   # information fractions
alpha <- 0.05
beta <- 0.2
accrualTime <- c(0, 10)
accrualIntensity <- 48  # 48 pts over 10 months
randoratio <- 2         # 2:1 randomization
m2 <- 12                # median control
m1 <- 16.9              # median treatment
do <- 0.05              # dropout same in both arms
doTime <- 12            # time at which dropout happens

# provide your solution here

Exercise 1b (Addition of a futility interim analysis)

Now add an interim analysis for futility ONLY (i.e. no stopping for efficacy possible) after 30% of information where we stop the trial if the observed hazard ratio is above 1.

Hint: Use significance levels from design with efficacy only, add futility interim with minimal alpha-spending. The argument userAlphaSpending in getDesignGroupSequential helps.

Solution:

# provide your solution here

Exercise 1c (Power loss associated with the futility interim analysis)

How large is the power loss from adding this futility interim analysis, assuming we would not increase the number of events compared to the initial design above?

To compute the power loss of adding the futility, conservatively assuming it will be adhered to, i.e. we compute the power of the design with futility using the number of events of the design without futility.

Solution:

# provide your solution here

Bonus Exercise 1d (Timing of OS events)

How many OS events would be expected to occur until exactly 16 and 24 months, respectively, from first patient randomized?

Hint: getEventProbabilities.

Solution:

# provide your solution here

Exercise 2 (Adaptive trial with a continuous endpoint)

A confirmatory, randomized and blinded study of an investigational drug against Placebo is planned in mild to moderate Alzheimer’s disease. The primary endpoint is the change from baseline in ADAS-Cog, a neuropsychological test battery measuring cognitive abilities, assessed 6 months after treatment initiation. The ADAS-Cog has a range of 0-70; we reverse its scale so that greater values are good. We consider our primary endpoint as approximately normally distributed, and for simplicity we assume a known standard deviation of 10. We believe that the improvement in the primary endpoint that can be achieved with the investigational drug is at least 4 points better than that under Placebo; and we want to have 80% chance of achieving a significant result if this is indeed the case. However, if the investigational drug is no better than Placebo, we want to have no more than 2.5% chance to claim success. This yields a sample size of approximately n=100 per treatment group for a trial with fixed sample size.

Exercise 2a (the “alpha calculus”)

We want to build in a “sanity check” mid-way through the trial. More precisely, we implement an interim analysis using the inverse normal method, with the following characteristics (all with respect to the primary endpoint):

  • Stop for futility if the investigational drug appears worse than Placebo

  • Stop for efficacy if the investigational drug appears “very significantly better” than Placebo (\(p < 0.0001\))

Which set of \((\alpha,\alpha_0,\alpha_1,\alpha_2)\) satisfies these conditions?

Hint: Use getDesignInverseNormal with a user-defined alpha-spending function and a binding futility boundary.

Solution:

# type solution here

What regulatory issues could this raise?

Solution

Exercise 2b (early stopping and sample size adaptation)

  1. At the interim analysis after \(n_1\) = 50 patients per group, we observe an average ADAS-Cog improvement of 4 points under the investigational drug and of 1 point under Placebo. Should we stop or continue the trial?

Hint: getDataset to define the input dataset and getAnalysisResults to analyse it.

Solution

# type solution here
  1. At the same time, there is a change in strategy, and we now want 90% power at an improvement of 4 points over placebo. Determine the sample size per treatment group for the second stage of the trial, in light of the interim results.

Hint: Calculate second stage sample size using getSampleSizeMeans with type I error equal to the conditional rejection probability from the previous part.

Solution

# type solution here

Exercise 2c (final inference)

In the second stage of the trial, we observe an average ADAS-Cog improvement of only 3 points under the investigational drug and of 1 point under Placebo.

  1. Can we reject the null hypothesis and claim superiority of the investigational drug over placebo?

Solution

# type solution here
  1. Compute the overall (“exact”) p-value and confidence interval for the adaptive trial.

Solution

  1. What would a “naive” z-test have concluded, based on all observations and ignoring the adaptive nature of the trial? What is your interpretation of the situation?

Solution

# type solution here

Exercise 3 (Sample size calculation for testing proportions)

Suppose a trial should be conducted in 3 stages where at the first stage 50%, at the second stage 75%, and at the final stage 100% of the information should be observed. O’Brien-Fleming boundaries should be used with one-sided \(\alpha = 0.025\) and non-binding futility bounds 0 and 0.5 for the first and the second stage, respectively, on the z-value scale.

The endpoints are binary (failure rates) and should be compared in a parallel group design, i.e., the null hypothesis to be tested is \(H_0:\pi_1 - \pi_2 = 0\,,\) which is tested against the alternative \(H_1: \pi_1 - \pi_2 < 0\,.\)

Exercise 3a (sample size calculation)

What is the necessary sample size to achieve 90% power if the failure rates are assumed to be \(\pi_1 = 0.40\) and \(\pi_2 = 0.60\)? What is the optimum allocation ratio?

Solution

# type solution here

Exercise 3b (boundary plots)

Illustrate the decision boundaries on different scales.

Solution

# type solution here

Exercise 3c (power assessment)

Suppose that \(N = 280\) subjects were planned for the study. What is the power if the failure rate in the active treatment group is \(\pi_1 = 0.50\)?

Solution

# type solution here

Exercise 3d (power illustration)

Illustrate power, expected sample size, and early/futility stop for a range of alternative values.

Solution

# type solution here

Exercise 4 (Sample size reassessment for testing proportions)

Using an adaptive design, the sample size from Example 3 in the last interim can be increased up to a 4-fold of the originally planned sample size for the last stage. Conditional power 90% based on the observed effect sizes (failure rates) is used to increase the sample size.

Exercise 4a (assess power)

Use the inverse normal method to allow for the sample size increase and compare the test characteristics with the group sequential design from Example 3.

Solution

# type solution here

Exercise 4b (illustrate power difference)

Illustrate the gain in power when using the adaptive sample size recalculation.

Solution

# type solution here

Exercise 4c (histogram of sample sizes)

Create a histogram for the attained sample size of the study when using the adaptive sample size recalculation. How often will the maximum sample size be achieved?

Solution

# type solution here

Exercise 5 (Multi-armed design with continuous endpoint)

Suppose a trial is conducted with three active treatment arms (+ one control arm). An adaptive design using the equally weighted inverse normal method with two interim analyses using O’Brien & Fleming boundaries is chosen where in both interim analyses a selection of treatment arms is foreseen (overall \(\alpha = 0.025\) one-sided). It is decided to test the intersection tests in the closed system of hypotheses with Dunnett’s test. In the designing stage, it was decided to conduct the study with 20 patients per treatment arm and stage where at interim the sample size can be redefined.

Exercise 5a (First stage and conditional power)

Suppose, at the first stage, the following results were obtained:

arm n mean std
1 19 3.11 1.77
2 22 3.87 1.23
3 23 4.12 1.64
control 21 3.02 1.72

Perform the closed test and assess the conditional power in order to decide which treatment arm(s) should be selected and if the sample size should be redefined.

Solution

# Define input data (in rpact, the *last* group refers to control) and other parameters
dataExample <- getDataset(
  n1      = c(19),
  n2      = c(22),
  n3      = c(23),
  n4      = c(21),
  means1  = c(3.11),
  means2  = c(3.87),
  means3  = c(4.12),
  means4  = c(3.02),
  stDevs1 = c(1.77),
  stDevs2 = c(1.23),
  stDevs3 = c(1.64),
  stDevs4 = c(1.72)
)

alpha <- 0.025
intersectionTest <- "Dunnett"
varianceOption <- "overallPooled"
normalApproximation <- FALSE

# Now define the design via getDesignInverseNormal,
# then analyse the data using getAnalysisResults

# type solution here

Exercise 5b (Second stage)

Suppose it was decided to drop treatment arm 1 for stage 2 and leave the sample size for the remaining arms unchanged. For the second stage, the following results were obtained:

arm n mean std
2 23 3.66 1.11
3 19 3.98 1.21
control 22 2.99 1.82

Perform the closed test and discuss whether or not to stop the study and determine overall \(p\,\)-values and confidence intervals.

Solution

dataExample <- getDataset(
  n1      = c(19, NA),
  n2      = c(22, 23),
  n3      = c(23, 19),
  n4      = c(21, 22),
  means1  = c(3.11, NA),
  means2  = c(3.87, 3.66),
  means3  = c(4.12, 3.98),
  means4  = c(3.02, 2.99),
  stDevs1 = c(1.77, NA),
  stDevs2 = c(1.23, 1.11),
  stDevs3 = c(1.64, 1.21),
  stDevs4 = c(1.72, 1.82)
)

# Now analyse the data using getAnalysisResults

# type solution here

Exercise 5c (Intersection tests)

Would the Bonferroni and the Simes test intersection tests provide the same results?

Solution

# type solution here

Bonus Exercise 6 (Planning of survival design)

A survival trial is planned to be performed with one interim stage and using an O’Brien & Fleming type \(\alpha\)-spending approach at \(\alpha = 0.025\). The interim is planned to be performed after half of the necessary events were observed. It is assumed that the median survival time is 18 months in the treatment group, and 12 months in the control. Assume that the drop-out rate is 5% after 1 year and the drop-out time is exponentially distributed.

Exercise 6a (accrual and follow-up time given)

The patients should be recruited within 12 months assuming uniform accrual. Assume an additional follow-up time of 12 months, i.e., the study should be conducted within 2 years. Calculate the necessary number of events and patients (total and per month) in order to reach power 90% with the assumed median survival times if the survival time is exponentially distributed. Under the postulated assumption, estimate interim and final analysis time.

Solution

# type solution here

Exercise 6b (follow-up time and absolue intensity given)

Assume that 25 patients can be recruited each month and that there is uniform accrual. Estimate the necessary accrual time if the planned follow-up time remains unchanged.

Solution

# type solution here

Exercise 6c (accrual time and max number of patients given)

Assume that accrual stops after 16 months with 25 patients per month, i.e., after 400 patients were recruited. What is the estimated necessary follow-up time?

Solution

# type solution here

Exercise 6d (staggered patient entry)

How do the results change if in the first 3 months 15 patients, in the second 3 months 20 patients, and after 6 months 25 patients per month can be accrued?

Solution

# type solution here

Bonus Exercise 7 (Adaptive survival design)

Exercise 7a (verify results by simulation)

Assume that the study from Example 6 is planned with 257 events and 400 patients under the assumptions that accrual stops after 16 months with 25 patients per month. Verify by simulation the correctness of the results obtained by the analytical formulae.

Solution

# type solution here

Exercise 7b (assess adaptive survival design)

Assume now that a sample size increase up to a ten-fold of the originally planned number of events is foreseen. Conditional power 90% based on the observed hazard ratios is used to increase the number of events. Assess by simulation the magnitude of power increase when using the appropriate method.

Simulate the Type I error rate when using

  • the group sequential method

  • the inverse normal method

Hint: Make sure that enough subjects are used in the simulation (set maxNumberOfSubjects = 3000 and no drop-outs)

Solution

# type solution here

System: rpact 3.5.1, R version 4.3.3 (2024-02-29), platform: aarch64-apple-darwin20

To cite package ‘rpact’ in publications use:

Wassmer G, Pahlke F (2024). rpact: Confirmatory Adaptive Clinical Trial Design and Analysis. R package version 3.5.1, https://CRAN.R-project.org/package=rpact.