Nestlé rpact Training 2024, Lausanne
May 23, 2024
Assume we want to design a two-arm group sequential trial with a time to event endpoint where the summary measure of interest is the hazard ratio \(\omega\)
Suppose we wish to test
\[ H_0: \omega \geq 1 \text{ against } H_1: \omega < 1 \]
We require:
Question 1
How many events would be required for a fixed sample size design?
Sample size calculation for a survival endpoint
Fixed sample analysis, significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control pi(2) = 0.2, event time = 12, accrual time = 12, accrual intensity = 62.1, follow-up time = 6, power 80%.
| Stage | Fixed |
|---|---|
| Efficacy boundary (z-value scale) | 1.960 |
| Number of subjects | 745.0 |
| Number of events | 120.3 |
| Analysis time | 18.00 |
| Expected study duration | 18.00 |
| One-sided local significance level | 0.0250 |
| Efficacy boundary (t) | 0.700 |
Legend:
Required Number Of Subjects
Obviously, some default parameters were used to derive the required number of subjects
For the survival time distributions, we assume:
Question 2
What is the sample size (number of subjects) for a fixed sample size design if the recruitment lasts 3 years and the (additional) follow-up time lasts 2 years, i.e., it is planned to conduct the study in 5 years?
Sample size calculation for a survival endpoint
Fixed sample analysis, significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control median(2) = 1, accrual time = 3, accrual intensity = 48.7, follow-up time = 2, power 80%.
| Stage | Fixed |
|---|---|
| Efficacy boundary (z-value scale) | 1.960 |
| Number of subjects | 146.2 |
| Number of events | 120.3 |
| Analysis time | 5.00 |
| Expected study duration | 5.00 |
| One-sided local significance level | 0.0250 |
| Efficacy boundary (t) | 0.700 |
Legend:
For recruitment assumptions, we assume:
Question 3
What is the expected study duration for a fixed sample size design?
Sample size calculation for a survival endpoint
Fixed sample analysis, significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control median(2) = 1, accrual time = 3, accrual intensity = 50, power 80%.
| Stage | Fixed |
|---|---|
| Efficacy boundary (z-value scale) | 1.960 |
| Number of subjects | 150.0 |
| Number of events | 120.3 |
| Analysis time | 4.78 |
| Expected study duration | 4.78 |
| One-sided local significance level | 0.0250 |
| Efficacy boundary (t) | 0.700 |
Legend:
Question 4
Suppose recruitment lasts 3.5 years instead of 3 years. What would be the expected study duration for a fixed sample size design?
Sample size calculation for a survival endpoint
Fixed sample analysis, significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control median(2) = 1, accrual time = 3.5, accrual intensity = 50, power 80%.
| Stage | Fixed |
|---|---|
| Efficacy boundary (z-value scale) | 1.960 |
| Number of subjects | 175.0 |
| Number of events | 120.3 |
| Analysis time | 4.20 |
| Expected study duration | 4.20 |
| One-sided local significance level | 0.0250 |
| Efficacy boundary (t) | 0.700 |
Legend:
We wish to add an interim analysis for efficacy. The interim should happen after approximately half the required number of events. An O’Brien-Fleming type alpha-spending function will be used. No futility analysis is considered (for now).
Question 5
Using the same design assumptions as above with a recruitment period of 3.5 years, perform the sample size calculation.
After how many events is the interim and final analysis to take place?
How long would it take to reach the required number of interim and final events according to the design assumptions?
What is the critical value for the hazard ratio at the interim and final analysis?
Sample size calculation for a survival endpoint
Sequential analysis with a maximum of 2 looks (group sequential design), overall significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control median(2) = 1, accrual time = 3.5, accrual intensity = 50, power 80%.
| Stage | 1 | 2 |
|---|---|---|
| Planned information rate rate | 50% | 100% |
| Efficacy boundary (z-value scale) | 2.963 | 1.969 |
| Cumulative power | 0.1641 | 0.8000 |
| Number of subjects | 130.3 | 175.0 |
| Expected number of subjects under H1 | 167.7 | |
| Cumulative number of events | 60.4 | 120.8 |
| Analysis time | 2.61 | 4.22 |
| Expected study duration | 3.95 | |
| Cumulative alpha spent | 0.0015 | 0.0250 |
| One-sided local significance level | 0.0015 | 0.0245 |
| Efficacy boundary (t) | 0.466 | 0.699 |
| Exit probability for efficacy (under H0) | 0.0015 | |
| Exit probability for efficacy (under H1) | 0.1641 |
Legend:
What is the expected number of events under the null hypothesis and under the alternative?
What is the expected study duration according to the design assumptions (i.e., under the alternative hypothesis)?
What is the expected number of subjects according to the design assumptions (i.e., under the alternative hypothesis)?
Design plan parameters and output for survival data
Design parameters
User defined parameters
Default parameters
Sample size and output
Legend
Suppose at the time of the interim analysis we wish to add a futility stopping boundary.
A simple rule is considered: if the Z-statistic is above zero (where negative values of the Z-statistic indicate treatment benefit), then the trial is stopped for futility.
The rule is considered non-binding.
Question 6
Create a design object which includes this futility stopping rule.
getPowerSurvival() to calculate the overall power when adding the futility boundary under the same design assumptions as above. Assume recruitment lasts for 3.5 years (50 subjects per year). Assume that the maximum number of events is 121. What is the the overall power? The expected study duration and number of subjects?Power calculation for a survival endpoint
Sequential analysis with a maximum of 2 looks (group sequential design), overall significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, power directed towards smaller values, H1: hazard ratio = 0.6, control median(2) = 1, maximum number of events = 121, accrual time = 3.5, accrual intensity = 50.
| Stage | 1 | 2 |
|---|---|---|
| Planned information rate rate | 50% | 100% |
| Efficacy boundary (z-value scale) | 2.963 | 1.969 |
| Futility boundary (z-value scale) | 0 | |
| Cumulative power | 0.1645 | 0.7977 |
| Number of subjects | 130.5 | 175.0 |
| Expected number of subjects under H1 | 166.6 | |
| Expected number of events | 109.6 | |
| Cumulative number of events | 60.5 | 121.0 |
| Analysis time | 2.61 | 4.22 |
| Expected study duration | 3.92 | |
| Cumulative alpha spent | 0.0015 | 0.0250 |
| One-sided local significance level | 0.0015 | 0.0245 |
| Efficacy boundary (t) | 0.467 | 0.699 |
| Futility boundary (t) | 1.000 | |
| Overall exit probability (under H0) | 0.5015 | |
| Overall exit probability (under H1) | 0.1880 | |
| Exit probability for efficacy (under H0) | 0.0015 | |
| Exit probability for efficacy (under H1) | 0.1645 | |
| Exit probability for futility (under H0) | 0.5000 | |
| Exit probability for futility (under H1) | 0.0235 |
Legend:
Power calculation for a survival endpoint
Sequential analysis with a maximum of 2 looks (group sequential design), overall significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, power directed towards smaller values, H1: hazard ratio = 1, control median(2) = 1, maximum number of events = 121, accrual time = 3.5, accrual intensity = 50.
| Stage | 1 | 2 |
|---|---|---|
| Planned information rate rate | 50% | 100% |
| Efficacy boundary (z-value scale) | 2.963 | 1.969 |
| Futility boundary (z-value scale) | 0 | |
| Cumulative power | 0.0015 | 0.0247 |
| Number of subjects | 118.7 | 175.0 |
| Expected number of subjects under H1 | 146.8 | |
| Expected number of events | 90.7 | |
| Cumulative number of events | 60.5 | 121.0 |
| Analysis time | 2.37 | 3.78 |
| Expected study duration | 3.08 | |
| Cumulative alpha spent | 0.0015 | 0.0250 |
| One-sided local significance level | 0.0015 | 0.0245 |
| Efficacy boundary (t) | 0.467 | 0.699 |
| Futility boundary (t) | 1.000 | |
| Overall exit probability (under H0) | 0.5015 | |
| Overall exit probability (under H1) | 0.5015 | |
| Exit probability for efficacy (under H0) | 0.0015 | |
| Exit probability for efficacy (under H1) | 0.0015 | |
| Exit probability for futility (under H0) | 0.5000 | |
| Exit probability for futility (under H1) | 0.5000 |
Legend:
Together:
Power calculation for a survival endpoint
Sequential analysis with a maximum of 2 looks (group sequential design), overall significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, power directed towards smaller values, H1: hazard ratio as specified, control median(2) = 1, maximum number of events = 121, accrual time = 3.5, accrual intensity = 50.
| Stage | 1 | 2 |
|---|---|---|
| Planned information rate rate | 50% | 100% |
| Efficacy boundary (z-value scale) | 2.963 | 1.969 |
| Futility boundary (z-value scale) | 0 | |
| Cumulative power, HR = 0.6 | 0.1645 | 0.7977 |
| Cumulative power, HR = 1 | 0.0015 | 0.0247 |
| Number of subjects, HR = 0.6 | 130.5 | 175.0 |
| Number of subjects, HR = 1 | 118.7 | 175.0 |
| Expected number of subjects under H1, HR = 0.6 | 166.6 | |
| Expected number of subjects under H1, HR = 1 | 146.8 | |
| Expected number of events, HR = 0.6 | 109.6 | |
| Expected number of events, HR = 1 | 90.7 | |
| Cumulative number of events | 60.5 | 121.0 |
| Analysis time, HR = 0.6 | 2.61 | 4.22 |
| Analysis time, HR = 1 | 2.37 | 3.78 |
| Expected study duration, HR = 0.6 | 3.92 | |
| Expected study duration, HR = 1 | 3.08 | |
| Cumulative alpha spent | 0.0015 | 0.0250 |
| One-sided local significance level | 0.0015 | 0.0245 |
| Efficacy boundary (t) | 0.467 | 0.699 |
| Futility boundary (t) | 1.000 | |
| Overall exit probability (under H0) | 0.5015 | |
| Overall exit probability (under H1) | 0.1880 | |
| Exit probability for efficacy (under H0) | 0.0015 | |
| Exit probability for efficacy (under H1), HR = 0.6 | 0.1645 | |
| Exit probability for efficacy (under H1), HR = 1 | 0.0015 | |
| Exit probability for futility (under H0) | 0.5000 | |
| Exit probability for futility (under H1), HR = 0.6 | 0.0235 | |
| Exit probability for futility (under H1), HR = 1 | 0.5000 |
Legend:
Range of Plots
$`Boundaries Z Scale`

$`Boundaries Effect Scale`

$`Boundaries p Values Scale`

$`Error Spending`

$`Overall Power and Early Stopping`

$`Number of Events`

$`Overall Power`

$`Overall Early Stopping`

$`Expected Number of Events`

$`Study Duration`

$`Expected Number of Subjects`

$`Analysis Time`

$`Cumulative Distribution Function`

$`Survival Function`

Suppose at the interim analysis, the observed number of events is 67 and the value of the Z-statistic is -1.10 (where negative values correspond to treatment benefit).
Question 7
Re-calculate the stopping boundary based on the observed 67 events at the interim analysis.
What is the interim analysis decision?
Test decision
Continue to the next stage, since the Z statistic is between 0 (futility bound) and -2.795 (efficacy bound)
\(\hspace{2cm}\)
Direction of test statistic
NOTE: the function getDesignGroupSequential() doesn’t know which direction of Z statistic indicates treatment benefit. By default, the critical values are displayed assuming positive Z is beneficial.
Only when using functions like getSampleSize... or getPower... we indicate the direction.
Suppose at the final analysis, the observed number of events is 129 and the value of the Z-statistic is -2.00 (where negative values correspond to treatment benefit).
Question 8
Re-calculate the stopping boundary based on the observed 67 events at interim and 129 events at the final analysis.
Since we have deviated from the planned number of events, our actual alpha spent no longer follows the O’Brien-Fleming-type alpha-spending function. Use the argument typeOfDesign = "asUser" instead.
What is the final analysis decision?
Test decision
Reject the null hypothesis since Z < -1.976
Use maxInformation and getAnalysisResults()
Analysis results for a survival endpoint
Sequential analysis with 2 looks (group sequential design). The results were calculated using a two-sample logrank test (one-sided, alpha = 0.025). H0: hazard ratio = 1 against H1: hazard ratio > 1.
| Stage | 1 | 2 |
|---|---|---|
| Planned information rate | 0.519 | 1 |
| Efficacy boundary (z-value scale) | 2.795 | 1.976 |
| Cumulative alpha spent | 0.0026 | 0.0250 |
| Stage level | 0.0026 | 0.0241 |
| Cumulative effect size | 0.764 | 0.703 |
| Overall test statistic | -1.100 | -2.000 |
| Overall p-value | 0.8643 | 0.9772 |
| Test action | continue | accept |
| Conditional rejection probability | <0.0001 | |
| 95% repeated confidence interval | [0.386; 1.513] | [0.496; 0.996] |
| Repeated p-value | >0.5 | |
| Final p-value | 0.9772 | |
| Final confidence interval | [0.498; 0.993] | |
| Median unbiased estimate | 0.703 |
Let’s go back to the fixed sample size design with 3.5 years of recruitment (Question 4).
Suppose now that the control arm follows a piecewise exponential distribution. For the first year the hazard rate is 1.3, and thereafter the hazard rate is 0.5.
Question 9
What is the expected study duration?
Suppose that, in addition to the changing hazard rate on the control arm, the hazard ratio also changes.
Suppose that the hazard ratio during the first year is 0.6. Thereafter, the hazard ratio is 0.8.
Question 10
Use the function getSimulationSurvival() to calculate the power of the fixed sample size design with 3.5 years recruitment (50 subjects per year).
Question 11
Going back to the assumptions in Question 3, what is the expected study duration for a fixed sample size design if we specify in addition that 2% of subjects on each arm will drop out per year?
Sample size calculation for a survival endpoint
Fixed sample analysis, significance level 2.5% (one-sided). The results were calculated for a two-sample logrank test, H0: hazard ratio = 1, H1: hazard ratio = 0.6, control median(2) = 1, accrual time = 3, accrual intensity = 50, dropout rate(1) = 0.02, dropout rate(2) = 0.02, dropout time = 1, power 80%.
| Stage | Fixed |
|---|---|
| Efficacy boundary (z-value scale) | 1.960 |
| Number of subjects | 150.0 |
| Number of events | 120.3 |
| Analysis time | 4.98 |
| Expected study duration | 4.98 |
| One-sided local significance level | 0.0250 |
| Efficacy boundary (t) | 0.700 |
Legend:

