The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The other vignettes describe two-arm randomised designs. Single-arm
trials – in which every subject receives the experimental therapy and
the comparator is an external benchmark – are common in early-phase
oncology, rare-disease, and proof-of-concept studies. This vignette
shows how to set up a Goldilocks single-arm design with
survival_adapt().
Two practical constraints on single-arm designs in this package:
hazard_control = NULL.method = "bayes" is supported for single-arm
trials. The frequentist tests (logrank, cox,
chisq) require two arms and will raise an error if used in
this mode.In a single-arm trial there is no concurrent control, so the “treatment effect” is replaced by the cumulative event probability on the treatment arm itself:
\(\text{effect} \;=\; p_{\text{treatment}} \;=\; \Pr(\text{event by end\_of\_study} \mid \text{data}).\)
The argument h0 plays the role of a benchmark on this
scale: a target failure probability (or, equivalently, \(1 - h_0\) is a target survival probability)
drawn from external evidence such as a published rate, registry, or
historical cohort. With alternative = "less" and
prob_ha, the trial declares success when
\[\Pr(p_{\text{treatment}} < h_0 \mid \text{data}) \;>\; \texttt{prob\_ha},\]
i.e. when the posterior assigns enough mass to “the experimental
therapy has a lower failure rate than the benchmark”. Choosing
alternative = "greater" reverses the direction;
alternative = "two.sided" is not allowed for
method = "bayes".
The same posterior is used at each interim look to compute the
predictive probability of eventual success, which drives the futility
(Fn) and expected-success (Sn) stopping rules.
Predictive probabilities are obtained by imputing remaining follow-up
from the posterior predictive distribution of the
(piecewise-)exponential model and re-evaluating the success criterion on
each completed dataset.
Suppose the existing standard of care has a 30% event probability by 24 months, and we are testing a new agent that we hope will reduce this to 20%. We use an interim look at 50 of 80 enrolled subjects:
end_of_study <- 24
benchmark <- 0.30 # external standard-of-care failure rate
target <- 0.20 # rate we hope the new therapy achieves
# Convert the target failure rate into a constant hazard (so we can simulate)
ht <- prop_to_haz(probs = target, endtime = end_of_study)
ht
#> [1] 0.009297648Now we run survival_adapt():
out <- survival_adapt(
hazard_treatment = ht,
hazard_control = NULL, # single-arm
cutpoints = 0,
N_total = 80,
lambda = 5, # enrolments per month (constant)
lambda_time = 0,
interim_look = 50,
end_of_study = end_of_study,
prior = c(0.1, 0.1), # Gamma(0.1, 0.1) on the hazard
block = 2, # default; inert in single-arm mode
rand_ratio = c(1, 1), # default; inert in single-arm mode
prop_loss = 0.05,
alternative = "less",
h0 = benchmark, # benchmark failure probability
Fn = 0.05,
Sn = 0.95,
prob_ha = 0.95,
N_impute = 50,
N_mcmc = 2000,
method = "bayes")
out
#> prob_threshold margin alternative N_treatment N_control N_enrolled N_max
#> 1 0.95 0.3 less 80 0 80 80
#> post_prob_ha est_final ppp_success stop_futility stop_expected_success
#> 1 0.9985 0.1689997 0.86 0 0A few points to highlight in the output:
N_control = 0: no concurrent control was
simulated.margin = 0.30: this is the value of h0
that the trial is testing against. Note that it is on the
cumulative-failure scale, not the survival scale.est_final is the posterior mean of \(p_{\text{treatment}}\) at
end_of_study, not a treatment effect relative to
control.post_prob_ha is the posterior probability that \(p_{\text{treatment}} < h_0\).block and rand_ratio still appearsurvival_adapt() shares its trial-data simulator with
the two-arm case. In single-arm mode the simulator skips
randomization() entirely and assigns every subject to the
treatment arm; block and rand_ratio are
therefore inert and can be left at their defaults. The
minimum-interim_look rule
(interim_look >= max(block)) only applies to two-arm
designs, so a single-arm trial can use any interim_look
strictly less than N_total.
A single trial does not tell you whether the design is
well-calibrated. To estimate power and type I error, we run the design
under each scenario using sim_trials(). The chunks below
are not run when knitting (each takes a few minutes) but illustrate the
workflow:
# Power: simulate under the alternative (true rate = 0.20)
out_power <- sim_trials(
N_trials = 1000,
hazard_treatment = ht,
hazard_control = NULL,
cutpoints = 0,
N_total = 80,
lambda = 5,
lambda_time = 0,
interim_look = 50,
end_of_study = end_of_study,
prior = c(0.1, 0.1),
block = 2,
rand_ratio = c(1, 1),
prop_loss = 0.05,
alternative = "less",
h0 = benchmark,
Fn = 0.05,
Sn = 0.95,
prob_ha = 0.95,
N_impute = 50,
N_mcmc = 2000,
method = "bayes")
# Type I error: simulate under the null (true rate = benchmark = 0.30)
ht_null <- prop_to_haz(probs = benchmark, endtime = end_of_study)
out_t1error <- sim_trials(
N_trials = 1000,
hazard_treatment = ht_null,
hazard_control = NULL,
cutpoints = 0,
N_total = 80,
lambda = 5,
lambda_time = 0,
interim_look = 50,
end_of_study = end_of_study,
prior = c(0.1, 0.1),
block = 2,
rand_ratio = c(1, 1),
prop_loss = 0.05,
alternative = "less",
h0 = benchmark,
Fn = 0.05,
Sn = 0.95,
prob_ha = 0.95,
N_impute = 50,
N_mcmc = 2000,
method = "bayes")
summarise_sims(list(out_power$sims, out_t1error$sims))Calibration proceeds the same way as for two-arm designs: if the type
I error under the null (where the true rate equals the benchmark) is
above the desired level, raise prob_ha; if power is too
low, increase N_total or relax the
Fn/Sn thresholds.
The validity of a single-arm Goldilocks trial rests entirely on the
benchmark h0 being a fair representation of the population
the trial is enrolling. Drift in standard of care, differences in
patient mix, and unmeasured confounding all bias the comparison in a way
that randomisation would otherwise neutralise. A Bayesian framework can
incorporate uncertainty about the benchmark itself – e.g. by replacing a
fixed h0 with a prior distribution informed by historical
data – but this is outside the scope of the simple h0
scalar that survival_adapt() exposes, and would require a
custom analysis. When in doubt, simulating the design under several
plausible values of the true rate (including ones near the benchmark) is
a useful way to characterise its sensitivity.
hazard_control = NULL and
pass a per-interval hazard_treatment vector).?survival_adapt documents all arguments.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.