The goal of expertsurv
is to incorporate expert opinion
into an analysis of time to event data. expertsurv
uses
many of the core functions of the survHE
package (Baio 2020) and also the flexsurv
package (Jackson 2016). Technical details
of the implementation are detailed in (Cooney and
White 2023) and will not be repeated here.
The key function is fit.models.expert
and operates
almost identically to the fit.models
function of
survHE
.
You can install the released version of expertsurv from CRAN with:
install("expertsurv")
If we have elicited expert opinion of the survival probability at certain timepoint(s) and assigned distributions to these beliefs, we encode that information as follows:
#A param_expert object; which is a list of
#length equal to the number of timepoints
param_expert_example1 <- list()
#If we have 1 timepoint and 2 experts
#dist is the names of the distributions
#wi is the weight assigned to each expert (usually 1)
#param1, param2, param3 are the parameters of the distribution
#e.g. for norm, param1 = mean, param2 = sd
#param3 is only used for the t-distribution and is the degress of freedom.
#We allow the following distributions:
#c("normal","t","gamma","lognormal","beta")
param_expert_example1[[1]] <- data.frame(dist = c("norm","t"),
wi = c(0.5,0.5), # Ensure Weights sum to 1
param1 = c(0.1,0.12),
param2 = c(0.005,0.005),
param3 = c(NA,3))
param_expert_example1
#> [[1]]
#> dist wi param1 param2 param3
#> 1 norm 0.5 0.10 0.005 NA
#> 2 t 0.5 0.12 0.005 3
#Naturally we will specify the timepoint for which these probabilities where elicited
timepoint_expert <- 14
#In case we wanted a second timepoint -- Just for illustration
# param_expert_example1[[2]] <- data.frame(dist = c("norm","norm"),
# wi = c(1,1),
# param1 = c(0.05,0.045),
# param2 = c(0.005,0.005),
# param3 = c(NA,NA))
#
# timepoint_expert <- c(timepoint_expert,18)
If we wanted opinions at multiple timepoints we just include append
another list (i.e. param_expert_example1[[2]] with the relevant
parameters) and specify timepoint_expert
as a vector of
length 2 with the second element being the second timepoint.
For details on assigning distributions to elicited probabilities and
quantiles see the SHELF
package (Oakley 2021) and for an overview on
methodological approaches to eliciting expert opinion see (O’Hagan 2019). We can see both the individual
and pooled distributions using the following code (note that we could
have used the output of the fitdist
function from
SHELF
if we actually elicited quantiles from an
expert):
plot_opinion1 <- plot_expert_opinion(param_expert_example1[[1]],
weights = param_expert_example1[[1]]$wi)
ggsave("Vignette_Example 1 - Expert Opinion.png")
For the log pool we have a uni-modal distribution (in contrast to the bi-modal linear pool) which has a \(95\%\) credible interval between \(9.0−11.9\%\) calculated with the function below:
cred_int_val <- cred_int(plot_opinion1,val = "log pool", interval = c(0.025, 0.975))
Expert prior distributions
We load and fit the data as follows (in this example considering just
the Weibull and Gompertz models), with
pool_type = "log pool"
specifying that we want to use the
logarithmic pooling (rather than default “linear pool”). We do this as
we wish to compare the results to the penalized maximum likelihood
estimates in the next section.
data2 <- data %>% rename(status = censored) %>% mutate(time2 = ifelse(time > 10, 10, time),
status2 = ifelse(time> 10, 0, status))
#Set the opinion type to "survival"
example1 <- fit.models.expert(formula=Surv(time2,status2)~1,data=data2,
distr=c("wph", "gomp"),
method="bayes",
iter = 5000,
pool_type = "log pool",
opinion_type = "survival",
times_expert = timepoint_expert,
param_expert = param_expert_example1)
Both visual fit and model fit statistics highlight that the Weibull model is a poor fit to both the expert opinion and data (black line referring to the \(95\%\) confidence region for the experts prior belief).
model.fit.plot(example1, type = "dic")
#N.B. plot.expertsurv (ported directly from survHE) plots the survival function at the posterior mean parameter values
# while it is more robust to use the entire posterior sample (make.surv), however, in this case both results are similar.
plot(example1, add.km = T, t = 0:30)+
theme_light()+
scale_x_continuous(expand = c(0, 0), limits = c(0,NA), breaks=seq(0, 30, 2)) +
scale_y_continuous(expand = c(0, 0), limits = c(0, NA), breaks=seq(0, 1, 0.05))+
geom_segment(aes(x = 14, y = cred_int_val[1], xend = 14, yend = cred_int_val[2]))
Model Comparison
Survival function with Expert prior
We can also fit the model by Penalized Maximum Likelihood approaches
based on code taken from the flexsurv
package (Jackson 2016). All that is required that the
method="bayes"
is changed to method="mle"
with
the iter
argument now redundant. One argument that maybe of
interest is the method_mle
which is the optimization
procedure that flexsurv
uses. In case the optimization
fails, we can sometimes obtain convergence with the
Nelder-Mead
algorithm. If the procedure is still failing,
it may relate to the expert opinion being too informative.
It should be noted that the results will be similar to the Bayesian approach when the expert opinion is unimodal and relatively more informative, therefore we use the logarithmic pool which is unimodal.
We find that the AIC values also favour the Gompertz model by a large factor (not shown) and are very similar to the DIC presented for the Bayesian model.
In this situation we place an opinion on the comparator arm. In this
example id_St
is used to specify the row number in the data
frame of the comparator arm, indicating that the covariate pattern for
this row represents the group for which the expert opinion is
provided.
param_expert_example2[[1]] <- data.frame(dist = c("norm"),
wi = c(1),
param1 = c(0.1),
param2 = c(0.005),
param3 = c(NA))
#Check the coding of the arm variable
#Comparator is 0, which is our id_St
unique(data$arm)
#> [1] 0 1
survHE.data.model <- fit.models.expert(formula=Surv(time2,status2)~as.factor(arm),data=data2,
distr=c("wei"),
method="hmc",
iter = 5000,
opinion_type = "survival",
id_St = min(which(data2$arm ==0)), #We want our opinion to refer to the treatment called "0"
times_expert = timepoint_expert,
param_expert = param_expert_example2)
We can remove the impact of expert opinion omitting the arguments
relating to opinion_type,id_St,times_expert,param_expert
,
\(\dots\) with the function.
Survival function with Expert prior (left) and Vague prior (right)
The survival function for “arm 1” has been shifted downwards slightly, however the covariate for the accelerated time factor has markedly increased to counteract the lower survival probability for the reference (arm 0).
This example illustrates an opinion on the survival difference. For illustration we use the Gompertz, noting that a negative shape parameter will lead to a proportion of subjects living forever. Clearly the mean is not defined in these cases so the code automatically constrains the shape to be positive. It should also be noted that there are many potential combinations of survival curves which could produce a difference in the expected survival of 5 months. If this approach is considered, the number of iterations should be quite large to ensure convergence.
param_expert3 <- list()
#Prior belief of 5 "months" difference in expected survival
param_expert3[[1]] <- data.frame(dist = "norm", wi = 1, param1 = 5, param2 = 0.2, param3 = NA)
survHE.data.model <- fit.models.expert(formula=Surv(time2,status2)~as.factor(arm),data=data2,
distr=c("gom"),
method="hmc",
iter = 5000,
opinion_type = "mean",
id_trt = min(which(data2$arm ==1)), # Survival difference is Mean_surv[id_trt]- Mean_surv[id_comp]
id_comp = min(which(data2$arm ==0)),
param_expert = param_expert3)
Survival difference
As stated in the introduction this package relies on many of the core
functions of the survHE, flexsurv
packages Jackson (2016). Because future versions of
survHE
and flexsurv
may introduce conflicts
with the current implementation, we have directly ported the key
functions from these packages into the package so that
expertsurv
no longer imports survHE,flexsurv
(of course all credit for those functions goes to Jackson (2016)).
If you run in issues, bugs or just features which you feel would be useful, please let me know (phcooney@tcd.ie) and I will investigate and update as required.
As this is a Bayesian analysis convergence diagnostics should be performed. Poor convergence can be observed for many reasons, however, because of our use of expert opinion it may be a symptom of conflict between the observed data and the expert’s opinion.
Default priors should work in most situations, but still need to be considered. At a minimum the Bayesian results without expert opinion should be compared against the maximum likelihood estimates. If considerable differences are present the prior distributions should be investigated.
Because the analysis is done in JAGS and Stan we can leverage the
ggmcmc
package (ggmcmc.2016?):
#For Stan Models # Log-Normal, RP, Exponential, Weibull
ggmcmc(ggs(example1$models$`Exponential`), file = "Exponential.pdf")
#For JAGS Models # Gamma, Gompertz, Generalized Gamma
ggmcmc(ggs(as.mcmc(example1$models$`Gamma`)), file = "Gamma.pdf")