The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Case studies

Gautier Paux and Alex Dmitrienko

2019-05-08

Introduction

Several case studies have been created to facilitate the implementation of simulation-based Clinical Scenario Evaluation (CSE) approaches in multiple settings and help the user understand individual features of the Mediana package. Case studies are arranged in terms of increasing complexity of the underlying clinical trial setting (i.e., trial design and analysis methodology). For example, Case study 1 deals with a number of basic settings and increasingly more complex settings are considered in the subsequent case studies.

Case study 1

This case study serves a good starting point for users who are new to the Mediana package. It focuses on clinical trials with simple designs and analysis strategies where power and sample size calculations can be performed using analytical methods.

  1. Trial with two treatment arms and single endpoint (normally distributed endpoint).
  2. Trial with two treatment arms and single endpoint (binary endpoint).
  3. Trial with two treatment arms and single endpoint (survival-type endpoint).
  4. Trial with two treatment arms and single endpoint (survival-type endpoint with censoring).
  5. Trial with two treatment arms and single endpoint (count-type endpoint).

Case study 2

This case study is based on a clinical trial with three or more treatment arms. A multiplicity adjustment is required in this setting and no analytical methods are available to support power calculations.

This example also illustrates a key feature of the Mediana package, namely, a useful option to define custom functions, for example, it shows how the user can define a new criterion in the Evaluation Model.

Clinical trial in patients with schizophrenia

Case study 3

This case study introduces a clinical trial with several patient populations (marker-positive and marker-negative patients). It demonstrates how the user can define independent samples in a data model and then specify statistical tests in an analysis model based on merging several samples, i.e., merging samples of marker-positive and marker-negative patients to carry out a test that evaluated the treatment effect in the overall population.

Clinical trial in patients with asthma

Case study 4

This case study illustrates CSE simulations in a clinical trial with several endpoints and helps showcase the package’s ability to model multivariate outcomes in clinical trials.

Clinical trial in patients with metastatic colorectal cancer

Case study 5

This case study is based on a clinical trial with several endpoints and multiple treatment arms and illustrates the process of performing complex multiplicity adjustments in trials with several clinical objectives.

Clinical trial in patients with rheumatoid arthritis

Case study 6

This case study is an extension of Case study 2 and illustrates how the package can be used to assess the performance of several multiplicity adjustments. The case study also walks the reader through the process of defining customized simulation reports.

Clinical trial in patients with schizophrenia

Case study 1

Case study 1 deals with a simple setting, namely, a clinical trial with two treatment arms (experimental treatment versus placebo) and a single endpoint. Power calculations can be performed analytically in this setting. Specifically, closed-form expressions for the power function can be derived using the central limit theorem or other approximations.

Several distribution will be illustrated in this case study:

Normally distributed endpoint

Suppose that a sponsor is designing a Phase III clinical trial in patients with pulmonary arterial hypertension (PAH). The efficacy of experimental treatments for PAH is commonly evaluated using a six-minute walk test and the primary endpoint is defined as the change from baseline to the end of the 16-week treatment period in the six-minute walk distance.

Define a Data Model

The first step is to initialize the data model:

After the initialization, components of the data model can be added to the DataModel object incrementally using the + operator.

The change from baseline in the six-minute walk distance is assumed to follow a normal distribution. The distribution of the primary endpoint is defined in the OutcomeDist object:

The sponsor would like to perform power evaluation over a broad range of sample sizes in each treatment arm:

As a side note, the seq function can be used to compactly define sample sizes in a data model:

The sponsor is interested in performing power calculations under two treatment effect scenarios (standard and optimistic scenarios). Under these scenarios, the experimental treatment is expected to improve the six-minute walk distance by 40 or 50 meters compared to placebo, respectively, with the common standard deviation of 70 meters.

Therefore, the mean change in the placebo arm is set to μ = 0 and the mean changes in the six-minute walk distance in the experimental arm are set to μ = 40 (standard scenario) or μ = 50 (optimistic scenario). The common standard deviation is σ = 70.

Note that the mean and standard deviation are explicitly identified in each list. This is done mainly for the user’s convenience.

After having defined the outcome parameters for each sample, two Sample objects that define the two treatment arms in this trial can be created and added to the DataModel object:

Define an Analysis Model

Just like the data model, the analysis model needs to be initialized as follows:

Only one significance test is planned to be carried out in the PAH clinical trial (treatment versus placebo). The treatment effect will be assessed using the one-sided two-sample t-test:

According to the specifications, the two-sample t-test will be applied to Sample 1 (Placebo) and Sample 2 (Treatment). These sample IDs come from the data model defied earlier. As explained in the manual, see Analysis Model, the sample order is determined by the expected direction of the treatment effect. In this case, an increase in the six-minute walk distance indicates a beneficial effect and a numerically larger value of the primary endpoint is expected in Sample 2 (Treatment) compared to Sample 1 (Placebo). This implies that the list of samples to be passed to the t-test should include Sample 1 followed by Sample 2. It is of note that from version 1.0.6, it is possible to specify an option to indicate if a larger numeric values is expected in the Sample 2 (larger = TRUE) or in Sample 1 (larger = FALSE). By default, this argument is set to TRUE.

To illustrate the use of the Statistic object, the mean change in the six-minute walk distance in the treatment arm can be computed using the MeanStat statistic:

Define an Evaluation Model

The data and analysis models specified above collectively define the Clinical Scenarios to be examined in the PAH clinical trial. The scenarios are evaluated using success criteria or metrics that are aligned with the clinical objectives of the trial. In this case it is most appropriate to use regular power or, more formally, marginal power. This success criterion is specified in the evaluation model.

First of all, the evaluation model must be initialized:

Secondly, the success criterion of interest (marginal power) is defined using the Criterion object:

The tests argument lists the IDs of the tests (defined in the analysis model) to which the criterion is applied (note that more than one test can be specified). The test IDs link the evaluation model with the corresponding analysis model. In this particular case, marginal power will be computed for the t-test that compares the mean change in the six-minute walk distance in the placebo and treatment arms (Placebo vs treatment).

In order to compute the average value of the mean statistic specified in the analysis model (i.e., the mean change in the six-minute walk distance in the treatment arm) over the simulation runs, another Criterion object needs to be added:

The statistics argument of this Criterion object lists the ID of the statistic (defined in the analysis model) to which this metric is applied (e.g., Mean Treatment).

Perform Clinical Scenario Evaluation

After the clinical scenarios (data and analysis models) and evaluation model have been defined, the user is ready to evaluate the success criteria specified in the evaluation model by calling the CSE function.

To accomplish this, the simulation parameters need to be defined in a SimParameters object:

The function call for CSE specifies the individual components of Clinical Scenario Evaluation in this case study as well as the simulation parameters:

The simulation results are saved in an CSE object (case.study1.results). This object contains complete information about this particular evaluation, including the data, analysis and evaluation models specified by the user. The most important component of this object is the data frame contained in the list named simulation.results (case.study1.results$simulation.results). This data frame includes the values of the success criteria and metrics defined in the evaluation model.

Summarize the Simulation Results

Summary of simulation results in R console

To facilitate the review of the simulation results produced by the CSE function, the user can invoke the summary function. This function displays the data frame containing the simulation results in the R console:

If the user is interested in generate graphical summaries of the simulation results (using the the ggplot2 package or other packages), this data frame can also be saved to an object:

General a Simulation Report

Presentation Model

A very useful feature of the Mediana package is generation of a Microsoft Word-based report to provide a summary of Clinical Scenario Evaluation Report.

To generate a simulation report, the user needs to define a presentation model by creating a PresentationModel object. This object must be initialized as follows:

Project information can be added to the presentation model using the Project object:

The user can easily customize the simulation report by defining report sections and specifying properties of summary tables in the report. The code shown below creates a separate section within the report for each set of outcome parameters (using the Section object) and sets the sorting option for the summary tables (using the Table object). The tables will be sorted by the sample size. Further, in order to define descriptive labels for the outcome parameter scenarios and sample size scenarios, the CustomLabel object needs to be used:

Report generation

Once the presentation model has been defined, the simulation report is ready to be generated using the GenerateReport function:

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Binary endpoint

Consider a Phase III clinical trial for the treatment of rheumatoid arthritis (RA). The primary endpoint is the response rate based on the American College of Rheumatology (ACR) definition of improvement. The trial’s sponsor in interested in performing power calculations using several treatment effect assumptions (Placebo 30% - Treatment 50%, Placebo 30% - Treatment 55% and Placebo 30% - Treatment 60%)

Define an Analysis Model

The analysis model uses a standard two-sample test for comparing proportions (method = "PropTest") to assess the treatment effect in this clinical trial example:

Define an Evaluation Model

Power evaluations are easily performed in this clinical trial example using the same evaluation model utilized in the case of a normally distributed endpoint, i.e., evaluations rely on marginal power:

An extension of this clinical trial example is provided in Case study 5. The extension deals with a more complex setting involving several trial endpoints and multiple treatment arms.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Survival-type endpoint

If the trial’s primary objective is formulated in terms of analyzing the time to a clinically important event (progression or death in an oncology setting), data and analysis models can be set up based on an exponential distribution and the log-rank test.

As an illustration, consider a Phase III trial which will be conducted to evaluate the efficacy of a new treatment for metastatic colorectal cancer (MCC). Patients will be randomized in a 2:1 ratio to an experimental treatment or placebo (in addition to best supportive care).

The trial’s primary objective is to assess the effect of the experimental treatment on progression-free survival (PFS).

Define a Data Model

A single treatment effect scenario is considered in this clinical trial example. Specifically, the median time to progression is assumed to be:

  • Placebo : t0 = 6 months,

  • Treatment: t1 = 9 months.

Under an exponential distribution assumption (which is specified using the ExpoDist distribution), the median times correspond to the following hazard rates:

  • λ0 = log(2)/t0 = 0.116,

  • λ1 = log(2)/t1 = 0.077,

and the resulting hazard ratio (HR) is 0.077/0.116 = 0.67.

It is important to note that, if no censoring mechanisms are specified in a data model with a time-to-event endpoint, all patients will reach the endpoint of interest (e.g., progression) and thus the number of patients will be equal to the number of events. Using this property, power calculations can be performed using either the Event object or SampleSize object. For the purpose of illustration, the Event object will be used in this example.

To define a data model in the MCC clinical trial, the total event count in the trial is assumed to range between 270 and 300. Since the trial’s design is not balanced, the randomization ratio needs to be specified in the Event object:

It is worth noting that the primary endpoint’s type (i.e., theoutcome.type argument in the OutcomeDist object) is not specified. By default, the outcome type is set to fixed, which means that a design with a fixed follow-up is assumed even though the primary endpoint in this clinical trial is clearly a time-to-event endpoint. This is due to the fact that, as was explained earlier in this case study, there is no censoring in this design and all patients are followed until the event of interest is observed. It is easy to verify that the same results are obtained if the outcome type is set to event.

Define an Analysis Model

The analysis model in this clinical trial is very similar to the analysis models defined in the case studies with normal and binomial outcome variables. The only difference is the choice of the statistical method utilized in the primary analysis (method = "LogrankTest"):

To illustrate the specification of a Statistic object, the hazard ratio will be computed using the Cox method. This can be accomplished by adding a Statistic object to the AnalysisModel such presented below.

Define an Evaluation Model

An evaluation model identical to that used earlier in the case studies with normal and binomial distribution can be applied to compute the power function at the selected event counts. Moreover, the average hazard ratio accross the simulations will be computed.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Survival-type endpoint (with censoring)

The power calculations presented in the previous case study assume an idealized setting where each patient is followed until the event of interest (e.g., progression) is observed. In this case, the sample size (number of patients) in each treatment arm is equal to the number of events. In reality, events are often censored and a sponsor is generally interested in determining the number of patients to be recruited in order to ensure a target number of events, which translates into desirable power.

The Mediana package can be used to perform power calculations in event-driven trials in the presence of censoring. This is accomplished by setting up design parameters such as the length of the enrollment and follow-up periods in a data model using a Design object.

In general, even though closed-form solutions have been derived for sample size calculations in event-driven designs, the available approaches force clinical trial researchers to make a variety of simplifying assumptions, e.g., assumptions on the enrollment distribution are commonly made, see, for example, Julious (2009, Chapter 15). A general simulation-based approach to power and sample size calculations implemented in the Mediana package enables clinical trial sponsors to remove these artificial restrictions and examine a very broad set of plausible design parameters.

Define a Data Model

Suppose, for example, that a standard design with a variable follow-up will be used in the MCC trial introduced in the previous case study. The total study duration will be 21 months, which includes a 9-month enrollment (accrual) period and a minimum follow-up of 12 months. The patients are assumed to be recruited at a uniform rate. The set of design parameters also includes the dropout distribution and its parameters. In this clinical trial, the dropout distribution is exponential with a rate determined from historical data. These design parameters are specified in a Design object:

Finally, the primary endpoint’s type is set to event in the OutcomeDist object to indicate that a variable follow-up will be utilized in this clinical trial.

The complete data model in this case study is defined as follows:

Define an Analysis Model

Since the number of events has been fixed in this clinical trial example and some patients will not reach the event of interest, it will be important to estimate the number of patients required to accrue the required number of events. In the Mediana package, this can be accomplished by specifying a descriptive statistic named PatientCountStat (this statistic needs to be specified in a Statistic object). Another descriptive statistic that would be of interest is the event count in each sample. To compute this statistic, EventCountStat needs to be included in a Statistic object.

Define an Evaluation Model

In order to compute the average values of the two statistics (PatientCountStat and EventCountStat) in each sample over the simulation runs, two Criterion objects need to be specified, in addition to the Criterion object defined to obtain marginal power. The IDs of the corresponding Statistic objects will be included in the statistics argument of the two Criterion objects:

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Count-type endpoint

The last clinical trial example within Case study 1 deals with a Phase III clinical trial in patients with relapsing-remitting multiple sclerosis (RRMS). The trial aims at assessing the safety and efficacy of a single dose of a novel treatment compared to placebo. The primary endpoint is the number of new gadolinium enhancing lesions seen during a 6-month period on monthly MRIs of the brain and a smaller number indicates treatment benefit. The distribution of such endpoints has been widely studied in the literature and Sormani et al. (1999a, 1999b) showed that a negative binomial distribution provides a fairly good fit.

The list below gives the expected treatment effect in the experimental treatment and placebo arms (note that the negative binomial distribution is parameterized using the mean rather than the probability of success in each trial). The mean number of new lesions is set to 13 in the Treament arm and 7.8 in the Placebo arm, with a common dispersion parameter of 0.5.

The corresponding treatment effect, i.e., the relative reduction in the mean number of new lesions counts, is 100 * (13 − 7.8)/13 = 40%. The assumptions in the table define a single outcome parameter set.

Define a Data Model

The OutcomeDist object defines the distribution of the trial endpoint (NegBinomDist). Further, a balanced design is utilized in this clinical trial and the range of sample sizes is defined in the SampleSize object (it is convenient to do this using the seq function). The Sample object includes the parameters required by the negative binomial distribution (dispersion and mean).

Define an Analysis Model

The treatment effect will be assessed in this clinical trial example using a negative binomial generalized linear model (NBGLM). In the Mediana package, the corresponding test is carrying out using the GLMNegBinomTest method which is specified in the Test object. It should be noted that as a smaller value indicates a treatment benefit, the first sample defined in the samples argument must be Treatment.

Alternatively, from version 1.0.6, it is possible to specify the argument lower in the parameters of the method. If set to FALSE a numerically lower value is expected in Sample 2.

Define an Evaluation Model

The objective of this clinical trial is identical to that of the clinical trials presented earlier on this page, i.e., evaluation will be based on marginal power of the primary endpoint test. As a consequence, the same evaluation model can be applied.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Case study 2

Summary

This clinical trial example deals with settings where no analytical methods are available to support power calculations. However, as demonstrated below, simulation-based approaches are easily applied to perform а comprehensive assessment of the relevant operating characteristics within the clinical scenario evaluation framework.

Case study 2 is based on a clinical trial example introduced in Dmitrienko and D’Agostino (2013, Section 10). This example deals with a Phase III clinical trial in a schizophrenia population. Three doses of a new treatment, labelled Dose L, Dose M and Dose H, will be tested versus placebo. The trial will be declared successful if a beneficial treatment effect is demonstrated in any of the three dosing groups compared to the placebo group.

The primary endpoint is defined as the reduction in the Positive and Negative Syndrome Scale (PANSS) total score compared to baseline and a larger reduction in the PANSS total score indicates treatment benefit. This endpoint is normally distributed and the treatment effect assumptions in the four treatment arms are displayed in the next table.

Arm Mean SD
Placebo 16 18
Dose L 19.5 18
Dose M 21 18
Dose H 21 18

Define a Data Model

The treatment effect assumptions presented in the table above define a single outcome parameter set and the common sample size is set to 260 patients. These parameters are specified in the following data model:

Define an Analysis Model

The analysis model, shown below, defines the three individual tests that will be carried out in the schizophrenia clinical trial. Each test corresponds to a dose-placebo comparison such as:

Each comparison will be carried out based on a one-sided two-sample t-test (TTest method defined in each Test object).

As indicated earlier, the overall success criterion in the trial is formulated in terms of demonstrating a beneficial effect at any of the three doses. Due to multiple opportunities to claim success, the overall Type I error rate will be inflated and the Hochberg procedure is introduced to protect the error rate at the nominal level.

Since no procedure parameters are defined, the three significance tests (or, equivalently, three null hypotheses of no effect) are assumed to be equally weighted. The corresponding analysis model is defined below:

To request the Hochberg procedure with unequally weighted hypotheses, the user needs to assign a list of hypothesis weights to the par argument of the MultAdjProc object. The weights typically reflect the relative importance of the individual null hypotheses. Assume, for example, that 60% of the overall weight is assigned to H3 and the remainder is split between H1 and H2. In this case, the MultAdjProc object should be defined as follow:

It should be noted that the order of the weights must be identical to the order of the Test objects defined in the analysis model.

Define an Evaluation Model

An evaluation model specifies clinically relevant criteria for assessing the performance of the individual tests defined in the corresponding analysis model or composite measures of success. In virtually any setting, it is of interest to compute the probability of achieving a significant outcome in each individual test, e.g., the probability of a significant difference between placebo and each dose. This is accomplished by requesting a Criterion object with method = "MarginalPower".

Since the trial will be declared successful if at least one dose-placebo comparison is significant, it is natural to compute the overall success probability, which is defined as the probability of demonstrating treatment benefit in one of more dosing groups. This is equivalent to evaluating disjunctive power in the trial (method = "DisjunctivePower").

In addition, the user can easily define a custom evaluation criterion. Suppose that, based on the results of the previously conducted trials, the sponsor expects a much larger treatment treatment difference at Dose H compared to Doses L and M. Given this, the sponsor may be interested in evaluating the probability of observing a significant treatment effect at Dose H and at least one other dose. The associated evaluation criterion is implemented in the following function:

The function’s first argument (test.result) is a matrix of p-values produced by the Test objects defined in the analysis model and the second argument (statistic.result) is a matrix of results produced by the Statistic objects defined in the analysis model. In this example, the criteria will only use the test.result argument, which will contain the p-values produced by the tests associated with the three dose-placebo comparisons. The last argument (parameter) contains the optional parameter(s) defined by the user in the Criterion object. In this example, the par argument contains the overall alpha level.

The case.study2.criterion function computes the probability of a significant treatment effect at Dose H (test.result[,3] <= alpha) and a significant treatment difference at Dose L or Dose M ((test.result[,1] <= alpha) | (test.result[,2]<= alpha)). Since this criterion assumes that the third test is based on the comparison of Dose H versus Placebo, the order in which the tests are included in the evaluation model is important.

The following evaluation model specifies marginal and disjunctive power as well as the custom evaluation criterion defined above:

Another potential option is to apply the conjunctive criterion which is met if a significant treatment difference is detected simultaneously in all three dosing groups (method = "ConjunctivePower"). This criterion helps characterize the likelihood of a consistent treatment effect across the doses.

The user can also use the metric.tests parameter to choose the specific tests to which the disjunctive and conjunctive criteria are applied (the resulting criteria are known as subset disjunctive and conjunctive criteria). To illustrate, the following statement computes the probability of a significant treatment effect at Dose M or Dose H (Dose L is excluded from this calculation):

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Case study 3

Summary

This case study deals with a Phase III clinical trial in patients with mild or moderate asthma (it is based on a clinical trial example from Millen et al., 2014, Section 2.2). The trial is intended to support a tailoring strategy. In particular, the treatment effect of a single dose of a new treatment will be compared to that of placebo in the overall population of patients as well as a pre-specified subpopulation of patients with a marker-positive status at baseline (for compactness, the overall population is denoted by OP, marker-positive subpopulation is denoted by M+ and marker- negative subpopulation is denoted by M−).

Marker-positive patients are more likely to receive benefit from the experimental treatment. The overall objective of the clinical trial accounts for the fact that the treatment’s effect may, in fact, be limited to the marker-positive subpopulation. The trial will be declared successful if the treatment’s beneficial effect is established in the overall population of patients or, alternatively, the effect is established only in the subpopulation. The primary endpoint in the clinical trial is defined as an increase from baseline in the forced expiratory volume in one second (FEV1). This endpoint is normally distributed and improvement is associated with a larger change in FEV1.

Define a Data Model

To set up a data model for this clinical trial, it is natural to define samples (mutually exclusive groups of patients) as follows:

Using this definition of samples, the trial’s sponsor can model the fact that the treatment’s effect is most pronounced in patients with a marker-positive status.

The treatment effect assumptions in the four samples are summarized in the next table (expiratory volume in FEV1 is measured in liters). As shown in the table, the mean change in FEV1 is constant across the marker-negative and marker-positive subpopulations in the placebo arm (Samples 1 and 2). A positive treatment effect is expected in both subpopulations in the treatment arm but marker-positive patients will experience most of the beneficial effect (Sample 4).

Sample Mean SD
Placebo M- 0.12 0.45
Placebo M+ 0.12 0.45
Treament M- 0.24 0.45
Treatment M+ 0.3 0.45

The following data model incorporates the assumptions listed above by defining a single set of outcome parameters. The data model includes three sample size sets (total sample size is set to 330, 340 and 350 patients). The sizes of the individual samples are computed based on historic information (40% of patients in the population of interest are expected to have a marker-positive status). In order to define specific sample size for each sample, they will be specified within each Sample object.

Define an Analysis Model

The analysis model in this clinical trial example is generally similar to that used in Case study 2 but there is an important difference which is described below.

As in Case study 2, the primary endpoint follows a normal distribution and thus the treatment effect will be assessed using the two-sample t-test.

Since two null hypotheses are tested in this trial (null hypotheses of no effect in the overall population of patients and subpopulation of marker-positive patients), a multiplicity adjustment needs to be applied. The Hochberg procedure with equally weighted null hypotheses will be used for this purpose.

A key feature of the analysis strategy in this case study is that the samples defined in the data model are different from the samples used in the analysis of the primary endpoint. As shown in the Table, four samples are included in the data model. However, from the analysis perspective, the sponsor in interested in examining the treatment effect in two samples, namely, the overall population and marker-positive subpopulation. As shown below, to perform a comparison in the overall population, the t-test is applied to the following analysis samples:

Further, the treatment effect test in the subpopulation of marker-positive patients is carried out based on these analysis samples:

These analysis samples are specified in the analysis model below. The samples defined in the data model are merged using c() or list() function, e.g., c("Placebo M-", "Placebo M+")defines the placebo arm and c("Treatment M-", "Treatment M+") defines the experimental treatment arm in the overall population test.

Define an Evaluation Model

It is reasonable to consider the following success criteria in this case study:

The following evaluation model applies the three criteria to the two tests listed in the analysis model:

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Case study 4

Summary

Case study 4 serves as an extension of the oncology clinical trial example presented in Case study 1. Consider again a Phase III trial in patients with metastatic colorectal cancer (MCC). The same general design will be assumed in this section; however, an additional endpoint (overall survival) will be introduced. The case of two endpoints helps showcase the package’s ability to model complex design and analysis strategies in trials with multivariate outcomes.

Progression-free survival (PFS) is the primary endpoint in this clinical trial and overall survival (OS) serves as the key secondary endpoint, which provides supportive evidence of treatment efficacy. A hierarchical testing approach will be utilized in the analysis of the two endpoints. The PFS analysis will be performed first at α = 0.025 (one-sided), followed by the OS analysis at the same level if a significant effect on PFS is established. The resulting testing procedure is equivalent to the fixed-sequence procedure and controls the overall Type I error rate (Dmitrienko and D’Agostino, 2013).

The treatment effect assumptions that will be used in clinical scenario evaluation are listed in the table below. The table shows the hypothesized median times along with the corresponding hazard rates for the primary and secondary endpoints. It follows from the table that the expected effect size is much larger for PFS compared to OS (PFS hazard ratio is lower than OS hazard ratio).

Endpoint Statistic Placebo Treatment
Progression-free survival Median time (months) 6 9
Hazard rate 0.116 0.077
Hazard ratio 0.67
Overall survival Median time (months) 15 19
Hazard rate 0.046 0.036
Hazard ratio 0.79

Define a Data Model

In this clinical trial two endpoints are evaluated for each patient (PFS and OS) and thus their joint distribution needs to be listed in the general set.

A bivariate exponential distribution will be used in this example and samples from this bivariate distribution will be generated by the MVExpoPFSOSDist function which implements multivariate exponential distributions. The function utilizes the copula method, i.e., random variables that follow a bivariate normal distribution will be generated and then converted into exponential random variables.

The next several statements specify the parameters of the bivariate exponential distribution:

The hazard rates for PFS and OS in each treatment arm are defined based on the information presented in the table above (placebo.par and treatment.par) and the correlation matrix is specified based on historical information (corr.matrix). These parameters are combined to define the outcome parameter sets (outcome.placebo and outcome.treatment) that will be included in the sample-specific set of data model parameters (Sample object).

To define the sample-specific data model parameters, a 2:1 randomization ratio will be used in this clinical trial and thus the number of events as well as the randomization ratio are specified by the user in the Event object. Secondly, a separate sample ID needs to be assigned to each endpoint within the two samples (e.g. Placebo PFS and Placebo OS) corresponding to the two treatment arms. This will enable the user to construct analysis models for examining the treatment effect on each endpoint.

Define an Analysis Model

The treatment comparisons for both endpoints will be carried out based on the log-rank test (method = "LogrankTest"). Further, as was stated in the beginning of this page, the two endpoints will be tested hierarchically using a multiplicity adjustment procedure known as the fixed-sequence procedure. This procedure belongs to the class of chain procedures (proc = "ChainAdj") and the following figure provides a visual summary of the decision rules used in this procedure.

The circles in this figure denote the two null hypotheses of interest:

The value displayed above a circle defines the initial weight of each null hypothesis. All of the overall α is allocated to H1 to ensure that the OS test will be carried out only after the PFS test is significant and the arrow indicates that H2 will be tested after H1 is rejected.

More formally, a chain procedure is uniquely defined by specifying a vector of hypothesis weights (W) and matrix of transition parameters (G). Based on the figure, these parameters are given by

Two objects (named chain.weight and chain.transition) are defined below to pass the hypothesis weights and transition parameters to the multiplicity adjustment parameters.

As shown above, the two significance tests included in the analysis model reflect the two-fold objective of this trial. The first test focuses on a PFS comparison between the two treatment arms (id = "PFS test") whereas the other test is carried out to assess the treatment effect on OS (test.id = "OS test").

Alternatively, the fixed-sequence procedure can be implemented using the method FixedSeqAdj introduced from version 1.0.4. This implementation is facilitated as no parameters have to be specified.

Define an Evaluation Model

The evaluation model specifies the most basic criterion for assessing the probability of success in the PFS and OS analyses (marginal power). A criterion based on disjunctive power could be considered but it would not provide additional information.

Due to the hierarchical testing approach, the probability of detecting a significant treatment effect on at least one endpoint (disjunctive power) is simply equal to the probability of establishing a significant PFS effect.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Case study 5

Summary

This case study extends the straightforward setting presented in Case study 1 to a more complex setting involving two trial endpoints and three treatment arms. Case study 5 illustrates the process of performing power calculations in clinical trials with multiple, hierarchically structured objectives and “multivariate” multiplicity adjustment strategies (gatekeeping procedures).

Consider a three-arm Phase III clinical trial for the treatment of rheumatoid arthritis (RA). Two co-primary endpoints will be used to evaluate the effect of a novel treatment on clinical response and on physical function. The endpoints are defined as follows:

The two endpoints have different marginal distributions. The first endpoint is binary whereas the second one is continuous and follows a normal distribution.

The efficacy profile of two doses of a new treatment (Doses L and Dose H) will be compared to that of a placebo and a successful outcome will be defined as a significant treatment effect at either or both doses. A hierarchical structure has been established within each dose so that Endpoint 2 will be tested if and only if there is evidence of a significant effect on Endpoint 1.

Three treatment effect scenarios for each endpoint are displayed in the table below. The scenarios define three outcome parameter sets. The first set represents a rather conservative treatment effect scenario, the second set is a standard (most plausible) scenario and the third set represents an optimistic scenario. Note that a reduction in the HAQ-DI score indicates a beneficial effect and thus the mean changes are assumed to be negative for Endpoint 2.

Table continues below
Endpoint Outcome.parameter.set Placebo Dose.L
ACR20 (%) Conservative 30% 40%
Standard 30% 45%
Optimistic 30% 50%
HAQ-DI (mean (SD)) Conservative -0.10 (0.50) -0.20 (0.50)
Standard -0.10 (0.50) -0.25 (0.50)
Optimistic -0.10 (0.50) -0.30 (0.50)
Dose.H
50%
55%
60%
-0.30 (0.50)
-0.35 (0.50)
-0.40 (0.50)

Define a Data Model

As in Case study 4, two endpoints are evaluated for each patient in this clinical trial example, which means that their joint distribution needs to be specified. The MVMixedDist method will be utilized for specifying a bivariate distribution with binomial and normal marginals (var.type = list("BinomDist", "NormalDist")). In general, this function is used for modeling correlated normal, binomial and exponential endpoints and relies on the copula method, i.e., random variables are generated from a multivariate normal distribution and converted into variables with pre-specified marginal distributions.

Three parameters must be defined to specify the joint distribution of Endpoints 1 and 2 in this clinical trial example:

These parameters are combined to define three outcome parameter sets (e.g., outcome1.plac, outcome1.dosel and outcome1.doseh) that will be included in the Sample object in the data model.

# Variable types
var.type = list("BinomDist", "NormalDist")

# Outcome distribution parameters
placebo.par = parameters(parameters(prop = 0.3), 
                         parameters(mean = -0.10, sd = 0.5))

dosel.par1 = parameters(parameters(prop = 0.40), 
                        parameters(mean = -0.20, sd = 0.5))
dosel.par2 = parameters(parameters(prop = 0.45), 
                        parameters(mean = -0.25, sd = 0.5))
dosel.par3 = parameters(parameters(prop = 0.50), 
                        parameters(mean = -0.30, sd = 0.5))

doseh.par1 = parameters(parameters(prop = 0.50), 
                        parameters(mean = -0.30, sd = 0.5))
doseh.par2 = parameters(parameters(prop = 0.55), 
                        parameters(mean = -0.35, sd = 0.5))
doseh.par3 = parameters(parameters(prop = 0.60), 
                        parameters(mean = -0.40, sd = 0.5))

# Correlation between two endpoints
corr.matrix = matrix(c(1.0, 0.5,
                       0.5, 1.0), 2, 2)

# Outcome parameter set 1
outcome1.placebo = parameters(type = var.type, 
                              par = placebo.par, 
                              corr = corr.matrix)
outcome1.dosel = parameters(type = var.type, 
                            par = dosel.par1, 
                            corr = corr.matrix)
outcome1.doseh = parameters(type = var.type, 
                            par = doseh.par1, 
                            corr = corr.matrix)

# Outcome parameter set 2
outcome2.placebo = parameters(type = var.type, 
                              par = placebo.par, 
                              corr = corr.matrix)
outcome2.dosel = parameters(type = var.type, 
                            par = dosel.par2, 
                            corr = corr.matrix)
outcome2.doseh = parameters(type = var.type, 
                            par = doseh.par2, 
                            corr = corr.matrix)

# Outcome parameter set 3
outcome3.placebo = parameters(type = var.type, 
                              par = placebo.par, 
                              corr = corr.matrix)
outcome3.doseh = parameters(type = var.type, 
                            par = doseh.par3, 
                            corr = corr.matrix)
outcome3.dosel = parameters(type = var.type, 
                            par = dosel.par3, 
                            corr = corr.matrix)

These outcome parameter set are then combined within each Sample object and the common sample size per treatment arm ranges between 100 and 120:

Define an Analysis Model

To set up the analysis model in this clinical trial example, note that the treatment comparisons for Endpoints 1 and 2 will be carried out based on two different statistical tests:

It was pointed out earlier in this page that the two endpoints will be tested hierarchically within each dose. The figure below provides a visual summary of the testing strategy used in this clinical trial. The circles in this figure denote the four null hypotheses of interest:

H1: Null hypothesis of no difference between Dose L and placebo with respect to Endpoint 1.

H2: Null hypothesis of no difference between Dose H and placebo with respect to Endpoint 1.

H3: Null hypothesis of no difference between Dose L and placebo with respect to Endpoint 2.

H4: Null hypothesis of no difference between Dose H and placebo with respect to Endpoint 2.

A multiple testing procedure known as the multiple-sequence gatekeeping procedure will be applied to account for the hierarchical structure of this multiplicity problem. This procedure belongs to the class of mixture-based gatekeeping procedures introduced in Dmitrienko et al. (2015). This gatekeeping procedure is specified by defining the following three parameters:

These parameters are included in the MultAdjProc object defined below. The tests to which the multiplicity adjustment will be applied are defined in the tests argument. The use of this argument is optional if all tests included in the analysis model are to be included. The argument family states that the null hypotheses will be grouped into two families:

It is to be noted that the order corresponds to the order of the tests defined in the analysis model, except if the tests are specifically specified in the tests argument of the MultAdjProc object.

The families will be tested sequentially and a truncated Holm procedure will be applied within each family (component.procedure). Lastly, the truncation parameter will be set to 0.8 in Family 1 and to 1 in Family 2 (gamma). The resulting parameters are included in the par argument of the MultAdjProc object and, as before, the proc argument is used to specify the multiple testing procedure (MultipleSequenceGatekeepingAdj).

The test are then specified in the analysis model and the overall analysis model is defined as follows:

Recall that a numerically lower value indicates a beneficial effect for the HAQ-DI score and, as a result, the experimental treatment arm must be defined prior to the placebo arm in the test.samples parameters corresponding to the HAQ-DI tests, e.g., samples = samples("DoseL HAQ-DI", "Placebo HAQ-DI").

Define an Evaluation Model

In order to assess the probability of success in this clinical trial, a hybrid criterion based on the conjunctive criterion (both trial endpoints must be significant) and disjunctive criterion (at least one dose-placebo comparison must be significant) can be considered.

This criterion will be met if a significant effect is established at one or two doses on Endpoint 1 (ACR20) and also at one or two doses on Endpoint 2 (HAQ-DI). However, due to the hierarchical structure of the testing strategy (see Figure), this is equivalent to demonstrating a significant difference between Placebo and at least one dose with respect to Endpoint 2. The corresponding criterion is a subset disjunctive criterion based on the two Endpoint 2 tests (subset disjunctive power was briefly mentioned in Case study 2).

In addition, the sponsor may also be interested in evaluating marginal power as well as subset disjunctive power based on the Endpoint 1 tests. The latter criterion will be met if a significant difference between Placebo and at least one dose is established with respect to Endpoint 1. Additionally, as in Case study 2, the user could consider defining custom evaluation criteria. The three resulting evaluation criteria (marginal power, subset disjunctive criterion based on the Endpoint 1 tests and subset disjunctive criterion based on the Endpoint 2 tests) are included in the following evaluation model.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

Case study 6

Summary

Case study 6 is an extension of Case study 2 where the objective of the sponsor is to compare several Multiple Testing Procedures (MTPs). The main difference is in the specification of the analysis model.

Define a Data Model

The same data model as in Case study 2 will be used in this case study. However, as shown in the table below, a new set of outcome parameters will be added in this case study (an optimistic set of parameters).

Outcome.parameter.set Arm Mean SD
Standard Placebo 16 18
Dose L 19.5 18
Dose M 21 18
Dose H 21 18
Optimistic Placebo 16 18
Dose L 20 18
Dose M 21 18
Dose H 22 18

Define an Analysis Model

As in Case study 2, each dose-placebo comparison will be performed using a one-sided two-sample t-test (TTest method defined in each Test object). The same nomenclature will be used to define the hypotheses, i.e.:

In this case study, as in Case study 2, the overall success criterion in the trial is formulated in terms of demonstrating a beneficial effect at any of the three doses, inducing an inflation of the overall Type I error rate. In this case study, the sponsor is interested in comparing several Multiple Testing Procedures, such as the weighted Bonferroni, Holm and Hochberg procedures. These MTPs are defined as below:

The mult.adj1 object, which specified that no adjustment will be used, is defined in order to observe the decrease in power induced by each MTPs.

It should be noted that for each weighted procedure, a higher weight is assigned to the test of Placebo vs Dose H (1/2), and the remaining weight is equally assigned to the two other tests (i.e. 1/4 for each test). These parameters are specified in the par argument of each MTP.

The analysis model is defined as follows:

For the sake of compactness, all MTPs are combined using a MultAdj object, but it is worth mentioning that each MTP could have been directly added to the AnalysisModel object using the + operator.

Define an Evaluation Model

As for the data model, the same evaluation model as in Case study 2 will be used in this case study. Refer to Case study 2 for more information.

The last Criterion object specifies the custom criterion which computes the probability of a significant treatment effect at Dose H and a significant treatment difference at Dose L or Dose M.

Perform Clinical Scenario Evaluation

Using the data, analysis and evaluation models, simulation-based Clinical Scenario Evaluation is performed by calling the CSE function:

Generate a Simulation Report

This case study will also illustrate the process of customizing a Word-based simulation report. This can be accomplished by defining custom sections and subsections to provide a structured summary of the complex set of simulation results.

Create a Customized Simulation Report

Define a Presentation Model

Several presentation models will be used produce customized simulation reports:

  • A report without subsections.

  • A report with subsections.

  • A report with combined sections.

First of all, a default PresentationModel object (case.study6.presentation.model.default) will be created. This object will include the common components of the report that are shared across the presentation models. The project information (Project object), sorting options in summary tables (Table object) and specification of custom labels (CustomLabel objects) are included in this object:

Report without subsections

The first simulation report will include a section for each outcome parameter set. To accomplish this, a Section object is added to the default PresentationModel object and the report is generated:

Report with subsections

The second report will include a section for each outcome parameter set and, in addition, a subsection will be created for each multiplicity adjustment procedure. The Section and Subsection objects are added to the default PresentationModel object as shown below and the report is generated:

Report with combined sections

Finally, the third report will include a section for each combination of outcome parameter set and each multiplicity adjustment procedure. This is accomplished by adding a Section object to the default PresentationModel object and specifying the outcome parameter and multiplicity adjustment in the section’s by argument.

Download

The R code and the report that summarizes the results of Clinical Scenario Evaluation for this case study can be downloaded on the Mediana website:

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.