The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette provides a detailed explanation of the statistical
methods implemented in the bbssr
package. We cover the
theoretical foundations of blinded sample size re-estimation (BSSR) and
the five exact statistical tests supported by the package.
Traditional clinical trials use fixed sample sizes determined during the planning phase based on: - Assumed treatment effect size - Expected response rates in each group - Desired power and significance level
However, these assumptions are often inaccurate, leading to: - Underpowered studies when the assumed effect size is too optimistic - Overpowered studies when the assumed effect size is too conservative - Resource inefficiency due to incorrect sample size planning - Ethical concerns about continuing underpowered trials
Blinded Sample Size Re-estimation addresses these issues by:
Let’s define the key parameters:
The bbssr
package implements five exact statistical
tests, each with different characteristics and optimal use cases.
'Chisq'
)The one-sided Pearson chi-squared test uses the test statistic:
\[Z_{ij} = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}\]
where \(\hat{p} = \frac{X_1 + X_2}{n_1 + n_2}\) is the pooled proportion.
P-value formula: \[p\text{-value} = P(Z \geq z_{\text{obs}}) = 1 - \Phi(z_{\text{obs}})\]
where \(\Phi(\cdot)\) is the standard normal cumulative distribution function.
# Example: Chi-squared test
power_chisq <- BinaryPower(
p1 = 0.6, p2 = 0.4,
N1 = 30, N2 = 30,
alpha = 0.025,
Test = 'Chisq'
)
print(paste("Chi-squared test power:", round(power_chisq, 3)))
#> [1] "Chi-squared test power: 0.349"
Characteristics: - Good asymptotic properties for large samples - Computationally efficient - May be anti-conservative for small samples
'Fisher'
)The Fisher exact test conditions on the total number of successes and uses the hypergeometric distribution.
P-value formula: \[p\text{-value} = P(X_1 \geq k | X_1 + X_2 = s) = \sum_{i=k}^{\min(n_1,s)} \frac{\binom{n_1}{i}\binom{n_2}{s-i}}{\binom{n_1+n_2}{s}}\]
where \(k\) is the observed number of successes in group 1, and \(s = X_1 + X_2\) is the total number of successes.
The conditional probability mass function is: \[P(X_1 = i | X_1 + X_2 = s) = \frac{\binom{n_1}{i}\binom{n_2}{s-i}}{\binom{n_1+n_2}{s}}\]
# Example: Fisher exact test
power_fisher <- BinaryPower(
p1 = 0.6, p2 = 0.4,
N1 = 30, N2 = 30,
alpha = 0.025,
Test = 'Fisher'
)
print(paste("Fisher exact test power:", round(power_fisher, 3)))
#> [1] "Fisher exact test power: 0.257"
Characteristics: - Exact Type I error control - Conservative (actual α < nominal α) - Widely accepted by regulatory agencies - Conditional test
'Fisher-midP'
)The Fisher mid-p test reduces the conservatism of the Fisher exact test by including half the probability of the observed outcome.
P-value formula: \[p\text{-value} = P(X_1 > k | X_1 + X_2 = s) + 0.5 \cdot P(X_1 = k | X_1 + X_2 = s)\]
This can be expressed as: \[p\text{-value} = \sum_{i=k+1}^{\min(n_1,s)} \frac{\binom{n_1}{i}\binom{n_2}{s-i}}{\binom{n_1+n_2}{s}} + 0.5 \cdot \frac{\binom{n_1}{k}\binom{n_2}{s-k}}{\binom{n_1+n_2}{s}}\]
# Example: Fisher mid-p test
power_midp <- BinaryPower(
p1 = 0.6, p2 = 0.4,
N1 = 30, N2 = 30,
alpha = 0.025,
Test = 'Fisher-midP'
)
print(paste("Fisher mid-p test power:", round(power_midp, 3)))
#> [1] "Fisher mid-p test power: 0.349"
Characteristics: - Less conservative than Fisher exact - Better power properties - Maintains approximate Type I error control
'Z-pool'
)This test uses the Z-statistic but calculates exact p-values by considering all possible values of the nuisance parameter \(\theta\) (the common success probability under the null hypothesis).
P-value formula: \[p\text{-value} = \max_{\theta \in [0,1]} P_{\theta}(Z \geq z_{\text{obs}})\]
where under the null hypothesis \(H_0: p_1 = p_2 = \theta\): \[P_{\theta}(Z \geq z_{\text{obs}}) = \sum_{(x_1,x_2): z(x_1,x_2) \geq z_{\text{obs}}} \binom{n_1}{x_1}\binom{n_2}{x_2}\theta^{x_1+x_2}(1-\theta)^{n_1+n_2-x_1-x_2}\]
The test statistic is: \[z(x_1,x_2) = \frac{\frac{x_1}{n_1} - \frac{x_2}{n_2}}{\sqrt{\frac{x_1+x_2}{n_1 n_2} \cdot \left(1 - \frac{x_1+x_2}{n_1+n_2}\right)}}\]
# Example: Z-pooled test
power_zpool <- BinaryPower(
p1 = 0.6, p2 = 0.4,
N1 = 30, N2 = 30,
alpha = 0.025,
Test = 'Z-pool'
)
print(paste("Z-pooled test power:", round(power_zpool, 3)))
#> [1] "Z-pooled test power: 0.33"
Characteristics: - Unconditional test - Good balance between power and conservatism - Computationally more intensive than conditional tests
'Boschloo'
)The Boschloo test is the most powerful exact unconditional test. It maximizes the p-value over all possible values of the nuisance parameter, but uses the Fisher exact p-value as the test statistic.
P-value formula: \[p\text{-value} = \max_{\theta \in [0,1]} P_{\theta}(p_{\text{Fisher}}(X_1, X_2) \leq p_{\text{Fisher,obs}})\]
where \(p_{\text{Fisher}}(x_1, x_2)\) is the Fisher exact p-value for the observation \((x_1, x_2)\): \[p_{\text{Fisher}}(x_1, x_2) = P(X_1 \geq x_1 | X_1 + X_2 = x_1 + x_2)\]
Under the null hypothesis \(H_0: p_1 = p_2 = \theta\): \[P_{\theta}(p_{\text{Fisher}}(X_1, X_2) \leq p_{\text{Fisher,obs}}) = \sum_{\substack{(x_1,x_2): \\ p_{\text{Fisher}}(x_1,x_2) \leq p_{\text{Fisher,obs}}}} \binom{n_1}{x_1}\binom{n_2}{x_2}\theta^{x_1+x_2}(1-\theta)^{n_1+n_2-x_1-x_2}\]
# Example: Boschloo test
power_boschloo <- BinaryPower(
p1 = 0.6, p2 = 0.4,
N1 = 30, N2 = 30,
alpha = 0.025,
Test = 'Boschloo'
)
print(paste("Boschloo test power:", round(power_boschloo, 3)))
#> [1] "Boschloo test power: 0.33"
Characteristics: - Most powerful exact unconditional test - Maintains exact Type I error control - Computationally intensive - Optimal choice when computational resources allow
The key insight is that:
# Compare all five tests
tests <- c('Chisq', 'Fisher', 'Fisher-midP', 'Z-pool', 'Boschloo')
powers <- sapply(tests, function(test) {
BinaryPower(p1 = 0.6, p2 = 0.4, N1 = 30, N2 = 30, alpha = 0.025, Test = test)
})
comparison_df <- data.frame(
Test = tests,
Power = round(powers, 4),
Type = c("Asymptotic", "Conditional", "Conditional", "Unconditional", "Unconditional"),
Conservatism = c("Moderate", "High", "Moderate", "Moderate", "Low")
)
print(comparison_df)
#> Test Power Type Conservatism
#> Chisq Chisq 0.3494 Asymptotic Moderate
#> Fisher Fisher 0.2571 Conditional High
#> Fisher-midP Fisher-midP 0.3493 Conditional Moderate
#> Z-pool Z-pool 0.3298 Unconditional Moderate
#> Boschloo Boschloo 0.3298 Unconditional Low
restricted = TRUE
)In the restricted design, the final sample size must be at least the originally planned sample size:
\(N_{\text{final}} \geq N_{\text{planned}}\)
This approach is conservative and ensures that the study duration doesn’t exceed the originally planned timeline.
restricted = FALSE
)The unrestricted design allows both increases and decreases in sample size based on the interim data:
\(N_{\text{final}} = \max(N_{\text{interim}}, N_{\text{recalculated}})\)
This provides maximum flexibility but may extend or shorten the study duration.
weighted = TRUE
)The weighted approach uses a weighted average across all possible interim scenarios:
\(N_{\text{final}} = \max\left(N_{\text{scenario}}, \sum_{scenarios} w_h \cdot N_h\right)\)
where \(w_h\) are weights based on the probability of each interim scenario.
# Detailed BSSR example with different approaches
bssr_results_list <- list()
# Restricted approach
bssr_results_list[["Restricted"]] <- BinaryPowerBSSR(
asmd.p1 = 0.45, asmd.p2 = 0.09,
p = seq(0.1, 0.9, by = 0.1),
Delta.A = 0.36, Delta.T = 0.36,
N1 = 24, N2 = 24, omega = 0.5, r = 1,
alpha = 0.025, tar.power = 0.8,
Test = 'Z-pool',
restricted = TRUE, weighted = FALSE
) %>% mutate(approach = "Restricted")
# Unrestricted approach
bssr_results_list[["Unrestricted"]] <- BinaryPowerBSSR(
asmd.p1 = 0.45, asmd.p2 = 0.09,
p = seq(0.1, 0.9, by = 0.1),
Delta.A = 0.36, Delta.T = 0.36,
N1 = 24, N2 = 24, omega = 0.5, r = 1,
alpha = 0.025, tar.power = 0.8,
Test = 'Z-pool',
restricted = FALSE, weighted = FALSE
) %>% mutate(approach = "Unrestricted")
# Weighted approach
bssr_results_list[["Weighted"]] <- BinaryPowerBSSR(
asmd.p1 = 0.45, asmd.p2 = 0.09,
p = seq(0.1, 0.9, by = 0.1),
Delta.A = 0.36, Delta.T = 0.36,
N1 = 24, N2 = 24, omega = 0.5, r = 1,
alpha = 0.025, tar.power = 0.8,
Test = 'Z-pool',
restricted = FALSE, weighted = TRUE
) %>% mutate(approach = "Weighted")
# Combine results
bssr_results <- do.call(rbind, bssr_results_list)
# Summary statistics
bssr_summary <- bssr_results %>%
group_by(approach) %>%
summarise(
mean_power_bssr = mean(power.BSSR),
mean_power_trad = mean(power.TRAD),
min_power_bssr = min(power.BSSR),
max_power_bssr = max(power.BSSR),
.groups = 'drop'
)
print(bssr_summary)
#> # A tibble: 3 × 5
#> approach mean_power_bssr mean_power_trad min_power_bssr max_power_bssr
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Restricted 0.837 0.791 0.786 0.932
#> 2 Unrestricted 0.805 0.791 0.771 0.873
#> 3 Weighted 0.830 0.791 0.786 0.921
# Create comprehensive power comparison with vertical layout
power_data <- bssr_results %>%
select(approach, p, power.BSSR, power.TRAD) %>%
pivot_longer(
cols = c(power.BSSR, power.TRAD),
names_to = "design_type",
values_to = "power"
) %>%
mutate(
design_type = case_when(
design_type == "power.BSSR" ~ "BSSR",
design_type == "power.TRAD" ~ "Traditional"
),
approach = factor(approach, levels = c("Restricted", "Unrestricted", "Weighted"))
)
ggplot(power_data, aes(x = p, y = power, color = design_type)) +
geom_line(linewidth = 1.2) +
facet_wrap(~approach, ncol = 1, scales = "free_y") + # Vertical layout
geom_hline(yintercept = 0.8, linetype = "dashed", color = "gray") +
scale_color_manual(
values = c("BSSR" = "#1F78B4", "Traditional" = "#E31A1C"),
name = "Design Type"
) +
scale_x_continuous(
breaks = seq(0.2, 0.8, by = 0.2),
labels = c("0.2", "0.4", "0.6", "0.8")
) +
scale_y_continuous(
breaks = seq(0.7, 1.0, by = 0.1),
labels = c("0.7", "0.8", "0.9", "1.0")
) +
labs(
x = "Pooled Response Rate (θ)",
y = "Power",
title = "Power Comparison: Traditional vs BSSR Designs",
subtitle = "Horizontal dashed line shows target power = 0.8"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 14, hjust = 0.5, margin = margin(b = 5)),
plot.subtitle = element_text(size = 11, hjust = 0.5, margin = margin(b = 15)),
strip.text = element_text(size = 12, face = "bold", margin = margin(t = 8, b = 8)),
strip.background = element_rect(fill = "gray95", color = "gray80"),
legend.position = "bottom",
legend.title = element_text(size = 11),
legend.text = element_text(size = 10),
legend.margin = margin(t = 10),
axis.title.x = element_text(size = 11, margin = margin(t = 10)),
axis.title.y = element_text(size = 11, margin = margin(r = 10)),
axis.text = element_text(size = 9),
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = "gray92", linewidth = 0.5),
plot.margin = margin(t = 10, r = 10, b = 10, l = 10)
)
Scenario | Recommended Test | Rationale |
---|---|---|
Small samples (n < 30 per group) | Boschloo | Most powerful exact test |
Moderate samples (30-100 per group) | Z-pool | Good balance of power and computation |
Large samples (n > 100 per group) | Chisq | Asymptotically optimal, fast |
Regulatory submission | Fisher | Widely accepted, conservative |
Exploratory analysis | Fisher-midP | Less conservative than Fisher |
Priority | Recommended Approach | Rationale |
---|---|---|
Timeline certainty | Restricted | Guarantees study doesn’t extend |
Statistical efficiency | Unrestricted | Optimal sample size adaptation |
Robust performance | Weighted | Consistent across scenarios |
# Sample size planning example
planning_scenarios <- expand.grid(
p1 = c(0.4, 0.5, 0.6),
p2 = c(0.2, 0.3),
test = c('Fisher', 'Z-pool', 'Boschloo')
) %>%
filter(p1 > p2)
# Calculate sample sizes for each scenario
sample_size_results <- list()
for(i in 1:nrow(planning_scenarios)) {
result <- BinarySampleSize(
p1 = planning_scenarios$p1[i],
p2 = planning_scenarios$p2[i],
r = 1,
alpha = 0.025,
tar.power = 0.8,
Test = planning_scenarios$test[i]
)
sample_size_results[[i]] <- result
}
# Combine results
final_results <- do.call(rbind, sample_size_results)
final_results <- final_results[, c("p1", "p2", "Test", "N1", "N2", "N", "Power")]
print(final_results)
#> p1 p2 Test N1 N2 N Power
#> 1 0.4 0.2 Fisher 90 90 180 0.8016798
#> 2 0.5 0.2 Fisher 44 44 88 0.8020894
#> 3 0.6 0.2 Fisher 27 27 54 0.8024322
#> 4 0.4 0.3 Fisher 375 375 750 0.8010219
#> 5 0.5 0.3 Fisher 102 102 204 0.8061477
#> 6 0.6 0.3 Fisher 48 48 96 0.8004594
#> 7 0.4 0.2 Z-pool 84 84 168 0.8035668
#> 8 0.5 0.2 Z-pool 40 40 80 0.8096513
#> 9 0.6 0.2 Z-pool 23 23 46 0.8088250
#> 10 0.4 0.3 Z-pool 359 359 718 0.8001135
#> 11 0.5 0.3 Z-pool 95 95 190 0.8007528
#> 12 0.6 0.3 Z-pool 44 44 88 0.8010988
#> 13 0.4 0.2 Boschloo 84 84 168 0.8023435
#> 14 0.5 0.2 Boschloo 40 40 80 0.8096508
#> 15 0.6 0.2 Boschloo 23 23 46 0.8088248
#> 16 0.4 0.3 Boschloo 360 360 720 0.8004597
#> 17 0.5 0.3 Boschloo 95 95 190 0.8007528
#> 18 0.6 0.3 Boschloo 44 44 88 0.8010988
All methods in bbssr
maintain exact Type I error
control:
# Demonstrate Type I error control under null hypothesis
null_powers <- sapply(c('Fisher', 'Z-pool', 'Boschloo'), function(test) {
BinaryPower(p1 = 0.3, p2 = 0.3, N1 = 30, N2 = 30, alpha = 0.025, Test = test)
})
names(null_powers) <- c('Fisher', 'Z-pool', 'Boschloo')
print("Type I error rates under null hypothesis:")
#> [1] "Type I error rates under null hypothesis:"
print(round(null_powers, 4))
#> Fisher Z-pool Boschloo
#> 0.0131 0.0208 0.0183
All values should be ≤ 0.025, confirming exact Type I error control.
For regulatory submissions, document: 1. Rationale for BSSR: Why adaptive design is appropriate 2. Test selection: Justification for chosen statistical test 3. Design approach: Restricted vs unrestricted rationale 4. Simulation studies: Demonstrate operating characteristics 5. Implementation plan: Detailed interim analysis procedures
# Compare different allocation ratios
ratios <- c(1, 2, 3)
ratio_results <- sapply(ratios, function(r) {
result <- BinarySampleSize(
p1 = 0.5, p2 = 0.3, r = r,
alpha = 0.025, tar.power = 0.8,
Test = 'Boschloo'
)
c(N1 = result$N1, N2 = result$N2, N_total = result$N)
})
colnames(ratio_results) <- paste0("r=", ratios)
print("Sample sizes for different allocation ratios:")
#> [1] "Sample sizes for different allocation ratios:"
print(ratio_results)
#> r=1 r=2 r=3
#> N1 95 142 189
#> N2 95 71 63
#> N_total 190 213 252
# Sensitivity analysis for key parameters
sensitivity_data <- expand.grid(
omega = c(0.3, 0.5, 0.7),
alpha = c(0.01, 0.025, 0.05)
) %>%
rowwise() %>%
mutate(
avg_power = mean(BinaryPowerBSSR(
asmd.p1 = 0.45, asmd.p2 = 0.09,
p = seq(0.2, 0.8, by = 0.1),
Delta.A = 0.36, Delta.T = 0.36,
N1 = 24, N2 = 24, omega = omega, r = 1,
alpha = alpha, tar.power = 0.8,
Test = 'Z-pool',
restricted = FALSE, weighted = FALSE
)$power.BSSR)
)
print("Sensitivity analysis results:")
#> [1] "Sensitivity analysis results:"
print(sensitivity_data)
#> # A tibble: 9 × 3
#> # Rowwise:
#> omega alpha avg_power
#> <dbl> <dbl> <dbl>
#> 1 0.3 0.01 0.796
#> 2 0.5 0.01 0.798
#> 3 0.7 0.01 0.802
#> 4 0.3 0.025 0.796
#> 5 0.5 0.025 0.805
#> 6 0.7 0.025 0.811
#> 7 0.3 0.05 0.807
#> 8 0.5 0.05 0.814
#> 9 0.7 0.05 0.823
The bbssr
package provides a comprehensive toolkit for
implementing blinded sample size re-estimation in clinical trials with
binary endpoints. The choice of statistical test and design approach
should be based on:
All methods maintain exact statistical validity while providing the flexibility needed for efficient clinical trial conduct.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.