The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Method comparison studies are fundamental in clinical laboratories
and biotech research. When introducing a new analytical method, we must
demonstrate that it produces results comparable to an established
reference method. This vignette walks through a complete method
comparison workflow using the valytics package.
The statistical approaches implemented in valytics
follow well-established methodology from the clinical chemistry
literature. We focus on two complementary techniques: Bland-Altman
analysis for assessing agreement and Passing-Bablok regression for
evaluating systematic differences.
We will use the glucose_methods dataset included in the
package. This dataset contains paired measurements from a point-of-care
(POC) glucose meter and a laboratory reference analyzer on 60 patient
samples.
data("glucose_methods")
head(glucose_methods)
#> sample_id reference poc_meter
#> 1 GLU026 118 131
#> 2 GLU022 113 110
#> 3 GLU043 83 77
#> 4 GLU005 51 57
#> 5 GLU016 112 121
#> 6 GLU010 77 77Before diving into statistical analysis, it is always good practice to visualize the raw data:
ggplot(glucose_methods, aes(x = reference, y = poc_meter)) +
geom_point(alpha = 0.7) +
geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray50") +
labs(
x = "Reference Method (mg/dL)",
y = "POC Meter (mg/dL)",
title = "Glucose Method Comparison"
) +
coord_fixed() +
theme_minimal()Scatter plot of POC vs laboratory glucose measurements with identity line.
The points cluster around the identity line, suggesting reasonable agreement. Now let us quantify this agreement using appropriate statistical methods.
Bland-Altman analysis, introduced by Bland and Altman (1986), assesses agreement between two measurement methods by examining the differences between paired measurements. Rather than correlation, which can be misleading for method comparison, this approach focuses on clinically meaningful questions: How large are the differences? Is there systematic bias?
The ba_analysis() function accepts paired measurements
as vectors or via a formula interface:
# Vector interface
ba <- ba_analysis(
x = glucose_methods$reference,
y = glucose_methods$poc_meter
)
# Alternative: formula interface
# ba <- ba_analysis(reference ~ poc_meter, data = glucose_methods)
ba
#>
#> Bland-Altman Analysis
#> ----------------------------------------
#> n = 60 paired observations
#>
#> Difference type: Absolute (y - x)
#> Confidence level: 95%
#>
#> Results:
#> Bias (mean difference): 5.700
#> 95% CI: [3.539, 7.861]
#> SD of differences: 8.365
#>
#> Limits of Agreement:
#> Lower LoA: -10.695
#> 95% CI: [-14.409, -6.982]
#> Upper LoA: 22.095
#> 95% CI: [18.382, 25.809]The print output shows the mean difference (bias) and the 95% limits of agreement (LoA). These limits represent the range within which 95% of differences between the two methods are expected to fall.
The summary() method provides additional statistical
details:
summary(ba)
#>
#> Bland-Altman Analysis - Detailed Summary
#> ==================================================
#>
#> Call:
#> ba_analysis(x = glucose_methods$reference, y = glucose_methods$poc_meter)
#>
#> Sample size: n = 60
#> Variables: x = 'x', y = 'y'
#> Difference type: Absolute (y - x)
#> Confidence level: 95%
#>
#> --------------------------------------------------
#> Descriptive Statistics:
#> --------------------------------------------------
#> Variable N Mean SD Median Min Max
#> x 60 131.0 73.54 111 48 323
#> y 60 136.7 74.19 118 54 342
#>
#> --------------------------------------------------
#> Agreement Statistics:
#> --------------------------------------------------
#> Statistic Estimate CI_Lower_95% CI_Upper_95%
#> Bias 5.7 3.539 7.861
#> Lower LoA -10.7 -14.409 -6.982
#> Upper LoA 22.1 18.382 25.809
#>
#> SD of differences: 8.3652
#>
#> --------------------------------------------------
#> Normality of Differences (Shapiro-Wilk test):
#> --------------------------------------------------
#> W = 0.9567, p-value = 3.25e-02
#> Note: p < 0.05 suggests differences may not be normally distributed.
#> Consider inspecting the Bland-Altman plot for patterns.Key outputs to examine:
The Bland-Altman plot displays differences against the average of paired measurements:
Bland-Altman plot showing bias and 95% limits of agreement.
This plot reveals several important features:
For publication-quality figures, you can use autoplot()
with additional ggplot2 customization:
Customized Bland-Altman plot.
When the magnitude of measurements varies widely, percentage differences can be more informative than absolute differences:
ba_pct <- ba_analysis(
x = glucose_methods$reference,
y = glucose_methods$poc_meter,
type = "percent"
)
ba_pct
#>
#> Bland-Altman Analysis
#> ----------------------------------------
#> n = 60 paired observations
#>
#> Difference type: Percent (y - x)
#> Confidence level: 95%
#>
#> Results:
#> Bias (mean difference): 5.035
#> 95% CI: [3.516, 6.555]
#> SD of differences: 5.882
#>
#> Limits of Agreement:
#> Lower LoA: -6.494
#> 95% CI: [-9.105, -3.882]
#> Upper LoA: 16.564
#> 95% CI: [13.952, 19.175]Bland-Altman plot with percentage differences.
Percentage-based LoA are particularly useful when acceptable differences scale with measurement magnitude.
While Bland-Altman analysis assesses overall agreement, Passing-Bablok regression (1983) specifically addresses two questions: Is there a constant bias (intercept different from 0)? Is there a proportional bias (slope different from 1)?
This non-parametric regression method is robust to outliers and does not assume that measurement errors occur in only one method, making it well-suited for method comparison studies.
pb <- pb_regression(
x = glucose_methods$reference,
y = glucose_methods$poc_meter
)
pb
#>
#> Passing-Bablok Regression
#> ----------------------------------------
#> n = 60 paired observations
#>
#> CI method: Analytical (Passing-Bablok 1983)
#> Confidence level: 95%
#>
#> Regression equation:
#> glucose_methods$poc_meter = 2.861 + 1.028 * glucose_methods$reference
#>
#> Results:
#> Intercept: 2.861
#> 95% CI: [2.074, 4.605]
#> (excludes 0: significant constant bias)
#>
#> Slope: 1.028
#> 95% CI: [1.013, 1.037]
#> (excludes 1: significant proportional bias)summary(pb)
#>
#> Passing-Bablok Regression - Detailed Summary
#> ==================================================
#>
#> Data:
#> X variable: glucose_methods$reference
#> Y variable: glucose_methods$poc_meter
#> Sample size: 60
#>
#> Settings:
#> Confidence level: 95%
#> CI method: Analytical (Passing-Bablok 1983)
#>
#> Regression Coefficients:
#> --------------------------------------------------
#> Estimate 95% Lower 95% Upper
#> Intercept 2.8611 2.0741 4.6051
#> Slope 1.0278 1.0127 1.0370
#>
#> Regression equation:
#> glucose_methods$poc_meter = 2.8611 + 1.0278 * glucose_methods$reference
#>
#> Linearity Test (CUSUM):
#> --------------------------------------------------
#> Test statistic: 0.9834
#> Critical value (alpha = 0.05): 1.36
#> p-value: 0.2882
#> Result: Linearity assumption is satisfied (p >= 0.05)
#>
#> Interpretation:
#> --------------------------------------------------
#> Intercept: CI excludes 0 (2.074 to 4.605)
#> -> Significant positive constant bias of 2.861
#> Slope: CI excludes 1 (1.013 to 1.037)
#> -> Significant proportional bias of 2.8%
#>
#> Conclusion:
#> --------------------------------------------------
#> The two methods show SYSTEMATIC DIFFERENCES:
#> - Constant bias: 2.861 glucose_methods$poc_meter
#> - Proportional bias: 2.8%
#>
#> Residuals (perpendicular):
#> --------------------------------------------------
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -20.804423 -2.702250 -0.009685 -0.558530 2.416529 14.315148The summary provides hypothesis tests and confidence intervals for the regression parameters:
When both the intercept includes 0 and the slope includes 1, we conclude that the methods are statistically equivalent.
The scatter plot shows the fitted regression line with confidence band:
Passing-Bablok regression with 95% confidence band.
The dashed identity line (y = x) serves as a reference. If the methods were in perfect agreement, the regression line would coincide with the identity line.
Residual plots help assess model assumptions:
Perpendicular residuals from Passing-Bablok regression.
Residuals should scatter randomly around zero without obvious patterns. Trends or heteroscedasticity may indicate violations of the linearity assumption.
The CUSUM plot provides a visual assessment of linearity:
CUSUM plot for linearity assessment.
Points should remain within the boundary lines if the linear model is appropriate. Deviations suggest non-linear relationships that may require transformation or alternative modeling approaches.
For smaller sample sizes or when parametric assumptions are questionable, bootstrap confidence intervals provide a robust alternative:
pb_boot <- pb_regression(
x = glucose_methods$reference,
y = glucose_methods$poc_meter,
ci_method = "bootstrap",
boot_n = 1999
)
summary(pb_boot)
#>
#> Passing-Bablok Regression - Detailed Summary
#> ==================================================
#>
#> Data:
#> X variable: glucose_methods$reference
#> Y variable: glucose_methods$poc_meter
#> Sample size: 60
#>
#> Settings:
#> Confidence level: 95%
#> CI method: Bootstrap BCa (n = 1999)
#>
#> Regression Coefficients:
#> --------------------------------------------------
#> Estimate 95% Lower 95% Upper
#> Intercept 2.8611 -2.9388 6.0000
#> Slope 1.0278 0.9867 1.0577
#>
#> Regression equation:
#> glucose_methods$poc_meter = 2.8611 + 1.0278 * glucose_methods$reference
#>
#> Linearity Test (CUSUM):
#> --------------------------------------------------
#> Test statistic: 0.9834
#> Critical value (alpha = 0.05): 1.36
#> p-value: 0.2882
#> Result: Linearity assumption is satisfied (p >= 0.05)
#>
#> Interpretation:
#> --------------------------------------------------
#> Intercept: CI includes 0
#> -> No significant constant (additive) bias
#> Slope: CI includes 1
#> -> No significant proportional (multiplicative) bias
#>
#> Conclusion:
#> --------------------------------------------------
#> The two methods are EQUIVALENT within the measured range.
#> No systematic differences detected.
#>
#> Residuals (perpendicular):
#> --------------------------------------------------
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -20.804423 -2.702250 -0.009685 -0.558530 2.416529 14.315148The BCa (bias-corrected and accelerated) bootstrap method adjusts for potential bias and skewness in the bootstrap distribution.
A complete method comparison report typically includes both analyses. Here is a summary workflow:
# 1. Load and inspect data
data("glucose_methods")
# 2. Bland-Altman analysis for agreement assessment
ba <- ba_analysis(reference ~ poc_meter, data = glucose_methods)
summary(ba)
plot(ba)
# 3. Passing-Bablok regression for systematic differences
pb <- pb_regression(reference ~ poc_meter, data = glucose_methods)
summary(pb)
plot(pb, type = "scatter")
plot(pb, type = "cusum")
# 4. Document conclusions
# - Bias and LoA from Bland-Altman
# - Slope and intercept CIs from Passing-Bablok
# - Clinical interpretation based on acceptable performance criteriaBoth ba_analysis() and pb_regression()
handle missing values through the na_action parameter:
# Create data with missing values for demonstration
glucose_missing <- glucose_methods
glucose_missing$poc_meter[c(5, 15, 25)] <- NA
# Default behavior: remove pairs with missing values
ba_complete <- ba_analysis(
reference ~ poc_meter,
data = glucose_missing,
na_action = "omit"
)
# Require complete cases (will error if any NA present)
# ba_strict <- ba_analysis(
# reference ~ poc_meter,
# data = glucose_missing,
# na_action = "fail"
# )The package includes two additional datasets for exploring different scenarios:
# Creatinine: enzymatic vs Jaffe methods
data("creatinine_serum")
head(creatinine_serum)
#> sample_id enzymatic jaffe
#> 1 CREAT056 2.10 2.46
#> 2 CREAT022 1.20 1.47
#> 3 CREAT050 2.50 2.34
#> 4 CREAT024 0.83 1.07
#> 5 CREAT063 1.56 1.65
#> 6 CREAT039 0.58 0.77
# High-sensitivity troponin: two immunoassay platforms
data("troponin_cardiac")
head(troponin_cardiac)
#> sample_id platform_a platform_b
#> 1 TROP020 111.0 92.0
#> 2 TROP023 42.9 33.4
#> 3 TROP008 3.8 3.7
#> 4 TROP002 33.7 31.9
#> 5 TROP011 910.0 750.0
#> 6 TROP031 52.0 45.8These datasets represent different analytical challenges: the creatinine data includes known interferences affecting the Jaffe method at low concentrations, while the troponin data covers a wide dynamic range typical of cardiac biomarkers.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307-310.
Bland JM, Altman DG. Measuring agreement in method comparison studies. Statistical Methods in Medical Research. 1999;8(2):135-160.
Passing H, Bablok W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Journal of Clinical Chemistry and Clinical Biochemistry. 1983;21(11):709-720.
Passing H, Bablok W. Comparison of several regression procedures for method comparison studies and determination of sample sizes. Journal of Clinical Chemistry and Clinical Biochemistry. 1984;22(6):431-445.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.