The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Robust test statistics with semTests

Introduction

semTests computes robust p-values for structural equation models fit with lavaan. The standard chi-square test of model fit is anti-conservative under non-normality, and the usual corrections (Satorra–Bentler and friends) only partly fix it. The eigenvalue-based tests in this package target the limiting null law of the test statistic directly: that law is a weighted sum of independent chi-squares, \(\sum_j \lambda_j \chi^2_1\), whose weights \(\lambda_j\) are the eigenvalues of a matrix built from the model and the asymptotic covariance of the sample moments. Because this holds for any minimum-discrepancy estimator, the tests are not specific to normal-theory maximum likelihood.

The penalized eigenvalue block-averaging (pEBA) and penalized regression (pOLS) methods were introduced by Foldnes, Moss, & Grønneberg (2024) and extended to nested model comparison by Foldnes, Grønneberg, & Moss (2026).

library("semTests")
library("lavaan")

Basic usage: goodness of fit

Fit a model with lavaan, then call pvalues(). Fit with a robust test (e.g. estimator = "MLM" or "MLR") so that lavaan exposes the asymptotic moment covariance the correction needs.

model <- "visual  =~ x1 + x2 + x3
          textual =~ x4 + x5 + x6
          speed   =~ x7 + x8 + x9"
fit <- cfa(model, HolzingerSwineford1939, estimator = "MLM")
pvalues(fit)
#>    peba4_rls 
#> 5.165737e-07 
#> estimator: ML | data: continuous | information: expected | df: 24

The result is a named vector that also prints a one-line provenance footer recording the estimator, data type, information matrix, and degrees of freedom actually used. The full option record is available with attr(x, "semtests").

The test-name grammar

Each requested test is a string of the form (test)_(ug?)_(ml|rls):

Several tests can be requested at once:

pvalues(fit, c("SB_ML", "pEBA4_ML", "pOLS2_ML"))
#>        sb_ml     peba4_ml     pols2_ml 
#> 4.416190e-08 1.529685e-07 1.519741e-07 
#> estimator: ML | data: continuous | information: expected | df: 24

The UG gamma and the RLS statistic are defined only for the classical case (continuous, complete data, estimator = "ML"); off that case they are refused with an informative error rather than silently degrading.

Nested model comparison

To compare two nested models, fit both and pass them to pvalues_nested() with the constrained model first. Here we test weak invariance of the textual loadings:

constrained <- "visual  =~ x1 + x2 + x3
                textual =~ a*x4 + a*x5 + a*x6
                speed   =~ x7 + x8 + x9"
m1 <- cfa(model, HolzingerSwineford1939, estimator = "MLM")
m0 <- cfa(constrained, HolzingerSwineford1939, estimator = "MLM")
pvalues_nested(m0, m1)
#> pall_ug_ml 
#>  0.0166158 
#> estimator: ML | data: continuous | information: expected | df: 2 | nested (method 2000)

pall is the recommended default for nested comparison (Foldnes, Grønneberg, & Moss, 2026). The reduction method defaults to "2000" (Satorra 2000); "2001" is also available for complete-data fits.

Other estimators and missing data (experimental)

The same machinery applies to other estimators. The non-ML paths below are experimental as of this release; the classical ML path is stable. See ?semTests-support for the authoritative matrix of what is supported.

GLS and ULS work directly (ULS needs a robust test for the asymptotic covariance):

gls <- cfa(model, HolzingerSwineford1939, estimator = "GLS")
pvalues(gls, "pEBA4")
#>     peba4_ml 
#> 8.854857e-07 
#> estimator: GLS | data: continuous | information: expected | df: 24

uls <- cfa(model, HolzingerSwineford1939, estimator = "ULS",
           test = "satorra.bentler")
pvalues(uls, "pEBA4")
#>     peba4_ml 
#> 2.527109e-08 
#> estimator: ULS | data: continuous | information: expected | df: 24

Missing data is handled through full-information maximum likelihood. Fit with missing = "fiml" and pass the fit to pvalues() as usual. FIML uses the biased gamma and the standard statistic, so the UG/RLS suffixes do not apply; the default suffix-free tests are the right choice.

HS <- HolzingerSwineford1939
set.seed(1)
HS$x1[sample(nrow(HS), 60)] <- NA
fit_fiml <- cfa(model, HS, missing = "fiml", estimator = "MLR")
pvalues(fit_fiml)
#>     peba4_ml 
#> 2.845788e-08 
#> estimator: ML (FIML) | data: continuous | information: observed | df: 24

FIML inference rests on the observed information under MAR (lavaan’s default for missing = "fiml"); a fit using expected information triggers a warning, since that is valid only under the stronger MCAR assumption.

References

Du, H., & Bentler, P. M. (2022). 40-Year Old Unbiased Distribution Free Estimator Reliably Improves SEM Statistics for Nonnormal Data. Structural Equation Modeling, 29(6), 872–887. https://doi.org/10.1080/10705511.2022.2063870

Foldnes, N., Moss, J., & Grønneberg, S. (2024). Improved goodness of fit procedures for structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 1–13. https://doi.org/10.1080/10705511.2024.2372028

Foldnes, N., Grønneberg, S., & Moss, J. (2026). Penalized eigenvalue block averaging: Extension to nested model comparison and Monte Carlo evaluations. Behavior Research Methods. https://doi.org/10.3758/s13428-026-02968-4

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.