Getting Started with smsncut

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Renato de Paula, Helena Mouriño, Tiago Dias Domingues

2026-05-14

Overview

smsncut implements the parametric decision-theoretic framework for optimal diagnostic cutoff selection under scale mixtures of skew-normal (SMSN) distributions, described in:

de Paula, R., Mouriño, H., and Dias Domingues, T. (2026). Parametric ROC analysis and optimal cutoff selection under scale mixtures of skew-normal distributions: a decision-theoretic framework with asymptotic inference. arXiv:2605.07829. https://doi.org/10.48550/arXiv.2605.07829.

The package supports the skew-normal (SN) and skew-t (ST) families and provides tools for model fitting, ROC analysis, optimal cutoff estimation, asymptotic inference, and Monte Carlo validation.

1. Unconstrained parametrisation

All functions use the unconstrained vector:

Family	`theta`	Natural parameters
SN	`c(xi, log(omega), alpha)`	location, scale, skewness
ST	`c(xi, log(omega), alpha, log(nu-2))`	+ degrees of freedom

theta_sn <- c(0, log(1), 1.5)
smsn_density(c(-1, 0, 1, 2), theta_sn, "SN")
#> [1] 0.03233077 0.39894228 0.45161068 0.10783617

theta_st <- c(0, log(1), 1.5, log(8 - 2))
smsn_density(c(-1, 0, 1, 2), theta_st, "ST")
#> [1] 0.03820408 0.38669902 0.41701108 0.12293816

2. Model fitting

smsn_fit uses BFGS for the MLE and Richardson extrapolation (via numDeriv::hessian) for the observed Fisher information matrix, which is critical for accurate variance estimation (de Paula et al., 2026, Section 6.2).

set.seed(42)
x0 <- sn::rst(300, xi = 0, omega = 1, alpha = 1, nu = 8)
x1 <- sn::rsn(300, xi = 2, omega = 1, alpha = 1.5)

fit0 <- smsn_fit(x0, family = "ST")
fit1 <- smsn_fit(x1, family = "SN")
cat("Group 0 (ST) MLE:", round(fit0$par, 3), "\n")
#> Group 0 (ST) MLE: -0.224 0.114 1.619 2.648
cat("Group 1 (SN) MLE:", round(fit1$par, 3), "\n")
#> Group 1 (SN) MLE: 2.085 -0.079 1.198

3. Admissible interval and boundary conditions

ab <- cutoff_admissible_interval(
  fit0$par, fit1$par, family0 = "ST", family1 = "SN"
)
cat("Interval: [", round(ab["a"], 3), ",", round(ab["b"], 3), "]\n")
#> Interval: [ -1.457 , 4.678 ]

bc <- cutoff_boundary_check(
  ab["a"], ab["b"], fit0$par, fit1$par, "ST", "SN",
  lambda0 = 1, lambda1 = 3, pi0 = 0.9, pi1 = 0.1
)
cat("Boundary conditions satisfied:", bc$satisfied, "\n")
#> Boundary conditions satisfied: TRUE

4. Optimal cutoff

res <- cutoff_optimal(
  fit0$par, fit1$par, "ST", "SN",
  lambda0 = 1, lambda1 = 3, pi0 = 0.9, pi1 = 0.1,
  interval = ab
)
cat("Optimal cutoff c*:", round(res$cutoff, 4), "\n")
#> Optimal cutoff c*: 1.9301
cat("Risk at c*:       ", round(res$risk,   4), "\n")
#> Risk at c*:        0.1131
cat("Youden cutoff:    ", round(cutoff_youden(fit0$par, fit1$par, "ST", "SN",
                                               interval = ab), 4), "\n")
#> Youden cutoff:     1.5845

5. Local identifiability diagnostic

diag_val <- cutoff_identifiability(
  res$cutoff, fit0$par, fit1$par, "ST", "SN",
  lambda0 = 1, lambda1 = 3, pi0 = 0.9, pi1 = 0.1
)
cat("|d phi/dc| at c* =", round(diag_val, 4), "\n")
#> |d phi/dc| at c* = 0.3068

6. Asymptotic variance and Wald confidence interval

n0 <- length(x0); n1 <- length(x1)

V_hat <- cutoff_variance(
  res$cutoff, fit0$par, fit1$par, "ST", "SN",
  lambda0 = 1, lambda1 = 3, pi0 = 0.9, pi1 = 0.1,
  hess0 = fit0$hessian, hess1 = fit1$hessian, n0 = n0, n1 = n1
)

ci <- cutoff_ci(res$cutoff, V_hat, n = n0 + n1)
cat("95% Wald CI: [", round(ci["lower"], 4), ",", round(ci["upper"], 4), "]\n")
#> 95% Wald CI: [ 1.829 , 2.0312 ]

7. ROC curve and AUC

auc_val <- smsn_auc(fit0$par, fit1$par, "ST", "SN")
cat("Estimated AUC:", round(auc_val, 4), "\n")
#> Estimated AUC: 0.9594

r_grid <- seq(0.01, 0.99, by = 0.01)
tpr    <- smsn_roc(r_grid, fit0$par, fit1$par, "ST", "SN")
plot(r_grid, tpr, type = "l", lwd = 2,
     xlab = "False positive rate", ylab = "True positive rate",
     main = paste0("Parametric ROC (AUC = ", round(auc_val, 3), ")"))
abline(0, 1, lty = 2, col = "grey60")

8. Reproducing Table 3 (Scenario SN1, n = 200)

set.seed(123)
result <- mc_simulate_scenario(
  theta0_true = c(0, log(1), 1),  theta1_true = c(2, log(1), 1.5),
  family0 = "SN", family1 = "SN", c_true = 1.6101,
  lambda0 = 1, lambda1 = 1, pi0 = 0.5, pi1 = 0.5,
  n0 = 100L, n1 = 100L, B = 2000L  # use B=2000 for paper results
)

cat("Success rate:", result$success_rate, "\n")
cat("Mean c_hat:  ", round(result$mean_c_hat, 4), "\n")
cat("Bias:        ", round(result$bias,       4), "\n")
cat("RMSE:        ", round(result$rmse,       4), "\n")
cat("Coverage:    ", round(result$coverage,   3), "\n")  
cat("Ratio:       ", round(result$ratio,      3), "\n")  
cat("Bracket rate:", round(result$bracket_rate, 3), "\n")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.