The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The betaregscale package provides maximum-likelihood estimation of beta regression models for responses derived from bounded rating scales. Common examples include pain intensity scales (NRS-11, NRS-21, NRS-101), Likert-type scales, product quality ratings, and any instrument whose response can be mapped to the open interval \((0,1)\).
The key idea is that a discrete score recorded on a bounded scale carries measurement uncertainty inherent to the instrument. For instance, a pain score of \(y=6\) on a 0–10 NRS is not an exact value but rather represents a range: after rescaling to \((0,1)\), the observation is treated as interval-censored in \([0.55,0.65]\). The package uses the beta distribution to model such data, building a complete likelihood that supports mixed censoring types within the same dataset.
The complete likelihood supports four censoring types, automatically
classified by brs_check():
| \(\delta\) | Type | Likelihood contribution |
|---|---|---|
| 0 | Exact (uncensored) | \(f(y_i;a_i,b_i)\) |
| 1 | Left-censored (\(y=0\)) | \(F(u_i;a_i,b_i)\) |
| 2 | Right-censored (\(y=K\)) | \(1-F(l_i;a_i,b_i)\) |
| 3 | Interval-censored | \(F(u_i;a_i,b_i)-F(l_i;a_i,b_i)\) |
where \(f(\cdot)\) and \(F(\cdot)\) are the beta density and CDF, \([l_i,u_i]\) are the interval endpoints, and \((a_i,b_i)\) are the beta shape parameters derived from \(\mu_i\) and \(\phi_i\) via the chosen reparameterization.
Scale observations are mapped to \((0,1)\) with midpoint uncertainty intervals:
\[y_t=y/K,\quad\text{interval }[y_t-h/K,y_t+h/K]\]
where \(K\) is the number of scale
categories (ncuts) and \(h\) is the half-width (lim,
default 0.5).
# Illustrate brs_check with a 0-10 NRS scale
y_example <- c(0, 3, 5, 7, 10)
cr <- brs_check(y_example, ncuts = 10)
kbl10(cr)| left | right | yt | y | delta |
|---|---|---|---|---|
| 0.00 | 0.05 | 0.0 | 0 | 1 |
| 0.25 | 0.35 | 0.3 | 3 | 3 |
| 0.45 | 0.55 | 0.5 | 5 | 3 |
| 0.65 | 0.75 | 0.7 | 7 | 3 |
| 0.95 | 1.00 | 1.0 | 10 | 2 |
The delta column shows that \(y=0\) is left-censored (\(\delta=1\)), \(y=10\) is right-censored (\(\delta=2\)), and all interior values are
interval-censored (\(\delta=3\)).
brs_prep()In practice, analysts may want to supply their own censoring
indicators or interval endpoints rather than relying on the automatic
classification of brs_check(). The brs_prep()
function provides a flexible, validated bridge between raw analyst data
and brs().
It supports four input modes:
# Equivalent to brs_check - delta inferred from y
d1 <- data.frame(y = c(0, 3, 5, 7, 10), x1 = rnorm(5))
kbl10(brs_prep(d1, ncuts = 10))| left | right | yt | y | delta | x1 |
|---|---|---|---|---|---|
| 0.00 | 0.05 | 0.0 | 0 | 1 | 0.5769 |
| 0.25 | 0.35 | 0.3 | 3 | 3 | 1.1427 |
| 0.45 | 0.55 | 0.5 | 5 | 3 | -0.7074 |
| 0.65 | 0.75 | 0.7 | 7 | 3 | -0.6186 |
| 0.95 | 1.00 | 1.0 | 10 | 2 | -0.0083 |
# Analyst specifies delta directly
d2 <- data.frame(
y = c(50, 0, 99, 50),
delta = c(0, 1, 2, 3),
x1 = rnorm(4)
)
kbl10(brs_prep(d2, ncuts = 100))| left | right | yt | y | delta | x1 |
|---|---|---|---|---|---|
| 0.500 | 0.500 | 0.50 | 50 | 0 | 0.3389 |
| 0.000 | 0.005 | 0.00 | 0 | 1 | 1.4054 |
| 0.985 | 1.000 | 0.99 | 99 | 2 | -0.8653 |
| 0.495 | 0.505 | 0.50 | 50 | 3 | -0.8740 |
When the analyst provides left and/or right
columns, censoring is inferred from the NA pattern:
d3 <- data.frame(
left = c(NA, 20, 30, NA),
right = c(5, NA, 45, NA),
y = c(NA, NA, NA, 50),
x1 = rnorm(4)
)
kbl10(brs_prep(d3, ncuts = 100))| left | right | yt | y | delta | x1 |
|---|---|---|---|---|---|
| 0.0 | 0.05 | 0.025 | NA | 1 | 0.7743 |
| 0.2 | 1.00 | 0.600 | NA | 2 | -0.4019 |
| 0.3 | 0.45 | 0.375 | NA | 3 | -0.2151 |
| 0.5 | 0.50 | 0.500 | 50 | 0 | 0.6053 |
When the analyst provides y, left, and
right simultaneously, their endpoints are used directly
(rescaled by \(K\)):
d4 <- data.frame(
y = c(50, 75),
left = c(48, 73),
right = c(52, 77),
x1 = rnorm(2)
)
kbl10(brs_prep(d4, ncuts = 100))| left | right | yt | y | delta | x1 |
|---|---|---|---|---|---|
| 0.48 | 0.52 | 0.50 | 50 | 3 | -0.6146 |
| 0.73 | 0.77 | 0.75 | 75 | 3 | -0.7244 |
brs()Data processed by brs_prep() is automatically detected
by brs() - the internal brs_check() step is
skipped:
set.seed(42)
n <- 1000
dat <- data.frame(x1 = rnorm(n), x2 = rnorm(n))
sim <- brs_sim(
formula = ~ x1 + x2, data = dat,
beta = c(0.2, -0.5, 0.3), phi = 0.3,
link = "logit", link_phi = "logit",
repar = 2
)
# Remove left, right, yt so brs_prep can rebuild them
prep <- brs_prep(sim[-c(1:3)], ncuts = 100)
fit_prep <- brs(y ~ x1 + x2,
data = prep, repar = 2,
link = "logit", link_phi = "logit"
)
summary(fit_prep, digits = 4)
#>
#> Call:
#> brs(formula = y ~ x1 + x2, data = prep, link = "logit", link_phi = "logit",
#> repar = 2)
#>
#> Quantile residuals:
#> Min 1Q Median 3Q Max
#> -3.1122 -0.6590 -0.0288 0.6409 3.6970
#>
#> Coefficients (mean model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.18254 0.04364 4.183 2.88e-05 ***
#> x1 -0.48490 0.04491 -10.797 < 2e-16 ***
#> x2 0.26206 0.04447 5.893 3.80e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Phi coefficients (precision model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (phi) 0.26989 0.03992 6.76 1.38e-11 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> ---
#> Log-likelihood: -4072.5673 on 4 Df | AIC: 8153.1346 | BIC: 8172.7656
#> Pseudo R-squared: 0.1292
#> Number of iterations: 35 (BFGS)
#> Censoring: 796 interval | 74 left | 130 rightWe simulate observations from a beta regression model with fixed dispersion, two covariates, and a logit link for the mean.
set.seed(4255)
n <- 1000
dat <- data.frame(x1 = rnorm(n), x2 = rnorm(n))
sim_fixed <- brs_sim(
formula = ~ x1 + x2,
data = dat,
beta = c(0.3, -0.6, 0.4),
phi = 1 / 10,
link = "logit",
link_phi = "logit",
ncuts = 100,
repar = 2
)
kbl10(head(sim_fixed, 8))| left | right | yt | y | delta | x1 | x2 |
|---|---|---|---|---|---|---|
| 0.165 | 0.175 | 0.17 | 17 | 3 | 1.9510 | 0.2888 |
| 0.985 | 0.995 | 0.99 | 99 | 3 | 0.7725 | 1.6103 |
| 0.000 | 0.005 | 0.00 | 0 | 1 | 0.7264 | 1.0542 |
| 0.575 | 0.585 | 0.58 | 58 | 3 | 0.0487 | -0.4923 |
| 0.095 | 0.105 | 0.10 | 10 | 3 | -0.5445 | -1.4098 |
| 0.065 | 0.075 | 0.07 | 7 | 3 | 0.3600 | -1.0116 |
| 0.025 | 0.035 | 0.03 | 3 | 3 | 0.7136 | 2.3083 |
| 0.355 | 0.365 | 0.36 | 36 | 3 | -0.7274 | 0.2061 |
Each observation is centered in its interval. For example, a score of 67 on a 0–100 scale yields \(y_t=0.67\) with the interval \([0.665,0.675]\).
fit_fixed <- brs(
y ~ x1 + x2,
data = sim_fixed,
link = "logit",
link_phi = "logit",
repar = 2
)
summary(fit_fixed)
#>
#> Call:
#> brs(formula = y ~ x1 + x2, data = sim_fixed, link = "logit",
#> link_phi = "logit", repar = 2)
#>
#> Quantile residuals:
#> Min 1Q Median 3Q Max
#> -3.0753 -0.6600 0.0058 0.7071 3.3708
#>
#> Coefficients (mean model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.30604 0.04312 7.097 1.28e-12 ***
#> x1 -0.60332 0.04530 -13.317 < 2e-16 ***
#> x2 0.44135 0.04385 10.066 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Phi coefficients (precision model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (phi) 0.10991 0.04057 2.709 0.00674 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> ---
#> Log-likelihood: -4035.2262 on 4 Df | AIC: 8078.4524 | BIC: 8098.0834
#> Pseudo R-squared: 0.2393
#> Number of iterations: 39 (BFGS)
#> Censoring: 809 interval | 65 left | 126 rightThe summary output follows the betareg package style,
showing separate coefficient tables for the mean and precision
submodels, with Wald z-tests and \(p\)-values based on the standard normal
distribution.
| logLik | AIC | BIC | pseudo_r2 |
|---|---|---|---|
| -4035.226 | 8078.452 | 8098.083 | 0.2393 |
The package supports several link functions for the mean submodel. We can compare them using information criteria:
links <- c("logit", "probit", "cauchit", "cloglog")
fits <- lapply(setNames(links, links), function(lnk) {
brs(y ~ x1 + x2, data = sim_fixed, link = lnk, repar = 2)
})
# Estimates
est_table <- do.call(rbind, lapply(names(fits), function(lnk) {
e <- brs_est(fits[[lnk]])
e$link <- lnk
e
}))
kbl10(est_table)| variable | estimate | se | z_value | p_value | ci_lower | ci_upper | link |
|---|---|---|---|---|---|---|---|
| (Intercept) | 0.3060 | 0.0431 | 7.0968 | 0.0000 | 0.2215 | 0.3906 | logit |
| x1 | -0.6033 | 0.0453 | -13.3169 | 0.0000 | -0.6921 | -0.5145 | logit |
| x2 | 0.4413 | 0.0438 | 10.0661 | 0.0000 | 0.3554 | 0.5273 | logit |
| (phi) | 0.1099 | 0.0406 | 2.7095 | 0.0067 | 0.0304 | 0.1894 | logit |
| (Intercept) | 0.1866 | 0.0262 | 7.1224 | 0.0000 | 0.1352 | 0.2379 | probit |
| x1 | -0.3652 | 0.0266 | -13.7107 | 0.0000 | -0.4175 | -0.3130 | probit |
| x2 | 0.2669 | 0.0261 | 10.2177 | 0.0000 | 0.2157 | 0.3181 | probit |
| (phi) | 0.1118 | 0.0405 | 2.7603 | 0.0058 | 0.0324 | 0.1913 | probit |
| (Intercept) | 0.2766 | 0.0409 | 6.7671 | 0.0000 | 0.1965 | 0.3566 | cauchit |
| x1 | -0.5595 | 0.0493 | -11.3590 | 0.0000 | -0.6561 | -0.4630 | cauchit |
| logLik | AIC | BIC | pseudo_r2 | |
|---|---|---|---|---|
| logit | -4035.226 | 8078.452 | 8098.083 | 0.2393 |
| probit | -4035.685 | 8079.370 | 8099.001 | 0.2383 |
| cauchit | -4035.500 | 8079.000 | 8098.631 | 0.1918 |
| cloglog | -4037.563 | 8083.126 | 8102.757 | 0.1824 |
The plot() method provides six diagnostic panels. By
default, the first four are shown:
For ggplot2 output (requires the ggplot2 package):
# Fitted means
kbl10(
data.frame(mu_hat = head(predict(fit_fixed, type = "response"))),
digits = 4
)| mu_hat |
|---|
| 0.3222 |
| 0.6343 |
| 0.5825 |
| 0.5148 |
| 0.5031 |
| 0.4115 |
# Conditional variance
kbl10(
data.frame(var_hat = head(predict(fit_fixed, type = "variance"))),
digits = 4
)| var_hat |
|---|
| 0.1152 |
| 0.1224 |
| 0.1283 |
| 0.1317 |
| 0.1319 |
| 0.1277 |
# Quantile predictions
kbl10(predict(fit_fixed, type = "quantile", at = c(0.10, 0.25, 0.5, 0.75, 0.90)))| q_0.1 | q_0.25 | q_0.5 | q_0.75 | q_0.9 |
|---|---|---|---|---|
| 0.0007 | 0.0170 | 0.1779 | 0.6024 | 0.8964 |
| 0.0711 | 0.3148 | 0.7508 | 0.9675 | 0.9980 |
| 0.0435 | 0.2311 | 0.6578 | 0.9398 | 0.9947 |
| 0.0208 | 0.1444 | 0.5288 | 0.8860 | 0.9856 |
| 0.0181 | 0.1318 | 0.5060 | 0.8745 | 0.9833 |
| 0.0048 | 0.0565 | 0.3311 | 0.7601 | 0.9538 |
| 0.1348 | 0.4660 | 0.8689 | 0.9905 | 0.9997 |
| 0.1221 | 0.4392 | 0.8517 | 0.9880 | 0.9996 |
| 0.0319 | 0.1895 | 0.6011 | 0.9185 | 0.9915 |
| 0.0288 | 0.1777 | 0.5834 | 0.9111 | 0.9902 |
Wald confidence intervals based on the asymptotic normal approximation:
| 2.5 % | 97.5 % | |
|---|---|---|
| (Intercept) | 0.2215 | 0.3906 |
| x1 | -0.6921 | -0.5145 |
| x2 | 0.3554 | 0.5273 |
| (phi) | 0.0304 | 0.1894 |
| 2.5 % | 97.5 % | |
|---|---|---|
| (Intercept) | 0.2215 | 0.3906 |
| x1 | -0.6921 | -0.5145 |
| x2 | 0.3554 | 0.5273 |
In many applications, the dispersion parameter \(\phi\) may depend on covariates. The
package supports variable-dispersion models using the
Formula package notation:
y ~ x1 + x2 | z1 + z2, where the terms after |
define the linear predictor for \(\phi\). The same brs_sim()
function is used for fixed and variable dispersion; the second formula
part activates the precision submodel in simulation.
set.seed(2222)
n <- 1000
dat_z <- data.frame(
x1 = rnorm(n),
x2 = rnorm(n),
x3 = rbinom(n, size = 1, prob = 0.5),
z1 = rnorm(n),
z2 = rnorm(n)
)
sim_var <- brs_sim(
formula = ~ x1 + x2 + x3 | z1 + z2,
data = dat_z,
beta = c(0.2, -0.6, 0.2, 0.2),
zeta = c(0.2, -0.8, 0.6),
link = "logit",
link_phi = "logit",
ncuts = 100,
repar = 2
)
kbl10(head(sim_var, 10))| left | right | yt | y | delta | x1 | x2 | x3 | z1 | z2 |
|---|---|---|---|---|---|---|---|---|---|
| 0.265 | 0.275 | 0.27 | 27 | 3 | -0.3381 | -0.8235 | 0 | 0.0306 | -1.1222 |
| 0.175 | 0.185 | 0.18 | 18 | 3 | 0.9392 | -1.7563 | 0 | 0.1938 | -0.2891 |
| 0.000 | 0.005 | 0.00 | 0 | 1 | 1.7377 | -1.3148 | 1 | -0.4283 | 0.3479 |
| 0.525 | 0.535 | 0.53 | 53 | 3 | 0.6963 | -0.8196 | 0 | 0.3694 | 0.2811 |
| 0.085 | 0.095 | 0.09 | 9 | 3 | 0.4623 | -0.1183 | 1 | 0.0553 | -1.2184 |
| 0.955 | 0.965 | 0.96 | 96 | 3 | -0.3151 | -0.0648 | 1 | 0.7169 | -0.5613 |
| 0.165 | 0.175 | 0.17 | 17 | 3 | 0.1927 | -0.9985 | 0 | -0.2222 | 0.4225 |
| 0.385 | 0.395 | 0.39 | 39 | 3 | 1.1307 | 0.6330 | 1 | 1.2509 | -1.5492 |
| 0.005 | 0.015 | 0.01 | 1 | 3 | 1.9764 | -0.4887 | 0 | -1.8699 | 0.3679 |
| 0.515 | 0.525 | 0.52 | 52 | 3 | 1.2071 | -1.3548 | 1 | -0.2813 | -1.6754 |
fit_var <- brs(
y ~ x1 + x2 | z1,
data = sim_var,
link = "logit",
link_phi = "logit",
repar = 2
)
summary(fit_var)
#>
#> Call:
#> brs(formula = y ~ x1 + x2 | z1, data = sim_var, link = "logit",
#> link_phi = "logit", repar = 2)
#>
#> Quantile residuals:
#> Min 1Q Median 3Q Max
#> -3.6111 -0.6344 0.0131 0.6654 3.2463
#>
#> Coefficients (mean model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.27141 0.04245 6.394 1.62e-10 ***
#> x1 -0.54553 0.04375 -12.468 < 2e-16 ***
#> x2 0.17226 0.04141 4.160 3.18e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Phi coefficients (precision model with logit link):
#> Estimate Std. Error z value Pr(>|z|)
#> (phi)_(Intercept) 0.24653 0.04233 5.823 5.77e-09 ***
#> (phi)_z1 -0.72230 0.04507 -16.028 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> ---
#> Log-likelihood: -3922.2430 on 5 Df | AIC: 7854.4861 | BIC: 7879.0249
#> Pseudo R-squared: 0.1159
#> Number of iterations: 42 (BFGS)
#> Censoring: 744 interval | 105 left | 151 rightNotice the (phi)_ prefix in the precision coefficient
names, following the betareg convention.
# Full parameter vector
round(coef(fit_var), 4)
#> (Intercept) x1 x2 (phi)_(Intercept)
#> 0.2714 -0.5455 0.1723 0.2465
#> (phi)_z1
#> -0.7223
# Mean submodel only
round(coef(fit_var, model = "mean"), 4)
#> (Intercept) x1 x2
#> 0.2714 -0.5455 0.1723
# Precision submodel only
round(coef(fit_var, model = "precision"), 4)
#> (phi)_(Intercept) (phi)_z1
#> 0.2465 -0.7223
# Variance-covariance matrix for the mean submodel
kbl10(vcov(fit_var, model = "mean"))| (Intercept) | x1 | x2 | |
|---|---|---|---|
| (Intercept) | 0.0018 | -0.0001 | 0.0000 |
| x1 | -0.0001 | 0.0019 | 0.0000 |
| x2 | 0.0000 | 0.0000 | 0.0017 |
links <- c("logit", "probit", "cauchit", "cloglog")
fits_var <- lapply(setNames(links, links), function(lnk) {
brs(y ~ x1 + x2 | z1, data = sim_var, link = lnk, repar = 2)
})
# Estimates
est_var <- do.call(rbind, lapply(names(fits_var), function(lnk) {
e <- brs_est(fits_var[[lnk]])
e$link <- lnk
e
}))
kbl10(est_var)| variable | estimate | se | z_value | p_value | ci_lower | ci_upper | link |
|---|---|---|---|---|---|---|---|
| (Intercept) | 0.2714 | 0.0424 | 6.3938 | 0 | 0.1882 | 0.3546 | logit |
| x1 | -0.5455 | 0.0438 | -12.4682 | 0 | -0.6313 | -0.4598 | logit |
| x2 | 0.1723 | 0.0414 | 4.1599 | 0 | 0.0911 | 0.2534 | logit |
| (phi)_(Intercept) | 0.2465 | 0.0423 | 5.8234 | 0 | 0.1636 | 0.3295 | logit |
| (phi)_z1 | -0.7223 | 0.0451 | -16.0277 | 0 | -0.8106 | -0.6340 | logit |
| (Intercept) | 0.1675 | 0.0260 | 6.4315 | 0 | 0.1165 | 0.2186 | probit |
| x1 | -0.3352 | 0.0262 | -12.8011 | 0 | -0.3865 | -0.2839 | probit |
| x2 | 0.1056 | 0.0253 | 4.1771 | 0 | 0.0561 | 0.1552 | probit |
| (phi)_(Intercept) | 0.2467 | 0.0423 | 5.8291 | 0 | 0.1638 | 0.3297 | probit |
| (phi)_z1 | -0.7229 | 0.0451 | -16.0344 | 0 | -0.8112 | -0.6345 | probit |
| logLik | AIC | BIC | pseudo_r2 | |
|---|---|---|---|---|
| logit | -3922.243 | 7854.486 | 7879.025 | 0.1159 |
| probit | -3922.227 | 7854.455 | 7878.994 | 0.1216 |
| cauchit | -3923.161 | 7856.323 | 7880.862 | 0.0902 |
| cloglog | -3922.296 | 7854.593 | 7879.132 | 0.0898 |
The package includes analyst-facing helpers for uncertainty quantification, effect interpretation, score-scale communication, and predictive validation.
set.seed(101)
boot_ci <- brs_bootstrap(
fit_fixed,
R = 100,
level = 0.95,
ci_type = "bca",
keep_draws = TRUE
)
kbl10(head(boot_ci, 10))| parameter | estimate | se_boot | ci_lower | ci_upper | mcse_lower | mcse_upper | wald_lower | wald_upper | level |
|---|---|---|---|---|---|---|---|---|---|
| (Intercept) | 0.3060 | 0.0410 | 0.2420 | 0.4106 | 0.0058 | 0.0137 | 0.2215 | 0.3906 | 0.95 |
| x1 | -0.6033 | 0.0424 | -0.6863 | -0.5387 | 0.0059 | 0.0045 | -0.6921 | -0.5145 | 0.95 |
| x2 | 0.4413 | 0.0423 | 0.3641 | 0.5088 | 0.0100 | 0.0034 | 0.3554 | 0.5273 | 0.95 |
| (phi) | 0.1099 | 0.0396 | 0.0307 | 0.1714 | 0.0093 | 0.0039 | 0.0304 | 0.1894 | 0.95 |
set.seed(202)
ame <- brs_marginaleffects(
fit_fixed,
model = "mean",
type = "response",
interval = TRUE,
n_sim = 120,
keep_draws = TRUE
)
kbl10(ame)| variable | ame | std.error | ci.lower | ci.upper | model | type | n |
|---|---|---|---|---|---|---|---|
| x1 | -0.1321 | 0.0088 | -0.1473 | -0.1123 | mean | response | 1000 |
| x2 | 0.0966 | 0.0093 | 0.0798 | 0.1140 | mean | response | 1000 |
| score_0 | score_1 | score_2 | score_3 | score_4 | score_5 |
|---|---|---|---|---|---|
| 0.1754 | 0.0657 | 0.0386 | 0.0288 | 0.0235 | 0.0201 |
| 0.0218 | 0.0190 | 0.0139 | 0.0117 | 0.0104 | 0.0095 |
| 0.0320 | 0.0249 | 0.0176 | 0.0145 | 0.0127 | 0.0115 |
| 0.0516 | 0.0342 | 0.0230 | 0.0185 | 0.0159 | 0.0142 |
| 0.0559 | 0.0360 | 0.0240 | 0.0192 | 0.0165 | 0.0146 |
| 0.1016 | 0.0509 | 0.0318 | 0.0246 | 0.0206 | 0.0180 |
| row_sum |
|---|
| 0.4263 |
| 0.1260 |
| 0.1605 |
| 0.2141 |
| 0.2245 |
| 0.3163 |
set.seed(303) # For cross-validation reproducibility
cv_res <- brs_cv(
y ~ x1 + x2,
data = sim_fixed,
k = 5,
repeats = 5,
repar = 2,
)
kbl10(cv_res)| repeat | fold | n_train | n_test | log_score | rmse_yt | mae_yt | converged | error |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 800 | 200 | -4.0713 | 0.3472 | 0.3019 | TRUE | NA |
| 1 | 2 | 800 | 200 | -3.9984 | 0.3451 | 0.3019 | TRUE | NA |
| 1 | 3 | 800 | 200 | -3.9235 | 0.3395 | 0.2997 | TRUE | NA |
| 1 | 4 | 800 | 200 | -4.0902 | 0.3311 | 0.2874 | TRUE | NA |
| 1 | 5 | 800 | 200 | -4.1125 | 0.3420 | 0.2958 | TRUE | NA |
| 2 | 1 | 800 | 200 | -4.0795 | 0.3270 | 0.2892 | TRUE | NA |
| 2 | 2 | 800 | 200 | -4.2062 | 0.3504 | 0.2972 | TRUE | NA |
| 2 | 3 | 800 | 200 | -4.1858 | 0.3248 | 0.2818 | TRUE | NA |
| 2 | 4 | 800 | 200 | -3.8729 | 0.3457 | 0.3036 | TRUE | NA |
| 2 | 5 | 800 | 200 | -3.8752 | 0.3578 | 0.3159 | TRUE | NA |
kbl10(
data.frame(
metric = c("log_score", "rmse_yt", "mae_yt"),
mean = c(
mean(cv_res$log_score, na.rm = TRUE),
mean(cv_res$rmse_yt, na.rm = TRUE),
mean(cv_res$mae_yt, na.rm = TRUE)
)
),
digits = 4
)| metric | mean |
|---|---|
| log_score | -4.0404 |
| rmse_yt | 0.3409 |
| mae_yt | 0.2974 |
The following standard S3 methods are available for objects of class
"brs":
| Method | Description |
|---|---|
print() |
Compact display of call and coefficients |
summary() |
Detailed output with Wald tests and goodness-of-fit |
coef(model=) |
Extract coefficients (full, mean, or precision) |
vcov(model=) |
Variance-covariance matrix (full, mean, or precision) |
confint(model=) |
Wald confidence intervals |
logLik() |
Log-likelihood value |
AIC(), BIC() |
Information criteria |
nobs() |
Number of observations |
formula() |
Model formula |
model.matrix(model=) |
Design matrix (mean or precision) |
fitted() |
Fitted mean values |
residuals(type=) |
Residuals: response, pearson, rqr, weighted, sweighted |
predict(type=) |
Predictions: response, link, precision, variance, quantile |
plot(gg=) |
Diagnostic plots (base R or ggplot2) |
The package supports three reparameterizations of the beta
distribution, controlled by the repar argument:
Direct (repar = 0): Shape parameters
\(a=\mu\) and \(b=\phi\) are used directly. This is rarely
used in practice.
Precision (repar = 1, Ferrari &
Cribari-Neto, 2004): The mean \(\mu\in(0,1)\) and precision \(\phi>0\) yield \(a=\mu\phi\) and \(b=(1-\mu)\phi\). Higher \(\phi\) means less variability.
Mean–variance (repar = 2): The mean
\(\mu\in(0,1)\) and dispersion \(\phi\in(0,1)\) yield \(a=\mu(1-\phi)/\phi\) and \(b=(1-\mu)(1-\phi)/\phi\). Here \(\phi\) acts as a coefficient of variation:
smaller \(\phi\) means less
variability.
Ferrari, S. L. P., and Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799–815. DOI: 10.1080/0266476042000214501. Validated online via: https://doi.org/10.1080/0266476042000214501.
Smithson, M., and Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods, 11(1), 54–71. DOI: 10.1037/1082-989X.11.1.54. Validated online via: https://doi.org/10.1037/1082-989X.11.1.54.
Hawker, G. A., Mian, S., Kendzerska, T., and French, M. (2011). Measures of adult pain: Visual Analog Scale for Pain (VAS Pain), Numeric Rating Scale for Pain (NRS Pain), McGill Pain Questionnaire (MPQ), Short-Form McGill Pain Questionnaire (SF-MPQ), Chronic Pain Grade Scale (CPGS), Short Form-36 Bodily Pain Scale (SF-36 BPS), and Measure of Intermittent and Constant Osteoarthritis Pain (ICOAP). Arthritis Care and Research, 63(S11), S240–S252. DOI: 10.1002/acr.20543. Validated online via: https://doi.org/10.1002/acr.20543.
Hjermstad, M. J., Fayers, P. M., Haugen, D. F., et al. (2011). Studies comparing numerical rating scales, verbal rating scales, and visual analogue scales for assessment of pain intensity in adults: a systematic literature review. Journal of Pain and Symptom Management, 41(6), 1073–1093. DOI: 10.1016/j.jpainsymman.2010.08.016. Validated online via: https://doi.org/10.1016/j.jpainsymman.2010.08.016.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.