The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The likelihood.model package provides a flexible
framework for specifying and using likelihood models for statistical
inference in R. It supports: - Standard R distributions (normal,
Weibull, exponential, etc.) - Censored data (left, right, and interval
censoring) - Custom likelihood contributions for complex models -
Maximum likelihood estimation with confidence intervals - Bootstrap
sampling distributions
This vignette provides a quick introduction to the main features.
Let’s start with a simple example: fitting a Weibull distribution to
survival data using the weibull_uncensored model, which
provides analytical derivatives for faster and more accurate
estimation.
# Generate synthetic survival data
set.seed(42)
n <- 150
true_shape <- 2.5
true_scale <- 3.0
df <- data.frame(x = rweibull(n, shape = true_shape, scale = true_scale))
# Create the likelihood model
model <- weibull_uncensored("x")
# View model assumptions
assumptions(model)
#> [1] "independent" "identically distributed"
#> [3] "exact observations"
# Fit the MLE
solver <- fit(model)
mle <- solver(df, par = c(1, 1)) # initial guess: shape=1, scale=1
# View results
summary(mle)
#> Maximum Likelihood Estimate (Fisherian)
#> ----------------------------------------
#>
#> Coefficients:
#> Estimate Std. Error 2.5% 97.5%
#> [1,] 2.152 0.131 1.896 2.409
#> [2,] 2.916 0.117 2.686 3.145
#>
#> Log-likelihood: -238.7
#> AIC: 481.4
#> Number of observations: 150# Parameter estimates
cat("Estimated parameters:\n")
#> Estimated parameters:
print(coef(mle))
#> [1] 2.152 2.916
# Confidence intervals (Wald-based)
cat("\n95% Confidence Intervals:\n")
#>
#> 95% Confidence Intervals:
print(confint(mle))
#> 2.5% 97.5%
#> [1,] 1.896 2.409
#> [2,] 2.686 3.145
# Standard errors
cat("\nStandard Errors:\n")
#>
#> Standard Errors:
print(se(mle))
#> [1] 0.131 0.117
# Log-likelihood value
cat("\nLog-likelihood:", loglik_val(mle), "\n")
#>
#> Log-likelihood: -238.7
# AIC for model selection
cat("AIC:", aic(mle), "\n")
#> AIC: 481.4A key feature of likelihood.model is proper handling of
censored data. Let’s simulate data with right-censoring and show how
ignoring censoring leads to biased estimates.
# Generate normal data with right-censoring
set.seed(123)
n <- 200
true_mean <- 10
true_sd <- 2
censor_time <- 11
# Generate latent (true) values
x_latent <- rnorm(n, true_mean, true_sd)
# Apply censoring
censored <- x_latent > censor_time
df_cens <- data.frame(
x = ifelse(censored, censor_time, x_latent),
censor = ifelse(censored, "right", "exact")
)
cat("Censoring rate:", mean(censored) * 100, "%\n")
#> Censoring rate: 27.5 %# Create model that handles censoring
model_cens <- likelihood_name("norm", ob_col = "x", censor_col = "censor")
# Fit the model
solver_cens <- fit(model_cens)
mle_cens <- suppressWarnings(solver_cens(df_cens, par = c(0, 1)))
cat("MLE (accounting for censoring):\n")
#> MLE (accounting for censoring):
cat(" Mean:", coef(mle_cens)[1], "(true:", true_mean, ")\n")
#> Mean: 9.912 (true: 10 )
cat(" SD: ", coef(mle_cens)[2], "(true:", true_sd, ")\n")
#> SD: 1.754 (true: 2 )# Naive estimates (ignoring censoring)
naive_mean <- mean(df_cens$x)
naive_sd <- sd(df_cens$x)
cat("\nComparison of estimates:\n")
#>
#> Comparison of estimates:
cat(" Mean SD\n")
#> Mean SD
cat(sprintf("True values: %7.3f %5.3f\n", true_mean, true_sd))
#> True values: 10.000 2.000
cat(sprintf("MLE (correct): %7.3f %5.3f\n", coef(mle_cens)[1], coef(mle_cens)[2]))
#> MLE (correct): 9.912 1.754
cat(sprintf("Naive (biased): %7.3f %5.3f\n", naive_mean, naive_sd))
#> Naive (biased): 9.617 1.353
cat("\nNote: Naive estimates are biased downward due to censoring!\n")
#>
#> Note: Naive estimates are biased downward due to censoring!For more robust inference, especially with small samples, we can use bootstrap methods to estimate the sampling distribution of the MLE.
# Using the Weibull example from before
model <- weibull_uncensored("x")
df <- data.frame(x = rweibull(150, shape = 2.5, scale = 3.0))
# Create bootstrap sampler
boot_sampler <- sampler(model, df = df, par = c(1, 1))
# Generate bootstrap samples (100 replicates for speed)
boot_result <- boot_sampler(n = 100)
# Compare asymptotic vs bootstrap standard errors
mle <- fit(model)(df, par = c(1, 1))
asymp_se <- se(mle)
boot_se <- se(boot_result) # Uses vcov from bootstrap replicates
cat("Standard Error Comparison:\n")
#> Standard Error Comparison:
cat(" Asymptotic Bootstrap\n")
#> Asymptotic Bootstrap
cat(sprintf("Shape: %10.4f %9.4f\n", asymp_se[1], boot_se[1]))
#> Shape: 0.1623 0.2150
cat(sprintf("Scale: %10.4f %9.4f\n", asymp_se[2], boot_se[2]))
#> Scale: 0.1026 0.0989
# Bootstrap bias estimate
cat("\nBootstrap Bias Estimate:\n")
#>
#> Bootstrap Bias Estimate:
print(bias(boot_result))
#> [1] 0.031600 0.007698At the MLE, the score function (gradient of log-likelihood) should be approximately zero. This provides a useful diagnostic.
model <- weibull_uncensored("x")
df <- data.frame(x = rweibull(100, 2, 1.5))
# Fit MLE
mle <- fit(model)(df, par = c(1, 1))
# Evaluate score at MLE
score_func <- score(model)
score_at_mle <- score_func(df, coef(mle))
cat("Score at MLE (should be near zero):\n")
#> Score at MLE (should be near zero):
print(score_at_mle)
#> shape scale
#> -0.00338 0.01183
# The score is also stored in the MLE object
cat("\nScore stored in MLE object:\n")
#>
#> Score stored in MLE object:
print(score_val(mle))
#> shape scale
#> -0.00338 0.01183The score values are very close to zero, confirming that the optimizer successfully found the MLE (a stationary point of the log-likelihood).
The package supports pure likelihood-based inference in the Fisherian tradition. Instead of confidence intervals, we can compute likelihood intervals that make no probability statements about parameters.
# Fit a model
model <- weibull_uncensored("x")
df <- data.frame(x = rweibull(100, 2.0, 1.5))
mle <- fit(model)(df, par = c(1, 1))
# Support function: log relative likelihood
# S(theta) = logL(theta) - logL(theta_hat)
# Support at MLE is always 0
s_at_mle <- support(mle, coef(mle), df, model)
cat("Support at MLE:", s_at_mle, "\n")
#> Support at MLE: 0
# Support at alternative parameter values (negative = less supported)
s_alt <- support(mle, c(1.5, 1.0), df, model)
cat("Support at (1.5, 1.0):", s_alt, "\n")
#> Support at (1.5, 1.0): -18.64
# Relative likelihood = exp(support)
rl <- relative_likelihood(mle, c(1.5, 1.0), df, model)
cat("Relative likelihood at (1.5, 1.0):", rl, "\n")
#> Relative likelihood at (1.5, 1.0): 8.011e-09Negative support and a very small relative likelihood indicate that the alternative parameter values \((1.5, 1.0)\) are poorly supported by the data compared to the MLE.
# Compute 1/8 likelihood interval (roughly equivalent to 95% CI)
# This is the set of theta where R(theta) >= 1/8
li <- likelihood_interval(mle, df, model, k = 8)
print(li)
#> 1/8 Likelihood Interval (R >= 0.125)
#> -----------------------------------
#> lower upper
#> [1,] 1.646 2.281
#> [2,] 1.299 1.623
#> attr(,"k")
#> [1] 8
#> attr(,"relative_likelihood_cutoff")
#> [1] 0.125
# Compare with Wald confidence interval
cat("\nWald 95% CI:\n")
#>
#> Wald 95% CI:
print(confint(mle))
#> 2.5% 97.5%
#> [1,] 1.644 2.255
#> [2,] 1.301 1.608The exponential_lifetime model demonstrates a key design
feature: specialized models can override fit() to bypass
optim entirely when the MLE has a closed-form solution.
For Exponential(lambda) data, the MLE is simply lambda_hat = d / T, where d is the number of uncensored observations and T is the total observation time.
# Generate exponential survival data
set.seed(99)
df_exp <- data.frame(t = rexp(200, rate = 3.0))
# Create model and fit -- no initial guess needed!
model_exp <- exponential_lifetime("t")
mle_exp <- fit(model_exp)(df_exp)
cat("Closed-form MLE (no optim):\n")
#> Closed-form MLE (no optim):
cat(" lambda_hat:", coef(mle_exp), "(true: 3.0)\n")
#> lambda_hat: 3.137 (true: 3.0)
cat(" SE:", se(mle_exp), "\n")
#> SE: 0.2218
# The score at the MLE is exactly zero (by construction)
cat(" Score at MLE:", score_val(mle_exp), "\n")
#> Score at MLE: 0The model handles right-censoring naturally through the sufficient statistics.
# Generate data with right-censoring at time 0.5
set.seed(99)
true_lambda <- 3.0
x <- rexp(200, rate = true_lambda)
censored <- x > 0.5
df_cens_exp <- data.frame(
t = pmin(x, 0.5),
status = ifelse(censored, "right", "exact")
)
cat("Censoring rate:", mean(censored) * 100, "%\n")
#> Censoring rate: 20.5 %
model_cens_exp <- exponential_lifetime("t", censor_col = "status")
mle_cens_exp <- fit(model_cens_exp)(df_cens_exp)
cat("MLE (censored):", coef(mle_cens_exp), "(true:", true_lambda, ")\n")
#> MLE (censored): 3.336 (true: 3 )
cat("95% CI:", confint(mle_cens_exp)[1, ], "\n")
#> 95% CI: 2.817 3.854likelihood_name("exp", ...)The analytical model produces the same log-likelihood as the generic approach:
# Compare log-likelihood values at the MLE
lambda_hat <- coef(mle_exp)
ll_analytical <- loglik(model_exp)(df_exp, lambda_hat)
# likelihood_name("exp") uses dexp(x, rate), so pass unnamed parameter
df_exp_name <- data.frame(t = df_exp$t, censor = rep("exact", nrow(df_exp)))
model_generic <- likelihood_name("exp", ob_col = "t", censor_col = "censor")
ll_generic <- loglik(model_generic)(df_exp_name, unname(lambda_hat))
cat("Analytical loglik:", ll_analytical, "\n")
#> Analytical loglik: 28.66
cat("Generic loglik: ", ll_generic, "\n")
#> Generic loglik: 28.66
cat("Match:", isTRUE(all.equal(unname(ll_analytical), ll_generic, tolerance = 1e-6)), "\n")
#> Match: TRUE| Function | Description |
|---|---|
likelihood_name() |
Create model for standard R distributions |
weibull_uncensored() |
Weibull model with analytical derivatives (exact obs.) |
exponential_lifetime() |
Exponential model with closed-form MLE and optional censoring |
likelihood_contr_model$new() |
Custom likelihood contributions |
| Function | Description |
|---|---|
loglik() |
Get log-likelihood function |
score() |
Get score (gradient) function |
hess_loglik() |
Get Hessian of log-likelihood |
fim() |
Fisher information matrix |
observed_info() |
Observed information matrix |
| Function | Description |
|---|---|
fit() |
Create MLE solver (returns fisher_mle) |
sampler() |
Create bootstrap sampler (returns fisher_boot) |
coef() |
Extract parameter estimates |
vcov() |
Variance-covariance matrix |
confint() |
Confidence intervals |
se() |
Standard errors |
| Function | Description |
|---|---|
support() |
Log relative likelihood: S(θ) = logL(θ) - logL(θ̂) |
relative_likelihood() |
Likelihood ratio: R(θ) = L(θ)/L(θ̂) |
likelihood_interval() |
Interval where R(θ) ≥ 1/k |
profile_loglik() |
Profile log-likelihood |
evidence() |
Evidence for θ₁ vs θ₂ |
| Function | Description |
|---|---|
lrt() |
Likelihood ratio test |
aic() |
Akaike Information Criterion |
bic() |
Bayesian Information Criterion |
deviance() |
Deviance statistic |
For more advanced usage, see the other vignettes:
likelihood_name with various censoring
typessessionInfo()
#> R version 4.3.3 (2024-02-29)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/Chicago
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] algebraic.dist_0.1.0 algebraic.mle_0.9.0 likelihood.model_0.9.1
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 codetools_0.2-19
#> [4] numDeriv_2016.8-1.1 fastmap_1.2.0 xfun_0.54
#> [7] cachem_1.1.0 knitr_1.50 htmltools_0.5.8.1
#> [10] generics_0.1.4 rmarkdown_2.30 lifecycle_1.0.5
#> [13] mvtnorm_1.3-3 cli_3.6.5 sass_0.4.10
#> [16] jquerylib_0.1.4 compiler_4.3.3 boot_1.3-32
#> [19] tools_4.3.3 evaluate_1.0.5 bslib_0.9.0
#> [22] yaml_2.3.11 rlang_1.1.7 jsonlite_2.0.0
#> [25] MASS_7.3-60.0.1These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.