The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Adaptive Huber Estimation and Regression
This package implements the Huber-type estimator for mean, covariance matrix, regression and l1-regularized Huber regression (Huber-Lasso). For all these methods, the robustification parameter τ is calibrated via a tuning-free principle.
Specifically, for Huber regression, assume the observed data vectors (Y, X) follow a linear model Y = θ0 + X θ + ε, where Y is an n-dimensional response vector, X is an n × d design matrix, and ε is an n-vector of noise variables whose distributions can be asymmetric and/or heavy-tailed. The package computes the standard Huber’s M-estimator when d < n and the Huber-Lasso estimator when d > n. The vector of coefficients θ and the intercept term θ0 are estimated successively via a two-step procedure. See Wang et al., 2021 for more details.
2022-03-04
Version 1.1 is submitted to CRAN.
Install adaHuber
from CRAN
install.packages("adaHuber")
Error: Compilation failed (with messages involving lgfortran, clang, etc.). Solution: This is a compilation error of Rcpp-based source packages. It happens when we recently submit a new version to CRAN, but it usually takes 3-5 days to build the binary package. Please use an older version or patiently wait for 3-5 days and then install the updated version.
Error: unable to load shared object.. Symbol not found:
_EXTPTR_PTR. Solution: This issue is common in some
specific versions of R
when we load Rcpp-based libraries.
It is an error in R caused by a minor change about
EXTPTR_PTR
. Upgrading R to 4.0.2 will solve the
problem.
There are five functions in this package:
adaHuber.mean
: Adaptive Huber mean estimation.adaHuber.cov
: Adaptive Huber covariance
estimation.adaHuber.reg
: Adaptive Huber regression.adaHuber.lasso
: Adaptive Huber-Lasso regression.adaHuber.cv.lasso
: Cross-validated adaptive Huber-Lasso
regression.Help on the functions can be accessed by typing ?
,
followed by function name at the R command prompt.
For example, ?adaHuber.reg
will present a detailed
documentation with inputs, outputs and examples of the function
adaHuber.reg
.
First, we present an example of Huber mean estimation. We generate data from a t distribution, which is heavy-tailed. We estimate its mean by the tuning-free Huber mean estimator.
library(adaHuber)
= 1000
n = 2
mu = rt(n, 2) + mu
X = adaHuber.mean(X)
fit.mean $mu fit.mean
Then we present an example of Huber covariance matrix estimation. We generate data from t distribution with df = 3, which is heavy-tailed.
= 100
n = 5
p = matrix(rt(n * p, 3), n, p)
X = adaHuber.cov(X)
fit.cov $cov fit.cov
Next, we present an example of adaptive Huber regression. Here we generate data from a linear model Y = X θ + ε, where ε follows a t distribution, and estimate the intercept and coefficients by tuning-free Huber regression.
= 200
n = 10
p = rep(1.5, p + 1)
beta = matrix(rnorm(n * p), n, p)
X = rt(n, 2)
err = cbind(1, X) %*% beta + err
Y
= adaHuber.reg(X, Y, method = "adaptive")
fit.adahuber = fit.adahuber$coef beta.adahuber
Finally, we illustrate the use of l1-regularized Huber regression. Again, we generate data from a linear model Y = X θ + ε, where θ is a high-dimensional vector, and ε is from a t distribution. We estimate the intercept and coefficients by Huber-Lasso regression, where the regularization parameter λ is calibrated by K-fold cross-validation, and the robustification parameter τ is chosen by a tuning-free procedure.
= 100; p = 200; s = 5
n = c(rep(1.5, s + 1), rep(0, p - s))
beta = matrix(rnorm(n * p), n, p)
X = rt(n, 2)
err = cbind(rep(1, n), X) %*% beta + err
Y
= adaHuber.cv.lasso(X, Y)
fit.lasso = fit.lasso$coef beta.lasso
GPL-3.0
C++11
Xiaoou Pan xip024@ucsd.edu, Wen-Xin Zhou wez243@ucsd.edu
Xiaoou Pan xip024@ucsd.edu
Eddelbuettel, D. and Francois, R. (2011). Rcpp: Seamless R and C++ integration. J. Stat. Softw. 40 1-18. Paper
Fan, J., Liu, H., Sun, Q. and Zhang, T. (2018). I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46 814–841. Paper
Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci. 34 454-471. Paper
Pan, X., Sun, Q. and Zhou, W.-X. (2021). Iteratively reweighted l1-penalized robust regression. Electron. J. Stat. 15 3287-3348. Paper
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Stat. Assoc. 115 254-265. Paper
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). A new principle for tuning-free Huber regression. Stat. Sinica 31 2153-2177. Paper
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.