The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Optimization via Subsampling (OPTS)
Version: 0.1
Date: 2022-05-20
Maintainer: Mihai Giurcanu <giurcanu@uchicago.edu>
Author: Mihai Giurcanu [aut, cre], Marinela Capanu [aut, ctb], Colin Begg [aut], Mithat Gonen [aut]
Imports: MASS, cvTools, changepoint
Description: Subsampling based variable selection for low dimensional generalized linear models. The methods repeatedly subsample the data minimizing an information criterion (AIC/BIC) over a sequence of nested models for each subsample. Marinela Capanu, Mihai Giurcanu, Colin B Begg, Mithat Gonen, Subsampling based variable selection for generalized linear models.
License: GPL-2
NeedsCompilation: no
Packaged: 2022-05-24 14:16:53 UTC; mgiurcanu
Repository: CRAN
Date/Publication: 2022-05-25 07:50:08 UTC

Optimization via Subsampling (OPTS)

Description

opts computes the OPTS MLE in low dimensional case.

Usage

opts(X, Y, m, crit = "aic", prop_split = 0.5, cutoff = 0.75, ...)

Arguments

X

n x p covariate matrix (without intercept)

Y

n x 1 binary response vector

m

number of subsamples

crit

information criterion to select the variables: (a) aic = minimum AIC and (b) bic = minimum BIC

prop_split

proportion of subsample size and sample size, default value = 0.5

cutoff

cutoff used to select the variables using the stability selection criterion, default value = 0.75

...

other arguments passed to the glm function, e.g., family = "binomial"

Value

opts returns a list:

betahat

OPTS MLE of regression parameter vector

Jhat

estimated set of active predictors (TRUE/FALSE) corresponding to the OPTS MLE

SE

standard error of OPTS MLE

freqs

relative frequency of selection for all variables

Examples

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# OPTS-AIC MLE
opts(X, Y, 10, family = "binomial")


Threshold OPTimization via Subsampling (OPTS_TH)

Description

opts_th computes the threshold OPTS MLE in low dimensional case.

Usage

opts_th(X, Y, m, crit = "aic", type = "binseg", prop_split = 0.5,
  prop_trim = 0.2, q_tail = 0.5, ...)

Arguments

X

n x p covariate matrix (without intercept)

Y

n x 1 binary response vector

m

number of subsamples

crit

information criterion to select the variables: (a) aic = minimum AIC and (b) bic = minimum BIC

type

method used to minimize the trimmed and averaged information criterion: (a) min = observed minimum subsampling trimmed average information, (b) sd = observed minimum using the 0.25sd rule (corresponding to OPTS-min in the paper), (c) pelt = PELT changepoint algorithm (corresponding to OPTS-PELT in the paper), (d) binseg = binary segmentation changepoint algorithm (corresponding to OPTS-BinSeg in the paper), (e) amoc = AMOC method.

prop_split

proportion of subsample size of the sample size; default value is 0.5

prop_trim

proportion that defines the trimmed mean; default value = 0.2

q_tail

quantiles for the minimum and maximum p-values across the subsample cutpoints used to define the range of cutpoints

...

other arguments passed to the glm function, e.g., family = "binomial"

Value

opts_th returns a list:

betahat

STOPES MLE of regression parameters

SE

SE of STOPES MLE

Jhat

set of active predictors (TRUE/FALSE) corresponding to STOPES MLE

cuthat

estimated cutpoint for variable selection

pval

marginal p-values from univariate fit

cutpoits

subsample cutpoints

aic_mean

mean subsample AIC

bic_mean

mean subsample BIC

Examples

require(MASS)
P = 15
N = 100
M = 20
BETA_vector = c(0.5, rep(0.5, 2), rep(0.5, 2), rep(0, P - 5))
MU_vector = numeric(P)
SIGMA_mat = diag(P)

X <- mvrnorm(N, MU_vector, Sigma = SIGMA_mat)
linearPred <- cbind(rep(1, N), X) 
Y <- rbinom(N, 1, plogis(linearPred))

# Threshold OPTS-BinSeg MLE
opts_th(X, Y, M, family = "binomial")

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.