The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Mixed, Low-Rank, and Sparse Multivariate Regression on High-Dimensional Data
Version: 0.1.0
Description: Mixed, low-rank, and sparse multivariate regression ('mixedLSR') provides tools for performing mixture regression when the coefficient matrix is low-rank and sparse. 'mixedLSR' allows subgroup identification by alternating optimization with simulated annealing to encourage global optimum convergence. This method is data-adaptive, automatically performing parameter selection to identify low-rank substructures in the coefficient matrix.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.2.1
Depends: R (≥ 4.1.0)
Imports: grpreg, purrr, MASS, stats, ggplot2
Suggests: knitr, rmarkdown, mclust
VignetteBuilder: knitr
BugReports: https://github.com/alexanderjwhite/mixedLSR
URL: https://alexanderjwhite.github.io/mixedLSR/
NeedsCompilation: no
Packaged: 2022-11-04 10:33:31 UTC; whitealj
Author: Alexander White ORCID iD [aut, cre], Sha Cao ORCID iD [aut], Yi Zhao ORCID iD [ctb], Chi Zhang ORCID iD [ctb]
Maintainer: Alexander White <whitealj@iu.edu>
Repository: CRAN
Date/Publication: 2022-11-04 20:00:02 UTC

Compute Bayesian information criterion for a mixedLSR model

Description

Compute Bayesian information criterion for a mixedLSR model

Usage

bic_lsr(a, n, llik)

Arguments

a

A list of coefficient matrices.

n

The sample size.

llik

The log-likelihood of the model.

Value

The BIC.

Examples

n <- 50
simulate <- simulate_lsr(n)
model <- mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)
bic_lsr(model$A, n = n, model$llik)

Internal Alternating Optimization Function

Description

Internal Alternating Optimization Function

Usage

fct_alt_optimize(
  x,
  y,
  k,
  clust_assign,
  lambda,
  alt_iter,
  anneal_iter,
  em_iter,
  temp,
  mu,
  eps,
  accept_prob,
  sim_N,
  verbose
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

clust_assign

The current clustering assignment.

lambda

A vector of penalization parameters.

alt_iter

The maximum number of times to alternate between the classification expectation maximization algorithm and the simulated annealing algorithm.

anneal_iter

The maximum number of simulated annealing iterations.

em_iter

The maximum number of EM iterations.

temp

The initial simulated annealing temperature, temp > 0.

mu

The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.

eps

The final simulated annealing temperature, eps > 0.

accept_prob

The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.

sim_N

The simulated annealing number of iterations for reaching equilibrium.

verbose

A boolean indicating whether to print to screen.

Value

A final fit of mixedLSR


Internal Double Penalized Projection Function

Description

Internal Double Penalized Projection Function

Usage

fct_dpp(
  y,
  x,
  rank,
  lambda = NULL,
  alpha = 2 * sqrt(3),
  beta = 1,
  sigma,
  ptype = "grLasso",
  y_sparse = TRUE
)

Arguments

y

A matrix of responses.

x

A matrix of predictors.

rank

The rank, if known.

lambda

A vector of penalization parameters.

alpha

A positive constant DPP parameter.

beta

A positive constant DPP parameter.

sigma

An estimated standard deviation

ptype

A group penalized regression penalty type. See grpreg.

y_sparse

Should Y coefficients be treated as sparse?

Value

A list containing estimated coefficients, covariance, and penalty parameters.


Internal EM Algorithm

Description

Internal EM Algorithm

Usage

fct_em(x, y, k, lambda, clust_assign, lik_track, em_iter, verbose)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

lambda

A vector of penalization parameters.

clust_assign

The current clustering assignment.

lik_track

A vector storing the log-likelihood by iteration.

em_iter

The maximum number of EM iterations.

verbose

A boolean indicating whether to print to screen.

Value

A mixedLSR model.


Internal Posterior Calculation

Description

Internal Posterior Calculation

Usage

fct_gamma(
  x,
  y,
  k,
  N,
  clust_assign,
  pi_vec,
  lambda,
  alpha,
  beta,
  y_sparse,
  rank,
  max_rank
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

N

The sample size.

clust_assign

The current clustering assignment.

pi_vec

A vector of mixing probabilities for each cluster label.

lambda

A vector of penalization parameters.

alpha

A positive constant DPP parameter.

beta

A positive constant DPP parameter.

y_sparse

Should Y coefficients be treated as sparse?

rank

The rank, if known.

max_rank

The maximum allowed rank.

Value

A list with the posterior, coefficients, and estimated covariance.


Internal Partition Initialization Function

Description

Internal Partition Initialization Function

Usage

fct_initialize(k, N)

Arguments

k

The number of groups.

N

The sample size.

Value

A vector of assignments.


Internal Likelihood Function

Description

Internal Likelihood Function

Usage

fct_j_lik(
  x,
  y,
  k,
  clust_assign,
  lambda,
  alpha = 2 * sqrt(3),
  beta = 1,
  y_sparse = TRUE,
  max_rank = 3,
  rank = NULL
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

clust_assign

A vector of cluster labels.

lambda

A vector of penalization parameters.

alpha

A positive constant DPP parameter.

beta

A positive constant DPP parameter.

y_sparse

Should Y coefficients be treated as sparse?

max_rank

The maximum allowed rank.

rank

The rank, if known.

Value

The weighted log-likelihood


Internal Log-Likelihood Function

Description

Internal Log-Likelihood Function

Usage

fct_log_lik(mu_mat, sig_vec, y, N, m)

Arguments

mu_mat

The mean matrix.

sig_vec

A vector of sigma.

y

The output matrix.

N

The sample size.

m

The number of y features.

Value

A posterior matrix.


Internal Perturb Function

Description

Internal Perturb Function

Usage

fct_new_assign(assign, k, p)

Arguments

assign

The current clustering assignments.

k

The number of groups.

p

The acceptance probability.

Value

A perturbed assignment.


Internal Pi Function

Description

Internal Pi Function

Usage

fct_pi_vec(clust_assign, k, N)

Arguments

clust_assign

The current clustering assignment.

k

The number of groups.

N

The sample size.

Value

A mixing vector.


Internal Rank Estimation Function

Description

Internal Rank Estimation Function

Usage

fct_rank(x, y, sigma, eta)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

sigma

An estimated noise level.

eta

A rank selection parameter.

Value

The estimated rank.


Internal Penalty Parameter Selection Function.

Description

Internal Penalty Parameter Selection Function.

Usage

fct_select_lambda(
  x,
  y,
  k,
  clust_assign = NULL,
  initial = FALSE,
  type = "all",
  verbose
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

clust_assign

The current clustering assignment.

initial

An initial penalty parameter.

type

A type.

verbose

A boolean indicating whether to print to screen.

Value

A selected penalty parameter.


Internal Sigma Estimation Function

Description

Internal Sigma Estimation Function

Usage

fct_sigma(y, N, m)

Arguments

y

A matrix of responses.

N

The sample size.

m

The number of outcome variables.

Value

The estimated sigma.


Internal Simulated Annealing Function

Description

Internal Simulated Annealing Function

Usage

fct_sim_anneal(
  x,
  y,
  k,
  init_assign,
  lambda,
  temp,
  mu,
  eps,
  accept_prob,
  sim_N,
  track,
  anneal_iter = 1000,
  verbose
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

init_assign

An initial clustering assignment.

lambda

A vector of penalization parameters.

temp

The initial simulated annealing temperature, temp > 0.

mu

The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.

eps

The final simulated annealing temperature, eps > 0.

accept_prob

The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.

sim_N

The simulated annealing number of iterations for reaching equilibrium.

track

A likelihood tracking vector.

anneal_iter

The maximum number of simulated annealing iterations.

verbose

A boolean indicating whether to print to screen.

Value

An updated clustering vector.


Internal Weighted Log Likelihood Function

Description

Internal Weighted Log Likelihood Function

Usage

fct_weighted_ll(gamma)

Arguments

gamma

A posterior matrix

Value

A weighted log likelihood vector


Mixed Low-Rank and Sparse Multivariate Regression for High-Dimensional Data

Description

Mixed Low-Rank and Sparse Multivariate Regression for High-Dimensional Data

Usage

mixed_lsr(
  x,
  y,
  k,
  nstart = 1,
  init_assign = NULL,
  init_lambda = NULL,
  alt_iter = 5,
  anneal_iter = 1000,
  em_iter = 1000,
  temp = 1000,
  mu = 0.95,
  eps = 1e-06,
  accept_prob = 0.95,
  sim_N = 200,
  verbose = TRUE
)

Arguments

x

A matrix of predictors.

y

A matrix of responses.

k

The number of groups.

nstart

The number of random initializations, the result with the maximum likelihood is returned.

init_assign

A vector of initial assignments, NULL by default.

init_lambda

A vector with the values to initialize the penalization parameter for each group, e.g., c(1,1,1). Set to NULL by default.

alt_iter

The maximum number of times to alternate between the classification expectation maximization algorithm and the simulated annealing algorithm.

anneal_iter

The maximum number of simulated annealing iterations.

em_iter

The maximum number of EM iterations.

temp

The initial simulated annealing temperature, temp > 0.

mu

The simulated annealing decrease temperature fraction. Once the best configuration cannot be improved, reduce the temperature to (mu)T, 0 < mu < 1.

eps

The final simulated annealing temperature, eps > 0.

accept_prob

The simulated annealing probability of accepting a new assignment 0 < accept_prob < 1. When closer to 1, trial assignments will only be small perturbation of the current assignment. When closer to 0, trial assignments are closer to random.

sim_N

The simulated annealing number of iterations for reaching equilibrium.

verbose

A boolean indicating whether to print to screen.

Value

A list containing the likelihood, the partition, the coefficient matrices, and the BIC.

Examples

simulate <- simulate_lsr(50)
mixed_lsr(simulate$x, simulate$y, k = 2, init_lambda = c(1,1), alt_iter = 0)

Heatmap Plot of the mixedLSR Coefficient Matrices

Description

Heatmap Plot of the mixedLSR Coefficient Matrices

Usage

plot_lsr(a, abs = TRUE)

Arguments

a

A coefficient matrix from mixed_lsr model.

abs

A boolean for taking the absolute value of the coefficient matrix.

Value

A ggplot2 heatmap of the coefficient matrix, separated by subgroup.

Examples

simulate <- simulate_lsr()
plot_lsr(simulate$a)

Simulate Heterogeneous, Low-Rank, and Sparse Data

Description

Simulate Heterogeneous, Low-Rank, and Sparse Data

Usage

simulate_lsr(
  N = 100,
  k = 2,
  p = 30,
  m = 35,
  b = 1,
  d = 20,
  h = 0.2,
  case = "independent"
)

Arguments

N

The sample size, default = 100.

k

The number of groups, default = 2.

p

The number of predictor features, default = 30.

m

The number of response features, default = 35.

b

The signal-to-noise ratio, default = 1.

d

The singular value, default = 20.

h

The lower bound for the singular matrix simulation, default = 0.2.

case

The covariance case, "independent" or "dependent", default = "independent".

Value

A list of simulation values, including x matrix, y matrix, coefficients and true clustering assignments.

Examples

simulate_lsr()

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.