The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Scalable Spike-and-Slab
Version: 1.0
Date: 2022-05-13
Description: A scalable Gibbs sampling implementation for high dimensional Bayesian regression with the continuous spike-and-slab prior. Niloy Biswas, Lester Mackey and Xiao-Li Meng, "Scalable Spike-and-Slab" (2022) <doi:10.48550/arXiv.2204.01668>.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Imports: Rcpp, stats, TruncatedNormal
LinkingTo: Rcpp, RcppEigen
RoxygenNote: 7.1.2
NeedsCompilation: yes
Packaged: 2022-05-17 20:40:23 UTC; niloybiswas
Author: Niloy Biswas ORCID iD [aut, cre], Lester Mackey [aut], Xiao-Li Meng [aut]
Maintainer: Niloy Biswas <niloy_biswas@g.harvard.edu>
Depends: R (≥ 3.5.0)
Repository: CRAN
Date/Publication: 2022-05-18 17:00:07 UTC

Riboflavin GWAS dataset

Description

Dataset of riboflavin production by Bacillus subtilis containing n = 71 observations of a one-dimensional response (riboflavin production) and p = 4088 predictors (gene expressions). The one-dimensional response corresponds to riboflavin production.

Usage

data(riboflavin)

Format

A data frame containing a vector y of length 71 (responses) and a matrix X of dimension 71 by 4088 (gene expressions)

Details

The processed dataset is the same as in the R packages qut and hdi.

References

Buhlmann, P., Kalisch, M. and Meier, L. (2014) High-dimensional statistics with a view towards applications in biology. Annual Review of Statistics and its Applications 1, 255–278

Examples

data(riboflavin)
y <- as.vector(riboflavin$y)
X <- as.matrix(riboflavin$x)

spike_slab_linear

Description

Generates Markov chain targeting the posterior corresponding to Bayesian linear regression with spike and slab priors

Usage

spike_slab_linear(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  a0 = 1,
  b0 = 1,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL,
  tau0_inverse = NULL,
  tau1_inverse = NULL
)

Arguments

chain_length

Markov chain length

X

matrix of length n by p

y

Response

tau0

prior hyperparameter (non-negative real)

tau1

prior hyperparameter (non-negative real)

q

prior hyperparameter (strictly between 0 and 1)

a0

prior hyperparameter (non-negative real)

b0

prior hyperparameter (non-negative real)

rinit

initial distribution of Markov chain (default samples from the prior)

verbose

print iteration of the Markov chain (boolean)

burnin

chain burnin (non-negative integer)

store

store chain trajectory (boolean)

Xt

Pre-calculated transpose of X

XXt

Pre-calculated matrix X*transpose(X) (n by n matrix)

tau0_inverse

Pre-calculated matrix inverse(I + tau0^2*XXt) (n by n matrix)

tau1_inverse

Pre-calculated matrix inverse(I + tau1^2*XXt) (n by n matrix)

Value

Output from Markov chain targeting the posterior corresponding to Bayesian linear regression with spike and slab priors

Examples

# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='linear')
X <- syn_data$X
y <- syn_data$y

# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))

# Run S^3
sss_chain <- spike_slab_linear(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,a0=params$a0,b0=params$b0,
verbose=FALSE,store=FALSE)

# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]

spike_slab_logistic

Description

Generates Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors

Usage

spike_slab_logistic(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL
)

Arguments

chain_length

Markov chain length

X

matrix of length n by p

y

Response

tau0

prior hyperparameter (non-negative real)

tau1

prior hyperparameter (non-negative real)

q

prior hyperparameter (strictly between 0 and 1)

rinit

initial distribution of Markov chain (default samples from the prior)

verbose

print iteration of the Markov chain (boolean)

burnin

chain burnin (non-negative integer)

store

store chain trajectory (boolean)

Xt

Pre-calculated transpose of X

XXt

Pre-calculated matrix X*transpose(X) (n by n matrix)

Value

Output from Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors

Examples

# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='logistic')
X <- syn_data$X
y <- syn_data$y

# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))

# Run S^3
sss_chain <- spike_slab_logistic(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,verbose=FALSE,store=FALSE)

# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]

spike_slab_params

Description

Generates hyperparameters for spike-and-slab

Usage

spike_slab_params(n, p)

Arguments

n

number of observations

p

number of covariates

Value

spike-and-slab hyperparameters q, tau0, tau1, a0, b0

Examples

hyper_params <- spike_slab_params(n=100,p=200)
print(hyper_params)

spike_slab_probit

Description

Generates Markov chain targeting the posterior corresponding to Bayesian probit regression with spike and slab priors

Usage

spike_slab_probit(
  chain_length,
  X,
  y,
  tau0,
  tau1,
  q,
  rinit = NULL,
  verbose = FALSE,
  burnin = 0,
  store = TRUE,
  Xt = NULL,
  XXt = NULL,
  tau0_inverse = NULL,
  tau1_inverse = NULL
)

Arguments

chain_length

Markov chain length

X

matrix of length n by p

y

Response

tau0

prior hyperparameter (non-negative real)

tau1

prior hyperparameter (non-negative real)

q

prior hyperparameter (strictly between 0 and 1)

rinit

initial distribution of Markov chain (default samples from the prior)

verbose

print iteration of the Markov chain (boolean)

burnin

chain burnin (non-negative integer)

store

store chain trajectory (boolean)

Xt

Pre-calculated transpose of X

XXt

Pre-calculated matrix X*transpose(X) (n by n matrix)

tau0_inverse

Pre-calculated matrix inverse(I + tau0^2*XXt) (n by n matrix)

tau1_inverse

Pre-calculated matrix inverse(I + tau1^2*XXt) (n by n matrix)

Value

Output from Markov chain targeting the posterior corresponding to Bayesian logistic regression with spike and slab priors

Examples

# Synthetic dataset
syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2,type='probit')
X <- syn_data$X
Xt <- t(X)
y <- syn_data$y

# Hyperparamters
params <- spike_slab_params(n=nrow(X),p=ncol(X))

# Run S^3
sss_chain <- spike_slab_probit(chain_length=4e3,burnin=1e3,X=X,y=y,
tau0=params$tau0,tau1=params$tau1,q=params$q,verbose=FALSE,store=FALSE)

# Use posterior probabilities for variable selection
sss_chain$z_ergodic_avg[1:10]

synthetic_data

Description

Generates synthetic linear and logistic regression data

Usage

synthetic_data(
  n,
  p,
  s0,
  error_std,
  type = "linear",
  scale = TRUE,
  signal = "constant"
)

Arguments

n

number of observations

p

number of covariates

s0

sparsity (number of non-zero components of the true signal)

error_std

Standard deviation of the Gaussian noise (linear regression only)

type

dataset type ('linear' or 'logistic')

scale

design matrix X has columns mean zero and standard deviation 1 (TRUE or FALSE)

signal

non-zero components of the true signal ('constant' or 'deacy')

Value

Design matrix, response and true signal vector for linear and logistic regression

Examples

syn_data <- synthetic_data(n=100,p=200,s0=5,error_std=2)

# syn_data$X is an n by p design matrix
dim(syn_data$X)

# syn_data$y is a length n response vector
length(syn_data$y) 

# syn_data$true_beta is a length n response vector with only the first s0 entries non-zero
all(syn_data$true_beta[1:5]!=0)
all(syn_data$true_beta[-c(1:5)]==0)

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.