| Type: | Package | 
| Title: | High-Dimensional Spatial Covariate-Augmented Overdispersed Poisson Factor Model | 
| Version: | 1.3 | 
| Date: | 2025-03-27 | 
| Author: | Wei Liu [aut, cre], Qingzhi Zhong [aut] | 
| Maintainer: | Wei Liu <liuwei8@scu.edu.cn> | 
| Description: | A spatial covariate-augmented overdispersed Poisson factor model is proposed to perform efficient latent representation learning method for high-dimensional large-scale spatial count data with additional covariates. | 
| License: | GPL-3 | 
| URL: | https://github.com/feiyoung/SpaCOAP | 
| BugReports: | https://github.com/feiyoung/SpaCOAP/issues | 
| Imports: | LaplacesDemon, stats, methods, Matrix, MASS,Rcpp (≥ 1.0.10) | 
| Depends: | irlba, R (≥ 3.5.0) | 
| Suggests: | knitr, rmarkdown | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| VignetteBuilder: | knitr | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | yes | 
| Packaged: | 2025-03-27 13:55:40 UTC; 10297 | 
| Repository: | CRAN | 
| Date/Publication: | 2025-03-27 14:30:01 UTC | 
Fit the SpaCOAP model
Description
Fit the spatial covariate-augmented overdispersed Poisson factor model
Usage
SpaCOAP(
  X_count,
  Adj_sp,
  H,
  Z = matrix(1, nrow(X_count), 1),
  offset = rep(0, nrow(X_count)),
  rank_use = 5,
  q = 15,
  epsELBO = 1e-08,
  maxIter = 30,
  verbose = TRUE,
  add_IC_inter = FALSE,
  seed = 1,
  algo = 1
)
Arguments
| X_count | a count matrix, the observed count matrix with shape n-by-p. | 
| Adj_sp | a sparse matrix, the weighted adjacency matrix; | 
| H | a n-by-d matrix, the covariate matrix with low-rank regression coefficient matrix; | 
| Z | an optional matrix, the fixed-dimensional covariate matrix with control variables; default as a full-one column vector if there is no additional covariates. | 
| offset | an optional vector, the offset for each unit; default as full-zero vector. | 
| rank_use | an optional integer, specify the rank of the regression coefficient matrix; default as 5. | 
| q | an optional string, specify the number of factors; default as 15. | 
| epsELBO | an optional positive vlaue, tolerance of relative variation rate of the envidence lower bound value, defualt as '1e-8'. | 
| maxIter | the maximum iteration of the VEM algorithm. The default is 30. | 
| verbose | a logical value, whether output the information in iteration. | 
| add_IC_inter | a logical value, add the identifiability condition in iterative algorithm or add it after algorithm converges; default as FALSE. | 
| seed | an integer, set the random seed in initialization, default as 1; | 
| algo | an optional integer taking value 1 0r 2, select the algorithm used, default as 1, representing variational EM algorithm. | 
Details
None
Value
return a list including the following components:
-  F- the predicted factor matrix;
-  B- the estimated loading matrix;
-  bbeta- the estimated low-rank large coefficient matrix;
-  alpha0- the estimated regression coefficient matrix corresponing to Z;
-  invLambda- the inverse of the estimated variances of error;
-  eta- the estimated spatial autocorrelation parameter;
-  S- the approximated posterior covariance for each row of F;
-  ELBO- the ELBO value when algorithm stops;
-  ELBO_seq- the sequence of ELBO values.
-  time_use- the running time in model fitting of SpaCOAP;
References
Liu W, Zhong Q. High-dimensional covariate-augmented overdispersed poisson factor model. Biometrics. 2024 Jun;80(2):ujae031.
See Also
None
Examples
width <- 20; height <- 15; p <- 100
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=20, k=k, q=q, rank0=r)
fitlist <- SpaCOAP(X_count=datlist$X, Adj_sp = datlist$Adj_sp, 
H= datlist$H, Z = datlist$Z, q=6, rank_use=3)
str(fitlist)
Select the parameters in COAP models
Description
Select the number of factors and the rank of coefficient matrix in the covariate-augmented overdispersed Poisson factor model
Usage
chooseParams(
  X_count,
  Adj_sp,
  H,
  Z = matrix(1, nrow(X_count), 1),
  offset = rep(0, nrow(X_count)),
  q_max = 15,
  r_max = 24,
  threshold = c(0.1, 0.01),
  verbose = TRUE,
  ...
)
Arguments
| X_count | a count matrix, the observed count matrix with shape n-by-p. | 
| Adj_sp | a sparse matrix, the weighted adjacency matrix; | 
| H | a n-by-d matrix, the covariate matrix with low-rank regression coefficient matrix; | 
| Z | an optional matrix, the fixed-dimensional covariate matrix with control variables; default as a full-one column vector if there is no additional covariates. | 
| offset | an optional vector, the offset for each unit; default as full-zero vector. | 
| q_max | an optional string, specify the upper bound for the number of factors; default as 15. | 
| r_max | an optional integer, specify the upper bound for the rank of the regression coefficient matrix; default as 24. | 
| threshold | an optional 2-dimensional positive vector, specify the the thresholds that filters the singular values of beta and B, respectively. | 
| verbose | a logical value, whether output the information in iteration. | 
| ... | other arguments passed to the function  | 
Details
The threshold is to filter the singular values with low signal, to assist the identification of underlying model structure.
Value
return a named vector with names 'hr' and 'hq', the estimated rank and number of factors.
References
None
See Also
Examples
width <- 20; height <- 15; p <- 300
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=d, k=k, q=q, rank0=r)
set.seed(1)
para_vec <- chooseParams(X_count=datlist$X, Adj_sp=datlist$Adj_sp,
 H= datlist$H, Z = datlist$Z, r_max=6)
print(para_vec)
Generate simulated data
Description
Generate simulated data from spaital covariate-augmented Poisson factor models
Usage
gendata_spacoap(
  seed = 1,
  width = 20,
  height = 30,
  p = 500,
  d = 40,
  k = 3,
  q = 5,
  rank0 = 3,
  eta0 = 0.5,
  bandwidth = 1,
  rho = c(10, 1),
  sigma2_eps = 1,
  seed.beta = 1
)
Arguments
| seed | a postive integer, the random seed for reproducibility of data generation process. | 
| width | a postive integer, specify the width of the spatial grid. | 
| height | a postive integer, specify the height of the spatial grid. | 
| p | a postive integer, specify the dimension of count variables. | 
| d | a postive integer, specify the dimension of covariate matrix with low-rank regression coefficient matrix. | 
| k | a postive integer, specify the dimension of covariate matrix as control variables. | 
| q | a postive integer, specify the number of factors. | 
| rank0 | a postive integer, specify the rank of the coefficient matrix. | 
| eta0 | a real between 0 and 1, specify the spatial autocorrelation parameter. | 
| bandwidth | a real positive value, specify the bandwidth in calculating the weighted adjacency matrix. | 
| rho | a numeric vector with length 2 and positive elements, specify the signal strength of loading matrix and regression coefficient, respectively. | 
| sigma2_eps | a positive real, the variance of overdispersion error. | 
| seed.beta | a postive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient matrix beta. | 
Details
None
Value
return a list including the following components:
-  X- the high-dimensional count matrix;
-  Z- the low-dimensional covariate matrix with control variables.
-  H- the high-dimensional covariate matrix;
-  Adj_sp- the weighted adjacence matrix;
-  alpha0- the regression coefficient matrix corresponing to Z;
-  bbeta0- the low-rank large regression coefficient matrix corresponing to H;
-  B0- the loading matrix;
-  F0- the laten factor matrix;
-  rank0- the true rank of bbeta0;
-  q- the true number of factors;
-  eta0- spatial autocorrelation parameter;
-  pos- spatial coordinates for each observation.
References
None
See Also
Examples
width <- 20; height <- 15; p <- 100
d <- 20; k <- 3; q <- 6; r <- 3
datlist <- gendata_spacoap(width=width, height=height, p=p, d=20, k=k, q=q, rank0=r)
str(datlist)