The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
SCRIP proposed two frameworks based on Gamma-Poisson and Beta-Gamma-Poisson distribution for simulating scRNA-seq data. Both Gamma-Poisson and Beta-Gamma-Poisson distribution model the over dispersion of scRNA-seq data. Specifically, Beta-Gamma-Poisson model was used to model bursting effect. The dispersion was accurately simulated by fitting the mean-BCV dependency using generalized additive model (GAM). Other key characteristics of scRNA-seq data including library size, zero inflation and outliers were also modeled by SCRIP. With its flexible modeling, SCRIP enables various application for different experimental designs and goals including DE analysis, clustering analysis, trajectory-based analysis and bursting analysis
Assuming you already have a count matrix for scRNA-seq data, and you want to simulation data based on it. Only a few steps are needed to creat a simulation data using SCRIP.
A dataset from Xin data is used for example.
## $start.arg
## $start.arg$shape
## [1] 0.833088
##
## $start.arg$rate
## [1] 0.09357466
##
##
## $fix.arg
## NULL
##
## $start.arg
## $start.arg$meanlog
## [1] 9.415808
##
## $start.arg$sdlog
## [1] 1.034692
##
##
## $fix.arg
## NULL
##
## $start.arg
## $start.arg$meanlog
## [1] 4.719586
##
## $start.arg$sdlog
## [1] 0.7954047
##
##
## $fix.arg
## NULL
## class: SingleCellExperiment
## dim: 1000 80
## metadata(13): Params method ... batch.facScale bcv.shrink
## assays(5): BatchCellMeans BaseCellMeans CellMeans TrueCounts counts
## rownames(1000): Gene1 Gene2 ... Gene999 Gene1000
## rowData names(4): Gene BaseGeneMean OutlierFactor GeneMean
## colnames(80): Cell1 Cell2 ... Cell79 Cell80
## colData names(3): Cell Batch ExpLibSize
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
SCRIP utlized the estimation strategy from splatter, but also provided more parameters (Fold change, dropout rates, library size, BCV degree of freedom) to serve different experimental designs (i.e. Simulation for differential expression analysis, clustering analysis and trajectory analysis). Detailed description about other parameters will be shown in other sections of this document.
The default mode in SCRIP for simulation is “GP-trendedBCV”. You can also choose other modes (“GP-commonBCV”, “BGP-commonBCV”,“BP”, “BGP-trendedBCV”) in the SCRIPsimu() function. For single cell type simulation, you have to set the “method” as “single”, which was default in SCRIPsimu() function.
GP-commonBCV is the model used by splatter. GP-commonBCV applied the Gamma-Poisson mixture model with mean-BCV dependency fitted by a common BCV across genes.
########################### GP-commonBCV model/Splatter ##########################
##################################################################################
sim_GPcommon <- SCRIPsimu(data=acinar.data, params=params, mode="GP-commonBCV")
sim_GPcommon
## class: SingleCellExperiment
## dim: 1000 80
## metadata(13): Params method ... batch.facScale bcv.shrink
## assays(5): BatchCellMeans BaseCellMeans CellMeans TrueCounts counts
## rownames(1000): Gene1 Gene2 ... Gene999 Gene1000
## rowData names(4): Gene BaseGeneMean OutlierFactor GeneMean
## colnames(80): Cell1 Cell2 ... Cell79 Cell80
## colData names(3): Cell Batch ExpLibSize
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
GP-trendedBCV is the major model of SCIRP. which used the Gamma-Poisson mixture model with mean-BCV dependency fitted by GAM.
BP is the model used for simulating bursting effect using Beta-Poisson mixture distributionwithout considering BCV effect.
############################### BP-commonBCV model ##############################
##################################################################################
sim_BP <- SCRIPsimu(data=acinar.data, params=params, mode="BP")
## $start.arg
## $start.arg$shape1
## [1] 0.1792874
##
## $start.arg$shape2
## [1] 1.938204
##
##
## $fix.arg
## NULL
BP-commonBCV is the model used for simulating bursting effect with Beta-Gamma-Poisson mixture distribution. The mean-BCV dependency was fitted by a common BCV across genes.
############################### BP-commonBCV model ##############################
##################################################################################
sim_BGPcommon <- SCRIPsimu(data=acinar.data, params=params, mode="BGP-commonBCV")
## $start.arg
## $start.arg$shape1
## [1] 0.1792874
##
## $start.arg$shape2
## [1] 1.938204
##
##
## $fix.arg
## NULL
BP-trendedBCV is the model used for simulating bursting effect with Beta-Gamma-Poisson mixture distribution. The mean-BCV dependency was fitted by a GAM.
############################### BP-trendedBCV model ##############################
##################################################################################
sim_BGPtrend <- SCRIPsimu(data=acinar.data, params=params, mode="BGP-trendedBCV")
## $start.arg
## $start.arg$shape1
## [1] 0.1792874
##
## $start.arg$shape2
## [1] 1.938204
##
##
## $fix.arg
## NULL
Group simulation is useful for studying different experimental conditions, especially for differential expression (DE) analysis. To serve different applications in scRNA-seq analysis, SCRIP provides flexible simulation. It can simulate scRNA-seq data with different parameters from multiple cell groups (i.e. cell types), which is useful for evaluating the detection of global characteristics such as clustering. It also allows simulation of group difference in a single cell group, which is useful for evaluating typical DE analysis methods.
DEGs were simulated using multiplicative differential expression factors from a log-normal distribution with parameters including number of genes (nGenes), the path-specific proportion of DE genes (de.prob), the proportion of down-regulated DE genes (de.downProb), DE location factor (de.facLoc) and DE scale factor (de.facScale).
Batch effect factors are also generated from a log-normal distribution with parameters including batchCells, batch.facLoc and batch.facScale.
batchCells: number of cells for each batch
batch.facLoc: Batch location factor in log-normal distribution for batch factor
batch.facScale: Batch scale factor in log-normal distribution for batch factor
sim.SCRIP3 <- SCRIPsimu(data=acinar.data, params=params, method="groups",
batchCells=c(150, 150),
batch.facLoc = c(0.1, 0.1),
batch.facScale = c(0.1, 0.1),
group.prob = c(0.25, 0.25, 0.25, 0.25),
de.prob = c(0.2, 0.2, 0.2, 0.2),
de.downProb = c(0.5, 0.5, 0.5, 0.5),
de.facLoc = c(0.2, 0.3, 0.4, 0.5),
de.facScale=c(0.2, 0.2, 0.2, 0.2))
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.