The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
CellDEEP reduces scRNA-seq sparsity by pooling cells into pseudocells before DE testing.
FindMarker.CellDEEP includes metadata preparation
internally. Key parameters to set: - group_id,
sample_id, cluster_id: metadata column names
in your Seurat object - ident.1, ident.2: two
groups to compare - cell_selection: how to select cells for
pooling ("kmean" or "random") -
readcounts: how to aggregate counts in pooled cells
("sum" or "mean") -
min_cells_per_subgroup: minimum cells required in each
sample-cluster subgroup for pooling
de.test <- FindMarker.CellDEEP(
sim,
group_id = "Status",
sample_id = "DonorID",
cluster_id = "cluster_id",
Pool = TRUE,
test.use = "wilcox",
n_cells = 3,
min_cells_per_subgroup = 1,
cell_selection = "random",
readcounts = "sum",
logfc.threshold = 0.25,
ident.1 = "Case",
ident.2 = "Control"
)
#> Start Pooling.....
#> Pooling...
#> Warning: Data is of class matrix. Coercing to dgCMatrix.
#> Pooling summary (random):
#> Input cells: 200
#> Cells kept in pooled pseudocells: 180
#> Cells not kept (approx): 20
#> Skipped empty groups: 0
#> Skipped empty clusters: 0
#> Skipped empty samples: 20
#> Skipped subgroups (<= min_cells_per_subgroup): 0
#> Dropped remainder cells (< n_cells) after random pooling: 20
#> FindMarker running.....
#> 1st ident is:
#> Case
#> 2nd ident is:
#> Control
#> group by:
#> group_id
#> Normalizing layer: counts
#> Finding variable features for layer counts
#> Centering and scaling data matrix
#> For a (much!) faster implementation of the Wilcoxon Rank Sum Test,
#> (default method for FindMarkers) please install the presto package
#> --------------------------------------------
#> install.packages('devtools')
#> devtools::install_github('immunogenomics/presto')
#> --------------------------------------------
#> After installation of presto, Seurat will automatically use the more
#> efficient implementation (no further action necessary).
#> This message will be shown once per session
#> 20
#> Gene1728Gene1992Gene1626Gene1864Gene1715Gene1807Use these functions if you want pooled objects without running DE immediately.
min_cells_per_subgroup means the minimum number of cells
required in each sample_id x cluster_id subgroup before
pooling is performed.
Pooling functions use standardized metadata fields
(sample_id, group_id,
cluster_id), so prepare once before pooling:
pool_input <- prepare_data(
sim,
sample_id = "DonorID",
group_id = "Status",
cluster_id = "cluster_id"
)pooled_kmean <- CellDEEP.Kmean(
pool_input,
readcounts = "sum",
n_cells = 3,
min_cells_per_subgroup = 1,
assay_name = "RNA"
)
#> Pooling...
#> Warning: Data is of class matrix. Coercing to dgCMatrix.
#> Drop out cell number during kmean pooling is:
#> 24
#> Pooling summary (kmean):
#> Input cells: 200
#> Cells kept in pooled pseudocells: 176
#> Cells not kept (approx): 24
#> Skipped empty groups: 0
#> Skipped empty clusters: 0
#> Skipped empty samples: 20
#> Skipped subgroups (<= min_cells_per_subgroup): 0
#> Dropped singleton cells after kmeans split: 24
pooled_kmean
#> An object of class Seurat
#> 2000 features across 56 samples within 1 assay
#> Active assay: RNA (2000 features, 0 variable features)
#> 1 layer present: countspooled_random <- CellDEEP.Random(
pool_input,
readcounts = "sum",
n_cells = 5,
min_cells_per_subgroup = 1,
assay_name = "RNA"
)
#> Pooling...
#> Warning: Data is of class matrix. Coercing to dgCMatrix.
#> Pooling summary (random):
#> Input cells: 200
#> Cells kept in pooled pseudocells: 160
#> Cells not kept (approx): 40
#> Skipped empty groups: 0
#> Skipped empty clusters: 0
#> Skipped empty samples: 20
#> Skipped subgroups (<= min_cells_per_subgroup): 0
#> Dropped remainder cells (< n_cells) after random pooling: 40
pooled_random
#> An object of class Seurat
#> 2000 features across 32 samples within 1 assay
#> Active assay: RNA (2000 features, 0 variable features)
#> 1 layer present: countsIf no genes pass the adjusted p-value filter in this small example
dataset, try a larger dataset or set full_list = TRUE.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.