First, we generate the spatial transcriptomics data with lattice neighborhood, i.e. ST platform by using the function gendata_RNAExp
in DR.SC
package.
For function DR.SC
, users can specify the number of clusters \(K\) or set K
to be an integer vector by using modified BIC(MBIC) to determine \(K\). First, we try using user-specified number of clusters. Then we show the version chosen by MBIC.
### Given K
library(Seurat)
seu <- NormalizeData(seu)
# choose 2000 variable features using Seurat
seu <- FindVariableFeatures(seu, nfeatures = 400)
seu2 <- DR.SC(seu, K=4, platform = 'ST', verbose=F)
Using ARI to check the performance of clustering
Show the spatial scatter plot for clusters
Show the tSNE plot based on the extracted features from DR-SC.
Show the UMAP plot based on the extracted features from DR-SC.
Use MBIC to choose number of clusters:
First, we select the spatilly variable genes using funciton FindSVGs
.
### Given K
seu <- NormalizeData(seu, verbose=F)
# choose 400 spatially variable features using FindSVGs
seus <- FindSVGs(seu, nfeatures = 400, verbose = F)
seu2 <- DR.SC(seus, K=4, platform = 'ST', verbose=F)
Using ARI to check the performance of clustering
Show the spatial scatter plot for clusters
Show the tSNE plot based on the extracted features from DR-SC.
Show the UMAP plot based on the extracted features from DR-SC.
Use MBIC to choose number of clusters:
Conduct visualization of marker gene expression. ### Ridge plots Visualize single cell expression distributions in each cluster from Seruat.
dat <- FindAllMarkers(seu2)
suppressPackageStartupMessages(library(dplyr) )
# Find the top 1 marker genes, user can change n to access more marker genes
dat %>%group_by(cluster) %>%
top_n(n = 1, wt = avg_log2FC) -> top
genes <- top$gene
RidgePlot(seu2, features = genes, ncol = 2)
Visualize single cell expression distributions in each cluster
We extract tSNE based on the features from DR-SC and then visualize feature expression in the low-dimensional space
The size of the dot corresponds to the percentage of cells expressing the feature in each cluster. The color represents the average expression level
Single cell heatmap of feature expression
# standard scaling (no regression)
dat %>%group_by(cluster) %>%
top_n(n = 30, wt = avg_log2FC) -> top
### select the marker genes that are also the variable genes.
genes <- intersect(top$gene, seu2[['RNA']]@var.features)
## Change the HVGs to SVGs
# <- topSVGs(seu2, 400)
seu2 <- ScaleData(seu2, verbose = F)
DoHeatmap(subset(seu2, downsample = 500),features = genes, size = 5)