The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Read data and apply Surprisal analysis
data <- read.csv(system.file("extdata", "helper_T_cell_0_test.csv", package = "SurprisalAnalysis"), header=TRUE)
results <- surprisal_analysis(data)
results[[2]]-> transcript_weights
percentile_GO <- 0.95 #change based on your preference
lambda_no <- 2 #change based on your preference, lambda #1 is the baseline state
Run GO analysis
GO.results <- GO_analysis_surprisal_analysis(transcript_weights, percentile_GO, lambda_no, key_type = "SYMBOL", flip = FALSE, species.db.str = "org.Mm.eg.db", top_GO_terms=15)
The function GO_analysis_surprisal_analysis() runs Gene Ontology (GO) enrichment on the most influential transcripts from a chosen Surprisal pattern. Below are the input arguments:
transcript_weights
A matrix of transcript weights, typically the second element ([[2]]) returned from the Surprisal analysis function.
percentile_GO
A numeric value between 0 and 1 specifying the quantile cutoff for transcript selection. Example: 0.95 means only the top 5% of transcripts (by absolute weight) in the chosen \(\lambda\) pattern are used.
lambda_no
An integer specifying which \(\lambda\) pattern to analyze. Note: \(\lambda_1\) represents the balance state, while higher-order \(\lambda\)’s capture additional constraints or patterns.
key_type
The type of transcript identifiers used in your data. Options include:
“SYMBOL” (gene symbols, e.g. TP53),
“ENTREZID” (Entrez gene IDs),
“ENSEMBL” (Ensembl IDs),
“PROBEID” (microarray probe IDs). This must match the ID format in your input dataset.
flip
Logical (TRUE/FALSE). If TRUE, multiplies transcript weights for the selected \(\lambda\) by –1 before selecting the top quantile. Useful for ensuring consistency with the direction of \(\lambda\) plots.
species.db.str
The organism database to use for gene mapping. Current options:
“org.Hs.eg.db” for Homo sapiens (human),
“org.Mm.eg.db” for Mus musculus (mouse)
ont
The GO ontology branch for enrichment analysis. Options:
“BP” – Biological Process (default),
“MF” – Molecular Function,
“CC” – Cellular Component
pAdjustMethod
The multiple testing correction method. Options include: “BH” (default), “bonferroni”, “holm”, “hochberg”, “hommel”, “BY”, “none”.
top_GO_terms
An integer specifying the number of top enriched GO terms to return (default: 15).
ggplot(GO.results, aes(x=Description, y=Count, fill=p.adjust))+geom_bar(stat="identity")+scale_fill_gradient(low = "#790915", high = "#062c5c")+theme_minimal()+
theme(
# Remove panel border
panel.border=element_blank(),
#plot.border = element_blank(),
# Remove panel grid lines
panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
# Add axis line
axis.line = element_line(colour = "black"),
#axis.title.x = element_blank(),
axis.title.y = element_blank(),
#axis.text = element_blank(),
#legend.position = "none",
plot.title = element_text(hjust = 0.5, size=20),
#axis.text = element_text(size = 15),
text = element_text(size=18)
) +coord_flip()+labs(tag="A", title="GO analysis")
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.