The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This update has focused on improving the single-cell phyloset functionality.
The single-cell phylo expression object no longer depends on Seurat.
You can construct the ScPhyloExpressionSet either from a count matrix
(sparse or dense), using ScPhyloExpressionSet_from_matrix,
or from a Seurat object, using
ScPhyloExpressionSet_from_seurat. For consistency, one can
use BulkPhyloExpressionSet_from_df instead of
as_BulkPhyloExpressionSet.
One key functionality of the single-cell object is the ability to
switch between different identities when plotting (equivalent to the
Seurat::Idents functionality). This is done by setting the
::identities_label property of the object. The
::available_idents property can be used to see what options
the user has in setting the current identity. By setting
::idents_colours[[]], the user can choose a colour pallette
for the different identities when plotting, which are saved across
different plotting calls.
The computation of TAI values for single cell is now cached. Moreover, we have readded the C++ accelerated code for the computation of TAI, which upon profiling shows to be faster than the R version when handling more than 100000 cells (an adaptive function chooses the appropriate implementation for the object size).
Some of the plotting functionality had been improved and more options were added for plotting (e.g. plot_gene_heatmap now allows for passing a custom colour mapping for the rows (genes), instead of colouring them by their strata; plot_signature for single cell should be more readable).
Bug fixes: - validation of S7 objects now works properly - printing of object information now works properly (instead of dumping all the properties information)
Most importantly, new vignettes were added and the website has been updated. Many thanks to @LotharukpongJS
Functionality wise:
destroy_patterntfPS to tf_PS and fixed bug which
prevented strata transformations from happening (PhyloExpressionSet now
has an extra field @strata_values which keeps track of
the phylostratum values)ScPhyloExpressionSet S7 class for single-cell data,
alongside BulkPhyloExpressionSet for bulk data.conditions →
identities (and conditions_label →
identities_label),
counts/counts_collapsed →
expression/expression_collapsed.stat_ prefix:
flatline_test() →
stat_flatline_test()early_conservation_test() →
stat_early_conservation_test()late_conservation_test() →
stat_late_conservation_test()reductive_hourglass_test() →
stat_reductive_hourglass_test()reverse_hourglass_test() →
stat_reverse_hourglass_test()pairwise_test() →
stat_pairwise_test()generic_conservation_test() →
stat_generic_conservation_test()generate_conservation_txis() →
stat_generate_conservation_txis()genes_ prefix:
top_expression_genes() →
genes_top_mean()top_variance_genes() →
genes_top_variance()lowly_expressed_genes() →
genes_lowly_expressed()filter_dyn_expr() →
genes_filter_dynamic()gene_patterns.R → genes_patterns.R,
S7_utils.R → utils_S7.R).expression_utils.R,
single_cell.R.PhyloExpressionSet S7 class replaces the old
data.frame-based formatTestResult S7 class for standardized storage of
statistical test resultsFunction names have been updated to use snake_case convention:
FlatLineTest() → flatline_test()ReductiveHourglassTest() →
reductive_hourglass_test()EarlyConservationTest() →
early_conservation_test()LateConservationTest() →
late_conservation_test()ReverseHourglassTest() →
reverse_hourglass_test()PairwiseTest() → pairwise_test()PlotSignature() → plot_signature()PlotPattern() → plot_signature()PlotContribution() →
plot_contribution()PlotDistribution() →
plot_distribution_strata()PlotCategoryExpr() →
plot_strata_expression()PlotRE() →
plot_relative_expression_line()PlotBarRE() →
plot_relative_expression_bar()PlotGeneSet() → plot_gene_profiles()PlotMeans, PlotVars,
PlotMedians →
`plot_strata_expression(aggregate_FUN=“mean”/“var”/“median”)PlotSignatureMultiple() →
plot_signature_multiple()PlotSignatureTransformed() →
plot_signature_transformed()PlotSignatureGeneQuantiles() →
plot_signature_gene_quantiles()TAI() → Computed property of PhyloExpressionSet (still
accessible via TAI())TDI() → Computed property of PhyloExpressionSet (still
accessible via TDI())TEI() → Computed property of PhyloExpressionSet (still
accessible via TEI())TPI() → Computed property of PhyloExpressionSet (still
accessible via TPI())pTAI(), pTDI → sTXI()
(generalized for all transcriptomic indices)pMatrix() → pTXI()CollapseReplicates() → collapse, Built
into PhyloExpressionSet constructorExpressed() → lowly_expressed_genes()MatchMap() → match_map()SelectGeneSet() → select_genes()TopExpressionGenes() →
top_expression_genes()TopVarianceGenes() →
top_variance_genes()REMatrix() → rel_exp_matrix()RE() → relative_expression()omitMatrix() → omit_matrix()is.ExpressionSet() → Built into S7 validationage.apply() → age.apply() (unchanged)tf() → tf() or
transform_counts()tfPS() → tfPS() (unchanged)tfStability() → tf_stability()taxid() → taxid() (unchanged)destroy_pattern(): Apply GATAI algorithm to identify
pattern-contributing genesplot_signature_multiple(): Plot multiple signatures on
the same plotplot_signature_gene_quantiles(): Plot signature with
gene expression quantilesplot_signature_transformed(): Plot signatures with
different transformationsplot_sample_space(): Visualize sample relationships
using PCA or UMAPplot_mean_var(): Mean-variance relationship plotsplot_gene_profiles(): Individual gene expression
profilesplot_distribution_expression(): Expression distribution
plotsplot_distribution_pTAI(): Partial TXI distribution
plotsplot_distribution_pTAI_qqplot(): Q-Q plots for pTXI
distributionsplot_distribution_strata(): Phylostrata distribution
plotsplot_gene_heatmap(): Gene expression heatmapsplot_gene_space(): Gene space visualizationplot_cullen_frey(): Cullen-Frey plots for distribution
assessmentplot_null_txi_sample(): Null TXI sample plotsas_PhyloExpressionSet(): Convert data to
PhyloExpressionSet S7 objectget_sc_TAI(): Single-cell TAI computation for Seurat
objects=diagnose_test_robustness(): Diagnose statistical test
robustnessremove_genes(): Remove genes from
PhyloExpressionSetPS_colours(): Generate phylostratum color palettesConservationTestResult(): S7 class for conservation
test resultsTestResult(): S7 class for statistical test
resultsTo migrate from myTAI 1.x to 2.0:
Convert data format:
# Old format (data.frame)
old_phyex <- PhyloExpressionSetExample
# New format (S7 object)
new_phyex <- as_PhyloExpressionSet(old_phyex)Update function calls:
# Old syntax
PlotSignature(phyex_set, TestStatistic = "FlatLineTest")
# New syntax
plot_signature(phyex_set, conservation_test = flatline_test)Access computed properties:
# Old syntax
tai_values <- TAI(phyex_set)
# New syntax
tai_values <- phyex_set@TXI
# or
tai_values <- TAI(phyex_set)tfPS() : Perform transformation of
phylostratum values, analogous to PS() which transforms
expression levels. Currently, tfPS() supports quantile rank
transformation.PairwiseTest() : Statistically evaluate
the pairwise difference in the phylotranscriptomic pattern between two
contrasts based on or computations.pairScore() : Compute pairwise difference
in TAI (or TDI) score.taxonomy() and references to it due to the
deprecation of taxize(). This is needed for CRAN
submission.New function PlotSignatureTransformed() : Plot
evolutionary signatures across transcriptomes and RNA-seq
transformations
New function tfStability(): Perform Permutation
Tests Under Different Transformations (to test the robustness of the
p-values of a given test (e.g. FlatLineTest(),
ReductiveHourglassTest(),
ReverseHourglassTest(),
EarlyConservationTest() and
LateConservationTest()) to expression data
transformations)
New function LateConservationTest() : Perform
Reductive Late Conservation Test (to test for a high-mid-low (or
high-high-low) TAI or TDI pattern)
New function lcScore() : Compute the Hourglass Score
for the LateConservationTest
internal rcpp functions are now using Eigen to
automatically enable parallelization
PlotSignature() updated to be able to perform the
TestStatistic = "LateConservationTest".
PlotSignature() now prints p-value as a subtitle
rather than via ggplot2::annotate().
tf() now has a pseudocount parameter,
which is useful for performing logarithmic transformations when there
are genes with 0 counts.
tf() now supports vst and
rlog transformations from DESeq2.
tf() now has an intergerise parameter,
which is needed when applying vst or rlog
transformations.
tf() updated documentation for performing rank
transformation, which assigns ranks to the gene expression values within
each stage, based on their relative positions compared to other
values.
Improvements to existing test functions (ecScore(),
rhScore() and reversehourglassScore()) to give
a message when the phylotranscriptomic pattern is unlikely to follow the
test statistics.
FlatLineTest() - newly returns the ks test
statistics for the fitting of gamma
FlatLineTest() - improved fitting
FlatLineTest() - cpp functions are newly
parallelized and progress bar is implemented for the computation of
permutations
devtools::test() and devtools::check(), when
building this package, which has been accumulated from previous
updates.TEI(): Compute the Transcriptome
Evolutionary IndexpMatrixTEI: Compute Partial Transcriptome
Evolutionary Index (TEI) ValuespStrataTEI: Compute Partial Transcriptome
Evolutionary Index (TEI) Strata ValuesbootTEI: Compute a Permutation Matrix of
Transcriptome Evolutionary Index (TEI)rcpp functions to support parallel C++
computations for TEI(), pMatrixTEI(),
pStrataTEI(), bootTEI()CollapseReplicates() now returns tibble
objectsPlotCategoryExpr() received a new argument
y.ticksCollapseFromTo() now has an exception when a replicate
number 1 is passed to the function -> previously this
would cause an error to occurstd::random_shuffle() function
to sample plylostratum or divergence stratum
columns and replacing it with std::shuffle(). See full
discussion here.dplyr::funs() and tibble::is.tibble()set.seed(123) which causes
an error in the new R version 3.6.0 due to the switch from
a non-uniform "Rounding" sampler to a
"Rejection" sampler in the new R version; the corresponding
unit test test-PlotEnrichment.R was adjusted accordingly.
Here the CRAN statement:Note that this ensures using the (old) non-uniform “Rounding” sampler for all 3.x versions of R, and does not add an R version dependency. Note also that the new “Rejection” sampler which R will use from 3.6.0 onwards by default is definitely preferable over the old one, so that the above should really only be used as a temporary measure for reproduction of the previous behavior (and the run time tests relying on it).
new function ReverseHourglassTest() to perform a
Reverse Hourglass Test. The Reverse Hourglass Test aims to
statistically evaluate the existence of a reverse hourglass pattern
based on TAI or TDI computations. The corresponding p-value quantifies
the probability that a given TAI or TDI pattern (or any
phylotranscriptomics pattern) does follow an hourglass like shape. A
p-value < 0.05 indicates that the corresponding phylotranscriptomics
pattern does rather follow a reverse hourglass (low-high-low)
shape.
new function reversehourglassScore() for computing
the Reverse Hourglass Score for the
Reverse Hourglass Test
PlotSignature() receives a new
TestStatistic
(TestStatistic = "ReverseHourglassTest") to perform a
revserse hourglass test (= testing the significance of a
low-high-low pattern)tibblePlotCIRatio() to compute and visualize
TAI/TDI etc patters using bootstrapping and confidence intervals
(contributed by @ljljolinq1010)tibble data as input ->
before there were errors thrown when input data wasn’t in strict
data.frame formatis.ExpressionSet() now prints out more detailed
error messages when ExpressionSet is violated
adapt PlotContribution() to new version of
dplyr where summarise_each() is
deprecated.
Error message occurring after new dplyr release was:
PlotContribution() works properly with
DivergenceExpressionSet input… (@test-PlotContribution.R#16)
PlotContribution(DivergenceExpressionSetExample, legendName = “DS”)
produced messages.summarise_each() is deprecated. Use
summarise_all(), summarise_at() or
summarise_if() instead. To map funs over all
variables, use summarise_all()
summarise_each() is deprecated. Use
summarise_all(), summarise_at() or
summarise_if() instead. To map funs over all
variables, use summarise_all()
Is now fixed.
new function PlotSignature() allows users to plot
evolutionary signatures across transcriptomes (based on ggplot2 ->
new main visualization function aiming to replace the
PlotPattern() function)
new function TPI() allows users to compute the
Transcriptome Polymorphism Index introduced by
Gossmann et al., 2015.
new function PlotMedians() allows users to compute
and visualize the median expression of all age categories
new function PlotVars() allows users to compute and
visualize the expression variance of all age categories
PlotContribution() is now based on ggplot2 and loses
base graphics arguments
now R/RcppExports.R and src/rcpp_funcs.cpp are included in the package due to previous compilation problems (see also stackoverflow discussion)
MatchMap() is now based on
dplyr::inner_join() to match age category table with a gene
expression dataset
PlotCorrelation() has been extended and optimized
for producing high publication quality plots
PlotMeans() is now based on ggplot2 and lost all
base graphics arguments.
PlotRE() is now based on ggplot2 and lost all base
graphics arguments.
Introduction vignette: complete restructuring of the
IntroductionIntroduction vignette: add new ggplot2 based
examplesPlotSelectedAgeDistr() allowing unsers
to visualize the PS or DS gene distribution of a subset of genes stored
in the input ExpressionSet objectPlotGroupDiffs() allowing users to plot
the significant differences between gene expression distributions of PS
or DS groupsGroupDiffs() allowing users to perform
statistical tests to quantify the gene expression level differences
between all genes of defined PS or DS groupsPlotDistribution() now uses ggplot2 to visualize the
PS or DS distribution and is also based on the new function
PlotSelectedAgeDistr(); furthermore it loses arguments
plotText and ... and gains a new argument
legendName
remove arguments ‘main.text’ and ‘…’ from
PlotCorrelation()
PlotCorrelation() is now based on ggplot2
PlotGroupDiffs() receives a new argument
gene.set allowing users to statistically quantify the group
specific PS/DS differences of a selected set of genes
analogously to PlotGroupDiffs() the function
GroupDiffs() also receives a new argument
gene.set allowing users to statistically quantify the group
specific PS/DS differences of a selected set of genes
Fixing wrong x-axis labeling in PlotCategoryExpr()
when type = "stage-centered" is specified
PlotCategoryExpr() now also prints out the PS/DS
absolute frequency distribution of the selected
gene.set
PlotCategoryExpr() to
Advanced VignettePlotReplicateQuality() to
Expression vignettePlotCategoryExpr() allowing users to
plot the expression levels of each age or divergence category as
boxplot, dot plot or violin plotPlotReplicateQuality() allowing users to
visualize the quality of biological replicatesPlotGeneSet() and SelectGeneSet() now have
a new argument use.only.map specifying whether or not
instead of using a standard ExpressionSet a
Phylostratigraphic Map or Divergene Map is
passed to the function.adding new vignette Taxonomy providing spep by step instructions on retrieving taxonomic information for any organism of interest
adding new vignette Expression Analysis
providing use cases to perform gene expression data analysis with
myTAI
adding new vignette Enrichment providing
step-by-step instructions on how to perform PS and DS enrichment
analyses with PlotEnrichment()
adding examples for pStrata(),
pMatrix(), pTAI(), pTDI(), and
PlotContribution() to the Introduction
Vignette
a new function taxonomy() allows users to retrieve
taxonomic information for any organism of interest; this function has
been taken from the biomartr package and was
removed from biomartr afterwards. Please notice, that in
myTAI version 0.1.0 the Introduction vignette referenced to the
taxonomy() function in biomartr. This is no
longer the case (since myTAI version 0.2.0), because now
taxonomy() is implemented in myTAI.
the new taxonomy() function is based on the powerful
R package taxize.
a new function SelectGeneSet() allows users to
fastly select a subset of genes in an ExpressionSet
a new function DiffGenes() allows users to perform
differential gene expression analysis with ExpressionSet
objects
a new function EnrichmentTest() allows users to
perform a Fisher’s exact test based enrichment analysis of over or
underrepresented Phylostrata or Divergence Strata within a given gene
set without having to plot the result
a new function PlotGeneSet() allows users to
visualize the expression profiles of a given gene set
a new function PlotEnrichment() allows users to
visualize the Phylostratum or Divergence Stratum enrichment of a given
Gene Set as well as computing Fisher’s exact test to quantify the
statistical significance of enrichment
a new function PlotContribution() allows users to
visualize the Phylostratum or Divergence Stratum contribution to the
global TAI/TDI pattern
a new function pTAI() allows users to compute the
phylostratum contribution to the global TAI pattern
a new function pTDI() allows users to compute the
divergence stratum contribution to the global TDI pattern
FilterRNASeqCT() has been renamed to
Expressed() allowing users to apply this filter function to
RNA-Seq data as well as to microarray dataPlotRE() and PlotMeans() are now based on
colors from the RColorBrewer package (default)PlotRE() and PlotMeans() now have a new
argument colors allowing unsers to choose custom colors for
the visualized relative or mean expression profilesgeom.mean() and harm.mean() now are
external functions accessible to the myTAI usera new function pStrata() allows users to compute
partial TAI/TDI values for all Phylostrata or Divergence Strata
a new function CollapseReplicates() allows users to
combine replicate expression levels in ExpressionSet objects
a new function FilterRNASeqCT() allows users to
filter expression levels of ExpressionSet objects deriving
from RNA-Seq count tables
function MatchMap() now receives a new argument
remove.duplicates allowing users to delete duplicate gene
ids (that might be stored in the input PhyoMap or DivergenceMap) during
the process of matching a Map with an ExpressionSet
FlatLineTest(),
ReductiveHourglassTest(),
EarlyConservationTest(), and PlotPattern()
implement a new argument custom.perm.matrix allowing users
to pass their own (custom) permutation matrix to the corresponding
function. All subsequent test statistics and p-value/std.dev
computations are then based on this custom permutation matrix
EarlyConservationTest() and
ReductiveHourglassTest() now have a new parameter
gof.warning allowing users to choose whether or not non
significant goodness of fit results should be printed as
warning
now when specifying TestStatistic = NULL in
PlotPattern() only the TAI/TDI profile is drawn (without
performing any test statistics); this is equavalent to performing:
plot(TAI(PhyloExpressionSetExample)
function combinatorialSignificance() is now named
CombinatorialSignificance()
changing the title and description of the myTAI
package
some minor changes in vignettes and within the documentation of functions
combinatorialSignificance(),
FlatLineTest(), ReductiveHourglassTest(), and
EarlyConservationTest() now support multicore
processing
MatchMap() has been entirely rewritten and is now
based on dplyr;
additionally it now has a new argument accumulate that
allows you to accumulate multiple expression levels to a unique
expressiion level for a unique gene id
All three Vignettes: Introduction,
Intermediate, and Advanced have been updated
and extended.
two small bugs in ReductiveHourglassTest() and
EarlyConservationTest() have been fixed that caused that
instead of displaying 3 or 4 plots (par(mfrow=c(1,3)) or
par(mfrow=c(2,2))) only 1 plot has been generated
a small bug in PlotMeans() that caused the
visualization of a wrong y-axis label when plotting only one group of
Phylostrata or Divergence Strata
Introducing myTAI 0.0.1:
A framework to perform phylotranscriptomics analyses for Evolutionary Developmental Biology research.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.