The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette illustrates the most useful functions of yatah.
For this example, we use data from Zeller et al. (2014). It is the abundances of bacteria present in 199 stool samples.
abundances <- as_tibble(yatah::abundances)
print(abundances, max_extra_cols = 2)
#> # A tibble: 1,585 × 200
#> lineages `CCIS00146684ST-4-0` `CCIS00281083ST-3-0` `CCIS02124300ST-4-0`
#> <chr> <dbl> <dbl> <dbl>
#> 1 k__Bacteria 100. 99.8 96.3
#> 2 k__Viruses 0.00697 0.128 3.70
#> 3 k__Bacteria|p… 66.2 24.6 74.2
#> 4 k__Bacteria|p… 19.1 74.4 11.9
#> 5 k__Bacteria|p… 12.1 0.0428 7.22
#> 6 k__Bacteria|p… 1.86 0.428 0.765
#> 7 k__Bacteria|p… 0.758 0.388 2.28
#> 8 k__Viruses|p_… 0.00697 0.128 3.70
#> 9 k__Bacteria|p… 0.00155 0.00415 0
#> 10 k__Bacteria|p… 62.4 21.7 62.3
#> # ℹ 1,575 more rows
#> # ℹ 196 more variables: `CCIS02379307ST-4-0` <dbl>, `CCIS02856720ST-4-0` <dbl>,
#> # …
taxonomy <- select(abundances, lineages)
taxonomy
#> # A tibble: 1,585 × 1
#> lineages
#> <chr>
#> 1 k__Bacteria
#> 2 k__Viruses
#> 3 k__Bacteria|p__Firmicutes
#> 4 k__Bacteria|p__Bacteroidetes
#> 5 k__Bacteria|p__Actinobacteria
#> 6 k__Bacteria|p__Verrucomicrobia
#> 7 k__Bacteria|p__Proteobacteria
#> 8 k__Viruses|p__Viruses_noname
#> 9 k__Bacteria|p__Candidatus_Saccharibacteria
#> 10 k__Bacteria|p__Firmicutes|c__Clostridia
#> # ℹ 1,575 more rows
Here, we have all the present bacteria at all different ranks. As we
are just interested in genera that belong to the
Gammaproteobacteria class, we filter()
the
lineages with is_clade()
and is_rank()
. The
genus name is accessible with get_last_clade()
.
gammap_genus <-
taxonomy %>%
filter(is_clade(lineages, "Gammaproteobacteria"),
is_rank(lineages, "genus")) %>%
mutate(genus = get_last_clade(lineages))
gammap_genus
#> # A tibble: 26 × 2
#> lineages genus
#> <chr> <chr>
#> 1 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteria… Esch…
#> 2 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pasteurellales… Haem…
#> 3 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteria… Ente…
#> 4 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadale… Pseu…
#> 5 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteria… Ente…
#> 6 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pasteurellales… Aggr…
#> 7 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteria… Hafn…
#> 8 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pasteurellales… Acti…
#> 9 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Xanthomonadale… Sino…
#> 10 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteria… Citr…
#> # ℹ 16 more rows
It is useful to have a taxonomic table. taxtable()
do
the job.
gammaprot_table <-
gammap_genus %>%
pull(lineages) %>%
taxtable()
as_tibble(gammaprot_table)
#> # A tibble: 26 × 6
#> kingdom phylum class order family genus
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enteroba… Esch…
#> 2 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteure… Haem…
#> 3 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enteroba… Ente…
#> 4 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Pseudomo… Pseu…
#> 5 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enteroba… Ente…
#> 6 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteure… Aggr…
#> 7 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enteroba… Hafn…
#> 8 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteure… Acti…
#> 9 Bacteria Proteobacteria Gammaproteobacteria Xanthomonadales Sinobact… Sino…
#> 10 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enteroba… Citr…
#> # ℹ 16 more rows
To have a tree, use taxtree()
with a taxonomic table in
input. By default, it collapses ranks with only one subrank.
gammaprot_tree <- taxtree(gammaprot_table)
gammaprot_tree
#>
#> Phylogenetic tree with 26 tips and 7 internal nodes.
#>
#> Tip labels:
#> Escherichia, Enterobacteriaceae_noname, Enterobacter, Hafnia, Citrobacter, Pantoea, ...
#> Node labels:
#> Gammaproteobacteria, Enterobacteriaceae, Pasteurellaceae, Pseudomonadales, Moraxellaceae, Xanthomonadales, ...
#>
#> Rooted; includes branch lengths.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.