rotl provides an interface to the Open Tree of Life (OTL) API and allows users to query the API, retrieve parts of the Tree of Life and integrate these parts with other R packages.

The OTL API provides services to access:

In rotl, each of these services correspond to functions with different prefixes:

Service rotl prefix
Tree of Life tol_
Graph of Life gol_
TNRS tnrs_
Studies studies_

rotl also provides a few other functions that can be used to extract relevant information from the objects returned by these functions.

Demonstration of a basic workflow

The most common use for rotl is probably to start from a list of species and get the relevant parts of the tree for these species. This is a two step process:

  1. the species names need to be matched to their ott_id (the Open Tree Taxonomy identifiers) using the Taxonomic name resolution services (TNRS)
  2. these ott_id will then be used to retrieve the relevant parts of the Tree of Life.

Step 1: Matching taxonomy to the ott_id

Let’s start by doing a search on a diverse group of taxa: a tree frog (genus Hyla), a fish (genus Salmo), a sea urchin (genus Diadema), and a nautilus (genus Nautilus).

library(rotl)
taxa <- c("Hyla", "Salmo", "Diadema", "Nautilus")
resolved_names <- tnrs_match_names(taxa)

It’s always a good idea to check that the resolved names match what you intended:

search_string unique_name approximate_match ott_id is_synonym is_deprecated number_matches
hyla Hyla FALSE 1062216 FALSE FALSE 1
salmo Salmo FALSE 982359 FALSE FALSE 1
diadema Diadema (genus in family Diademaceae) FALSE 4930522 FALSE FALSE 4
nautilus Nautilus FALSE 616358 FALSE FALSE 1

The column unique_name sometimes indicates the higher taxonomic level associated with the name. Here we queried on genera names, so the API indicates the family names associated with some the genera. The column number_matches indicates the number of ott_id that corresponds to a given name. In our example, our search on Diadema returns 4 matches, and the one returned by default is a fungus and not the sea urchin that we want for our query.

The argument context_name allows you to limit the taxonomic scope of your search. In this case, if we limit our search to animals, the genus Diadema now matches the sea urchin (notice the family is Diadematidae):

resolved_names <- tnrs_match_names(taxa, context_name = "Animals")
search_string unique_name approximate_match ott_id is_synonym is_deprecated number_matches
hyla Hyla FALSE 1062216 FALSE FALSE 1
salmo Salmo FALSE 982359 FALSE FALSE 1
diadema Diadema (genus in family Diadematidae) FALSE 631176 FALSE FALSE 2
nautilus Nautilus FALSE 616358 FALSE FALSE 1

If you are trying to build a tree with deeply divergent taxa that the argument context_name cannot fix, see “How to change the ott ids assigned to my taxa?” in the FAQ below.

Step 2: Getting the tree corresponding to our taxa

Now that we have the correct ott_id for our taxa, we can ask for the tree using the tol_induced_subtree() function. By default, the object returned by tol_induced_subtree is a phylo object (from the ape package), so we can plot it directly.

my_tree <- tol_induced_subtree(ott_ids = resolved_names$ott_id)
plot(my_tree, no.margin=TRUE)

FAQ

How to change the ott ids assigned to my taxa?

If you realize that tnrs_match_names assigns the incorrect taxonomic group to your name (e.g., because of synonymy) and changing the context_name does not help, you can use the function inspect. This function takes the object resulting from tnrs_match_names(), and either the row number, the taxon name (you used in your search in lowercase), or the ott_id returned by the initial query.

To illustrate this, let’s re-use the previous query but by adding a tree to our list of taxa (Oak tree, genus Acer), such that restricting our search to animals doesn’t work.

taxa <- c("Hyla", "Salmo", "Diadema", "Nautilus", "Acer")
resolved_names <- tnrs_match_names(taxa)
resolved_names
##   search_string                           unique_name approximate_match
## 1          hyla                                  Hyla             FALSE
## 2         salmo                                 Salmo             FALSE
## 3       diadema Diadema (genus in family Diademaceae)             FALSE
## 4      nautilus                              Nautilus             FALSE
## 5          acer                                  Acer             FALSE
##    ott_id is_synonym is_deprecated number_matches
## 1 1062216      FALSE         FALSE              1
## 2  982359      FALSE         FALSE              1
## 3 4930522      FALSE         FALSE              4
## 4  616358      FALSE         FALSE              1
## 5  948922      FALSE         FALSE              1
inspect(resolved_names, taxon_name = "diadema")
##   search_string                            unique_name approximate_match
## 1       diadema  Diadema (genus in family Diademaceae)             FALSE
## 2       diadema Diadema (genus in family Diadematidae)             FALSE
## 3       diadema                             Hypolimnas             FALSE
## 4       diadema                            Diademoides             FALSE
##    ott_id is_synonym is_deprecated number_matches
## 1 4930522      FALSE         FALSE              4
## 2  631176      FALSE         FALSE              4
## 3  643831       TRUE         FALSE              4
## 4 4024672       TRUE         FALSE              4

In our case, we want the second row in this data frame to replace the information that initially matched for Diadema. We can now use the update() function, to change to the correct taxa (the sea urchin not the fungus):

resolved_names <- update(resolved_names, taxon_name = "diadema",
                         new_row_number = 2)

## we could also have used the ott_id to replace this taxon:
## resolved_names <- update(resolved_names, taxon_name = "diadema",
##                          new_ott_id = 631176)

And now our resolved_names data frame includes the taxon we want:

search_string unique_name approximate_match ott_id is_synonym is_deprecated number_matches
hyla Hyla FALSE 1062216 FALSE FALSE 1
salmo Salmo FALSE 982359 FALSE FALSE 1
diadema Diadema (genus in family Diadematidae) FALSE 631176 FALSE FALSE 4
nautilus Nautilus FALSE 616358 FALSE FALSE 1
acer Acer FALSE 948922 FALSE FALSE 1

How do I know that the taxa I’m asking for is the correct one?

The function taxonomy_taxon() takes ott_ids as arguments and returns taxonomic information about the taxa. This output can be passed to some helpers functions to extract the relevant information. Let’s illustrate this with our Diadema example

diadema_info <- taxonomy_taxon(631176)
tax_rank(diadema_info)
##  631176 
## "genus"
synonyms(diadema_info)
## $`631176`
## [1] "Diadema"      "Centrechinus"
ott_taxon_name(diadema_info)
##    631176 
## "Diadema"

In some cases, it might also be useful to investigate the taxonomic tree descending from an ott_id to check that it’s the correct taxon and to determine the species included in the Open Tree Taxonomy:

diadema_tax_tree <- taxonomy_subtree(631176)
diadema_tax_tree
## $tip_label
##  [1] "Diadema_sp._CS-2014_ott5502179"          
##  [2] "Diadema_africana_ott5502180"             
##  [3] "Diadema_ascensionis_ott4950423"          
##  [4] "Diadema_lobatum_ott4950422"              
##  [5] "Diadema_pseudodiadema_ott4950421"        
##  [6] "Diadema_africanum_ott4147369"            
##  [7] "Diadema_antillarum_antillarum_ott4147370"
##  [8] "Diadema_antillarum_scensionis_ott220009" 
##  [9] "Diadema_palmeri_ott836860"               
## [10] "Diadema_sp._DSM6_ott771059"              
## [11] "Diadema_mexicanum_ott639130"             
## [12] "Diadema_setosum_ott631175"               
## [13] "Diadema_sp._DSM4_ott587481"              
## [14] "Diadema_sp._dsm5_ott587480"              
## [15] "Diadema_sp._DSM2_ott587483"              
## [16] "Diadema_sp._DSM3_ott587482"              
## [17] "Diadema_sp._seto9_ott587485"             
## [18] "Diadema_sp._seto10_ott587484"            
## [19] "Diadema_sp._DSM7_ott587487"              
## [20] "Diadema_sp._DSM8_ott587486"              
## [21] "Diadema_sp._SETO15_ott587479"            
## [22] "Diadema_sp._seto17_ott587478"            
## [23] "Diadema_savignyi_ott395692"              
## [24] "Diadema_sp._seto16_ott312262"            
## [25] "Diadema_paucispinum_ott312263"           
## [26] "Diadema_sp._DSM1_ott219999"              
## [27] "Diadema_sp._seto18_ott66623"             
## [28] "Diadema_sp._seto35_ott66618"             
## [29] "Diadema_sp._DJN9_ott66626"               
## [30] "Diadema_sp._seto19_ott66624"             
## [31] "Diadema_sp._seto38_ott66625"             
## 
## $edge_label
## [1] "Diadema_antillarum_ott1022356" "Diadema_ott631176"

By default, this function return all taxa (including self, and internal) descending from this ott_id but it also possible to return phylo object.

How do I get the tree for a particular taxonomic group?

If you are looking to get the tree for a particular taxonomic group, you need to first identify it by its node id or ott id, and then use the tol_subtree() function:

mono_id <- tnrs_match_names("Monotremes")
mono_tree <- tol_subtree(ott_id = mono_id$ott_id[1])
plot(mono_tree)

How do I find trees from studies focused on my favourite taxa?

The function studies_find_trees() allows the user to search for studies matching a specific criteria. The function studies_properties() returns the list of properties that can be used in the search.

furry_studies <- studies_find_studies(property="ot:focalCladeOTTTaxonName", value="Mammalia")
furry_ids <- unlist(furry_studies$matched_studies)

Now that we know the study_id, we can ask for the meta data information associated with this study:

furry_meta <- get_study_meta("pg_2550")
knitr::kable(get_publication(furry_meta))     ## The citation for the source of the study
O’Leary, Maureen A., Marc Allard, Michael J. Novacek, Jin Meng, and John Gatesy. 2004. “Building the mammalian sector of the tree of life: Combining different data and a discussion of divergence times for placental mammals.” In: Cracraft J., & Donoghue M., eds. Assembling the Tree of Life. pp. 490-516. Oxford, United Kingdom, Oxford University Press.
get_tree_ids(furry_meta)        ## This study has 10 trees associated with it
##  [1] "tree5513" "tree5515" "tree5516" "tree5517" "tree5518" "tree5519"
##  [7] "tree5520" "tree5521" "tree5522" "tree5523"
candidate_for_synth(furry_meta) ## None of these trees are yet included in the OTL
## NULL

Using get_study("pg_2550") would returns a multiPhylo object (default) with all the trees associated with this particular study, while get_study_tree("pg_2550", "tree5513") would return one of these trees.