Examples

This vignette illustrates how to detect ambiguity and inconsistency in a merged taxonomy. Start by loading the 2000 row sample dataset that comes with taxonbridge:

library(taxonbridge)
sample <- load_sample()
dim(sample)
#> [1] 2000   20

Next, retrieve all rows that have lineage information in both the GBIF backbone and NCBI:

lineages <- get_lineages(sample)

Then validate the lineages by using the kingdom and family taxonomic ranks, and create a list of the resulting tibble(s). Note that phylum, class, and order may also be used. In this example, entries that failed validation are returned by setting valid = FALSE.

kingdom <- get_validity(lineages, rank = "kingdom", valid = FALSE)
#> Term conversion carried out on kingdom taxonomic rank
family <- get_validity(lineages, rank = "family", valid = FALSE)
candidates <- list(kingdom, family)

Finally, detect candidate incongruencies (excluding those with uninomial scientific names):

get_inconsistencies(candidates, uninomials = FALSE)
#> [1] "Gordonia neofelifaecis"  "Attheya septentrionalis"

Two binomial names exhibit incongruency. Upon reference to the literature and the individual entries it can be seen that:

Attheya septentrionalis has the status “synonym” in the GBIF data:

lineages[lineages$canonicalName=="Attheya septentrionalis", "taxonomicStatus"]
#> # A tibble: 1 × 1
#>   taxonomicStatus
#>   <chr>          
#> 1 synonym

Applying the get_status() function and rerunning the exercise leaves only Gordonia neofelifaecis as a binomial incongruency with biological provenance:

lineages <- get_status(get_lineages(sample), status = "accepted")
kingdom <- get_validity(lineages, rank = "kingdom", valid = FALSE)
#> Term conversion carried out on kingdom taxonomic rank
family <- get_validity(lineages, rank = "family", valid = FALSE)
candidates <- list(kingdom, family)
get_inconsistencies(candidates, uninomials = FALSE)
#> [1] "Gordonia neofelifaecis"