The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
taxalight provides a lightweight, lightning fast query
for resolving taxonomic identifiers to taxonomic names, and vice versa,
by using a Lightning Memory Mapped Database backend. Compared to
taxadb, it has few dependencies, fewer functions, and
faster performance.
If you just need to resolve scientific names to identifiers and vice
versa, taxalight is a fast and simple option.
taxalight currently supports names from Integrated
Taxonomic Information System (ITIS), National Center for Biotechnology
Information (NCBI), Global Biodiversity Information Facility (GBIF),
Catalogue of Life (COL), and Open Tree Taxonomy (OTT). Like
taxadb, taxalight uses annual stable version
snapshots from these providers and presents the naming data in the
simple and consistent tabular format of the Darwin Core Standard.
You can install the released version of taxalight from CRAN with:
install.packages("taxalight")And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("cboettig/taxalight")taxalight needs to first download and import the
provider naming databases. This can take a while, but needs to only be
done once.
library(taxalight)
tl_create("itis")
#> Now we can look up species by names, IDs, or a mix. Even vernacular
names can be recognized as key. Note that only exact matches are
supported though! ITIS (itis) is the default provider, but
GBIF, COL, OTT, and NCBI are also available.
tl("Homo sapiens", provider = "itis")
#> # A tibble: 30 x 14
#>    taxonID   scientificName  acceptedNameUsag… taxonomicStatus taxonRank kingdom
#>    <chr>     <chr>           <chr>             <chr>           <chr>     <chr>  
#>  1 ITIS:944… Homo aethiopic… ITIS:180092       synonym         species   Animal…
#>  2 ITIS:944… Homo americanus ITIS:180092       synonym         species   Animal…
#>  3 ITIS:944… Homo arabicus   ITIS:180092       synonym         species   Animal…
#>  4 ITIS:944… Homo australas… ITIS:180092       synonym         species   Animal…
#>  5 ITIS:944… Homo cafer      ITIS:180092       synonym         species   Animal…
#>  6 ITIS:944… Homo capensis   ITIS:180092       synonym         species   Animal…
#>  7 ITIS:944… Homo columbicus ITIS:180092       synonym         species   Animal…
#>  8 ITIS:944… Homo drennani   ITIS:180092       synonym         species   Animal…
#>  9 ITIS:944… Homo grimaldii  ITIS:180092       synonym         species   Animal…
#> 10 ITIS:944… Homo hottentot… ITIS:180092       synonym         species   Animal…
#> # … with 20 more rows, and 8 more variables: phylum <chr>, class <chr>,
#> #   order <chr>, family <chr>, genus <chr>, specificEpithet <chr>,
#> #   infraspecificEpithet <lgl>, vernacularName <chr>id <- c("ITIS:180092", "ITIS:179913", "Dendrocygna autumnalis", "Snow Goose",
        provider = "itis")
tl(id)
#> # A tibble: 6 x 14
#>   taxonID   scientificName    acceptedNameUsa… taxonomicStatus taxonRank kingdom
#>   <chr>     <chr>             <chr>            <chr>           <chr>     <chr>  
#> 1 ITIS:180… Homo sapiens      ITIS:180092      accepted        species   Animal…
#> 2 ITIS:179… Mammalia          ITIS:179913      accepted        class     <NA>   
#> 3 ITIS:175… Dendrocygna autu… ITIS:175044      accepted        species   Animal…
#> 4 ITIS:175… Anser caerulesce… ITIS:175038      synonym         species   Animal…
#> 5 ITIS:175… Anser hyperboreus ITIS:175038      synonym         species   Animal…
#> 6 ITIS:175… Chen caerulescens ITIS:175038      accepted        species   Animal…
#> # … with 8 more variables: phylum <chr>, class <chr>, order <chr>,
#> #   family <chr>, genus <chr>, specificEpithet <chr>,
#> #   infraspecificEpithet <lgl>, vernacularName <chr>For convenience, we can request just the name or id as a character
vector (paralleling functionality in taxize). If the name
is recognized as an accepted name, the corresponding ID for the provider
is returned.
get_ids("Homo sapiens")
#>  Homo sapiens 
#> "ITIS:180092"get_names("ITIS:179913")
#> [1] "Mammalia"library(bench)sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor",
        "Chen canagica",          "Chen caerulescens"     )taxadb::td_create("itis", schema="dwc")
#> Registered S3 methods overwritten by 'readr':
#>   method           from 
#>   format.col_spec  vroom
#>   print.col_spec   vroom
#>   print.collector  vroom
#>   print.date_names vroom
#>   print.locale     vroom
#>   str.col_spec     vroom
#> Warning in overwrite_db(con, tablename): overwriting 2020_dwc_itis
#> Importing /home/cboettig/.local/share/R/contentid/data/ef/6a/ef6ae3b337be65c661d5e2d847613ebc955bb9d91d2d98d03cf8c53029cecc2a in 100000 line chunks:
#>  ...Done! (in 14.38502 secs)bench::bench_time(
  df_tb <- taxadb::filter_name(sp, "itis")
)
#> process    real 
#>   2.81s   2.84s
df_tb
#> # A tibble: 4 x 17
#>    sort taxonID   scientificName     taxonRank acceptedNameUsag… taxonomicStatus
#>   <int> <chr>     <chr>              <chr>     <chr>             <chr>          
#> 1     1 ITIS:175… Dendrocygna autum… species   ITIS:175044       accepted       
#> 2     2 ITIS:175… Dendrocygna bicol… species   ITIS:175046       accepted       
#> 3     3 ITIS:175… Chen canagica      species   ITIS:175042       accepted       
#> 4     4 ITIS:175… Chen caerulescens  species   ITIS:175038       accepted       
#> # … with 11 more variables: update_date <chr>, kingdom <chr>, phylum <chr>,
#> #   class <chr>, order <chr>, family <chr>, genus <chr>, specificEpithet <chr>,
#> #   infraspecificEpithet <chr>, vernacularName <chr>, input <chr>bench::bench_time(
  df_tl <- taxalight::tl(sp, "itis")
)
#> process    real 
#>  29.8ms  44.8ms
df_tl
#> # A tibble: 9 x 14
#>   taxonID  scientificName    acceptedNameUsag… taxonomicStatus taxonRank kingdom
#>   <chr>    <chr>             <chr>             <chr>           <chr>     <chr>  
#> 1 ITIS:17… Dendrocygna autu… ITIS:175044       accepted        species   Animal…
#> 2 ITIS:17… Dendrocygna bico… ITIS:175046       synonym         subspeci… Animal…
#> 3 ITIS:17… Dendrocygna bico… ITIS:175046       accepted        species   Animal…
#> 4 ITIS:17… Philacte canagica ITIS:175042       synonym         species   Animal…
#> 5 ITIS:17… Anser canagicus   ITIS:175042       synonym         species   Animal…
#> 6 ITIS:17… Chen canagica     ITIS:175042       accepted        species   Animal…
#> 7 ITIS:17… Anser caerulesce… ITIS:175038       synonym         species   Animal…
#> 8 ITIS:17… Anser hyperboreus ITIS:175038       synonym         species   Animal…
#> 9 ITIS:17… Chen caerulescens ITIS:175038       accepted        species   Animal…
#> # … with 8 more variables: phylum <chr>, class <chr>, order <chr>,
#> #   family <chr>, genus <chr>, specificEpithet <chr>,
#> #   infraspecificEpithet <lgl>, vernacularName <chr>bench::bench_time(
  id_tb <- taxadb::get_ids(sp, "itis")
)
#> process    real 
#>   2.42s   2.54s
id_tb
#> Dendrocygna autumnalis    Dendrocygna bicolor          Chen canagica 
#>          "ITIS:175044"          "ITIS:175046"          "ITIS:175042" 
#>      Chen caerulescens 
#>          "ITIS:175038"bench::bench_time(
  id_tl <- taxalight::get_ids(sp, "itis")
)
#> process    real 
#>  31.7ms    45ms
id_tl
#> Dendrocygna autumnalis    Dendrocygna bicolor          Chen canagica 
#>          "ITIS:175044"          "ITIS:175046"          "ITIS:175042" 
#>      Chen caerulescens 
#>          "ITIS:175038"Under the hood, taxalight consumes a DCAT2/PROV-O
based description of the data provenance which generates the
standard-format tables imported by taxalight (and
taxadb) from the original data published by the naming
providers. All data and scripts are identified by content-based
identifiers, which can be resolved by https://hash-archive.org or the R package,
contentid. This provides several benefits over resolving
data from a URL source:
Input data and scripts for transforming the data into the desired format are similarly archived and referenced by content identifiers in the provenance trace.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.