The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
##Background Erroneous database entries and problematic geographic
coordinates are a central issue in biogeography and there is a set of
tools available to address different dimensions of the problem.
CoordinateCleaner focuses on the fast and reproducible flagging of large
amounts of records, and additional functions to detect dataset-level and
fossil-specific biases. In the R-environment the scrubr and biogeo offer
cleaning approaches complementary to CoordinateCleaner. The scrubr
package combines basic geographic cleaning (comparable to cc_dupl,
cc_zero and cc_count in CoordinateCleaner) but adds options to clean
taxonomic names (See also taxize) and date
information. biogeo includes some basic automated geographic cleaning
(similar to cc_val
, cc_count
and
cc_outl
) but rather focusses on correcting suspicious
coordinates on a manual basis using environmental information.
Table 1. Function by function comparison of CoordinateCleaner, scrubr and biogeo.
Functionality | CoordinateCleaner 2.0-2 | scrubr 0.1.1 | biogeo 1.0 | Percent overlap |
---|---|---|---|---|
Missing coordinates | cc_val | coord_incomplete | missingvalsexclude | 100% |
Coordinates outside CRS | cc_val | coord_impossible | - | 100% |
Duplicated records | cc_dupl | dedup | duplicatesexclude | The aim is identical, methods differ |
0/0 coordinates | cc_zero | coord_unlikely | - | 100% |
Identical lon/lat | cc_equ | - | - | 0% |
Country capitals | cc_cap | - | - | 0% |
Political unit centroids | cc_cen | - | - | 0% |
Coordinates in-congruent with additional location information | cc_count | coord_within | errorcheck, quickclean | 100% |
Coordinates assigned to GBIF headquarters | cc_gbif | - | - | 0% |
Coordinates assigned to the location of biodiversity institutions | cc_inst | - | - | 0% |
Coordinates outside natural range | cc_iucn | - | - | 0% |
Spatial outliers | cc_outl | - | outliers | 50%, biogeo uses environmental distance |
Coordinates within the ocean | cc_sea | - | - | 0% |
Coordinates in urban area | cc_urb | - | - | 0% |
Coordinate conversion error | dc_ddmm | - | - | 0% |
Rounded coordinates/rasterized collection | dc_round | - | precisioncheck | 20%, biogeo test for predefined rasters |
Fossils: invalid age range | tc_equal | - | - | 0% |
Fossils: excessive age range | tc_range | - | - | 0% |
Fossils: temporal outlier | tc_outl | - | - | 0% |
Fossils: PyRate interface | WritePyrate | - | - | 0% |
Wrapper functions to run all test | CleanCoordinates, CleanCoordinatesDS, CleanCoordinatesFOS | - | - | 0% |
Database of biodiversity institutions | institutions | - | - | 0% |
Taxonomic cleaning | - | tax_no_epithet | - | 0% |
Missing date | - | date_missing | - | 0% |
Add date | - | date_create | - | 0% |
Date format | - | date_standardize | - | 0% |
Reformatting coordinate annotation | - | - | a large set of functions | 0 % |
Correcting coordinates using guessing and environmental distance | - | - | a large set of functions | 0 % |
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.