The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette shows how to use the ECDFniche package to reproduce the simulations from the original manuscript, comparing Mahalanobis distance–based suitability transformations using the chi-squared distribution and the empirical cumulative distribution function (ECDF).
We follow a virtual ecologist approach (Zurell et al. 2010) to evaluate how different transformations of the Mahalanobis distance recover a simulated niche across varying dimensionalities (number of predictor variables) and sample sizes (number of records), and then extend the analysis to a bivariate non-normal environmental space.
ecdf_compare_niche()In the first set of simulations, we assume that the environmental predictors describing a species’ niche follow a multivariate normal distribution.
Dimensions \(p\) range from 1 to 5, and sample sizes \(n\) range from 20 to 500 in steps of 20, mimicking increasing numbers of occurrence records.
For each combination of \(p\) and
\(n\),
ecdf_compare_niche():
Theoretical chi-squared suitability: \[ S_{\chi^2} = 1 - F_{\chi^2_p}(D^2), \] where \(F_{\chi^2_p}\) is the cumulative distribution function of the chi-squared distribution with \(p\) degrees of freedom.
Empirical ECDF-based suitability: \[ S_{\text{ECDF}} = 1 - F_{\text{ECDF}}(D^2), \] where \(F_{\text{ECDF}}\) is the empirical cumulative distribution function of the distances.
This setup mimics a realistic ENM scenario where multivariate means and true covariances are unknown and must be estimated from occurrence records drawn from a correlated environmental space representing the species’ theoretical niche (sensu Hutchinson 1978).
set.seed(1991)
normal_res <- ecdf_compare_niche(
p_vals = 1:5,
n_vals = seq(20L, 500L, 20L),
n_reps = 30L
)cor_plot summarizes, for each \(p\) and \(n\), the mean and standard deviation of the
correlation between chi-squared and ECDF suitabilities across the 30
replicates, with individual replicate values shown as grey points. The
plot shows the correlation between suitability metrics estimated using
the chi-squared and the Empirical Cumulative Distribution Function
(ECDF) across sample sizes and numbers of environmental predictors. Each
panel shows Pearson correlation coefficients as a function of the number
of occurrence records for a given dimensionality p (1 to 5 variables).
Correlations are generally very high (rarely < 0.95), increasing with
sample size and slightly increasing with dimensionality.
suit_plot pools observations across sample sizes for
each \(p\), recomputes an ECDF-based
suitability on the combined distances, and compares these to chi-squared
curves. The plot shows the relationship between squared Mahalanobis
distance (x-axis) and environmental suitability metrics (y-axis)
estimated using the chi-squared and the Empirical Cumulative
Distribution Function (ECDF) across different numbers of environmental
predictors. Grey curves represent the habitat suitability based on the
chi-squared distribution while red circles represent habitat suitability
via ECDF, highlighting that ECDF closely tracks the theoretical
chi‑squared mapping over the distance range.
ecdf_nonnormal_niche()In many real ENM applications, environmental covariates are not jointly normal and species can show complex responses to environmental gradients (Anderson et al. 2022).
To explore this, ecdf_nonnormal_niche() simulates a
bivariate environmental space for temperature and precipitation using a
Gaussian copula:
For each \(\rho\), the function:
set.seed(1991)
nonnormal_res <- ecdf_nonnormal_niche(
rho_vals = c(-0.7, -0.3, 0, 0.3, 0.7),
n_vals = c(20L, 50L, 100L, 200L, 500L),
n_reps = 10L,
N_ref = 1e5,
temp_function = "qnorm",
temp_parameters = list(mean = 20, sd = 5),
prec_function = "qweibull",
prec_parameters = list(shape = 2, scale = 10)
)suit_plot shows ECDF-based suitability (colored by \(\rho\)) and chi-squared suitability (red
points) as functions of \(D^2\),
faceted by sample size. Plots highlight the sensitivity of chi-squared
and ECDF suitability metrics to sample size and variable correlation in
non-normal bivariate data. Internal histograms represent the
distribution of ECDF-based values around the chi-squared-based
suitability. The ECDF estimator shows higher stochasticity in small
samples but converges to the chi-squared expectation in larger
samples.
nonnormal_res$suit_plot
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).You can customize key aspects of both simulations to reproduce or extend the analyses:
ecdf_compare_niche()):
p_vals: set of dimensions (e.g. 1:10) to
evaluate high-dimensional behavior.n_vals: sample size grid, e.g. smaller n
to explore the effect of limited records.n_reps: number of replicates per combination for more
stable summaries.res_custom_normal <- ecdf_compare_niche(
p_vals = 2:4,
n_vals = seq(20L, 200L, 20L),
n_reps = 50L,
seed = 42
)
res_custom_normal$cor_plotecdf_nonnormal_niche()):
rho_vals: alternative dependence structures.n_vals, n_reps: sample size and
replication design.N_ref: reference population size; larger values
approximate the true parameters more closely.res_custom_nonnorm <- ecdf_nonnormal_niche(
rho_vals = c(-0.5, 0, 0.5),
n_vals = c(30L, 100L, 300L),
n_reps = 20L,
N_ref = 5e4,
temp_function = "qnorm",
temp_parameters = list(mean = 20, sd = 5),
prec_function = "qweibull",
prec_parameters = list(shape = 2, scale = 10),
seed = 123
)
res_custom_nonnorm$suit_plot
#> Warning: Removed 7 rows containing non-finite outside the scale range
#> (`stat_bin()`).
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).Together, these simulations illustrate when the traditional chi-squared transformation is reliable and when the ECDF-based approach provides a more robust mapping from niche-based distances to habitat suitability, particularly when environmental covariates deviate from multivariate normality.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.