Introduction to SESraster

Neander Heming, Flávio Mota and Gabriela Alves-Ferreira

2023-06-22

Introduction

Species distributions are dynamic and ever-changing, exhibiting variations both across time and space. To study and understand the complex mechanisms underlying species distributions and the structure of biological communities, null models are often used. A null model is a simplified representation of how species distributions would look if specific ecological processes or factors were not operating. The package SESraster covers a current gap by offering functions for randomization of presence/absence raster data with or without including spatial structure on randomized patterns of species distribution.

Installation

install.packages("SESraster")
library(SESraster)

If you find any bug, let us know through SESraster Issues. The development version of SESraster can be installed from the SESraster repository in Github:

require(devtools)
devtools::install_github("HemingNM/SESraster", build_vignettes = TRUE)
library(SESraster)

Analysis

The SESraster package implements functions to randomize presence/absence species distribution raster data with or without including spatial structure for calculating Standardized Effect Sizes (SES) necessary for null hypothesis testing. The function bootspat_naive() does not retain the spatial structure of community richness and species distribution size at the same time. It randomizes a raster stack according to the observed frequency of presences for each species (layer) using any of the following methods: sites (raster cells), species (raster layers) or both (layers and cells). This, randomization without retaining spatial structure of data, is the most commonly used method of randomization for community data. The function bootspat_str() does retain the spatial structure of community richness and species distribution size at the same time. It randomizes a raster stack keeping the species richness fixed across raster cells. To our knowledge this method was not previously implemented in R.

Let’s see some examples.

Random species generation

Now, let’s see some of the package features. First, we will create some random species distributions using the package terra.

library(SESraster)
library(terra)
#> terra 1.7.37
# creating random species distributions
f <- system.file("ex/elev.tif", package="terra")
r <- rast(f)
set.seed(510)
r10 <- rast(lapply(1:18,
                function(i, r, mn, mx){
                  app(r, function(x, t){
                    sapply(x, function(x, t){
                       x<max(t) & x>min(t)
                    }, t=t)
                  }, t=sample(seq(mn, mx), 2))
                }, r=r, mn=minmax(r)[1]+10, mx=minmax(r)[2]-10))

names(r10) <- paste("sp", 1:nlyr(r10))
plot(r10)

With the distributions in hand, we can perform the spatial randomizations.

Spatially Unstructured Randomization

First, let’s randomize species distribution ignoring the spatial structure with the function bootspat_naive.

both, by site and species simultaneously

We can randomize the presences/absences (1s/0s) using the method both. This method combines randomization by site and species at the same time. It will shuffle all presences across cells and layers, changing site richness and species distribution sizes and location at the same time. Notice that NA cells are ignored.

srb <- bootspat_naive(r10, random = "both")
plot(srb, legend=F)

plot(c(sum(r10), sum(srb)), main=c("observed", "randomized"))

by species

We can randomize by species. This second method is performed at each layer (species) of the stack by randomizing the position of species presences in space. This method is equivalent to the flatland model of Laffan & Crisp (laffan2003?) and changes the species richness at each cell while retaining the size of the species distribution (except if randomization is performed by frequency). When running by frequency, presences/absences (1s/0s) are sampled at each pixel based on the probability (frequency) that the species is found within the study area. For each species, the randomized frequency of presences is very similar to the actual frequency but not exactly the same.

sr1 <- bootspat_naive(r10, random = "species")
plot(sr1, legend=F)

plot(c(sum(r10), sum(sr1)), main=c("observed", "randomized"))

sr1b <- bootspat_naive(r10, random = "species", memory = FALSE)
#> The file does not fit on the memory. Randomization will be done by probability.

Check that the number of occupied pixels of randomized distributions are similar to those of the observed distributions.

cbind(observed=sapply(r10, function(x)freq(x)[2,3]),
      randomized=sapply(sr1, function(x)freq(x)[2,3]),
      randomized_freq=sapply(sr1b, function(x)freq(x)[2,3]))
#>       observed randomized randomized_freq
#>  [1,]      767        767             773
#>  [2,]     3443       3443            3452
#>  [3,]     1175       1175            1180
#>  [4,]      889        889             872
#>  [5,]      908        908             952
#>  [6,]     2160       2160            2143
#>  [7,]      548        548             544
#>  [8,]      133        133             127
#>  [9,]      122        122             118
#> [10,]     4174       4174            4171
#> [11,]     2565       2565            2502
#> [12,]     3031       3031            3029
#> [13,]       36         36              35
#> [14,]     4387       4387            4398
#> [15,]     3270       3270            3235
#> [16,]     2198       2198            2216
#> [17,]     2427       2427            2392
#> [18,]      235        235             229

by site

Now, we will randomize by site. This method randomizes the position (presence/absence) of the species within each site (cell) of the stack. This method keeps species richness constant at each cell but the size of the species distribution might change, as more or less pixels can be randomly assigned to each species (raster layer). Notice that, although the spatial structure of species richness is held constant, the number of pixels that each species occupy is completely randomized.

sr2 <- bootspat_naive(r10, random = "site")
plot(sr2, legend=F)

plot(c(sum(r10), sum(sr2)), main=c("observed", "randomized"))

Spatially Structured Randomization

Notice that randomization by site from bootspat_naive keeps the species richness fixed (i.e. equal to the input raster), but the size of the species’ distribution (i.e. number of pixels of each species) is completely randomized. On the other hand, randomization by species keeps the number of pixels of each species fixed, but the richness is completely randomized.

In the function bootspat_str we implement a spatially structured randomization that keeps both, species richness pattern and distribution size of each species, fixed. This method is based on the second null model of Laffan & Crisp (laffan2003?), but uses probability of sampling presences based on frequency of presences for each species. In the same way, notice that although the size of the species distribution is retained, it will lack spatial structure.

Randomizations are based on frequencies (given or calculated from the output raster (a presence-absence SpatRaster) and, optionally, a probability raster stack. Both, frequencies and probability raster stack, control the probability that a given species is sampled in each cell raster. Frequency control the probability of each species compared to all others. Probability raster stack control the probability that each species is sampled on a given raster cell.

# bootstrapping once
fr.prob <- SESraster::fr2prob(r10)
prob <- terra::app(r10,
                   function(x){
                     ifelse(is.na(x), 0, 1)
                   })

randr10 <- bootspat_str(r10, rprob = prob, fr_prob = fr.prob)

The species distribution was spatially randomized according to the frequency of presence of each species. This method randomizes the position of species presences in space keeping the species richness constant and number of occupied pixels of randomized distributions very similar those on the actual distributions.

plot(randr10, legend=F)

See unchanged spatial pattern of species richness.

plot(c(sum(r10), sum(randr10)), main=c("observed", "randomized"))

Check that the number of occupied pixels of randomized distributions are very similar those of the observed distributions.

cbind(observed=sapply(r10, function(x)freq(x)[2,3]),
      randomized=sapply(randr10, function(x)freq(x)[2,3]))
#>       observed randomized
#>  [1,]      767        681
#>  [2,]     3443       3532
#>  [3,]     1175       1119
#>  [4,]      889        849
#>  [5,]      908        832
#>  [6,]     2160       2138
#>  [7,]      548        461
#>  [8,]      133        104
#>  [9,]      122         95
#> [10,]     4174       4246
#> [11,]     2565       2602
#> [12,]     3031       3127
#> [13,]       36         32
#> [14,]     4387       4413
#> [15,]     3270       3379
#> [16,]     2198       2193
#> [17,]     2427       2483
#> [18,]      235        182

References

Laffan, Shawn W., and Michael D. Crisp. 2003. "Assessing Endemism at Multiple Spatial Scales, with an Example from the Australian Vascular Flora." Journal of Biogeography 30 (4): 511–20.