library(pfocal)
Moving window calculations are common in many fields, from image processing to raster data manipulation. The process of ingesting values in the neighborhood of a each cell in a grid, passing them through a function, and returning a reduced value, is “embarrassingly parallel”. This package implements a this very process with parallel C code. In doing so, it provides a much faster option than other similarly aimed packages and functions.
Let’s start with a simple 1000 by 1000 square matrix.
<- 1000
size <- matrix(nrow = size, ncol = size,
data data = runif(n = size*size, min = 0, max = 10))
image(data, asp = 1)
The package contains two types of functions: kernel generation functions and kernel application functions. With the package’s kernel generation functions, it is easy to create distance, exponential, circular as well as Gaussian kernels. These kernels are returned under the form of a matrix. Here we generate a Chebyshev distance kernel and show its shape.
<- chebyshev_distance_kernel(10)
kernel image(kernel, asp = 1)
The second type of functions in the package, kernel applications functions, can apply a kernel to a matrix, raster object or terra object (although the most efficient approach is to pass a matrix to the function). The main application function is pfocal
.
<- pfocal(data = data, kernel = kernel, edge_value = 0)
convoluted image(convoluted, asp = 1)
The pfocal
function is not as flexible as other focal calculations functions which can accommodate any functional treatment of the values covered by the kernel (for example raster::focal
can take on any R function). However, it is optimized for a few common combinations of “transform” and “reduce” functions. The transform function dictates how the values d
covered by the kernel k
are treated with regard to the focal cell. To get the list of possible transform functions:
pfocal_info_transform()
#> [,1] [,2]
#> [1,] "MULTIPLY" "0"
#> [2,] "ADD" "1"
#> [3,] "R_EXP" "2"
#> [4,] "L_EXP" "3"
#> [,3]
#> [1,] "For data value 'd' and kernal value 'k', intermediate_value = (d * k)"
#> [2,] "For data value 'd' and kernal value 'k', intermediate_value = (d + k)"
#> [3,] "For data value 'd' and kernal value 'k', intermediate_value = (d ^ k)"
#> [4,] "For data value 'd' and kernal value 'k', intermediate_value = (k ^ d)"
The reduce function, on the other hand, dictates how the transformed values are reduced, with the most common option being sum, product, minimum and maximum. See the list of reduce functions:
pfocal_info_reduce()
#> [,1] [,2]
#> [1,] "SUM" "0"
#> [2,] "ABS_SUM" "1"
#> [3,] "PRODUCT" "2"
#> [4,] "ABS_PRODUCT" "3"
#> [5,] "MIN" "4"
#> [6,] "MAX" "5"
#> [,3]
#> [1,] "Accumulator starts at 0. For each intermediate value, in no particular order, acc = ( acc + iv )"
#> [2,] "Accumulator starts at 0. For each intermediate value, in no particular order, acc = ( acc + abs(iv) )"
#> [3,] "Accumulator starts at 1. For each intermediate value, in no particular order, acc = ( acc * iv )"
#> [4,] "Accumulator starts at 1. For each intermediate value, in no particular order, acc = ( acc * abs(iv) )"
#> [5,] "Accumulator starts at the highest possible value. For each intermediate value, in no particular order, acc = min( acc , iv )"
#> [6,] "Accumulator starts at the lowest possible value. For each intermediate value, in no particular order, acc = max( acc , iv )"
Finally, there is the option of dividing the final result (a method for scaling). The default is no division. The different options can be seen with:
pfocal_info_mean_divisor()
#> [,1] [,2]
#> [1,] "ONE" "0"
#> [2,] "KERNEL_SIZE" "1"
#> [3,] "KERNEL_COUNT" "2"
#> [4,] "KERNEL_SUM" "3"
#> [5,] "KERNEL_ABS_SUM" "4"
#> [6,] "KERNEL_PROD" "5"
#> [7,] "KERNEL_ABS_PROD" "6"
#> [8,] "DYNAMIC_COUNT" "7"
#> [9,] "DYNAMIC_SUM" "8"
#> [10,] "DYNAMIC_ABS_SUM" "9"
#> [11,] "DYNAMIC_PROD" "10"
#> [12,] "DYNAMIC_ABS_PROD" "11"
#> [13,] "DYNAMIC_DATA_SUM" "12"
#> [14,] "DYNAMIC_DATA_ABS_SUM" "13"
#> [15,] "DYNAMIC_DATA_PROD" "14"
#> [16,] "DYNAMIC_DATA_ABS_PROD" "15"
#> [,3]
#> [1,] "Does not divide the final value by anything"
#> [2,] "Divide the final value at each point by nrow(k)*ncol(k)"
#> [3,] "Divide the final value at each point by sum(+!is.na(k))"
#> [4,] "Divide the final value at each point by sum(k[!is.na(k)])"
#> [5,] "Divide the final value at each point by sum(abs(k[!is.na(k)]))"
#> [6,] "Divide the final value at each point by prod(k[!is.na(k)])"
#> [7,] "Divide the final value at each point by prod(abs(k[!is.na(k)]))"
#> [8,] "Divide the final value at each point by sum(!is.na( intermediate_data )), recalculated at every point"
#> [9,] "Divide the final value at each point by sum(intermediate_data[!is.na( intermediate_data )]), recalculated at every point"
#> [10,] "Divide the final value at each point by sum(abs(intermediate_data[!is.na( intermediate_data )])), recalculated at every point"
#> [11,] "Divide the final value at each point by prod(intermediate_data[!is.na( intermediate_data )]), recalculated at every point"
#> [12,] "Divide the final value at each point by prod(abs(intermediate_data[!is.na( intermediate_data )])), recalculated at every point"
#> [13,] "Divide the final value at each point by sum(local_data[!is.na( intermediate_data )]), recalculated at every point"
#> [14,] "Divide the final value at each point by sum(abs(local_data[!is.na( intermediate_data )])), recalculated at every point"
#> [15,] "Divide the final value at each point by prod(local_data[!is.na( intermediate_data )]), recalculated at every point"
#> [16,] "Divide the final value at each point by prod(abs(local_data[!is.na( intermediate_data )])), recalculated at every point"
Other kernel applications are available in the package, bearing the pfocal_fast
, and wrapping pfocal
as a shortcut to some default arguments.
<- pfocal_fast_abs_rectangle(data = data, height = 10, width = 10)
fast_convoluted image(fast_convoluted, asp = 1)
To compare with raster::focal
, we will make use of the raster
and microbenchmark
packages:
library(raster)
#> Loading required package: sp
library(microbenchmark)
#> Warning: package 'microbenchmark' was built under R version 4.1.2
We will run the functions 100 times on the same data grid above. We show here how the mean()
function can be replicated in pfocal
.
<- raster::raster(data)
data_raster
<- microbenchmark(
mbm "pfocal" = {
<- pfocal::pfocal(data = data_raster, kernel = kernel,
pfocal_results transform_function = "MULTIPLY",
reduce_function = "SUM",
mean_divider = "KERNEL_COUNT")
gc()
},"raster_focal" = {
<- raster::focal(x = data_raster, w = kernel,
raster_focal_results fun = mean, pad = TRUE, padValue = 0)
gc()
},times = 100,
unit = "s")
mbm#> Unit: seconds
#> expr min lq mean median uq max neval
#> pfocal 0.2178424 0.2228699 0.2279975 0.2253432 0.2300356 0.2677352 100
#> raster_focal 0.7477754 0.7576559 0.7686383 0.7624928 0.7725864 0.9350483 100
plot(mbm)