The link2GI package provides a small linking tool to simplify the usage of GRASS GIS
, SAGA GIS
, Orfeo Toolbox
(OTB
) and GDAL
binaries for R users. the focus is to simplify the the accessibility of this software for non operating system specialists or highly experienced GIS geeks. Acutally it is a direct result of numerous graduate courses with R(-GIS) beginners in the hostile world of university computer pools running under extremely restricted Windows systems.
This vignette:
link2GI
according to specific system requirements R has quite a lot of classes for storing and dealing with spatial data. For vector data the sp and recently the great sf packages are well known and the raster data world is widely covered by the raster package. Additionally external spatial data formats are interfaced by wrapping packages as rgdal or gdalUtils. For more specific links as needed for manipulating atmospheric modeling packages as ncdf4 are very helpful.
The spatial analysis itself is often supported by wrapper packages that integrate external libraries, command line tools or a mixture of both in an R-like syntax rgeos, geosphere, Distance, maptools, igraph or spatstat.
A comprehensive introduction to the spatial R-biotope and its backgrounds is excellently treated in Geocomputation with R wich is highly recommend as a reference textbook.
Despite all this capabilities of spatial analysis and data handling in the world of R
, it can be stated (at least from a non-R point of view), that there is still a enormous gap between R and the mature open source Geographic Information System (GIS) and even more Remote Sensing (RS) software community. QGIS
, GRASS GIS
and SAGA GIS
are providing a comprehensive, growing and mature collection of highly sophisticated algorithms. The provided algorithms are fast, stable and most of them are well proofed. Probably most of the R
users who are somehow related to the GI community know that there are awesome good wrapper packages for bridging this gap. For GRASS GIS 7 it is rgrass7 and for SAGA GIS the RSAGA package. The development of the RQGIS wrapper is the most recent outcome to provide a simple usage of the powerful QGIS command line interface.
In addition there is no wrapper for the great OTB
. It seems to be at least convenient to provide a lightweight wrapping utility for the usage of OTB
modules from R
.
Unfortunately one will run into a lot of technical problems depending on the choosen operating system (OS) or library dependencies or GIS software versions. In case of e.g. RSAGA
the main problem has been that the SAGA
GIS developers are not only changing the syntax and strategy of the command line interface (CLI) but also within the same release the calls differ from OS to OS. So the maintenance of RSAGA is at least laborious (but thumbs up is running again). Another example is given by GRASS GIS
which is well known for a sophisticated setup of the environment and the spatial properties of the database. If you “just” want to use a specific GRASS
algorithm from R, you will probablys get lost in setting up all OS-dependencies that are neccessary to set up a correct temporary or permanent GRASS
-environment from “outside”. This is not only caused due to the strict spatial and projection requirements of GRASS
but much more by challenging OS enviroments especially Windows.
To make it short it is a bit cumbersome to deal with all this stuff if one just want to start e.g. GRASS
from the R command line for e.g. a powerful random walk cost analysis (r.walk
) call as provided by GRASS
.
Linking means simply to provide all necessary environment settings that satisfy the existing wrapper packages as well as in addition the full access to the the command line (CLI) APIs of the mentioned software tools. link2GI
tries to analyze which software is installed to set up an temporary enviroment meeting the above mentioned needs.
GRASS GIS
has the most challenging requirements. It needs a bunch of environment and path variables as and a correct setup of the geographical data parameters. The linkGRASS7
function tries to find all installations let you (optionally) choose the one you want to use and generate the necessary variables. As a result you can use both the rgrass7 package or the command line API
of GRASS
.
SAGA GIS
is a far easier to set up. Again the linkSAGA
function tries to find all SAGA
installations, let you (optionally) choose one and generate the necessary variables. You may also use RSAGA
but you have to hand over the result of linkSAGA
like RSAGA::rsaga.env(path = saga$sagaPath)
. For a straightforward usage you may simply use the R
system() call to interface R
with the saga_cmd
API.
The Orfeo Toolbox
(OTB) is a very powerful remote sensing toolbox. It is widely used for classification, filtering and machine learning applications. You will find some of the implemented algorithm within different R packages but always much slower or only running on small data chunks. link2GI
searches and connects all OTB
installations of a given search path and provides the result as a clear list. Due to a missing wrapper package, a list-based OTB
module and function parser is also available, which can be piped into the function runOTB
for a convenient function call.
Notwithstanding that GDAL
is perfectly integrated in R in some cases it is beneficial to use system calls and grab the binaries directly. In particular the evolution to GDAL 3.x
and optionally various boxed versions of GDAL
binaries working together with different Python
and proj4/proj6
libs makes it sometimes difficult to grab the correct version of GDAL
. link2GI
generates a list of all pathes and commands of all GDAL
installation in the provided search path. With this list, you can easily use all available API calls of each installation.
Automatic search and find of the installed GIS software binaries is performed by the find
functions. Depending of you OS and the number of installed versions you will get a dataframe providing the binary and module folders.
# find all SAGA GIS installations at the default search location
require(link2GI)
saga <- link2GI::findSAGA()
saga
Same with GRASS
and OTB
# find all SAGA GIS installations at the default search location
require(link2GI)
grass <- link2GI::findGRASS()
grass
otb <- link2GI::findOTB()
otb
The find
functions are providing an overview of the installed software. This functions are not establishing any linkages or changing settings.
If you just call link2GI on the fly , that means for a single temporary operation, there will be no need for setting up folders and project structures. If you work on a more complex project it is seems to be helpful to support this by a fixed structure. Same with existing GRASS
projects wich need to be in specific mapsets and locations.
A straightforward (you may call it also dirty) approach is the ìnitProjfunction that creates folder structures (if not existing) and establishes (if wanted) global variables containing the pathes as strings.
# find all SAGA GIS installations at the default search location
require(link2GI)
link2GI::initProj(projRootDir = tempdir(),
projFolders = c("data/",
"data/level0/",
"data/level1/",
"output/",
"run/",
"fun/"),
path_prefix = "path_to_" ,
global =TRUE)
In earlier times it has been pretty cumbersome to link the correct SAGA GIS
version. Since the version 1.x.x of RSAGA
things turned much better. The new RSAGA::rsaga.env()
function is at getting the first RSAGA
version in the search path. For using RSAGA
with link2GI
it is strongly recommended to call RSAGA.env()
with the preferred path as provided by a ' findSAGA()
call. It is also possible to provide the version number as shown below. Storing the result in adequate variables will then even give the opportunity to easyly switch between different SAGA GIS
installations.
saga1<-link2GI::linkSAGA(ver_select = 1)
saga1
sagaEnv1<- RSAGA::rsaga.env(path = saga1$sagaPath)
linkGRASS7
Initializes the session environment and the system paths for an easy access to GRASS GIS 7.x.
The correct setup of the spatial and projection parameters is automatically performed by using either an existing and valid raster
, sp
or sf
object, or manually by providing a list containing the minimum parameters needed. These properties are used to initialize either a temporary or a permanent rgrass7
environment including the correct GRASS 7
database structure. If you provide none of the before mentioned objects linkGRASS
will create a EPSG:4326 world wide location.
The most time consuming part on 'Windows' Systems is the search process. This can easily take 10 or more minutes. To speed up this process you can also provide a correct parameter set. Best way to do so is to call manually findGRASS
. Then call linkGRASS7
with the returned version arguments of your choice.
The function linkGRASS7
tries to find all valid GRASS GIS
binaries by analyzing the startup script files of GRASS GIS
. After identifying the GRASS GIS
binaries all necessary system variables and settings will be generated and passed to a temporary R
environment.
If you have more than one valid installation and run linkGRASS7
with the arguments select_ver = TRUE
, then you will be ask to select one.
The most common way to use GRASS
is just for one call or algorithm. So the user is not interested in the cumbersome setting up of all parameters. linGRASS7(georeferenced-dataset)
does an automatic search and find all GRASS
binaries using the georeferenced-dataset object for spatial referencing and the necessary other settings.
NOTE: This is the highly recommended linking procedure for all on the fly calls of GRASS
. Please note also: If more than one GRASS
installation is found the one with the highest version number is selected automatically.
Have a look at the following examples which show a typical call for the well known sp
and sf
vector data objects.
Starting with sp
.
# get meuse data as sp object and link it temporary to GRASS
require(link2GI)
require(sp)
# get data
data(meuse)
# add georeference
coordinates(meuse) <- ~x+y
proj4string(meuse) <-CRS("+init=epsg:28992")
# Automatic search and find of GRASS binaries
# using the meuse sp data object for spatial referencing
# This is the highly recommended linking procedure for on the fly jobs
# NOTE: if more than one GRASS installation is found the highest version will be choosed
linkGRASS7(meuse)
Now do the same with sf
based data.
require(link2GI)
require(sf)
# get data
nc <- st_read(system.file("shape/nc.shp", package="sf"))
# Automatic search and find of GRASS binaries
# using the nc sf data object for spatial referencing
# This is the highly recommended linking procedure for on the fly jobs
# NOTE: if more than one GRASS installation is found the highest version will be choosed
grass<-linkGRASS7(nc,returnPaths = TRUE)
The second most common situation is the usage of an existing GRASS
location and project either with existing data sets or manually provided parameters.
library(link2GI)
require(sf)
# proj folders
projRootDir<-tempdir()
paths<-link2GI::initProj(projRootDir = projRootDir,
projFolders = c("project1/"))
# get data
nc <- st_read(system.file("shape/nc.shp", package="sf"))
# CREATE and link to a permanent GRASS folder at "projRootDir", location named "project1"
linkGRASS7(nc, gisdbase = projRootDir, location = "project1")
# ONLY LINK to a permanent GRASS folder at "projRootDir", location named "project1"
linkGRASS7(gisdbase = projRootDir, location = "project1", gisdbase_exist = TRUE )
# setting up GRASS manually with spatial parameters of the nc data
proj4_string <- as.character(sp::CRS("+init=epsg:28992"))
linkGRASS7(spatial_params = c(178605,329714,181390,333611,proj4_string))
# creating a GRASS gisdbase manually with spatial parameters of the nc data
# additionally using a peramanent directory "projRootDir" and the location "nc_spatial_params "
proj4_string <- as.character(sp::CRS("+init=epsg:4267"))
linkGRASS7(gisdbase = projRootDir,
location = "nc_spatial_params",
spatial_params = c(-84.32385, 33.88199,-75.45698,36.58965,proj4_string))
The full disk search can be cumbersome especially running Windos it can easily take 10 minutes and more. So it is helpful to provide a searchpath for narrowing down the search. Searching for GRASS
installations in the home directory you may use the following command.
# Link the GRASS installation and define the search location
linkGRASS7(nc, search_path = "~")
If you already did a full search and kow your installation fo example using the command findGRASS
you can use the result directly for linking.
findGRASS()
instDir version installation_type
1 /opt/grass 7.8.1 grass78
# now linking it
linkGRASS7(nc,c("/opt/grass","7.8.15","grass78"))
# corresponding linkage running windows
linkGRASS7(nc,c("C:/Program Files/GRASS GIS7.0.5","GRASS GIS 7.0.5","NSIS"))
Finally some more specific examples related to interactive selection or OS specific settings.
Choose manually the GRASS
installation additionally using the meuse sf
object for spatial referencing
linkGRASS7(nc, ver_select = TRUE)
Creating and linking a permanent GRASS
gisdbase (folder structure) at “~/temp3” with the standard mapset “PERMANENT”“ and the location named "project1”. For all spatial attributes use the the meuse sf
object.
linkGRASS7(x = nc,
gisdbase = "~/temp3",
location = "project1")
Link to the permanent GRASS
gisdbase (folder structure) at “~/temp3” with the standard mapset “PERMANENT” and the location named “project1”. For all spatial attributes use the formerly referencend nc sf
object parameter.
linkGRASS7(gisdbase = "~/temp3", location = "project1",
gisdbase_exist = TRUE)
Setting up GRASS
manually with spatial parameters of the meuse data
linkGRASS7(spatial_params = c(178605,329714,181390,333611,
"+proj=sterea +lat_0=52.15616055555555
+lon_0=5.38763888888889 +k=0.9999079
+x_0=155000 +y_0=463000 +no_defs
+a=6377397.155 +rf=299.1528128
+towgs84=565.4171,50.3319,465.5524,
-0.398957,0.343988,-1.8774,4.0725
+to_meter=1"))
link2GI supports the use of the Orfeo Toolbox with a listbased simple wrapper function. Actually two functions parse the modules and functions syntax dumps and generate a command list that is easy to modify with the necessary arguments.
Usually you have to get the module list first:
# link to the installed OTB
otblink<-link2GI::linkOTB()
# get the list of modules from the linked version
algo<-parseOTBAlgorithms(gili = otblink)
Based on the modules of the current version of OTB
you can then choose the module(s) you want to use.
## for the example we use the edge detection,
algoKeyword<- "EdgeExtraction"
## extract the command list for the choosen algorithm
cmd<-parseOTBFunction(algo = algoKeyword, gili = otblink)
## print the current command
print(cmd)
Admittedly this is a very straightforward and preliminary approach. Nevertheless it provids you a valid list of all OTB
API calls that can easily manipulated for your needs. The following working example will give you an idea how to use it.
require(link2GI)
require(raster)
require(listviewer)
otblink<-link2GI::linkOTB()
projRootDir<-tempdir()
data("rgb")
raster::plotRGB(rgb)
r<-raster::writeRaster(rgb,
filename=file.path(projRootDir,"test.tif"),
format="GTiff", overwrite=TRUE)
## for the example we use the edge detection,
algoKeyword<- "EdgeExtraction"
## extract the command list for the choosen algorithm
cmd<-parseOTBFunction(algo = algoKeyword, gili = otblink)
## get help using the convenient listviewer
listviewer::jsonedit(cmd$help)
## define the mandantory arguments all other will be default
cmd$input <- file.path(projRootDir,"test.tif")
cmd$filter <- "touzi"
cmd$channel <- 2
cmd$out <- file.path(projRootDir,paste0("out",cmd$filter,".tif"))
## run algorithm
retStack<-runOTB(cmd,gili = otblink)
## plot filter raster on the green channel
plot(retStack)
A typical example is the usage of an already existing project database in GRASS
. GRASS
organizes all data in an internal file structure that is known as gisdbase folder, a mapset and one or more locations within this mapset. All raster and vector data is stored inside this structure and the organisation is performed by GRASS
. So a typical task could be to work on data sets that are already stored in an existing GRASS
structure
First of all we need some real world data. In this this case the gridded 2011 micro zensus population data of Germany. It has some nice aspects:
We also have to download a meta data description file (excel sheet) for informations about projection and data concepts and so on.
# we need some additional packages
require(link2GI)
require(curl)
# first of all we create a project folder structure
link2GI::initProj(projRootDir = paste0(tempdir(),"/link2GI_examples"),
projFolders = c("run/"),
path_prefix = "path_",
global = TRUE)
# set runtime directory
setwd(path_run)
# get some typical authority generated data
url<-"https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/
DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip;
jsessionid=294313DDBB57914D6636DE373897A3F2.2_cid389?__blob=publicationFile&v=3"
res <- curl::curl_download(url, paste0(path_run,"testdata.zip"))
# unzip it
unzip(res,files = grep(".csv", unzip(res,list = TRUE)$Name,value = TRUE),
junkpaths = TRUE, overwrite = TRUE)
fn <- list.files(pattern = "[.]csv$", path = getwd(), full.names = TRUE)
After downloading the data we will use it for some demonstration stuff. If you have a look the data is nothing than x,y,z with assuming some projection information.
# get the filename
# fast read with data.table
xyz <- data.table::fread(paste0(path_run,"/Zensus_Bevoelkerung_100m-Gitter.csv"))
head(xyz)
We can easy rasterize this data as it is intentionally gridded data.that means we have in at a grid size of 100 by 100 meters a value.
require(RColorBrewer)
require(raster)
require(mapview)
# clean dataframe
xyz <- xyz[,-1]
# rasterize it according to the projection
r <- raster::rasterFromXYZ(xyz,crs = sp::CRS("+init=epsg:3035"))
# map it
p <- colorRampPalette(brewer.pal(8, "Reds"))
# aet resolution to 1 sqkm
mapview::mapviewOptions(mapview.maxpixels = r@ncols*r@nrows/10)
mapview::mapview(r, col.regions = p,
at = c(-1,10,25,50,100,500,1000,2500),
legend = TRUE)
So far nothing new. Now we create a new but permanent GRASS
gisbase using the spatial parameters from the raster object. As you know the linkGRASS7
function performs a full search for one or more than one existing GRASS
installations. If a valid GRASS
installation exists all parameter are setup und the package rgrass7
is linked.
Due to the fact that the gisdbase_exist
is by default set to FALSE it will create a new structure according to the R
object.
require(link2GI)
# initialize GRASS and set up a permanent structure
link2GI::linkGRASS7(x = r,
gisdbase = paste0(tempdir(),"/link2GI_examples"),
location = "microzensus2011")
Finally we can now import the data to the GRASS
gisdbase using the rgass7
package functionality.
First we must convert the raster object to GeoTIFF
file. Any GDAL
format is possible but GeoTIFF
is very common and stable.
require(link2GI)
require(raster)
require(rgrass7)
# write it to geotiff
raster::writeRaster(r, paste0(path_run,"/Zensus_Bevoelkerung_100m-Gitter.tif"),
overwrite = TRUE)
# import raster to GRASS
rgrass7::execGRASS('r.external',
flags=c('o',"overwrite","quiet"),
input=paste0(path_run,"/Zensus_Bevoelkerung_100m-Gitter.tif"),
output="Zensus_Bevoelkerung_100m_Gitter",
band=1)
# check imported data set
rgrass7::execGRASS('r.info',
map = "Zensus_Bevoelkerung_100m_Gitter")
Let's do now the same import as a vector data set. First we create a sf
object. Please note this will take quite a while.
xyz_sf = st_as_sf(xyz,
coords = c("x_mp_100m", "y_mp_100m"),
crs = 3035,
agr = "constant")
#map points
sf::plot_sf(xyz_sf)
The GRASS
gisdbase already exists. So we pass linkGRASS7
the argument gisdbase_exist=TRUE
and import the xyz data as generic GRASS
vector points.
require(sf)
require(sp)
require(link2GI)
sf2gvec(x = xyz_sf,
obj_name = "Zensus_Bevoelkerung_100m_",
gisdbase = paste0(tempdir(),"/link2GI_examples"),
location = "microzensus2011",
gisdbase_exist = TRUE)
# check imported data set
rgrass7::execGRASS('v.info', map = "Zensus_Bevoelkerung_100m_")
During the GEOSTAT 2018 in Prague some more complex usescases have been presented.
SAGA
and OTB
calls - SAGA & OTB basic usecaseGRASS
based cost analysis on a huge cost raster - Beetle spread over high asia