The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
GeoPackage is an open, standards-based, platform-independent, portable, self-describing, compact format for transferring geospatial information. The GeoPackage Encoding Standard describes a set of conventions for storing the following within an SQLite database:
vector features
tile matrix sets of imagery and raster maps at various scales
attributes (non-spatial data)
extensions
gpkg_write()
can handle a variety of different input
types. Here we start by adding two DEM (GeoTIFF) files.
library(gpkg)
library(terra)
#> terra 1.7.73
dem <- system.file("extdata", "dem.tif", package = "gpkg")
stopifnot(nchar(dem) > 0)
gpkg_tmp <- tempfile(fileext = ".gpkg")
if (file.exists(gpkg_tmp))
file.remove(gpkg_tmp)
# write a gpkg with two DEMs in it
gpkg_write(
dem,
destfile = gpkg_tmp,
RASTER_TABLE = "DEM1",
FIELD_NAME = "Elevation"
)
gpkg_write(
dem,
destfile = gpkg_tmp,
append = TRUE,
RASTER_TABLE = "DEM2",
FIELD_NAME = "Elevation",
NoData = -9999
)
We can also write vector data to GeoPackage. Here we use
gpkg_write()
to add a bounding box polygon layer derived
from extent of "DEM1"
.
Similarly, data.frame
-like objects (non-spatial
“attributes”) can be written to GeoPackage.
geopackage()
is a constructor that can create a simple
container for working with geopackages from several types of inputs.
Often you will have a character file path to a GeoPackage
(.gpkg) file.
g <- geopackage(gpkg_tmp, connect = TRUE)
g
#> <geopackage>
#> --------------------------------------------------------------------------------
#> # of Tables: 20
#>
#> DEM1, DEM2, bbox, gpkg_2d_gridded_coverage_ancillary,
#> gpkg_2d_gridded_tile_ancillary, gpkg_contents, gpkg_extensions,
#> gpkg_geometry_columns, gpkg_metadata, gpkg_metadata_reference,
#> gpkg_ogr_contents, gpkg_spatial_ref_sys, gpkg_tile_matrix,
#> gpkg_tile_matrix_set, myattr, rtree_bbox_geom, rtree_bbox_geom_node,
#> rtree_bbox_geom_parent, rtree_bbox_geom_rowid, sqlite_sequence
#> --------------------------------------------------------------------------------
#> <SQLiteConnection>
#> Path: /tmp/RtmpnDYffb/file2cdba26e44276.gpkg
#> Extensions: TRUE
class(g)
#> [1] "geopackage"
Other times you may have a list of tables and layers you want to be in a GeoPackage that does not exist yet.
g2 <- geopackage(list(dem = r, bbox = v))
g2
#> <geopackage>
#> --------------------------------------------------------------------------------
#> # of Tables: 18
#>
#> bbox, dem, gpkg_2d_gridded_coverage_ancillary,
#> gpkg_2d_gridded_tile_ancillary, gpkg_contents, gpkg_extensions,
#> gpkg_geometry_columns, gpkg_metadata, gpkg_metadata_reference,
#> gpkg_ogr_contents, gpkg_spatial_ref_sys, gpkg_tile_matrix,
#> gpkg_tile_matrix_set, rtree_bbox_geom, rtree_bbox_geom_node,
#> rtree_bbox_geom_parent, rtree_bbox_geom_rowid, sqlite_sequence
#> --------------------------------------------------------------------------------
#> <SQLiteConnection>
#> Path: /tmp/RtmpnDYffb/Rgpkg2cdba42f77d6e.gpkg
#> Extensions: TRUE
class(g2)
#> [1] "geopackage"
Note that a temporary GeoPackage
(/tmp/RtmpnDYffb/Rgpkg2cdba42f77d6e.gpkg) is automatically created when
using the geopackage(<list>)
constructor.
You also may have a DBIConnection to a GeoPackage database already opened that you want to use. In any case (character, list, SQLiteConnection) there is an S3 method to facilitate creating the basic geopackage class provided by {gpkg}. All other methods are designed to be able to work smoothly with geopackage class input.
We can list the table names in a GeoPackage with
gpkg_list_tables()
and fetch pointers (SpatRaster,
SpatVectorProxy, and lazy data.frame) to the data in them with
gpkg_table()
. We can check the status of the internal
geopackage
class SQLiteConnection
with
gpkg_is_connected()
and disconnect it with
gpkg_disconnect()
.
# enumerate tables
gpkg_list_tables(g)
#> [1] "DEM1" "DEM2"
#> [3] "bbox" "gpkg_2d_gridded_coverage_ancillary"
#> [5] "gpkg_2d_gridded_tile_ancillary" "gpkg_contents"
#> [7] "gpkg_extensions" "gpkg_geometry_columns"
#> [9] "gpkg_metadata" "gpkg_metadata_reference"
#> [11] "gpkg_ogr_contents" "gpkg_spatial_ref_sys"
#> [13] "gpkg_tile_matrix" "gpkg_tile_matrix_set"
#> [15] "myattr" "rtree_bbox_geom"
#> [17] "rtree_bbox_geom_node" "rtree_bbox_geom_parent"
#> [19] "rtree_bbox_geom_rowid" "sqlite_sequence"
# inspect tables
gpkg_tables(g)
#> $DEM1
#> class : SpatRaster
#> dimensions : 30, 31, 1 (nrow, ncol, nlyr)
#> resolution : 0.008333333, 0.008333333 (x, y)
#> extent : 6.008333, 6.266667, 49.69167, 49.94167 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
#> source : file2cdba26e44276.gpkg:DEM1
#> varname : file2cdba26e44276
#> name : DEM1
#> min value : 195
#> max value : 500
#>
#> $DEM2
#> class : SpatRaster
#> dimensions : 30, 31, 1 (nrow, ncol, nlyr)
#> resolution : 0.008333333, 0.008333333 (x, y)
#> extent : 6.008333, 6.266667, 49.69167, 49.94167 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
#> source : file2cdba26e44276.gpkg:DEM2
#> varname : file2cdba26e44276
#> name : DEM2
#> min value : 195
#> max value : 500
#>
#> $myattr
#> # Source: table<myattr> [10 x 2]
#> # Database: sqlite 3.45.0 [/tmp/RtmpnDYffb/file2cdba26e44276.gpkg]
#> a b
#> <int> <chr>
#> 1 1 A
#> 2 2 B
#> 3 3 C
#> 4 4 D
#> 5 5 E
#> 6 6 F
#> 7 7 G
#> 8 8 H
#> 9 9 I
#> 10 10 J
#>
#> $bbox
#> class : SpatVectorProxy
#> geometry : polygons
#> dimensions : 1, 0 (geometries, attributes)
#> extent : 6.008333, 6.266667, 49.69167, 49.94167 (xmin, xmax, ymin, ymax)
#> source : file2cdba26e44276.gpkg (bbox)
#> layer : bbox
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
# inspect a specific table
gpkg_table(g, "myattr", collect = TRUE)
#> a b
#> 1 1 A
#> 2 2 B
#> 3 3 C
#> 4 4 D
#> 5 5 E
#> 6 6 F
#> 7 7 G
#> 8 8 H
#> 9 9 I
#> 10 10 J
Note that the collect = TRUE
forces data be loaded into
R memory for vector and attribute data; this is the difference in result
object class of SpatVectorProxy/SpatVector and
tbl_SQLiteConnection/data.frame for vector and
attribute data, respectively.
gpkg_collect()
is a helper method to call
gpkg_table(..., collect = TRUE)
for in-memory loading of
specific tables.
Note that with grid data returned from gpkg_collect()
you get a table result with the tile contents in a blob column of a
data.frame instead of SpatRaster object.
The inverse function of gpkg_collect()
is
gpkg_tbl()
which always returns a
tbl_SQLiteConnection.
gpkg_tbl(g, "gpkg_contents")
#> # Source: table<gpkg_contents> [4 x 10]
#> # Database: sqlite 3.45.0 [/tmp/RtmpnDYffb/file2cdba26e44276.gpkg]
#> table_name data_type identifier description last_change min_x min_y max_x
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 DEM1 2d-gridded… DEM1 "" 2024-02-19… 6.01 49.7 6.27
#> 2 DEM2 2d-gridded… DEM2 "" 2024-02-19… 6.01 49.7 6.27
#> 3 bbox features bbox "" 2024-02-19… 6.01 49.7 6.27
#> 4 myattr attributes myattr "" 2024-02-19… -180 -90 180
#> # ℹ 2 more variables: max_y <dbl>, srs_id <int>
More on how to use this type of result next.
There are several other methods that can be used for working with tabular data in a GeoPackage in a “lazy” fashion.
gpkg_table_pragma()
gpkg_table_pragma()
is a low-frills
data.frame
result containing important table information,
but not values. The PRAGMA table_info()
is stored as a
nested data.frame table_info
. This representation has no
dependencies beyond {RSQLite} and is efficient for inspection of table
structure and attributes, though it is less useful for data
analysis.
head(gpkg_table_pragma(g))
#> dsn table_name nrow table_info.cid
#> 1 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM1 1 0
#> 2 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM1 1 1
#> 3 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM1 1 2
#> 4 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM1 1 3
#> 5 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM1 1 4
#> 6 /tmp/RtmpnDYffb/file2cdba26e44276.gpkg DEM2 1 0
#> table_info.name table_info.type table_info.notnull table_info.dflt_value
#> 1 id INTEGER 0 <NA>
#> 2 zoom_level INTEGER 1 <NA>
#> 3 tile_column INTEGER 1 <NA>
#> 4 tile_row INTEGER 1 <NA>
#> 5 tile_data BLOB 1 <NA>
#> 6 id INTEGER 0 <NA>
#> table_info.pk
#> 1 1
#> 2 0
#> 3 0
#> 4 0
#> 5 0
#> 6 1
gpkg_vect()
and
gpkg_query()
gpkg_vect()
is a wrapper around
terra::vect()
you can use to create ‘terra’
SpatVector
objects from the tables found in a
GeoPackage.
gpkg_vect(g, 'bbox')
#> class : SpatVector
#> geometry : polygons
#> dimensions : 1, 0 (geometries, attributes)
#> extent : 6.008333, 6.266667, 49.69167, 49.94167 (xmin, xmax, ymin, ymax)
#> source : file2cdba26e44276.gpkg (bbox)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
The table of interest need not have a geometry column, but this method does not work on GeoPackage that contain only gridded data, and some layer in the GeoPackage must have some geometry.
gpkg_vect(g, 'gpkg_ogr_contents')
#> class : SpatVector
#> geometry : none
#> dimensions : 0, 2 (geometries, attributes)
#> extent : 0, 0, 0, 0 (xmin, xmax, ymin, ymax)
#> source : file2cdba26e44276.gpkg (SELECT)
#> coord. ref. :
#> names : table_name feature_count
#> type : <chr> <int>
The SpatVectorProxy is used for “lazy” references to of vector and attribute contents of a GeoPackage; this object for vector data is analogous to the SpatRaster for gridded data. The ‘terra’ package provides “GDAL plumbing” for filter and query utilities.
gpkg_query()
by default uses the ‘RSQLite’ driver, but
the richer capabilities of OGR data sources can be harnessed with SQLite SQL
dialect. These additional features can be utilized with the
ogr=TRUE
argument to gpkg_query()
, or
gpkg_ogr_query()
for short. This assumes that GDAL is built
with support for SQLite (and ideally also with support for
Spatialite).
For example, we use built-in functions such as ST_MinX()
to calculate summaries for "bbox"
table, geometry column
"geom"
. In this case we expect the calculated quantities to
match the coordinates/boundaries of the bounding box:
gpkg_rast()
Using gpkg_rast()
you can quickly access references to
all tile/gridded datasets in a GeoPackage.
For example:
gpkg_rast(g)
#> class : SpatRaster
#> dimensions : 30, 31, 2 (nrow, ncol, nlyr)
#> resolution : 0.008333333, 0.008333333 (x, y)
#> extent : 6.008333, 6.266667, 49.69167, 49.94167 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
#> sources : file2cdba26e44276.gpkg:DEM1
#> file2cdba26e44276.gpkg:DEM2
#> varnames : file2cdba26e44276
#> file2cdba26e44276
#> names : DEM1, DEM2
#> min values : 195, 195
#> max values : 500, 500
gpkg_table()
With the gpkg_table()
method you access a specific table
(by name) and get a “lazy” tibble
object referencing that
table.
This is achieved via {dplyr} and the {dbplyr} database connection to the GeoPackage via the {RSQLite} driver. The resulting object’s data can be used in more complex analyses by using other {dbplyr}/{tidyverse} functions.
For example, we inspect the contents of the
gpkg_contents
table that contains critical information on
the data contained in a GeoPackage.
gpkg_table(g, "gpkg_contents")
#> # Source: table<gpkg_contents> [4 x 10]
#> # Database: sqlite 3.45.0 [/tmp/RtmpnDYffb/file2cdba26e44276.gpkg]
#> table_name data_type identifier description last_change min_x min_y max_x
#> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 DEM1 2d-gridded… DEM1 "" 2024-02-19… 6.01 49.7 6.27
#> 2 DEM2 2d-gridded… DEM2 "" 2024-02-19… 6.01 49.7 6.27
#> 3 bbox features bbox "" 2024-02-19… 6.01 49.7 6.27
#> 4 myattr attributes myattr "" 2024-02-19… -180 -90 180
#> # ℹ 2 more variables: max_y <dbl>, srs_id <int>
As a more complicated example we access the
gpkg_2d_gridded_tile_ancillary
table, and perform some data
processing.
We dplyr::select()
mean
and
std_dev
columns from the dplyr::filter()
ed
rows where tpudt_name == "DEM2"
. Finally we materialize a
tibble
with dplyr::collect()
:
Several helper methods are available for checking GeoPackage
SQLiteConnection
status, as well as connecting and
disconnecting an existing geopackage
object
(g
).
# still connected
gpkg_is_connected(g)
#> [1] TRUE
# disconnect geopackage
gpkg_disconnect(g)
# reconnect
gpkg_connect(g)
#> <geopackage>
#> --------------------------------------------------------------------------------
#> # of Tables: 21
#>
#> DEM1, DEM2, bbox, dummy_feature, gpkg_2d_gridded_coverage_ancillary,
#> gpkg_2d_gridded_tile_ancillary, gpkg_contents, gpkg_extensions,
#> gpkg_geometry_columns, gpkg_metadata, gpkg_metadata_reference,
#> gpkg_ogr_contents, gpkg_spatial_ref_sys, gpkg_tile_matrix,
#> gpkg_tile_matrix_set, myattr, rtree_bbox_geom, rtree_bbox_geom_node,
#> rtree_bbox_geom_parent, rtree_bbox_geom_rowid, sqlite_sequence
#> --------------------------------------------------------------------------------
#> <SQLiteConnection>
#> Path: /tmp/RtmpnDYffb/file2cdba26e44276.gpkg
#> Extensions: TRUE
# disconnect
gpkg_disconnect(g)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.