The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Getting Started

Dyfan Jones

The noctua package aims to make it easier to work with data stored in AWS Athena. noctua package attempts to provide three levels of interacting with AWS Athena:

Installing noctua:

As noctua utilising the R AWS SDK paws the installation of noctua is pretty straight forward:

# cran version
install.packages("noctua")

# Dev version
remotes::install_github("dyfanjones/noctua")

Docker Example:

To help with users wishing to run noctua in a docker, a simple docker file has been created here. To set up the docker please refer to link. For demo purposes we will use the example docker and run it locally:

# build docker image
docker build . -t noctua

# start container with aws credentials passed from local
docker run \
      -e AWS_ACCESS_KEY_ID="$(aws configure get aws_access_key_id)" \
      -e AWS_SECRET_ACCESS_KEY="$(aws configure get aws_secret_access_key)" \
      -e AWS_SESSION_TOKEN="$(aws configure get aws_session_token)" \
      -e AWS_DEFAULT_REGION="$(aws configure get region)" \
      -it noctua

NOTE: readr isn’t required for noctua, however it has been included in the docker file to improve performance when querying AWS Athena.

Usage:

Low - Level API:

library(DBI)
library(noctua)

con <- dbConnect(athena())

# list all current work groups in AWS Athena
list_work_groups(con)

# Create a new work group
create_work_group(con, "demo_work_group", description = "This is a demo work group",
                  tags = tag_options(key= "demo_work_group", value = "demo_01"))

DBI:

library(DBI)

con <- dbConnect(noctua::athena())

# Get metadata 
dbGetInfo(con)

# $profile_name
# [1] "default"
# 
# $s3_staging
# [1] ######## NOTE: Please don't share your S3 bucket to the public
# 
# $dbms.name
# [1] "default"
# 
# $work_group
# [1] "primary"
# 
# $poll_interval
# NULL
# 
# $encryption_option
# NULL
# 
# $kms_key
# NULL
# 
# $expiration
# NULL
# 
# $region_name
# [1] "eu-west-1"
# 
# $paws
# [1] "0.1.6"
# 
# $noctua
# [1] "1.5.1"

# create table to AWS Athena
dbWriteTable(con, "iris", iris)

dbGetQuery(con, "select * from iris limit 10")
# Info: (Data scanned: 860 Bytes)
#  sepal_length sepal_width petal_length petal_width species
# 1:           5.1         3.5          1.4         0.2  setosa
# 2:           4.9         3.0          1.4         0.2  setosa
# 3:           4.7         3.2          1.3         0.2  setosa
# 4:           4.6         3.1          1.5         0.2  setosa
# 5:           5.0         3.6          1.4         0.2  setosa
# 6:           5.4         3.9          1.7         0.4  setosa
# 7:           4.6         3.4          1.4         0.3  setosa
# 8:           5.0         3.4          1.5         0.2  setosa
# 9:           4.4         2.9          1.4         0.2  setosa
# 10:          4.9         3.1          1.5         0.1  setosa

dplyr:

library(dplyr)

athena_iris <- tbl(con, "iris")

athena_iris %>%
  select(species, sepal_length, sepal_width) %>% 
  head(10) %>%
  collect()

# Info: (Data scanned: 860 Bytes)
# # A tibble: 10 x 3
# species  sepal_length sepal_width
# <chr>           <dbl>       <dbl>
# 1 setosa            5.1         3.5
# 2 setosa            4.9         3  
# 3 setosa            4.7         3.2
# 4 setosa            4.6         3.1
# 5 setosa            5           3.6
# 6 setosa            5.4         3.9
# 7 setosa            4.6         3.4
# 8 setosa            5           3.4
# 9 setosa            4.4         2.9
# 10 setosa           4.9         3.1

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.