R library for interacting with the Google Cloud Storage JSON API (api docs).
Google Cloud Storage charges you for storage (prices here).
You can use your own Google Project with a credit card added to create buckets, where the charges will apply. This can be done in the Google API Console after login here: https://console.developers.google.com
.
Once you have a Google project and created a bucket with an object in it, you can download it as below:
library(googleCloudStorageR)
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")
## optional, if you want to use your own Google project
# options("googleAuthR.client_id" = "YOUR_CLIENT_ID")
# options("googleAuthR.client_secret" = "YOUR_CLIENT_SECRET")
googleAuthR::gar_auth()
## get your project name from the API console
proj <- "your-project"
## get bucket info
buckets <- gcs_list_buckets(proj)
bucket <- "your-bucket"
bucket_info <- gcs_get_bucket(bucket)
## get object info
objects <- gcs_list_objects(bucket)
## save directly to an R object (warning, don't run out of RAM if its a big object)
## the download type is guessed into an appropriate R object
parsed_download <- gcs_get_object(objects$name[[1]], bucket)
## if you want to do your own parsing, set parseObject to FALSE
## use httr::content() to parse afterwards
raw_download <- gcs_get_object(objects$name[[1]],
bucket,
parseObject = FALSE)
## save directly to a file in your working directory
## parseObject has no effect, it is a httr::content(req, "raw") download
gcs_get_object(objects$name[[1]], bucket, saveToDisk = "csv_downloaded.csv")
Objects can be uploaded via files saved to disk, or passed in directly if they are data frames or list type R objects. Data frames will be converted to CSV via write.csv()
, lists to JSON via jsonlite::toJSON
.
## upload a file - type will be guessed from file extension or supply type
write.csv(mtcars, file = filename)
gcs_upload(filename, "your-bucket")
## upload an R data.frame directly - will be converted to csv via write.csv
gcs_upload(mtcars, "your-bucket")
## upload an R list - will be converted to json via jsonlite::toJSON
gcs_upload(list(a = 1, b = 3, c = list(d = 2, e = 5)), "your-bucket")
You can change who can access objects via gcs_update_acl
to one of READER
or OWNER
, on a user, group, domain, project or public for all users or authenticated users.
By default you are “OWNER” of all the objects and buckets you upload and create.
## update access of object to READER for all public
gcs_update_acl("your-object.csv", "your-project", entity_type = "allUsers")
## update access of object for user joe@blogs.com to OWNER
gcs_update_acl("your-object.csv", "your-project", "joe@blogs.com", role = "OWNER")
## update access of object for googlegroup users to READER
gcs_update_acl("your-object.csv", "your-project", "my-group@googlegroups.com", entity_type = "group")
## update access of object for all users to OWNER on your Google Apps domain
gcs_update_acl("your-object.csv",
"your-project",
"yourdomain.com",
entity_type = "domain",
role = OWNER)
Use gcs_get_object_access()
to see what the current access is for an entity
+ entity_type
.
## default entity_type is user
acl <- gcs_get_object_access("your-object.csv", "your-project", entity = "joe@blogs.com")
acl$role
[1] "OWNER"
## for allUsers and allAuthenticated users, you don't need to supply entity
acl <- gcs_get_object_access("your-object.csv", "your-project", entity_type = "allUsers")
acl$role
[1] "READER"
Once a user (or group or the public) has access, they can reach that object via a download link generated by the function gcs_download_url
download_url <- gcs_download_url("your-object.csv", "your-project")
download_url
[1] "https://storage.cloud.google.com/your-project/your-object.csv"
The library is also compatible with Shiny authentication flows, so for example you can create Shiny apps that lets users log in and upload their own data.
An example of that is shown below:
library("shiny")
library("googleAuthR")
library("googleCloudStorageR")
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")
## optional, if you want to use your own Google project
# options("googleAuthR.client_id" = "YOUR_CLIENT_ID")
# options("googleAuthR.client_secret" = "YOUR_CLIENT_SECRET")
## you need to start Shiny app on port 1221
## as thats what the default googleAuthR project expects for OAuth2 authentication
## options(shiny.port = 1221)
## print(source('shiny_test.R')$value) or push the "Run App" button in RStudio
shinyApp(
ui = shinyUI(
fluidPage(
googleAuthR::googleAuthUI("login"),
fileInput("picture", "picture"),
textInput("filename", label = "Name on Google Cloud Storage",value = "myObject"),
actionButton("submit", "submit"),
textOutput("meta_file")
)
),
server = shinyServer(function(input, output, session){
access_token <- shiny::callModule(googleAuth, "login")
meta <- eventReactive(input$submit, {
message("Uploading to Google Cloud Storage")
# from googleCloudStorageR
with_shiny(gcs_upload,
file = input$picture$datapath,
# enter your bucket name here
bucket = "gogauth-test",
type = input$picture$type,
name = input$filename,
shiny_access_token = access_token())
})
output$meta_file <- renderText({
req(meta())
str(meta())
paste("Uploaded: ", meta()$name)
})
})
)
There are various functions to manipulate Buckets:
gcs_list_buckets
gcs_get_bucket
gcs_create_bucket
gcs_update_bucket
You can get meta data about an object by passing meta=TRUE
to gcs_get_object
gcs_get_object("your-object", "your-bucket", meta = TRUE)
googleCloudStorageR
has its own Google project which is used to call the Google Cloud Storage API, but does not have access to the objects or buckets in your Google Project unless you give permission for the library to access your own buckets during the OAuth2 authentication process.
No other user, including the owner of the Google Cloud Storage API project has access unless you have given them access, but you may want to change to use your own Google Project (that could or could not be the same as the one that holds your buckets). The instructions below are for when you visit the Google API console (https://console.developers.google.com/apis/
)
Modify these options after googleAuthR has been loaded:
options("googleAuthR.client_id" = "YOUR_CLIENT_ID")
options("googleAuthR.client_secret" = "YOUR_CLIENT_SECRET")
https://mark.shinyapps.io/searchConsoleRDemo/
http://127.0.0.1:1221
In your Shiny script modify these options:
options("googleAuthR.webapp.client_id" = "YOUR_CLIENT_ID")
options("googleAuthR.webapp.client_secret" = "YOUR_CLIENT_SECRET")
shiny::runApp(port=1221)
or set a shiny option to default to it: options(shiny.port = 1221)
and launch via the RunApp
button in RStudio.Running on your Shiny Server will work only for the URL from step 3.
Set the googleAuthR
option for Google Cloud storage scope:
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")