Once your delayed data object has been created as described in Delayed Data Objects,
teal.data
provides a useful set of functions to examine the
object outside of a shiny application, i.e. the global environment.
Below is an exhaustive list of all such functions:
TealDataset |
TealDatasetConnector |
TealDataConnector & TealData |
|
---|---|---|---|
Get Reproducible Code (Optionally Deparsed) | get_code |
get_code |
get_code |
Get data.frame | get_raw_data |
get_raw_data |
get_raw_data |
Get Dataset Name | get_dataname |
get_dataname |
get_dataname |
Get Single Dataset Object | get_dataset |
get_dataset |
get_dataset |
Get All Dataset Objects | - | - | get_datasets |
Load Data | - | load_dataset |
load_datasets |
Check if Loaded | - | is_pulled |
is_pulled |
Mutate Single Dataset | mutate_dataset |
mutate_dataset |
mutate_dataset |
Mutate All Datasets | - | - | mutate_data |
The most basic function get_dataname
returns the name of
the dataset or datasets in your delayed data object:
library(scda)
library(teal.data)
<- callable_function(function() synthetic_cdisc_data("latest")$adsl)
adsl_cf <- cdisc_dataset_connector(
adsl dataname = "ADSL",
pull_callable = adsl_cf,
keys = get_cdisc_keys("ADSL")
)get_dataname(adsl) # "ADSL"
## [1] "ADSL"
<- callable_function(function() synthetic_cdisc_data("latest")$adae)
adae_cf <- cdisc_dataset_connector(
adae dataname = "ADAE",
pull_callable = adae_cf,
keys = get_cdisc_keys("ADAE")
)<- cdisc_data(adsl, adae)
delayed_data get_dataname(delayed_data) # "ADSL" "ADAE"
## [1] "ADSL" "ADAE"
The delayed data objects described above all also contain a
launch
method which can be used to test the data loading
screen:
if (interactive()) {
$launch()
delayed_data }
There is also a pull
method to test that the data can be
loaded without launching a shiny app. See Delayed Data Loading.
Alternatively teal.data
provides a
load_dataset
function for
<...>Dataset<...>
objects which is used to pull
the data without launching the delayed loading screen, and a
load_datasets
function for
<...>Data<...>
objects which launches the
delayed loading screen used to pull the datasets from the
connection.
After loading the data, it can be checked that the data has been
successfully pulled using the is_pulled
function:
if (interactive()) {
load_datasets(delayed_data)
}is_pulled(delayed_data)
## [1] FALSE
It is possible to set default values of the boxes on the loading page
using the set_ui_input
method:
$set_ui_input(function(ns) {
adaelist(pickerInput("name", label = "Version of the dataset", choices = ls_synthetic_cdisc_data(), selected = "latest"))
})
Once the data are loaded, it’s also possible to access the individual
dataset objects using the get_dataset
function, or for
<...>Data<...>
objects, retrieve all dataset
objects using the get_datasets
function:
lapply(delayed_data$get_items(), function(item) item$pull())
# return a particular dataset by name
get_dataset(delayed_data, dataname = "ADSL")
# or return all datasets
load_datasets(delayed_data)
get_datasets(delayed_data)
Note that when a connector is loaded, the result is a dataset object:
# "CDISCTealDatasetConnector" "TealDatasetConnector" "R6"
class(adsl)
## [1] "CDISCTealDatasetConnector" "TealDatasetConnector"
## [3] "R6"
# "CDISCTealDataset" "TealDataset" "R6"
class(get_dataset(adsl))
## [1] "CDISCTealDataset" "TealDataset" "R6"
To view the raw dataframe object, use the get_raw_data
function:
# for a single <...>Dataset<..> object
head(get_raw_data(adsl), 3)
## # A tibble: 3 × 44
## STUDYID USUBJID SUBJID SITEID AGE AGEU SEX RACE ETHNIC COUNTRY DTHFL
## <chr> <chr> <chr> <chr> <int> <fct> <fct> <fct> <fct> <fct> <fct>
## 1 AB12345 AB12345-CH… id-128 CHN-3 32 YEARS M ASIAN NOT H… CHN N
## 2 AB12345 AB12345-CH… id-262 CHN-15 35 YEARS M BLAC… NOT H… CHN N
## 3 AB12345 AB12345-RU… id-378 RUS-3 30 YEARS F ASIAN NOT H… RUS N
## # … with 33 more variables: INVID <chr>, INVNAM <chr>, ARM <fct>, ARMCD <fct>,
## # ACTARM <fct>, ACTARMCD <fct>, TRT01P <fct>, TRT01A <fct>, REGION1 <fct>,
## # STRATA1 <fct>, STRATA2 <fct>, BMRKR1 <dbl>, BMRKR2 <fct>, ITTFL <fct>,
## # SAFFL <fct>, BMEASIFL <fct>, BEP01FL <fct>, RANDDT <date>, TRTSDTM <dttm>,
## # TRTEDTM <dttm>, EOSSTT <fct>, EOTSTT <fct>, EOSDT <date>, EOSDY <int>,
## # DCSREAS <fct>, DTHDT <date>, DTHCAUS <fct>, DTHCAT <fct>, LDDTHELD <int>,
## # LDDTHGR1 <fct>, LSTALVDT <date>, DTHADY <int>, study_duration_secs <dbl>
# or for a <...>Data<...> object containing multiple datasets, specify the name of the dataset of interest
<- get_raw_data(delayed_data, "ADSL")
raw head(raw, 3)
## # A tibble: 3 × 44
## STUDYID USUBJID SUBJID SITEID AGE AGEU SEX RACE ETHNIC COUNTRY DTHFL
## <chr> <chr> <chr> <chr> <int> <fct> <fct> <fct> <fct> <fct> <fct>
## 1 AB12345 AB12345-CH… id-128 CHN-3 32 YEARS M ASIAN NOT H… CHN N
## 2 AB12345 AB12345-CH… id-262 CHN-15 35 YEARS M BLAC… NOT H… CHN N
## 3 AB12345 AB12345-RU… id-378 RUS-3 30 YEARS F ASIAN NOT H… RUS N
## # … with 33 more variables: INVID <chr>, INVNAM <chr>, ARM <fct>, ARMCD <fct>,
## # ACTARM <fct>, ACTARMCD <fct>, TRT01P <fct>, TRT01A <fct>, REGION1 <fct>,
## # STRATA1 <fct>, STRATA2 <fct>, BMRKR1 <dbl>, BMRKR2 <fct>, ITTFL <fct>,
## # SAFFL <fct>, BMEASIFL <fct>, BEP01FL <fct>, RANDDT <date>, TRTSDTM <dttm>,
## # TRTEDTM <dttm>, EOSSTT <fct>, EOTSTT <fct>, EOSDT <date>, EOSDY <int>,
## # DCSREAS <fct>, DTHDT <date>, DTHCAUS <fct>, DTHCAT <fct>, LDDTHELD <int>,
## # LDDTHGR1 <fct>, LSTALVDT <date>, DTHADY <int>, study_duration_secs <dbl>
# note the raw data is now just a regular R table
class(raw)
## [1] "tbl_df" "tbl" "data.frame"
The get_code
function is called to check that the
processing code is as expected (and for reproducibility).
get_code(delayed_data)
## [1] "ADSL <- (function() synthetic_cdisc_data(\"latest\")$adsl)()\nADAE <- (function() synthetic_cdisc_data(\"latest\")$adae)()"
See the section on pre-processing Delayed Data
to specify additional code instructions to transform your delayed data
which will also be added to the output of get_code
.
The examples above covered some basic piping, but there is a natural
sequence to the loading and inspection of a delayed data object. For
this reason, the magrittr
pipe %>%
works
well for many pre-processing tasks.
library(teal.data)
library(scda)
library(magrittr)
<- callable_function(function() synthetic_cdisc_data("latest")$adsl)
adsl_cf cdisc_dataset_connector(
dataname = "ADSL",
pull_callable = adsl_cf,
keys = get_cdisc_keys("ADSL")
%>%
) mutate_dataset("ADSL$TRTDUR <- round(as.numeric(ADSL$TRTEDTM - ADSL$TRTSDTM), 1)") %>%
load_dataset() %>%
get_raw_data() %>%
head(n = 3)
## # A tibble: 3 × 45
## STUDYID USUBJID SUBJID SITEID AGE AGEU SEX RACE ETHNIC COUNTRY DTHFL
## <chr> <chr> <chr> <chr> <int> <fct> <fct> <fct> <fct> <fct> <fct>
## 1 AB12345 AB12345-CH… id-128 CHN-3 32 YEARS M ASIAN NOT H… CHN N
## 2 AB12345 AB12345-CH… id-262 CHN-15 35 YEARS M BLAC… NOT H… CHN N
## 3 AB12345 AB12345-RU… id-378 RUS-3 30 YEARS F ASIAN NOT H… RUS N
## # … with 34 more variables: INVID <chr>, INVNAM <chr>, ARM <fct>, ARMCD <fct>,
## # ACTARM <fct>, ACTARMCD <fct>, TRT01P <fct>, TRT01A <fct>, REGION1 <fct>,
## # STRATA1 <fct>, STRATA2 <fct>, BMRKR1 <dbl>, BMRKR2 <fct>, ITTFL <fct>,
## # SAFFL <fct>, BMEASIFL <fct>, BEP01FL <fct>, RANDDT <date>, TRTSDTM <dttm>,
## # TRTEDTM <dttm>, EOSSTT <fct>, EOTSTT <fct>, EOSDT <date>, EOSDY <int>,
## # DCSREAS <fct>, DTHDT <date>, DTHCAUS <fct>, DTHCAT <fct>, LDDTHELD <int>,
## # LDDTHGR1 <fct>, LSTALVDT <date>, DTHADY <int>, study_duration_secs <dbl>, …
Since these functions modify (operate on) the objects that are given to them, there is no need to assign the result.
For an introduction to pipes, refer to the documentation for
%>%
or other resources on pipes.