Type: | Package |
Title: | Sequential Trial Emulation |
Version: | 0.13.1 |
Description: | Implementation of sequential trial emulation for the analysis of observational databases. The 'SEQTaRget' software accommodates time-varying treatments and confounders, as well as binary and failure time outcomes. 'SEQTaRget' allows to compare both static and dynamic strategies, can be used to estimate observational analogs of intention-to-treat and per-protocol effects, and can adjust for potential selection bias induced by losses-to-follow-up. (Paper to come). |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
Suggests: | rmarkdown, testthat (≥ 3.0.0) |
Imports: | data.table, doFuture, doRNG, fastglm, future, future.apply, ggplot2, knitr, methods, stringr, survival |
Config/testthat/edition: | 3 |
Depends: | R (≥ 4.1) |
URL: | https://causalinference.github.io/SEQuential/ |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-09-10 15:45:45 UTC; ryo766 |
Author: | Ryan O'Dea |
Maintainer: | Ryan O'Dea <ryanodea@hsph.harvard.edu> |
Repository: | CRAN |
Date/Publication: | 2025-09-15 08:40:02 UTC |
Function to return the internal data from a SEQuential object
Description
Function to return the internal data from a SEQuential object
Usage
SEQ_data(object)
Arguments
object |
SEQoutput object |
Value
data.table
Simulated observational example data for SEQuential
Description
Simulated observational example data for SEQuential
Usage
SEQdata
Format
A data frame with 54,687 rows and 13 columns:
- ID
Integer: Unique ID emulating individual patients
- time
Integer: Time of observation, always begins at 0, max time of 59. Should be continuous
- eligible
Binary: eligibility criteria for timepoints
- outcome
Binary: If an outcome is observed at this time point
- tx_init
Binary: If treatment is observed at this time point
- sex
Binary: Sex of the emulated patient
- N
Numeric: Normal random variable from N\(10,5\)
- L
Numeric: 4% continuously increase from U\(0, 1\)
- P
Numeric: 2% continuously decrease from U\(9, 10\)
- excusedOne
Binary: Once one, always one variable emulating an excuse for treatment switch
- excusedZero
Binary: Once zero, always zero variable emulating an excuse for treatment switch
Simulated Lost-to-followup example data for SEQuential
Description
Simulated Lost-to-followup example data for SEQuential
Usage
SEQdata.LTFU
Format
A dataframe with 4,139 rows and 13 columns:
- ID
Integer: Unique ID emulating individual patients
- time
Integer: Time of observation, always begins at 0, max time of 59; however, if lost-to-followup, time is truncated at a random point
- eligible
Binary: eligibility criteria for timepoints
- outcome
Binary: If an outcome is observed at this time point
- tx_init
Binary: If treatment is observed at this time point
- sex
Binary: Sex of the emulated patient
- N
Numeric: Normal random variable from N\(10,5\)
- L
Numeric: 4% continuously increase from U\(0, 1\)
- P
Numeric: 2% continuously decrease from U\(9, 10\)
- excusedOne
Binary: Once one, always one variable emulating an excuse for treatment switch
- excusedZero
Binary: Once zero, always zero variable emulating an excuse for treatment switch
- LTFU
Binary: Flag for losing a simulated ID to followup, if 1 there are no more records of the ID afterwards
Simulated multitreatment example data for SEQuential multinomial models
Description
Simulated multitreatment example data for SEQuential multinomial models
Usage
SEQdata.multitreatment
Format
A dataframe with 5,976 rows and 11 columns:
- ID
Integer: Unique ID emulating individual patients
- time
Integer: Time of observation, always begins at 0, max time of 59; however, if lost-to-followup, time is truncated at a random point
- eligible
Binary: eligibility criteria for timepoints
- outcome
Binary: If an outcome is observed at this time point
- tx_init
Integer: Which treatment is observed at this time point
- sex
Binary: Sex of the emulated patient
- N
Numeric: Normal random variable from N\(10,5\)
- L
Numeric: 4% continuously increase from U\(0, 1\)
- P
Numeric: 2% continuously decrease from U\(9, 10\)
- excusedOne
Binary: Once one, always one variable emulating an excuse for treatment switch
- excusedZero
Binary: Once zero, always zero variable emulating an excuse for treatment switch
Estimate the (very rough) time to run SEQuential analysis on current machine
Description
Estimate the (very rough) time to run SEQuential analysis on current machine
Usage
SEQestimate(
data,
id.col,
time.col,
eligible.col,
treatment.col,
outcome.col,
time_varying.cols = list(),
fixed.cols = list(),
method,
options,
verbose = TRUE
)
Arguments
data |
data.frame or data.table, if not already expanded with |
id.col |
String: column name of the id column |
time.col |
String: column name of the time column |
eligible.col |
String: column name of the eligibility column |
treatment.col |
String: column name of the treatment column |
outcome.col |
String: column name of the outcome column |
time_varying.cols |
List: column names for time varying columns |
fixed.cols |
List: column names for fixed columns |
method |
String: method of analysis to preform |
options |
List: optional list of parameters from |
verbose |
Logical: if TRUE, cats progress to console |
Value
A list of (very rough) estimates for the time required for SEQuential containing:
-
modelTime
estimated time used when running models -
expansionTime
estimated time used when expanding data -
totalTime
sum of model and expansion time
Creates an expanded dataset for use with SEQuential
Description
Creates an expanded dataset for use with SEQuential
Usage
SEQexpand(params)
Arguments
params |
SEQparams object built in the SEQuential function |
Parameter Builder for SEQuential Model and Estimates
Description
Parameter Builder for SEQuential Model and Estimates
Usage
SEQopts(
bootstrap = FALSE,
bootstrap.nboot = 100,
bootstrap.sample = 0.8,
cense = NA,
cense.denominator = NA,
cense.eligible = NA,
cense.numerator = NA,
compevent = NA,
covariates = NA,
data.return = FALSE,
denominator = NA,
deviation = FALSE,
deviation.col = NA,
deviation.conditions = c(NA, NA),
deviation.excused = FALSE,
deviation.excused_cols = c(NA, NA),
excused = FALSE,
excused.cols = c(NA, NA),
fastglm.method = 2L,
followup.class = FALSE,
followup.include = TRUE,
followup.max = Inf,
followup.min = -Inf,
followup.spline = FALSE,
hazard = FALSE,
indicator.baseline = "_bas",
indicator.squared = "_sq",
km.curves = FALSE,
multinomial = FALSE,
ncores = parallel::detectCores() - 1,
nthreads = data.table::getDTthreads(),
numerator = NA,
parallel = FALSE,
plot.colors = c("#F8766D", "#00BFC4", "#555555"),
plot.labels = NA,
plot.subtitle = NA,
plot.title = NA,
plot.type = "survival",
seed = NULL,
selection.first_trial = FALSE,
selection.prob = 0.8,
selection.random = FALSE,
subgroup = NA,
survival.max = Inf,
treat.level = c(0, 1),
trial.include = TRUE,
weight.eligible_cols = c(),
weight.lower = -Inf,
weight.lag_condition = TRUE,
weight.p99 = FALSE,
weight.preexpansion = TRUE,
weight.upper = Inf,
weighted = FALSE
)
Arguments
bootstrap |
Logical: defines if SEQuential should run bootstrapping, default is FALSE |
bootstrap.nboot |
Integer: number of bootstraps |
bootstrap.sample |
Numeric: percentage of data to use when bootstrapping, should in [0, 1], default is 0.8 |
cense |
String: column name for additional censoring variable, e.g. loss-to-follow-up |
cense.denominator |
String: censoring denominator covariates to the right hand side of a formula object |
cense.eligible |
String: column name for indicator column defining which rows to use for censoring model |
cense.numerator |
String: censoring numerator covariates to the right hand side of a formula object |
compevent |
String: column name for competing event indicator |
covariates |
String: covariates to the right hand side of a formula object |
data.return |
Logical: whether to return the expanded dataframe with weighting information |
denominator |
String: denominator covariates to the right hand side of a to formula object |
deviation |
Logical: create switch based on deviation from column |
deviation.col |
Character: column name for deviation |
deviation.conditions |
Character list: RHS evaluations of the same length as |
deviation.excused |
Logical: whether deviations should be excused by |
deviation.excused_cols |
Character list: excused columns for deviation switches |
excused |
Logical: in the case of censoring, whether there is an excused condition |
excused.cols |
List: list of column names for treatment switch excuses - should be the same length, and ordered the same as |
fastglm.method |
Integer: decomposition method for fastglm (1-QR, 2-Cholesky, 3-LDLT, 4-QR.FPIV) |
followup.class |
Logical: treat followup as a class, e.g. expands every time to it's own indicator column |
followup.include |
Logical: whether or not to include 'followup' and 'followup_squared' in the outcome model |
followup.max |
Numeric: maximum time to expand about, default is Inf (no maximum) |
followup.min |
Numeric: minimum time to expand aboud, default is -Inf (no minimum) |
followup.spline |
Logical: treat followup as a cubic spline |
hazard |
Logical: hazard error calculation instead of survival estimation |
indicator.baseline |
String: identifier for baseline variables in |
indicator.squared |
String: identifier for squared variables in |
km.curves |
Logical: Kaplan-Meier survival curve creation and data return |
multinomial |
Logical: whether to expect multilevel treatment values |
ncores |
Integer: number of cores to use in parallel processing, default is one less than system max |
nthreads |
Integer: number of threads to use for data.table processing |
numerator |
String: numerator covariates to the right hand side of a to formula object |
parallel |
Logical: define if the SEQuential process is run in parallel, default is FALSE |
plot.colors |
Character: Colors for output plot if |
plot.labels |
Character: Color labels for output plot if |
plot.subtitle |
Character: Subtitle for output plot if |
plot.title |
Character: Title for output plot if |
plot.type |
Character: Type of plot to create if |
seed |
Integer: starting seed |
selection.first_trial |
Logical: selects only the first eligible trial in the expanded dataset |
selection.prob |
Numeric: percent of total IDs to select for |
selection.random |
Logical: randomly selects IDs with replacement to run analysis |
subgroup |
Character: Column name to stratify outcome models on |
survival.max |
Numeric: maximum time for survival curves, default is Inf (no maximum) |
treat.level |
List: treatment levels to compare |
trial.include |
Logical: whether or not to include 'trial' and 'trial_squared' in the outcome model |
weight.eligible_cols |
List: list of column names for indicator columns defining which weights are eligible for weight models - in order of |
weight.lower |
Numeric: weights truncated at lower end at this weight |
weight.lag_condition |
Logical: whether weights should be conditioned on treatment lag value |
weight.p99 |
Logical: forces weight truncation at 1st and 99th percentile weights, will override provided |
weight.preexpansion |
Logical: whether weighting should be done on pre-expanded data |
weight.upper |
Numeric: weights truncated at upper end at this weight |
weighted |
Logical: whether or not to preform weighted analysis, default is FALSE |
Value
An object of class 'SEQopts'
An S4 class of user options to feed into the SEQuential processes and estimates
This class should match SEQopts
in file SEQopts.R
Description
An S4 class of user options to feed into the SEQuential processes and estimates
This class should match SEQopts
in file SEQopts.R
An S4 class used to hold the outputs for the SEQuential process
Description
An S4 class used to hold the outputs for the SEQuential process
Slots
params
SEQparams object
outcome
outcome covariates
numerator
numerator covariates
denominator
denominator covariates
outcome.model
list of length
bootstrap.nboot
containing outcome coefficientshazard
hazard ratio
survival.curve
ggplot object for the survival curves
survival.data
data.table of survival data
risk.difference
risk difference calculated from survival data
risk.ratio
risk ratio calculated from survival data
time
time in minutes used for the SEQuential process
weight.statistics
information from the weighting process, containing weight coefficients and weight statistics
info
list of outcome and switch information (if applicable)
ce.model
list of competing event models if
compevent
is specified, NA otherwise
An internal S4 class to carry around parameters during the SEQuential process - inherits user facing parameters from SEQopts
Description
An internal S4 class to carry around parameters during the SEQuential process - inherits user facing parameters from SEQopts
Slots
data
pre expansion data
DT
post expansion data
id
id column as defined by the user
time
time column as defined by the user
eligible
eligible column as defined by the user
treatment
treatment column as defined by the user
time_varying
list of time varying columns as defined by the user
fixed
list of fixed columns as defined by the user
method
method of analysis as defined by the user
SEQuential trial emulation
Description
'SEQuential' is an all-in-one API to SEQuential analysis, returning a SEQoutput object of results. More specific examples can be found on pages at https://causalinference.github.io/SEQuential/
Usage
SEQuential(
data,
id.col,
time.col,
eligible.col,
treatment.col,
outcome.col,
time_varying.cols = list(),
fixed.cols = list(),
method,
options,
verbose = TRUE
)
Arguments
data |
data.frame or data.table, if not already expanded with |
id.col |
String: column name of the id column |
time.col |
String: column name of the time column |
eligible.col |
String: column name of the eligibility column |
treatment.col |
String: column name of the treatment column |
outcome.col |
String: column name of the outcome column |
time_varying.cols |
List: column names for time varying columns |
fixed.cols |
List: column names for fixed columns |
method |
String: method of analysis to preform |
options |
List: optional list of parameters from |
verbose |
Logical: if TRUE, cats progress to console |
Details
Implemention of sequential trial emulation for the analysis of observational databases. The SEQuential software accommodates time-varying treatments and confounders, as well as binary and failure time outcomes. SEQ allows to compare both static and dynamic strategies, can be used to estimate observational analogs of intention-to-treat and per-protocol effects, and can adjust for potential selection bias induced by losses-to-follow-up.
Value
An S4 object of class SEQoutput
Examples
data <- SEQdata
model <- SEQuential(data, id.col = "ID",
time.col = "time",
eligible.col = "eligible",
treatment.col = "tx_init",
outcome.col = "outcome",
time_varying.cols = c("N", "L", "P"),
fixed.cols = "sex",
method = "ITT",
options = SEQopts())
An internal S4 class to help transfer weight statistics out of internal_weights
Description
An internal S4 class to help transfer weight statistics out of internal_weights
Slots
weights
a data.table containing the estimated weights, either pre or post expansion
coef.n0
numerator zero model
coef.n1
numerator one model
coef.d0
denominator zero model
coef.d1
denominator one model
coef.ncense
numerator censoring model
coef.dcense
denominator censoring model
Function to clean out non needed elements from fastglm return
Description
Function to clean out non needed elements from fastglm return
Usage
clean_fastglm(model)
Arguments
model |
a fastglm model |
Function to return competing event models from a SEQuential object
Description
Function to return competing event models from a SEQuential object
Usage
compevent(object)
Arguments
object |
SEQoutput object |
Value
list of fastglm objects
Retrieves Outcome, Numerator, and Denominator Covariates
Description
Retrieves Outcome, Numerator, and Denominator Covariates
Usage
covariates(object)
Arguments
object |
object of class SEQoutput |
Value
list of SEQuential covariates
Internal Function to create 'default' loss-to-followup formula
Description
Internal Function to create 'default' loss-to-followup formula
Usage
create.default.LTFU.covariates(params, type)
Internal Function to create 'default' formula
Description
Internal Function to create 'default' formula
Usage
create.default.covariates(params)
Internal Function to create 'default' weighting formula
Description
Internal Function to create 'default' weighting formula
Usage
create.default.weight.covariates(params, type)
Internal function to pull Risk Ratio and Risk Difference from data when km.curves = TRUE
Description
Internal function to pull Risk Ratio and Risk Difference from data when km.curves = TRUE
Usage
create.risk(data, params)
Retrieves Denominator Models from SEQuential object
Description
Retrieves Denominator Models from SEQuential object
Usage
denominator(object)
Arguments
object |
object of class SEQoutput |
Value
List of both numerator models
Function to return diagnostic tables from a SEQuential object
Description
Function to return diagnostic tables from a SEQuential object
Usage
diagnostics(object)
Arguments
object |
SEQoutput object |
Value
list of diagnostic tables
Nicely cleans time for readability
Description
Nicely cleans time for readability
Usage
## S3 method for class 'time'
format(seconds)
Function to return hazard ratios from a SEQuential object
Description
Function to return hazard ratios from a SEQuential object
Usage
hazard_ratio(object)
Arguments
object |
SEQoutput object |
Value
list of hazard ratios
Helper Function to inline predict a fastglm object
Description
Helper Function to inline predict a fastglm object
Usage
inline.pred(
model,
newdata,
params,
type,
case = "default",
multi = FALSE,
target = NULL
)
Arguments
model |
a fastglm object |
newdata |
filler for a .SD from data.table |
params |
parameter from SEQuential |
type |
type of prediction |
Internal analysis tool for handling parallelization/bootstrapping on multiple OS types
Description
Internal analysis tool for handling parallelization/bootstrapping on multiple OS types
Usage
internal.analysis(params)
Generic function to format a dataset for hazard ratio calculation
Description
Generic function to format a dataset for hazard ratio calculation
Usage
internal.hazard(model, params)
Internal function for fitting outcome models
Description
Internal function for fitting outcome models
Usage
internal.model(data, params)
Plotting for survival curves
Description
Plotting for survival curves
Usage
internal.plot(survival.data, params)
Arguments
survival.data |
Dataframe containing survival information |
params |
Params passed around SEQuential |
Internal function for creating survival curves
Description
Internal function for creating survival curves
Usage
internal.survival(params, outcome)
Internal function for defining weights
Description
Internal function for defining weights
Usage
internal.weights(DT, data, params)
Arguments
DT |
data.table after expansion |
data |
data.table for data before expansion |
params |
object of class SEQparams (defined in SEQuential) |
Function to print kaplan-meier curves
Description
Function to print kaplan-meier curves
Usage
km_curve(
object,
plot.type = "survival",
plot.title,
plot.subtitle,
plot.labels,
plot.colors
)
Arguments
object |
SEQoutput object to plot |
plot.type |
character: type of plot to print |
plot.title |
character: defines the title of the plot |
plot.subtitle |
character: plot subtitle |
plot.labels |
length 2 character: plot labels |
plot.colors |
length 2 character: plot colors |
Value
ggplot object of plot plot.type
Function to return survival data from a SEQuential object
Description
Function to return survival data from a SEQuential object
Usage
km_data(object)
Arguments
object |
SEQoutput object |
Value
list of dataframes of survival values
Helper function for nested logistic
Description
Helper function for nested logistic
Usage
multinomial(X, y, family = quasibinomial(), method)
Helper to predict from the nested logistic
Description
Helper to predict from the nested logistic
Usage
multinomial.predict(model, X, target = NULL)
Helper function to get the summary table from multinomial
Description
Helper function to get the summary table from multinomial
Usage
multinomial.summary(model)
Retrieves Numerator Models from SEQuential object
Description
Retrieves Numerator Models from SEQuential object
Usage
numerator(object)
Arguments
object |
object of class SEQoutput |
Value
List of both numerator models
Retrieves Outcome Models from SEQuential object
Description
Retrieves Outcome Models from SEQuential object
Usage
outcome(object)
Arguments
object |
object of class SEQoutput |
Value
List of all outcome models
Parameter Helper
Description
Parameter Helper
Usage
parameter.setter(
data,
DT,
id.col,
time.col,
eligible.col,
outcome.col,
treatment.col,
time_varying.cols,
fixed.cols,
method,
opts,
verbose
)
Simplifies parameters down for later use
Description
Simplifies parameters down for later use
Usage
parameter.simplifier(params)
Helper function to prepare data for fastglm
Description
Helper function to prepare data for fastglm
Usage
prepare.data(weight, params, type, model, case)
Arguments
weight |
data after undergoing preparation |
params |
parameter from SEQuential |
type |
type of model, e.g. d0 = "denominator" |
model |
model number, e.g. d0 = "zero model" |
Output constructor
Description
Output constructor
Usage
prepare.output(
params,
WDT,
outcome,
weights,
hazard,
survival.plot,
survival.data,
survival.ce,
risk,
runtime,
info
)
Function to return risk information from a SEQuential object
Description
Function to return risk information from a SEQuential object
Usage
risk_comparison(object)
Arguments
object |
SEQoutput object |
Value
a data frame of risk information at end of followup (risk ratios, risk differences and confidence intervals, if bootstrapped)
Function to return risk information from a SEQuential object
Description
Function to return risk information from a SEQuential object
Usage
risk_data(object)
Arguments
object |
SEQoutput object |
Value
a data table of risk information at every followup
Show method for S4 object - SEQoutput.
Description
Show method for S4 object - SEQoutput.
Usage
## S4 method for signature 'SEQoutput'
show(object)
Arguments
object |
A SEQoutput object - usually generated from |
Value
No return value, sends information about SEQoutput to the console