The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

RealSurvSim

RealSurvSim is an R package that provides a variety of methods for simulating survival (time-to-event) datasets. It is particularly useful for survival analysis applications in research and simulation studies. The package includes both non-parametric (kernel density estimation), parametric, and bootstrap-based simulation approaches for generating realistic time-to-event data.

Features


Installation

1. From Source

If you have downloaded or cloned this repository:

# Install devtools if you don't already have it
install.packages("devtools")

# Then, from the root of the package directory:
devtools::install_github()

Dependencies

This package uses several R libraries for density estimation, distribution fitting, and survival analysis. They will be automatically installed (if not already present) when installing RealSurvSim. Key dependencies include:


Usage

Below is an overview of the core functions and some example usages. For detailed information on parameters and return values, refer to the function documentation.

Core Functions

  1. data_simul_KDE(orig_vals, n = NULL, kernel = "gaussian")
    Simulates data via kernel density estimation from a numeric vector of original values.
  2. data_simul_Estim(orig_vals, n = NULL, distrib = "exp")
    Fits a specified parametric distribution to orig_vals and draws new samples from the fitted distribution.
  3. data_simul_bootstr(dat, n = NULL, type = "cond")
    Bootstrap-based simulation of event and censoring times.
  4. RealSurvSim(dat, col_time, col_status, col_group, reps = 10000, random_seed = 123, n = NULL, simul_type, distribs = c("exp", "exp", "exp", "exp"))
    The main wrapper function for simulating multiple survival datasets using one of four approaches:

Examples

Below are brief examples demonstrating how to simulate data. In practice, replace the placeholders (example_data, "time", etc.) with your actual dataset and column names.

library(RealSurvSim)

# Example dataset construction (for demonstration):
set.seed(123)
example_data <- data.frame(
  time = rexp(100, rate = 0.1),            # Times
  status = sample(0:1, 100, replace = TRUE), # 0=censored, 1=event
  group = sample(0:1, 100, replace = TRUE)   # Two groups, 0 or 1
)

# 1. Kernel Density Estimation Simulation
sim_kde <- RealSurvSim(
  dat = example_data,
  col_time   = "time",
  col_status = "status",
  col_group  = "group",
  reps       = 5,            # Simulate 5 datasets
  simul_type = "KDE"         # Use KDE-based simulation
)
str(sim_kde$datasets)  # Check the structure of generated datasets

# 2. Parametric Distribution Simulation
sim_distr <- RealSurvSim(
  dat = example_data,
  col_time   = "time",
  col_status = "status",
  col_group  = "group",
  reps       = 5,
  simul_type = "distr",
  distribs   = c("exp", "exp", "exp", "exp")
)
str(sim_distr$datasets)

# 3. Conditional Bootstrap
sim_cond <- RealSurvSim(
  dat = example_data,
  col_time   = "time",
  col_status = "status",
  col_group  = "group",
  reps       = 5,
  simul_type = "cond"
)
str(sim_cond$datasets)

# 4. Case Resampling
sim_case <- RealSurvSim(
  dat = example_data,
  col_time   = "time",
  col_status = "status",
  col_group  = "group",
  reps       = 5,
  simul_type = "case"
)
str(sim_case$datasets)

data(liang)
data(wu)
# 5. liang_kde<- RealSurvSim(liang, liang$V1, liang$V2, liang$V3, reps=3, simul_type = "KDE")

# For arbitary n
# 6. arbliang_distr<- RealSurvSim(liang,  liang$V1, liang$V2, liang$V3,reps=10,n = c(40,50), simul_type = "distr", distrib=c("exp", "llogis","llogis", "exp"))

# 7. arbwu_case<- RealSurvSim(wu, wu$V1, wu$V2, wu$V3, reps=100,n = c(40,50),  simul_type = "case")

References and Further Reading

Underlying Paper for the Package
Analysis and Methods for Survival Data (arXiv:2308.07842)

Data Reconstruction Algorithm
Guyot et al. (2012), describing the algorithm for reconstructing survival data from published Kaplan-Meier curves.

WebPlotDigitizer
WebPlotDigitizer for extracting data points from Kaplan-Meier curves.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.