The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
RealSurvSim is an R package that provides a variety of methods for simulating survival (time-to-event) datasets. It is particularly useful for survival analysis applications in research and simulation studies. The package includes both non-parametric (kernel density estimation), parametric, and bootstrap-based simulation approaches for generating realistic time-to-event data.
cond): Splits
event and censoring times, then resamples to preserve the observed
event/censoring ratio.case): Simple random
resampling of entire observations with replacement.If you have downloaded or cloned this repository:
# Install devtools if you don't already have it
install.packages("devtools")
# Then, from the root of the package directory:
devtools::install_github()This package uses several R libraries for density estimation, distribution fitting, and survival analysis. They will be automatically installed (if not already present) when installing RealSurvSim. Key dependencies include:
Below is an overview of the core functions and some example usages. For detailed information on parameters and return values, refer to the function documentation.
data_simul_KDE(orig_vals, n = NULL, kernel = "gaussian")orig_vals: Numeric vector of original data
values.n: Number of observations to simulate (defaults to the
length of orig_vals).kernel: The kernel to use for KDE (currently supports
"gaussian").data_simul_Estim(orig_vals, n = NULL, distrib = "exp")orig_vals and draws new samples from the fitted
distribution.
"inverse_gamma",
"gompertz", "llogis", "gumbel",
"myMix", "exp".data_simul_bootstr(dat, n = NULL, type = "cond")dat: Dataframe containing at least V1
(time) and V2 (censor indicator, 0/1).n: Number of observations to sample. Defaults to the
same size as dat.type: "cond" for conditional bootstrap or
"case" for case-resampling.RealSurvSim(dat, col_time, col_status, col_group, reps = 10000, random_seed = 123, n = NULL, simul_type, distribs = c("exp", "exp", "exp", "exp"))"cond": Conditional bootstrap
"case": Case resampling
"distr": Parametric distribution-based
simulation
"KDE": Kernel density estimation-based
simulation
Parameters:
dat: Original (or reconstructed) dataset with time,
status, and group columns.col_time: Column name/index for time.col_status: Column name/index for censoring indicator
(1=event, 0=censored).col_group: Column name/index for treatment/group
identifier.reps: Number of datasets to simulate (default
10,000).random_seed: Random seed (default 123) for
reproducibility.n: Vector specifying sample sizes per group
(optional).simul_type: Single string specifying the simulation
method ("cond", "case", "distr",
"KDE").distribs: Which distributions to use if
simul_type = "distr".Returns:
A list containing multiple simulated datasets (one for
each repetition). Each dataset is a data.frame with
columns V1 (time), V2 (status), and
V3 (group).
Below are brief examples demonstrating how to simulate data. In
practice, replace the placeholders (example_data,
"time", etc.) with your actual dataset and column
names.
library(RealSurvSim)
# Example dataset construction (for demonstration):
set.seed(123)
example_data <- data.frame(
time = rexp(100, rate = 0.1), # Times
status = sample(0:1, 100, replace = TRUE), # 0=censored, 1=event
group = sample(0:1, 100, replace = TRUE) # Two groups, 0 or 1
)
# 1. Kernel Density Estimation Simulation
sim_kde <- RealSurvSim(
dat = example_data,
col_time = "time",
col_status = "status",
col_group = "group",
reps = 5, # Simulate 5 datasets
simul_type = "KDE" # Use KDE-based simulation
)
str(sim_kde$datasets) # Check the structure of generated datasets
# 2. Parametric Distribution Simulation
sim_distr <- RealSurvSim(
dat = example_data,
col_time = "time",
col_status = "status",
col_group = "group",
reps = 5,
simul_type = "distr",
distribs = c("exp", "exp", "exp", "exp")
)
str(sim_distr$datasets)
# 3. Conditional Bootstrap
sim_cond <- RealSurvSim(
dat = example_data,
col_time = "time",
col_status = "status",
col_group = "group",
reps = 5,
simul_type = "cond"
)
str(sim_cond$datasets)
# 4. Case Resampling
sim_case <- RealSurvSim(
dat = example_data,
col_time = "time",
col_status = "status",
col_group = "group",
reps = 5,
simul_type = "case"
)
str(sim_case$datasets)
data(liang)
data(wu)
# 5. liang_kde<- RealSurvSim(liang, liang$V1, liang$V2, liang$V3, reps=3, simul_type = "KDE")
# For arbitary n
# 6. arbliang_distr<- RealSurvSim(liang, liang$V1, liang$V2, liang$V3,reps=10,n = c(40,50), simul_type = "distr", distrib=c("exp", "llogis","llogis", "exp"))
# 7. arbwu_case<- RealSurvSim(wu, wu$V1, wu$V2, wu$V3, reps=100,n = c(40,50), simul_type = "case")Underlying Paper for the Package
Analysis and
Methods for Survival Data (arXiv:2308.07842)
Data Reconstruction Algorithm
Guyot et al. (2012), describing the algorithm for reconstructing
survival data from published Kaplan-Meier curves.
WebPlotDigitizer
WebPlotDigitizer
for extracting data points from Kaplan-Meier curves.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.