The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Nonparametric analysis of clustered multistate process data.
clusteredMSM provides population-averaged transition
probability estimates, pointwise confidence intervals, simultaneous
confidence bands, and two-sample Kolmogorov-Smirnov-type tests for
multistate process data with cluster-correlated observations. Estimation
follows Bakoyannis
(2021); two-sample inference for the cluster-randomized and
independent-samples designs follows Bakoyannis &
Bandyopadhyay (2022). Both rest on the working-independence
Aalen-Johansen estimator with a cluster-bootstrap variance.
Unlike its predecessor (the clustered-multistate
repository, which relied on the mstate package),
clusteredMSM is self-contained (depending only on
survival) and supports non-monotone multistate
processes, including illness-death with recovery and other
models with cyclic transitions.
# install.packages("devtools")
devtools::install_github("gbakoyannis/clusteredMSM")After CRAN release:
install.packages("clusteredMSM")clusteredMSM exposes one main function,
patp(), modelled after survival::Surv():
library(clusteredMSM)
# Synthetic clustered illness-death-with-recovery data (40 subjects,
# 8 clusters); see ?example_msm.
data(example_msm)
# Define the transition structure (illness-death with recovery)
tmat <- trans_mat(list(c(2, 3), c(1, 3), integer(0)),
names = c("Healthy", "Ill", "Dead"))
# One-sample analysis: P(Ill at t | Healthy at 0)
fit <- patp(msm(Tstart, Tstop, Sstart, Sstop) ~ 1,
data = example_msm, tmat = tmat,
id = "id", cluster = "cluster",
h = 1, j = 2, s = 0,
B = 1000, cband = TRUE)
fitIf the formula’s right-hand side has a grouping variable,
patp() automatically estimates both group-specific curves
AND tests their equality:
# Two-sample analysis (estimate + test in one call)
tt <- patp(msm(Tstart, Tstop, Sstart, Sstop) ~ treatment,
data = example_msm, tmat = tmat,
id = "id", cluster = "cluster",
h = 1, j = 2, B = 1000)
ttThe same example is shipped as a CSV under
inst/extdata/, so you can mimic the typical workflow of
reading a user-supplied file:
f <- system.file("extdata", "example_data.csv", package = "clusteredMSM")
mydata <- read.csv(f)
head(mydata)Each row of your data represents one mutually-exclusive time interval for one subject, with columns:
| Column | Description |
|---|---|
Tstart |
Numeric start time of the interval |
Tstop |
Numeric end time of the interval |
Sstart |
Integer state occupied during the interval |
Sstop |
Integer state at Tstop (or equal to Sstart
if censored) |
id |
Subject identifier |
cluster |
(optional) cluster identifier |
| (group) | (optional) binary grouping variable |
The column names are arbitrary — msm(...) and the
id/cluster arguments tell the package which is
which.
Censoring is encoded as Sstart == Sstop
on the final row of a subject’s record. Subjects in
absorbing states have no row after them.
Within each subject, intervals must be: - Temporally
contiguous: Tstop[k] == Tstart[k+1] -
State contiguous:
Sstop[k] == Sstart[k+1]
Validation is strict and informative — any violation triggers an error with a clear message.
Progressive illness-death (subject who got ill, then died):
| id | Tstart | Tstop | Sstart | Sstop |
|---|---|---|---|---|
| 1 | 0.0 | 1.5 | 1 | 2 |
| 1 | 1.5 | 3.0 | 2 | 3 |
Subject censored healthy:
| id | Tstart | Tstop | Sstart | Sstop |
|---|---|---|---|---|
| 2 | 0.0 | 4.0 | 1 | 1 |
Recovery (Healthy → Ill → Healthy → censored):
| id | Tstart | Tstop | Sstart | Sstop |
|---|---|---|---|---|
| 3 | 0.0 | 1.0 | 1 | 2 |
| 3 | 1.0 | 2.0 | 2 | 1 |
| 3 | 2.0 | 3.5 | 1 | 1 |
| Function | Purpose |
|---|---|
patp() |
The main user-facing function — formula-based estimation and testing. |
msm() |
Constructor for multistate intervals; used inside the formula. |
trans_mat() |
Build a K x K transition matrix. |
validate_intervals() |
Validate user data (called automatically by patp();
usable directly). |
Bakoyannis, G. (2021). Nonparametric analysis of nonhomogeneous multistate processes with clustered observations. Biometrics, 77(2), 533-546. doi:10.1111/biom.13327
Bakoyannis, G., & Bandyopadhyay, D. (2022). Nonparametric tests for multistate processes with clustered data. Annals of the Institute of Statistical Mathematics, 74(5), 837-867. doi:10.1007/s10463-021-00819-x
You can retrieve the BibTeX entries within R via
toBibtex(citation("clusteredMSM")).
GPL-3
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.