The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Clinical programmers working in R often face a common challenge when
migrating from SAS: in SAS, a single IF ... THEN DO block
can assign multiple variables at once under one
condition. In R, traditional approaches like case_when() or
fifelse() force you to repeat the same condition
for every variable — increasing QC risk and reducing
readability.
sasif solves this by bringing SAS-style
IF / ELSE IF / ELSE control flow into R’s
data.table ecosystem. One condition governs all assignments
in a block — just like SAS.
This vignette walks through three real-world ADaM derivation scenarios:
In a typical ADSL derivation, when a subject is in the treatment arm, multiple variables need to be assigned simultaneously — population flags, treatment labels, numeric codes, and treatment dates.
In traditional R, every variable requires its own repeated condition:
# ❌ Traditional R — condition repeated for every variable
adsl <- adsl %>% mutate(
SAFFL = case_when(ACTARMCD == "TRTA" ~ "Y"),
SAFFLN = case_when(ACTARMCD == "TRTA" ~ 1),
TRT01A = case_when(ACTARMCD == "TRTA" ~ ACTARMCD),
TRT01AN = case_when(ACTARMCD == "TRTA" ~ 1),
ITTFL = case_when(ACTARMCD == "TRTA" ~ "Y"),
FASFL = case_when(ACTARMCD == "TRTA" ~ "Y"),
RANDFL = case_when(ACTARMCD == "TRTA" ~ "Y"),
PPFL = case_when(ACTARMCD == "TRTA" ~ "Y")
# Same condition written 8 times — high QC risk
)If the condition ever changes, you must update it in 8 places. Miss one and your derivation silently diverges — a real risk in regulated environments.
# Create sample ADSL data
adsl <- data.table(
USUBJID = c("S01", "S02", "S03", "S04"),
ACTARMCD = c("TRTA", "TRTA", "SCRNFAIL", "TRTA"),
RFSTDTC = c("2024-01-10", "2024-01-15", NA, "2024-01-20"),
RFENDTC = c("2024-06-10", "2024-06-15", NA, "2024-06-20")
)
# ✅ sasif — condition written ONCE, governs all assignments
ADSL <- data_step(adsl,
if_do(ACTARMCD == "TRTA",
SAFFL = "Y",
SAFFLN = 1,
TRT01A = "Treatment A",
TRT01AN = 1,
TRTSDT = as.Date(RFSTDTC, "%Y-%m-%d"),
TRTEDT = as.Date(RFENDTC, "%Y-%m-%d"),
ITTFL = "Y",
FASFL = "Y",
RANDFL = "Y",
PPFL = "Y"
)
)
print(ADSL[, .(USUBJID, ACTARMCD, SAFFL, TRT01A, TRT01AN, ITTFL, FASFL)])
#> USUBJID ACTARMCD SAFFL TRT01A TRT01AN ITTFL FASFL
#> <char> <char> <char> <char> <num> <char> <char>
#> 1: S01 TRTA Y Treatment A 1 Y Y
#> 2: S02 TRTA Y Treatment A 1 Y Y
#> 3: S03 SCRNFAIL <NA> <NA> NA <NA> <NA>
#> 4: S04 TRTA Y Treatment A 1 Y YAll 10 variables are derived from a single condition block. Clean,
readable, and audit-friendly — exactly like SAS
IF ... THEN DO.
When a study has multiple treatment arms, use the full IF / ELSE IF / ELSE chain. The first matching condition wins — all others are skipped:
adsl2 <- data.table(
USUBJID = c("S01", "S02", "S03", "S04", "S05"),
ACTARMCD = c("TRTA", "TRTB", "TRTC", "TRTA", "TRTB"),
AGE = c(35, 52, 67, 44, 58)
)
ADSL2 <- data_step(adsl2,
if_do(ACTARMCD == "TRTA",
TRT01A = "Treatment A",
TRT01AN = 1
),
else_if_do(ACTARMCD == "TRTB",
TRT01A = "Treatment B",
TRT01AN = 2
),
else_do(
TRT01A = "Placebo",
TRT01AN = 99
)
)
print(ADSL2[, .(USUBJID, ACTARMCD, TRT01A, TRT01AN)])
#> USUBJID ACTARMCD TRT01A TRT01AN
#> <char> <char> <char> <num>
#> 1: S01 TRTA Treatment A 1
#> 2: S02 TRTB Treatment B 2
#> 3: S03 TRTC Placebo 99
#> 4: S04 TRTA Treatment A 1
#> 5: S05 TRTB Treatment B 2Notice that both TRT01A (character label) and
TRT01AN (numeric code) are derived together under each
condition — no repetition needed.
Derive both the age category label and its numeric code in one chain:
adsl3 <- data.table(
USUBJID = c("S01", "S02", "S03", "S04", "S05"),
AGE = c(32, 45, 58, 71, 80)
)
ADSL3 <- data_step(adsl3,
if_do(AGE <= 45,
AGECAT = "YOUNG",
AGECATN = 1
),
else_if_do(AGE <= 70,
AGECAT = "MIDDLE",
AGECATN = 2
),
else_do(
AGECAT = "OLD",
AGECATN = 3
)
)
print(ADSL3[, .(USUBJID, AGE, AGECAT, AGECATN)])
#> USUBJID AGE AGECAT AGECATN
#> <char> <num> <char> <num>
#> 1: S01 32 YOUNG 1
#> 2: S02 45 YOUNG 1
#> 3: S03 58 MIDDLE 2
#> 4: S04 71 OLD 3
#> 5: S05 80 OLD 3A common ADaM derivation — categorise lab values as LOW, NORMAL, or HIGH based on reference ranges, and derive both the character and numeric category together:
adlb <- data.table(
USUBJID = c("S01", "S01", "S02", "S02", "S03"),
LBTESTCD = c("ALB", "ALB", "ALB", "ALB", "ALB"),
AVAL = c(2.8, 4.2, 5.6, 3.5, 1.9),
ANRLO = c(3.5, 3.5, 3.5, 3.5, 3.5),
ANRHI = c(5.0, 5.0, 5.0, 5.0, 5.0)
)
ADLB <- data_step(adlb,
if_do(LBTESTCD == "ALB" & AVAL < ANRLO,
ALBCAT = "LOW",
ALBCATN = 1
),
else_if_do(LBTESTCD == "ALB" & AVAL > ANRHI,
ALBCAT = "HIGH",
ALBCATN = 2
),
else_do(
ALBCAT = "NORMAL",
ALBCATN = 3
)
)
print(ADLB[, .(USUBJID, LBTESTCD, AVAL, ANRLO, ANRHI, ALBCAT, ALBCATN)])
#> USUBJID LBTESTCD AVAL ANRLO ANRHI ALBCAT ALBCATN
#> <char> <char> <num> <num> <num> <char> <num>
#> 1: S01 ALB 2.8 3.5 5 LOW 1
#> 2: S01 ALB 4.2 3.5 5 NORMAL 3
#> 3: S02 ALB 5.6 3.5 5 HIGH 2
#> 4: S02 ALB 3.5 3.5 5 NORMAL 3
#> 5: S03 ALB 1.9 3.5 5 LOW 1Both ALBCAT and ALBCATN are always
consistent — they are derived from the same condition, so they can never
diverge.
Flag adverse events that started on or after the treatment start date:
adae <- data.table(
USUBJID = c("S01", "S01", "S02", "S02", "S03"),
AEDECOD = c("Headache", "Nausea", "Fatigue", "Dizziness", "Rash"),
ASTDT = as.Date(c("2024-01-15", "2023-12-01",
"2024-01-20", "2024-02-10", "2024-01-25")),
TRTSDT = as.Date(c("2024-01-10", "2024-01-10",
"2024-01-15", "2024-01-15", "2024-01-20")),
TRTEDT = as.Date(c("2024-06-10", "2024-06-10",
"2024-06-15", "2024-06-15", "2024-06-20"))
)
ADAE <- data_step(adae,
if_do(ASTDT >= TRTSDT & ASTDT <= TRTEDT,
TRTEMFL = "Y",
TRTEMA = AEDECOD
)
)
print(ADAE[, .(USUBJID, AEDECOD, ASTDT, TRTSDT, TRTEMFL)])
#> USUBJID AEDECOD ASTDT TRTSDT TRTEMFL
#> <char> <char> <Date> <Date> <char>
#> 1: S01 Headache 2024-01-15 2024-01-10 Y
#> 2: S01 Nausea 2023-12-01 2024-01-10 <NA>
#> 3: S02 Fatigue 2024-01-20 2024-01-15 Y
#> 4: S02 Dizziness 2024-02-10 2024-01-15 Y
#> 5: S03 Rash 2024-01-25 2024-01-20 YUse delete_if() to remove rows explicitly — mirrors the
SAS DELETE statement and makes the intent clear in the
code:
adlb2 <- data.table(
USUBJID = c("S01", "S02", "S03", "S04", "S05"),
LBTESTCD = c("ALB", NA, "ALB", "ALB", NA),
VISIT = c("WEEK 1", "WEEK 1", "UNSCHEDULED", "WEEK 2", "WEEK 4"),
AVAL = c(4.2, 3.8, 5.1, 4.0, 3.5)
)
ADLB2 <- data_step(adlb2,
delete_if(is.na(LBTESTCD)),
delete_if(VISIT == "UNSCHEDULED")
)
print(ADLB2)
#> USUBJID LBTESTCD VISIT AVAL
#> <char> <char> <char> <num>
#> 1: S01 ALB WEEK 1 4.2
#> 2: S04 ALB WEEK 2 4.0Only records with valid test codes and scheduled visits are retained.
Use if_independent() when conditions are
not mutually exclusive — each condition is evaluated on
its own, so multiple flags can apply to the same row simultaneously:
adsl4 <- data.table(
USUBJID = c("S01", "S02", "S03", "S04"),
AGE = c(30, 68, 45, 72),
WEIGHTKG = c(48, 72, 55, 43),
DIABFL = c("N", "Y", "N", "Y")
)
ADSL4 <- data_step(adsl4,
if_independent(AGE > 65, SENIORFL = "Y"),
if_independent(WEIGHTKG < 50, LOWWTFL = "Y"),
if_independent(DIABFL == "Y", COMORBFL = "Y")
)
print(ADSL4)
#> USUBJID AGE WEIGHTKG DIABFL SENIORFL LOWWTFL COMORBFL
#> <char> <num> <num> <char> <char> <char> <char>
#> 1: S01 30 48 N <NA> Y <NA>
#> 2: S02 68 72 Y Y <NA> Y
#> 3: S03 45 55 N <NA> <NA> <NA>
#> 4: S04 72 43 Y Y Y YSubject S04 (age 72, weight 43, diabetic) receives all three flags — because all three conditions are TRUE for that row simultaneously.
| Situation | Use |
|---|---|
| First matching condition should win | if_do() + else_if_do() +
else_do() |
| Multiple conditions can apply to same row | if_independent() |
| Remove rows from dataset | delete_if() |
Important: Do not mix
if_do()chains withif_independent()on the same variable.if_independent()runs after the chain and will overwrite earlier assignments. Use one approach consistently per variable.
sasif brings three key benefits to clinical R
programming:
IF / ELSE IF / ELSE control flow that clinical programmers
already knowFor more information, see the package documentation.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.