The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
As mentioned
elsewhere, case_match() and case_when() do
not return a factor. A typical
tidyverse solution for getting a factor out of
case_match() with the levels in a desired order is
something like this:
nhanes<-nhanes %>%
mutate(
country=factor(
case_match(dmdborn4,1 ~ 'USA',2 ~ 'Other'),
levels=c('USA','Other')
)
)In this sort of solution, we have to type the level labels twice. The first occurrence defines the label-level mapping, while the second occurrence defines the order of the levels. I think this is inefficient.
Compare the above with the following base-R solution:
dmdborn4_codebook<-c('USA'=1,'Other'=2)
nhanes$country<-factor(nhanes$dmdborn4,levels=dmdborn4_codebook,
labels=names(dmdborn4_codebook))Here, we only have to type the level labels once: that one occurrence defines both the label-level mapping and the order of the levels.
My starting principle in writing basecase is that one should only have to type the level labels once.
An R package that uses base R to mimic dplyr’s
case_match() and case_when(). Unlike the
dplyr functions, base_match() and
base_when() will each return a factor. The desired order of
the levels is honored.
Install remotes if you don’t already have it:
install.packages('remotes')Install the baseverse package:
remotes::install_github('yea-hung/baseverse')Load the baseverse package, if you haven’t already loaded it:
library(baseverse)Load the data:
data('nhanes')base_match()Using native piping:
nhanes<-nhanes |>
transform(country=base_match(dmdborn4,'USA'=1,'Other'=2))Using dollar-sign notation:
nhanes$country<-base_match(nhanes$dmdborn4,'USA'=1,'Other'=2)base_when()Using native piping:
nhanes<-nhanes |>
transform(
cholesterol=base_when(
'Desirable' = (lbxtc<200),
'Borderline high' = (lbxtc>=200)&(lbxtc<240),
'High' = (lbxtc>=240)
)
)Using dollar-sign notation:
nhanes$cholesterol<-base_when(
'Desirable' = (nhanes$lbxtc<200),
'Borderline high' = (nhanes$lbxtc>=200)&(nhanes$lbxtc<240),
'High' = (nhanes$lbxtc>=240)
)Despite the cute name, base_when() does not exactly
mimic case_when(), and I do not intend it to. A key
difference is base_when() will evaluate all conditions
defined in conditions whereas case_when()
will, for each position, stop when a condition is met.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.