The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The recodes()
functions makes it very easy to recode one or more variables in the your data frame. The format is
newdata <- recodes(olddata, variables, from values, to values)
Consider the following data set (below). Lets make the following changes.
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
1 | b | better | 20 | 15 | 12 | 1 |
2 | w | worse | 30 | 23 | 20 | 2 |
1 | a | same | 44 | 18 | 33 | 5 |
2 | b | same | 15 | 86 | 55 | 3 |
2 | w | better | 50 | 99 | 30 | 4 |
2 | h | worse | 99 | 35 | 100 | 5 |
For sex
, set 1 to “Male” and 2 to “Female”.
<- recodes(data=df, vars="sex",
df from=c(1,2), to=c("Male", "Female"))
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | b | better | 20 | 15 | 12 | 1 |
Female | w | worse | 30 | 23 | 20 | 2 |
Male | a | same | 44 | 18 | 33 | 5 |
Female | b | same | 15 | 86 | 55 | 3 |
Female | w | better | 50 | 99 | 30 | 4 |
Female | h | worse | 99 | 35 | 100 | 5 |
Recode race
to “White” vs. “Other”.
<- recodes(data=df, vars="race",
df from=c("w", "b", "a", "h"),
to=c("White", "Other", "Other", "Other"))
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | Other | better | 20 | 15 | 12 | 1 |
Female | White | worse | 30 | 23 | 20 | 2 |
Male | Other | same | 44 | 18 | 33 | 5 |
Female | Other | same | 15 | 86 | 55 | 3 |
Female | White | better | 50 | 99 | 30 | 4 |
Female | Other | worse | 99 | 35 | 100 | 5 |
Recode outcome
to 1 (better) vs. 0 (not better).
<- recodes(data=df, vars="outcome",
df from=c("better", "same", "worse"),
to=c(1, 0, 0))
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | Other | 1 | 20 | 15 | 12 | 1 |
Female | White | 0 | 30 | 23 | 20 | 2 |
Male | Other | 0 | 44 | 18 | 33 | 5 |
Female | Other | 0 | 15 | 86 | 55 | 3 |
Female | White | 1 | 50 | 99 | 30 | 4 |
Female | Other | 0 | 99 | 35 | 100 | 5 |
For Q1
and Q2
set values of 86 and 99 to missing.
<- recodes(data=df, vars=c("Q1", "Q2"),
df from=c(86, 99), to=NA)
#> Note: 'from' is longer than 'to', so 'to' was recycled.
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | Other | 1 | 20 | 15 | 12 | 1 |
Female | White | 0 | 30 | 23 | 20 | 2 |
Male | Other | 0 | 44 | 18 | 33 | 5 |
Female | Other | 0 | 15 | NA | 55 | 3 |
Female | White | 1 | 50 | NA | 30 | 4 |
Female | Other | 0 | NA | 35 | 100 | 5 |
For age
, set values
You can use expressions in your from
fields. When they are TRUE
, the corresponding to
values will be applied. We will use the dollar sign ($) to represent the variable (age in this case). The symbols ( |, & ) mean OR and AND respectively.
<- recodes(data=df, vars="age",
df from=c("$ < 20 | $ > 90",
"$ >= 20 & $ <= 30",
"$ > 30 & $ <= 50",
"$ > 50 & $ <= 90"),
to=c(NA, "Younger", "Middle Aged", "Older"))
We can also write this as
<- recodes(data=df, vars="age",
df from=c("$ < 20", "$ <= 30", "$ <= 50", "$ <= 90", "$ > 90"),
to= c(NA, "Younger", "Middle Aged", "Older", "NA"))
This works because once the age value for an observations meets a criteria that is TRUE
(working left to right), it is recoded. It isn’t changed again by later criteria in the same recodes
statement.
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | Other | 1 | 20 | 15 | NA | 1 |
Female | White | 0 | 30 | 23 | Younger | 2 |
Male | Other | 0 | 44 | 18 | Middle Aged | 5 |
Female | Other | 0 | 15 | NA | Older | 3 |
Female | White | 1 | 50 | NA | Younger | 4 |
Female | Other | 0 | NA | 35 | NA | 5 |
Finally, for the rating
variable, reverse the scoring so that 1 to 5 becomes 5 to 1.
<- recodes(data=df, vars="rating", from=1:5, to=5:1) df
sex | race | outcome | Q1 | Q2 | age | rating |
---|---|---|---|---|---|---|
Male | Other | 1 | 20 | 15 | NA | 5 |
Female | White | 0 | 30 | 23 | Younger | 4 |
Male | Other | 0 | 44 | 18 | Middle Aged | 1 |
Female | Other | 0 | 15 | NA | Older | 3 |
Female | White | 1 | 50 | NA | Younger | 2 |
Female | Other | 0 | NA | 35 | NA | 1 |
Remember that recodes
returns a data frame, not a variable.
df <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is correct.
df$rating <- recodes(data=df, vars="rating", from=1:5, to=5:1)
is not.
This allows you to apply the same recoding scheme to more than one variable at a time (e.g., Q1 and Q2 above).
And that’s it (APPLAUSE, APPLAUSE)!
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.