The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Not all sensitive data is recorded as strings - features such as age, date of birth, or income could result in aspects of a data set being personally identifiable. To aid with these challenges we include methods for ‘perturbing’ numeric data (the addition of random noise).
Three types of random noise are included:
adaptive_noise
[default] - random noise which scales
with the standard deviation of the variable transformed.white_noise
- random noise at a set spread.lognorm_noise
- random multiplicative noise at a set
spread.NB: we set a random seed using set.seed
here for
reproducibility. We recommend users avoid this step when using the
package in production code.
library(deident)
set.seed(101)
perturb_pipe <- ShiftsWorked |>
add_perturb(`Daily Pay`)
apply_deident(ShiftsWorked, perturb_pipe)
#> # A tibble: 3,100 × 7
#> `Record ID` Employee Date Shift `Shift Start` `Shift End` `Daily Pay`
#> <int> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 1 Maria Cook 2015-01-01 Night 17:01 00:01 75.1
#> 2 2 Stephen C… 2015-01-01 Day 08:01 16:01 160.
#> 3 3 Kimberly … 2015-01-01 Day 08:01 16:01 71.6
#> 4 4 Nathan Al… 2015-01-01 Day 08:01 15:01 205.
#> 5 5 Samuel Pa… 2015-01-01 Night 16:01 23:01 213.
#> 6 6 Scott Mor… 2015-01-01 Night 17:01 00:01 153.
#> 7 7 Nathan Sa… 2015-01-01 Rest <NA> <NA> 5.66
#> 8 8 Jose Lopez 2015-01-01 Night 17:01 00:01 212.
#> 9 9 Donna Bro… 2015-01-01 Night 16:01 00:01 228.
#> 10 10 George Ki… 2015-01-01 Night 16:01 00:01 240.
#> # ℹ 3,090 more rows
To change the noise, pass one of the functions including the desired level of noise.
perturb_pipe_white_noise <- ShiftsWorked |>
add_perturb(`Daily Pay`, noise = white_noise(sd=0.3))
apply_deident(ShiftsWorked, perturb_pipe_white_noise)
#> # A tibble: 3,100 × 7
#> `Record ID` Employee Date Shift `Shift Start` `Shift End` `Daily Pay`
#> <int> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 1 Maria Cook 2015-01-01 Night 17:01 00:01 78.6
#> 2 2 Stephen C… 2015-01-01 Day 08:01 16:01 156.
#> 3 3 Kimberly … 2015-01-01 Day 08:01 16:01 77.8
#> 4 4 Nathan Al… 2015-01-01 Day 08:01 15:01 203.
#> 5 5 Samuel Pa… 2015-01-01 Night 16:01 23:01 210.
#> 6 6 Scott Mor… 2015-01-01 Night 17:01 00:01 142.
#> 7 7 Nathan Sa… 2015-01-01 Rest <NA> <NA> -0.460
#> 8 8 Jose Lopez 2015-01-01 Night 17:01 00:01 213.
#> 9 9 Donna Bro… 2015-01-01 Night 16:01 00:01 219.
#> 10 10 George Ki… 2015-01-01 Night 16:01 00:01 242.
#> # ℹ 3,090 more rows
perturb_pipe_heavy_adaptive_noise <- ShiftsWorked |>
add_perturb(`Daily Pay`, noise = adaptive_noise(sd.ratio=0.4))
apply_deident(ShiftsWorked, perturb_pipe_heavy_adaptive_noise)
#> # A tibble: 3,100 × 7
#> `Record ID` Employee Date Shift `Shift Start` `Shift End` `Daily Pay`
#> <int> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 1 Maria Cook 2015-01-01 Night 17:01 00:01 60.0
#> 2 2 Stephen C… 2015-01-01 Day 08:01 16:01 108.
#> 3 3 Kimberly … 2015-01-01 Day 08:01 16:01 60.1
#> 4 4 Nathan Al… 2015-01-01 Day 08:01 15:01 195.
#> 5 5 Samuel Pa… 2015-01-01 Night 16:01 23:01 229.
#> 6 6 Scott Mor… 2015-01-01 Night 17:01 00:01 118.
#> 7 7 Nathan Sa… 2015-01-01 Rest <NA> <NA> -51.7
#> 8 8 Jose Lopez 2015-01-01 Night 17:01 00:01 197.
#> 9 9 Donna Bro… 2015-01-01 Night 16:01 00:01 229.
#> 10 10 George Ki… 2015-01-01 Night 16:01 00:01 230.
#> # ℹ 3,090 more rows
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.