The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Perturb Example

Not all sensitive data is recorded as strings - features such as age, date of birth, or income could result in aspects of a data set being personally identifiable. To aid with these challenges we include methods for ‘perturbing’ numeric data (the addition of random noise).

Three types of random noise are included:

  1. adaptive_noise [default] - random noise which scales with the standard deviation of the variable transformed.
  2. white_noise - random noise at a set spread.
  3. lognorm_noise - random multiplicative noise at a set spread.

NB: we set a random seed using set.seed here for reproducibility. We recommend users avoid this step when using the package in production code.

library(deident)
set.seed(101)

perturb_pipe <- ShiftsWorked |>
  add_perturb(`Daily Pay`)

apply_deident(ShiftsWorked, perturb_pipe)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <chr>      <date>     <chr> <chr>         <chr>             <dbl>
#>  1           1 Maria Cook 2015-01-01 Night 17:01         00:01             75.1 
#>  2           2 Stephen C… 2015-01-01 Day   08:01         16:01            160.  
#>  3           3 Kimberly … 2015-01-01 Day   08:01         16:01             71.6 
#>  4           4 Nathan Al… 2015-01-01 Day   08:01         15:01            205.  
#>  5           5 Samuel Pa… 2015-01-01 Night 16:01         23:01            213.  
#>  6           6 Scott Mor… 2015-01-01 Night 17:01         00:01            153.  
#>  7           7 Nathan Sa… 2015-01-01 Rest  <NA>          <NA>               5.66
#>  8           8 Jose Lopez 2015-01-01 Night 17:01         00:01            212.  
#>  9           9 Donna Bro… 2015-01-01 Night 16:01         00:01            228.  
#> 10          10 George Ki… 2015-01-01 Night 16:01         00:01            240.  
#> # ℹ 3,090 more rows

To change the noise, pass one of the functions including the desired level of noise.

perturb_pipe_white_noise <- ShiftsWorked |>
  add_perturb(`Daily Pay`, noise = white_noise(sd=0.3))

apply_deident(ShiftsWorked, perturb_pipe_white_noise)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <chr>      <date>     <chr> <chr>         <chr>             <dbl>
#>  1           1 Maria Cook 2015-01-01 Night 17:01         00:01            78.6  
#>  2           2 Stephen C… 2015-01-01 Day   08:01         16:01           156.   
#>  3           3 Kimberly … 2015-01-01 Day   08:01         16:01            77.8  
#>  4           4 Nathan Al… 2015-01-01 Day   08:01         15:01           203.   
#>  5           5 Samuel Pa… 2015-01-01 Night 16:01         23:01           210.   
#>  6           6 Scott Mor… 2015-01-01 Night 17:01         00:01           142.   
#>  7           7 Nathan Sa… 2015-01-01 Rest  <NA>          <NA>             -0.460
#>  8           8 Jose Lopez 2015-01-01 Night 17:01         00:01           213.   
#>  9           9 Donna Bro… 2015-01-01 Night 16:01         00:01           219.   
#> 10          10 George Ki… 2015-01-01 Night 16:01         00:01           242.   
#> # ℹ 3,090 more rows
perturb_pipe_heavy_adaptive_noise <- ShiftsWorked |>
  add_perturb(`Daily Pay`, noise = adaptive_noise(sd.ratio=0.4))

apply_deident(ShiftsWorked, perturb_pipe_heavy_adaptive_noise)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <chr>      <date>     <chr> <chr>         <chr>             <dbl>
#>  1           1 Maria Cook 2015-01-01 Night 17:01         00:01              60.0
#>  2           2 Stephen C… 2015-01-01 Day   08:01         16:01             108. 
#>  3           3 Kimberly … 2015-01-01 Day   08:01         16:01              60.1
#>  4           4 Nathan Al… 2015-01-01 Day   08:01         15:01             195. 
#>  5           5 Samuel Pa… 2015-01-01 Night 16:01         23:01             229. 
#>  6           6 Scott Mor… 2015-01-01 Night 17:01         00:01             118. 
#>  7           7 Nathan Sa… 2015-01-01 Rest  <NA>          <NA>              -51.7
#>  8           8 Jose Lopez 2015-01-01 Night 17:01         00:01             197. 
#>  9           9 Donna Bro… 2015-01-01 Night 16:01         00:01             229. 
#> 10          10 George Ki… 2015-01-01 Night 16:01         00:01             230. 
#> # ℹ 3,090 more rows

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.