The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The R package fairadapt is intended for removing bias from machine
learning algorithms. In particular, it implements the pre-processing
procedure described in Plecko
& Meinshausen, 2019 (all the code used for producing the figures
in the paper can be found in the jmlr-paper
folder). The
main idea is to adapt the training and testing data in a way which
prevents any further training procedure from learning an undesired bias.
The package currently offers the pre-processing step, after which the
user can use the adapted data to train any classifier. However, some
caution on the training step is still advised, so for more involved
applications with resolving variables, the user should refer to the
original paper.
You can install the released version of fairadapt from CRAN with:
install.packages("fairadapt")
An example of how fairadapt can be used is demonstrated below on the UCI Adult dataset.
# loading the package
library(fairadapt)
<- c("sex", "age", "native_country", "marital_status", "education_num",
vars "workclass", "hours_per_week", "occupation", "income")
# initialising the adjacency matrix
<- c(
adj.mat 0, 0, 0, 1, 1, 1, 1, 1, 1, # sex
0, 0, 0, 1, 1, 1, 1, 1, 1, # age
0, 0, 0, 1, 1, 1, 1, 1, 1, # native_country
0, 0, 0, 0, 1, 1, 1, 1, 1, # marital_status
0, 0, 0, 0, 0, 1, 1, 1, 1, # education_num
0, 0, 0, 0, 0, 0, 0, 0, 1, # workclass
0, 0, 0, 0, 0, 0, 0, 0, 1, # hours_per_week
0, 0, 0, 0, 0, 0, 0, 0, 1, # occupation
0, 0, 0, 0, 0, 0, 0, 0, 0 # income
)
<- matrix(adj.mat, nrow = length(vars), ncol = length(vars),
adj.mat dimnames = list(vars, vars), byrow = TRUE)
# reading in the UCI Adult data
<- readRDS(
adult system.file("extdata", "uci_adult.rds", package = "fairadapt")
)<- nrow(adult) / 2
n
<- fairadapt(income ~ .,
mod train.data = head(adult[, vars], n = n),
test.data = tail(adult[, vars], n = n),
prot.attr = "sex", adj.mat = adj.mat,
res.vars = "hours_per_week")
<- adaptedData(mod)
adapt.train <- adaptedData(mod, train = FALSE)
adapt.test
summary(mod)
#>
#> Call:
#> fairadapt(formula = income ~ ., prot.attr = "sex", adj.mat = adj.mat,
#> train.data = head(adult[, vars], n = n), test.data = tail(adult[,
#> vars], n = n), res.vars = "hours_per_week")
#>
#> Protected attribute: sex
#> Protected attribute levels: Female, Male
#> Adapted variables: marital_status, education_num, workclass, occupation, income
#> Resolving variables: hours_per_week, age, native_country
#>
#> Number of training samples: 1000
#> Number of test samples: 1000
#> Quantile method: rangerQuants
#>
#> Total variation (before adaptation): -0.2014
#> Total variation (after adaptation): -0.01676
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.