The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Search Data Frames for Personally Identifiable Information
Version: 1.3.0
Maintainer: Jacob Patterson-Stein <jacobpstein@gmail.com>
Description: Check a data frame for personal information, including names, location, disability status, and geo-coordinates.
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (≥ 2.10), dplyr, stringr, uuid, utils
RoxygenNote: 7.3.2
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
URL: https://github.com/jacobpstein/pii
BugReports: https://github.com/jacobpstein/pii/issues
NeedsCompilation: no
Packaged: 2025-01-11 19:55:50 UTC; jacobpstein
Author: Jacob Patterson-Stein [aut, cre]
Repository: CRAN
Date/Publication: 2025-01-13 15:40:06 UTC

Search Data Frames for Personally Identifiable Information

Description

Search Data Frames for Personally Identifiable Information

Usage

check_PII(df)

Arguments

df

a data frame object

Value

Returns a data frame of columns that potentially contain PII

Examples

# create a data frame containing various personally identifiable information
pii_df <- data.frame(
 lat = c(40.7128, 34.0522, 41.8781),
 long = c(-74.0060, -118.2437, -87.6298),
 first_name = c("John", "Michael", "Linda"),
 phone = c("123-456-7890", "234-567-8901", "345-678-9012"),
 age = sample(30:60, 3, replace = TRUE),
 email = c("test@example.com", "contact@domain.com", "user@website.org"),
 disabled = c("No", "Yes", "No"),
 stringsAsFactors = FALSE
)

check_PII(pii_df)

Split Data Into PII and Non-PII Columns

Description

Split Data Into PII and Non-PII Columns

Usage

split_PII_data(df, exclude_columns = NULL)

Arguments

df

a data frame object

exclude_columns

columns to exclude from the data frame splitdescription

Value

Returns two data frames into the global environment: one containing the PII columns and one without the PII columns. A unique merge key is created to join them. The function then prints the columns that were flagged and split to the console.

Examples

# create a data frame containing various personally identifiable information
pii_df <- data.frame(
 lat = c(40.7128, 34.0522, 41.8781),
 long = c(-74.0060, -118.2437, -87.6298),
 first_name = c("John", "Michael", "Linda"),
 phone = c("123-456-7890", "234-567-8901", "345-678-9012"),
 age = sample(30:60, 3, replace = TRUE),
 email = c("test@example.com", "contact@domain.com", "user@website.org"),
 disabled = c("No", "Yes", "No"),
 stringsAsFactors = FALSE
)

split_PII_data(pii_df, exclude_columns = c("phone"))

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.