The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The vld_
functions are used within the chk_
functions. The chk_
functions (and their vld_
equivalents) can be divided into the following families.
In the code in this examples, we will use vld_*
functions
If you want to learn more about the logic behind some of the functions explained here, we recommend reading the book Advanced R (Wickham, 2019).
For reasons of space, the x_name = NULL
argument is not
shown. For a more simplified list of the chk
functions, you
can see the Reference
section.
chk_
FunctionsCheck if the function input is missing or not
chk_missing
function uses missing()
to
check if an argument has been left out when the function is called.
Function | Code |
---|---|
chk_missing() |
missing() |
chk_not_missing() |
!missing() |
...
CheckerCheck if the function input comes from ...
(dot-dot-dot
) or not
The functions chk_used(...)
and
chk_unused(...)
check if any arguments have been provided
through ...
(called dot-dot-dot
or ellipsis),
which is commonly used in R to allow a variable number of arguments.
Function | Code |
---|---|
chk_used(...) |
length(list(...)) != 0L |
chk_unused(...) |
length(list(...)) == 0L |
Check if the function input is a valid external data source.
These chk
functions check the existence of a file, the
validity of its extension, and the existence of a directory.
Function | Code |
---|---|
chk_file(x) |
vld_string(x) && file.exists(x) && !dir.exists(x) |
chk_ext(x, ext) |
vld_string(x) && vld_subset(tools::file_ext(x), ext) |
chk_dir(x) |
vld_string(x) && dir.exists(x) |
Check if the function input is NULL or not
Function | Code |
---|---|
chk_null(x) |
is.null(x) |
chk_not_null(x) |
!is.null(x) |
Check if the function input is a scalar. In R, scalars are vectors of length 1.
Function | Code |
---|---|
chk_scalar(x) |
length(x) == 1L |
The following functions check if the functions inputs are vectors of length 1 of a particular data type. Each data type has a special syntax to create an individual value or “scalar”.
Function | Code |
---|---|
chk_string(x) |
is.character(x) && length(x) == 1L && !anyNA(x) |
chk_number(x) |
is.numeric(x) && length(x) == 1L && !anyNA(x) |
For logical data types, you can check flags using
chk_flag()
, which considers TRUE
or
FALSE
as possible values, or use chk_lgl()
to
verify if a scalar is of type logical, including NA as element.
Function | Code |
---|---|
chk_flag(x) |
is.logical(x) && length(x) == 1L && !anyNA(x) |
chk_lgl(x) |
is.logical(x) && length(x) == 1L |
It is also possible to check if the user-provided argument is only
TRUE
or only FALSE
:
Function | Code |
---|---|
chk_true(x) |
is.logical(x) && length(x) == 1L && !anyNA(x) && x |
chk_false(x) |
is.logical(x) && length(x) == 1L && !anyNA(x) && !x |
Check if the function input is of class Date or DateTime
Date and datetime classes can be checked with chk_date
and chk_datetime
.
Function | Code |
---|---|
chk_date(x) |
inherits(x, "Date") && length(x) == 1L && !anyNA(x) |
chk_date_time(x) |
inherits(x, "POSIXct") && length(x) == 1L && !anyNA(x) |
Also you can check the time zone with chk_tz()
. The
available time zones can be retrieved using the function
OlsonNames()
.
Function | Code |
---|---|
chk_tz(x) |
is.character(x) && length(x) == 1L && !anyNA(x) && x %in% OlsonNames() |
Check if the function input has a specific data structure.
Vectors are a family of data types that come in two forms: atomic vectors and lists. When vectors consist of elements of the same data type, they can be considered atomic, matrices, or arrays. The elements in a list, however, can be of different types.
To check if a function argument is a vector you can use
chk_vector()
.
Function | Code |
---|---|
chk_vector(x) |
is.atomic(x) && !is.matrix(x) && !is.array(x)) || is.list(x) |
Pay attention that chk_vector()
and
vld_vector()
are different from is.vector()
,
that will return FALSE if the vector has any attributes except
names.
vector <- c(1, 2, 3)
is.vector(vector) # TRUE
#> [1] TRUE
vld_vector(vector) # TRUE
#> [1] TRUE
attributes(vector) <- list("a" = 10, "b" = 20, "c" = 30)
is.vector(vector) # FALSE
#> [1] FALSE
vld_vector(vector) # TRUE
#> [1] TRUE
Function | Code |
---|---|
chk_atomic(x) |
is.atomic(x) |
Notice that is.atomic
is true for the types logical,
integer, numeric, complex, character and raw. Also, it is TRUE for
NULL.
vector <- c(1, 2, 3)
is.atomic(vector) # TRUE
#> [1] TRUE
vld_vector(vector) # TRUE
#> [1] TRUE
is.atomic(NULL) # TRUE
#> [1] FALSE
vld_vector(NULL) # TRUE
#> [1] FALSE
The dimension attribute converts vectors into matrices and arrays.
Function | Code |
---|---|
chk_array(x) |
is.array(x) |
chk_matrix(x) |
is.matrix(x) |
When a vector is composed by heterogeneous data types, can be a list. Data frames are among the most important S3 vectors, constructed on top of lists.
Function | Code |
---|---|
chk_list(x) |
is.list() |
chk_data(x) |
inherits(x, "data.frame") |
Be careful not to confuse the function chk_data
with
check_data
. Please read the check_
functions
section below and the function documentation.
Check if the function input has a data type. You can use the function
typeof()
to confirm the data type.
Function | Code |
---|---|
chk_environment(x) |
is.environment(x) |
chk_logical(x) |
is.logical(x) |
chk_character(x) |
is.character(x) |
For numbers there are four functions. R differentiates between
doubles (chk_double()
) and integers
(chk_integer()
). You can also use the generic function
chk_numeric()
, which will detect both. The third type of
number is complex (chk_complex()
).
Function | Code |
---|---|
chk_numeric(x) |
is.numeric(x) |
chk_double(x) |
is.double(x) |
chk_integer(x) |
is.integer(x) |
chk_complex(x) |
is.complex(x) |
Consider that to explicitly create an integer in R, you need to use
the suffix L
.
These functions accept whole numbers, whether they are explicitly integers or double types without fractional parts.
Function | Code |
---|---|
chk_whole_numeric |
is.integer(x) || (is.double(x) && vld_true(all.equal(x[!is.na(x)], trunc(x[!is.na(x)])))) |
chk_whole_number |
vld_number(x) && (is.integer(x) || vld_true(all.equal(x, trunc(x)))) |
chk_count |
vld_whole_number(x) && x >= 0 |
If you want to consider both 3.0 and 3L as integers, it is safer to
use the function chk_whole_numeric
. Here, x
is
valid if it’s an integer or a double that can be converted to an integer
without changing its value.
# Integer vector
vld_whole_numeric(c(1L, 2L, 3L)) # TRUE
#> [1] TRUE
# Double vector representing whole numbers
vld_whole_numeric(c(1.0, 2.0, 3.0)) # TRUE
#> [1] TRUE
# Double vector with fractional numbers
vld_whole_numeric(c(1.0, 2.2, 3.0)) # FALSE
#> [1] FALSE
The function chk_whole_number
is similar to
chk_whole_numeric
. chk_whole_number
checks if
the number is of length(x) == 1L
# Integer vector
vld_whole_numeric(c(1L, 2L, 3L)) # TRUE
#> [1] TRUE
vld_whole_number(c(1L, 2L, 3L)) # FALSE
#> [1] FALSE
vld_whole_number(c(1L)) # TRUE
#> [1] TRUE
chk_count()
is a special case of
chk_whole_number
, differing in that it ensures values are
non-negative whole numbers.
Check if the function input is a factor
Function | Code |
---|---|
chk_factor |
is.factor(x) |
chk_character_or_factor |
is.character(x) || is.factor(x) |
Factors can be specially confusing for users, because despite they are displayed as characters are built in top of integer vectors.
chk
provides the function
chk_character_or_factor()
that allows detecting if the
argument that the user is providing contains strings.
# Factor with specified levels
vector_fruits <- c("apple", "banana", "apple", "orange", "banana", "apple")
factor_fruits <- factor(c("apple", "banana", "apple", "orange", "banana", "apple"),
levels = c("apple", "banana", "orange"))
is.factor(factor_fruits) # TRUE
#> [1] TRUE
vld_factor(factor_fruits) # TRUE
#> [1] TRUE
is.character(factor_fruits) # FALSE
#> [1] FALSE
vld_character(factor_fruits) # FALSE
#> [1] FALSE
vld_character_or_factor(factor_fruits) # TRUE
#> [1] TRUE
Check if the function input has a characteristic shared by all its elements.
If you want to apply any of the previously defined functions for
length(x) == 1L
to the elements of a vector, you can use
chk_all()
.
Function | Code |
---|---|
chk_all(x, chk_fun, ...) |
all(vapply(x, chk_fun, TRUE, ...)) |
Check if the function input is another function
formals
refers to the count of the number of formal
arguments
Function | Code |
---|---|
chk_function |
is.function(x) && (is.null(formals) || length(formals(x)) == formals) |
Check if the function input has names and are valid
chk_named
function works with vectors, lists, data frames,
and matrices that have named columns or rows. Do not confuse with
check_names
.
chk_valid_name
function specifically designed to check
if the elements of a character vector are valid R names. If you want to
know what is considered a valid name, please refer to the documentation
for the make.names
function.
Function | Code |
---|---|
chk_named(x) |
!is.null(names(x)) |
chk_valid_name(x) |
identical(make.names(x[!is.na(x)]), as.character(x[!is.na(x)])) |
vld_valid_name(c("name1", NA, "name_2", "validName")) # TRUE
#> [1] TRUE
vld_valid_name(c(1, 2, 3)) # FALSE
#> [1] FALSE
vld_named(data.frame(a = 1:5, b = 6:10)) # TRUE
#> [1] TRUE
vld_named(list(a = 1, b = 2)) # TRUE
#> [1] TRUE
vld_named(c(a = 1, b = 2)) # TRUE
#> [1] TRUE
vld_named(c(1, 2, 3)) # FALSE
#> [1] FALSE
Check if the function input is part of a range of values. The function input should be numeric.
Function | Code |
---|---|
chk_range(x, range = c(0, 1)) |
all(x[!is.na(x)] >= range[1] & x[!is.na(x)] <= range[2]) |
chk_lt(x, value = 0) |
all(x[!is.na(x)] < value) |
chk_lte(x, value = 0) |
all(x[!is.na(x)] <= value) |
chk_gt(x, value = 0) |
all(x[!is.na(x)] > value) |
chk_gte(x, value = 0) |
all(x[!is.na(x)] >= value) |
Check if the function input is equal or similar to a predefined object.
The functions chk_identical()
, chk_equal()
,
and chk_equivalent()
are used to compare two objects, but
they differ in how strict the comparison is.
chk_equal
and chk_equivalent
checks if x and
y are numerically equivalent within a specified tolerance, but
chk_equivalent
ignores differences in attributes.
Function | Code |
---|---|
chk_identical(x, y) |
identical(x, y) |
chk_equal(x, y, tolerance = sqrt(.Machine$double.eps)) |
vld_true(all.equal(x, y, tolerance)) |
chk_equivalent(x, y, tolerance = sqrt(.Machine$double.eps)) |
vld_true(all.equal(x, y, tolerance, check.attributes = FALSE)) |
In the case you want to compare the elements of a vector, you can use
the check_all_*
functions.
Function | Code |
---|---|
chk_all_identical(x) |
length(x) < 2L || all(vapply(x, vld_identical, TRUE, y = x[[1]])) |
chk_all_equal(x, tolerance = sqrt(.Machine$double.eps)) |
length(x) < 2L || all(vapply(x, vld_equal, TRUE, y = x[[1]], tolerance = tolerance)) |
chk_all_equivalent(x, tolerance = sqrt(.Machine$double.eps)) |
length(x) < 2L || all(vapply(x, vld_equivalent, TRUE, y = x[[1]], tolerance = tolerance)) |
vld_all_identical(c(1, 2, 3)) # FALSE
#> [1] FALSE
vld_all_identical(c(1, 1, 1)) # TRUE
#> [1] TRUE
vld_identical(c(1, 2, 3), c(1, 2, 3)) # TRUE
#> [1] TRUE
vld_all_equal(c(0.1, 0.12, 0.13))
#> [1] FALSE
vld_all_equal(c(0.1, 0.12, 0.13), tolerance = 0.2)
#> [1] TRUE
vld_equal(c(0.1, 0.12, 0.13), c(0.1, 0.12, 0.13)) # TRUE
#> [1] TRUE
vld_equal(c(0.1, 0.12, 0.13), c(0.1, 0.12, 0.4), tolerance = 0.5) # TRUE
#> [1] TRUE
x <- c(0.1, 0.1, 0.1)
y <- c(0.1, 0.12, 0.13)
attr(y, "label") <- "Numbers"
vld_equal(x, y, tolerance = 0.5) # FALSE
#> [1] FALSE
vld_equivalent(x, y, tolerance = 0.5) # TRUE
#> [1] TRUE
Check if the function input are numbers in increasing order
chk_sorted
function checks if x
is sorted in
non-decreasing order, ignoring any NA values.
Function | Code |
---|---|
chk_sorted(x) |
!is.unsorted(x, na.rm = TRUE) |
Check if the function input is composed by certain elements
The setequal
function in R is used to check if two
vectors contain exactly the same elements, regardless of the order or
number of repetitions.
Function | Code |
---|---|
chk_setequal(x, values) |
setequal(x, values) |
vld_setequal(c(1, 2, 3), c(3, 2, 1)) # TRUE
#> [1] TRUE
vld_setequal(c(1, 2, 3), c(3, 2, 1, 4)) # FALSE
#> [1] FALSE
vld_setequal(c(1, 2, 3, 4), c(3, 2, 1)) # FALSE
#> [1] FALSE
vld_setequal(c(1, 2), c(1, 1, 1, 1, 1, 1, 2, 1)) # TRUE
#> [1] TRUE
First, the %in%
function is used to check whether the
elements of a vector x
are present in a specified set of
values. This returns a logical vector, which is then simplified by
all()
. The all()
function checks if all values
in the vector are TRUE. If the result is TRUE, it indicates that for
vld_
and chk_subset()
, all elements in the
x
vector are present in values
. Similarly, for
vld_
and chk_superset()
, it indicates that all
elements of values
are present in x
.
Function | Code |
---|---|
chk_subset(x, values) |
all(x %in% values) |
chk_not_subset(x, values) |
!any(x %in% values) || !length(x) |
chk_superset(x, values) |
all(values %in% x) |
# When both function inputs have the same elements,
# all functions return TRUE
vld_setequal(c(1, 2, 3), c(3, 2, 1)) # TRUE
#> [1] TRUE
vld_subset(c(1, 2, 3), c(3, 2, 1)) # TRUE
#> [1] TRUE
vld_superset(c(1, 2, 3), c(3, 2, 1)) # TRUE
#> [1] TRUE
vld_setequal(c(1, 2), c(1, 1, 1, 1, 1, 1, 2, 1)) # TRUE
#> [1] TRUE
vld_subset(c(1, 2), c(1, 1, 1, 1, 1, 1, 2, 1)) # TRUE
#> [1] TRUE
vld_superset(c(1, 2), c(1, 1, 1, 1, 1, 1, 2, 1)) # TRUE
#> [1] TRUE
# When there are elements present in one vector but not the other,
# `vld_setequal()` will return FALSE
vld_setequal(c(1, 2, 3), c(3, 2, 1, 4)) # FALSE
#> [1] FALSE
vld_setequal(c(1, 2, 3, 4), c(3, 2, 1)) # FALSE
#> [1] FALSE
# When some elements of the `x` input are not present in `values`,
# `vld_subset()` returns FALSE
vld_subset(c(1, 2, 3, 4), c(3, 2, 1)) # FALSE
#> [1] FALSE
vld_superset(c(1, 2, 3, 4), c(3, 2, 1)) # TRUE
#> [1] TRUE
# When some elements of the `values` input are not present in `x`,
# `vld_superset()` returns FALSE
vld_subset(c(1, 2, 3), c(3, 2, 1, 4)) # TRUE
#> [1] TRUE
vld_superset(c(1, 2, 3), c(3, 2, 1, 4)) # FALSE
#> [1] FALSE
# An empty set is considered a subset of any set, and any set is a superset of an empty set.
vld_subset(c(), c("apple", "banana")) # TRUE
#> [1] TRUE
vld_superset(c("apple", "banana"), c()) # TRUE
#> [1] TRUE
chk_orderset()
validate whether a given set of
values
in a vector x matches a specified set of allowed
values
(represented by values
) while
preserving the order of those values.
Function | Code |
---|---|
chk_orderset |
vld_equivalent(unique(x[x %in% values]), values[values %in% x]) |
Check if the function input belongs to a class or type.
These functions check if x
is an S3 or S4 object of the
specified class.
Function | Code |
---|---|
chk_s3_class(x, class) |
!isS4(x) && inherits(x, class) |
chk_s4_class(x, class) |
isS4(x) && methods::is(x, class) |
chk_is()
checks if x inherits from a specified class,
regardless of whether it is an S3 or S4 object.
Function | Code |
---|---|
chk_is(x, class) |
inherits(x, class) |
Check if the function input matches a regular expression (REGEX).
chk_match(x, regexp = ".+")
checks if the regular
expression pattern specified by regexp
matches all the
non-missing values in the vector x
. If regexp
it is not specified by the user, chk_match
checks whether
all non-missing values in x
contain at least one character
(regexp = “.+”)
Function | Code |
---|---|
chk_match(x, regexp = ".+") |
all(grepl(regexp, x[!is.na(x)])) |
Check if the function input meet some user defined quality criteria.
chk_not_empty
function checks if the length of the
object is not zero. For a data frame or matrix, the length corresponds
to the number of elements (not rows or columns), while for a vector or
list, it corresponds to the number of elements.
chk_not_any_na
function checks if there are no NA values
present in the entire object.
Function | Code |
---|---|
chk_not_empty(x) |
length(x) != 0L |
chk_not_any_na(x) |
!anyNA(x) |
vld_not_empty(c()) # FALSE
#> [1] FALSE
vld_not_empty(list()) # FALSE
#> [1] FALSE
vld_not_empty(data.frame()) # FALSE
#> [1] FALSE
vld_not_empty(data.frame(a = 1:3, b = 4:6)) # TRUE
#> [1] TRUE
vld_not_any_na(data.frame(a = 1:3, b = 4:6)) # TRUE
#> [1] TRUE
vld_not_any_na(data.frame(a = c(1, NA, 3), b = c(4, 5, 6))) # FALSE
#> [1] FALSE
The chk_unique()
function is designed to verify that
there are no duplicates elements in a vector.
Function | Code |
---|---|
chk_unique(x, incomparables = FALSE) |
!anyDuplicated(x, incomparables = incomparables) |
The function chk_length
checks whether the length of
x
is within a specified range. It ensures that the length
is at least equal to length
and no more than
upper
. It can be used with vectors, lists and data
frames.
Function | Code |
---|---|
chk_length(x, length = 1L, upper = length) |
length(x) >= length && length(x) <= upper |
vld_length(c(1, 2, 3), length = 2, upper = 5) # TRUE
#> [1] TRUE
vld_length(c("a", "b"), length = 3) # FALSE
#> [1] FALSE
vld_length(list(a = 1, b = 2, c = 3), length = 2, upper = 4) # TRUE
#> [1] TRUE
vld_length(list(a = 1, b = 2, c = 3), length = 4) # FALSE
#> [1] FALSE
# 2 columns
vld_length(data.frame(x = 1:3, y = 4:6), length = 1, upper = 3) # TRUE
#> [1] TRUE
vld_length(data.frame(x = 1:3, y = 4:6), length = 3) # FALSE
#> [1] FALSE
# length of NULL is 0
vld_length(NULL, length = 0) # TRUE
#> [1] TRUE
vld_length(NULL, length = 1) # FALSE
#> [1] FALSE
Another useful function is chk_compatible_lenghts()
.
This function helps to check vectors could be ‘strictly recycled’.
a <- integer(0)
b <- numeric(0)
vld_compatible_lengths(a, b) # TRUE
#> [1] TRUE
a <- 1
b <- 2
vld_compatible_lengths(a, b) # TRUE
#> [1] TRUE
a <- 1:3
b <- 1:3
vld_compatible_lengths(a, b) # TRUE
#> [1] TRUE
b <- 1
vld_compatible_lengths(a, b) # TRUE
#> [1] TRUE
b <- 1:2
vld_compatible_lengths(a, b) # FALSE
#> [1] FALSE
b <- 1:6
vld_compatible_lengths(a, b) # FALSE
#> [1] FALSE
The chk_join()
function is designed to validate whether
the number of rows in the resulting data frame from merging two data
frames (x
and y
) is equal to the number of
rows in the first data frame (x
). This is useful when you
want to ensure that a join operation does not change the number of rows
in your main data frame.
Function | Code |
---|---|
chk_join(x, y, by) |
identical(nrow(x), nrow(merge(x, unique(y[if (is.null(names(by))) by else names(by)]), by = by))) |
x <- data.frame(id = c(1, 2, 3), value_x = c("A", "B", "C"))
y <- data.frame(id = c(1, 2, 3), value_y = c("D", "E", "F"))
vld_join(x, y, by = "id") # TRUE
#> [1] TRUE
# Perform a join that reduces the number of rows
y <- data.frame(id = c(1, 2, 1), value_y = c("D", "E", "F"))
vld_join(x, y, by = "id") # FALSE
#> [1] FALSE
check_
functionsThe check_
functions combine several chk_
functions internally. Read the documentation for each function to learn
more about its specific use.
Function | Description |
---|---|
check_values(x, values) |
Checks values and S3 class of an atomic object. |
check_key(x, key = character(0), na_distinct = FALSE) |
Checks if columns have unique rows. |
check_data(x, values, exclusive, order, nrow, key) |
Checks column names, values, number of rows and key for a data.frame. |
check_dim(x, dim, values, dim_name) |
Checks dimension of an object. |
check_dirs(x, exists) |
Checks if all directories exist (or if exists = FALSE do not exist as directories or files). |
check_files(x, exists) |
Checks if all files exist (or if exists = FALSE do not exist as files or directories). |
check_names(x, names, exclusive, order) |
Checks the names of an object. |
Wickham, H. (2019). Advanced R, Second Edition (2nd ed.). Chapman and Hall/CRC.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.