The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
matsindf_apply() is a powerful and versatile function
that enables analysis with lists and data frames by applying
FUN in helpful ways. The function is called
matsindf_apply(), because it can be used to apply
FUN to a matsindf data frame, a data frame
that contains matrices as individual entries in a data frame. (A
matsindf data frame can be created by calling
collapse_to_matrices(), as demonstrated below.)
But matsindf_apply() can apply FUN across
much more: data frames of single numbers, lists of matrices, lists of
single numbers, and individual numbers. This vignette demonstrates
matsindf_apply(), starting with simple examples and
proceeding toward sophisticated analyses.
The basis of all analyses conducted with
matsindf_apply() is a function (FUN) to be
applied across data supplied in .dat or ....
FUN must return a named list of variables as its result.
Here is an example function that both adds and subtracts its arguments,
a and b, and returns a list containing its
result, c and d.
example_fun <- function(a, b){
return(list(c = matsbyname::sum_byname(a, b),
d = matsbyname::difference_byname(a, b)))
}Similar to lapply() and its siblings, additional
argument(s) to matsindf_apply() include the data over which
FUN is to be applied. These arguments can, in the first
instance, be supplied as named arguments to the ...
argument of matsindf_apply(). All arguments in
... must be named. The ... arguments to
matsindf_apply() are passed to FUN according
to their names. In this case, the output of
matsindf_apply() is the the named list returned by
FUN.
Passing an additional argument (z = 2) causes an unused
argument error, because example_fun does not have a
z argument.
tryCatch(
matsindf_apply(FUN = example_fun, a = 2, b = 1, z = 2),
error = function(e){e}
)
#> <simpleError in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(), the following unused arguments appeared in ...: z>Failing to pass a needed argument (b) causes an error
that indicates the missing argument.
tryCatch(
matsindf_apply(FUN = example_fun, a = 2),
error = function(e){e}
)
#> Warning in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(),
#> the following named arguments to FUN were not found in any of .dat, ..., or
#> defaults to FUN: b. Set .warn_missing_FUN_args = FALSE to suppress this warning
#> if you know what you are doing.
#> <getvarError in (function (a, b) { return(list(c = matsbyname::sum_byname(a, b), d = matsbyname::difference_byname(a, b)))})(a = 2): argument "b" is missing, with no default>Alternatively, arguments to FUN can be given in a named
list to .dat, the first argument of
matsindf_apply(). When a value is assigned to
.dat, the return value from matsindf_apply()
contains all named variables in .dat (in this case both
a and b) in addition to the results provided
by FUN (in this case both c and
d).
matsindf_apply(list(a = 2, b = 1), FUN = example_fun)
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 3
#>
#> $d
#> [1] 1Extra variables are tolerated in .dat, because
.dat is considered to be a store of data from which
variables can be drawn as needed.
matsindf_apply(list(a = 2, b = 1, z = 42), FUN = example_fun)
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 3
#>
#> $d
#> [1] 1In contrast, arguments to ... are named explicitly by
the user, so including an extra argument in ... is
considered an error, as shown above.
If a named argument is supplied by both .dat and
..., the argument in ... takes precedence,
overriding the argument in .dat.
matsindf_apply(list(a = 2, b = 1), FUN = example_fun, a = 10)
#> $a
#> [1] 10
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 11
#>
#> $d
#> [1] 9When supplying both .dat and
..., ... can contain named strings of length
1 which are interpreted as mappings from named items in
.dat to arguments in the signature of FUN. In
the example below, a = "z" indicates that argument
a to FUN should be supplied by item
z in .dat.
matsindf_apply(list(a = 2, b = 1, z = 42),
FUN = example_fun, a = "z")
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $z
#> [1] 42
#>
#> $c
#> [1] 43
#>
#> $d
#> [1] 41If a named argument appears in both .dat and the output
of FUN, a name collision occurs in the output of
matsindf_apply(), and a warning is issued.
tryCatch(
matsindf_apply(list(a = 2, b = 1, c = 42), FUN = example_fun),
warning = function(w){w}
)
#> <simpleWarning in matsindf_apply(list(a = 2, b = 1, c = 42), FUN = example_fun): Name collision in matsindf::matsindf_apply(). The following arguments appear both in .dat and in the output of `FUN`: c>FUN can accept more than just numerics.
example_fun_with_string() accepts a character string and a
numeric. However, because ... argument that is a character
string of length 1 has special meaning (namely mapping
variables in .dat to arguments of FUN),
passing a character string of length 1 can cause an error.
To get around the problem, wrap the single string in a list, as shown
below.
example_fun_with_string <- function(str_a, b) {
a <- as.numeric(str_a)
list(added = matsbyname::sum_byname(a, b), subtracted = matsbyname::difference_byname(a, b))
}
# Causes an error
tryCatch(
matsindf_apply(FUN = example_fun_with_string, str_a = "1", b = 2),
error = function(e){e}
)
#> Warning in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(),
#> the following named arguments to FUN were not found in any of .dat, ..., or
#> defaults to FUN: str_a. Set .warn_missing_FUN_args = FALSE to suppress this
#> warning if you know what you are doing.
#> <evalError in (function (str_a, b) { a <- as.numeric(str_a) list(added = matsbyname::sum_byname(a, b), subtracted = matsbyname::difference_byname(a, b))})(b = 2): argument "str_a" is missing, with no default>
# To solve the problem, wrap "1" in list().
matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = 2)
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = list(2))
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(FUN = example_fun_with_string,
str_a = list("1", "3"),
b = list(2, 4))
#> $added
#> $added[[1]]
#> [1] 3
#>
#> $added[[2]]
#> [1] 7
#>
#>
#> $subtracted
#> $subtracted[[1]]
#> [1] -1
#>
#> $subtracted[[2]]
#> [1] -1
matsindf_apply(.dat = list(str_a = list("1"), b = list(2)), FUN = example_fun_with_string)
#> $str_a
#> [1] "1"
#>
#> $b
#> [1] 2
#>
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(.dat = list(m = list("1"), n = list(2)), FUN = example_fun_with_string,
str_a = "m", b = "n")
#> $m
#> [1] "1"
#>
#> $n
#> [1] 2
#>
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1matsindf_apply() and data frames.dat can also contain a data frame (or tibble), both of
which are fancy lists. When .dat is a data frame or tibble,
the output of matsindf_apply() is a tibble, and
FUN acts like a specialized dplyr::mutate(),
adding new columns at the right of .dat.
matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)),
FUN = example_fun_with_string)
#> # A tibble: 2 × 4
#> str_a b added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1
matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)),
FUN = example_fun_with_string,
str_a = "str_a", b = "b")
#> # A tibble: 2 × 4
#> str_a b added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1
matsindf_apply(.dat = data.frame(m = c("1", "3"), n = c(2, 4)),
FUN = example_fun_with_string,
str_a = "m", b = "n")
#> # A tibble: 2 × 4
#> m n added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1Additional niceties are available when .dat is a data
frame or a tibble. matsindf_apply() works when the data
frame is filled with single numeric values, as is typical.
df <- data.frame(a = 2:4, b = 1:3)
matsindf_apply(df, FUN = example_fun)
#> # A tibble: 3 × 4
#> a b c d
#> <int> <int> <int> <int>
#> 1 2 1 3 1
#> 2 3 2 5 1
#> 3 4 3 7 1But matsindf_apply() also works with
matsindf data frames, data frames in which each cell of the
data frame is filled with a single matrix. To demonstrate use of
matsindf_apply() with a matsindf data frame,
we’ll construct a simple matsindf data frame
(midf) using functions in this package.
# Create a tidy data frame containing data for matrices
tidy <- tibble::tibble(Year = rep(c(rep(2017, 4), rep(2018, 4)), 2),
matnames = c(rep("U", 8), rep("V", 8)),
matvals = c(1:4, 11:14, 21:24, 31:34),
rownames = c(rep(c(rep("p1", 2), rep("p2", 2)), 2),
rep(c(rep("i1", 2), rep("i2", 2)), 2)),
colnames = c(rep(c("i1", "i2"), 4),
rep(c("p1", "p2"), 4))) |>
dplyr::mutate(
rowtypes = case_when(
matnames == "U" ~ "Product",
matnames == "V" ~ "Industry",
TRUE ~ NA_character_
),
coltypes = case_when(
matnames == "U" ~ "Industry",
matnames == "V" ~ "Product",
TRUE ~ NA_character_
)
)
tidy
#> # A tibble: 16 × 7
#> Year matnames matvals rownames colnames rowtypes coltypes
#> <dbl> <chr> <int> <chr> <chr> <chr> <chr>
#> 1 2017 U 1 p1 i1 Product Industry
#> 2 2017 U 2 p1 i2 Product Industry
#> 3 2017 U 3 p2 i1 Product Industry
#> 4 2017 U 4 p2 i2 Product Industry
#> 5 2018 U 11 p1 i1 Product Industry
#> 6 2018 U 12 p1 i2 Product Industry
#> 7 2018 U 13 p2 i1 Product Industry
#> 8 2018 U 14 p2 i2 Product Industry
#> 9 2017 V 21 i1 p1 Industry Product
#> 10 2017 V 22 i1 p2 Industry Product
#> 11 2017 V 23 i2 p1 Industry Product
#> 12 2017 V 24 i2 p2 Industry Product
#> 13 2018 V 31 i1 p1 Industry Product
#> 14 2018 V 32 i1 p2 Industry Product
#> 15 2018 V 33 i2 p1 Industry Product
#> 16 2018 V 34 i2 p2 Industry Product
# Convert to a matsindf data frame
midf <- tidy |>
dplyr::group_by(Year, matnames) |>
collapse_to_matrices(rowtypes = "rowtypes", coltypes = "coltypes") |>
tidyr::pivot_wider(names_from = "matnames", values_from = "matvals")
# Take a look at the midf data frame and some of the matrices it contains.
midf
#> # A tibble: 2 × 3
#> Year U V
#> <dbl> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]>
midf$U[[1]]
#> i1 i2
#> p1 1 2
#> p2 3 4
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
midf$V[[1]]
#> p1 p2
#> i1 21 22
#> i2 23 24
#> attr(,"rowtype")
#> [1] "Industry"
#> attr(,"coltype")
#> [1] "Product"With midf in hand, we can demonstrate use of tidyverse-style
functional programming to perform matrix algebra within a data frame.
The functions of the matsbyname package (such as
difference_byname() below) can be used for this
purpose.
result <- midf |>
dplyr::mutate(
W = difference_byname(transpose_byname(V), U)
)
result
#> # A tibble: 2 × 4
#> Year U V W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
result$W[[1]]
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
result$W[[2]]
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"This way of performing matrix calculations works equally well within
a 2-row matsindf data frame (as shown above) or within a
1000-row matsindf data frame.
matsindf_apply()Users can write their own functions using
matsindf_apply(). A flexible calc_W() function
can be written as follows.
calc_W <- function(.DF = NULL, U = "U", V = "V", W = "W") {
# The inner function does all the work.
W_func <- function(U_mat, V_mat){
# When we get here, U_mat and V_mat will be single matrices or single numbers,
# not a column in a data frame or an item in a list.
if (length(U_mat) == 0 & length(V_mat == 0)) {
# Tolerate zero-length arguments by returning a zero-length
# a list with the correct name and return type.
return(list(numeric()) |> magrittr::setnames(W))
}
# Calculate W_mat from the inputs U_mat and V_mat.
W_mat <- matsbyname::difference_byname(
matsbyname::transpose_byname(V_mat),
U_mat)
# Return a named list.
list(W_mat) |> magrittr::set_names(W)
}
# The body of the main function consists of a call to matsindf_apply
# that specifies the inner function in the FUN argument.
matsindf_apply(.DF, FUN = W_func, U_mat = U, V_mat = V)
}This style of writing matsindf_apply() functions is
incredibly versatile, leveraging the capabilities of both the
matsindf and matsbyname packages. (Indeed, the
Recca package uses matsindf_apply() heavily
and is built upon the functions in the matsindf and
matsbyname packages.)
Functions written like calc_W() can operate in ways
similar to matsindf_apply() itself. To demonstrate, we’ll
use calc_W() in all the ways that
matsindf_apply() can be used, going in the reverse order to
our demonstration of the capabilities of matsindf_apply()
above.
calc_W() can be used as a specialized
mutate function that operates on matsindf data
frames.
midf |> calc_W()
#> # A tibble: 2 × 4
#> Year U V W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>The added column could be given a different name from the default
(“W”) using the W argument.
midf |> calc_W(W = "W_prime")
#> # A tibble: 2 × 4
#> Year U V W_prime
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>As with matsindf_apply(), column names in
midf can be mapped to the arguments of
calc_W() by the arguments to calc_W().
midf |>
dplyr::rename(X = U, Y = V) |>
calc_W(U = "X", V = "Y")
#> # A tibble: 2 × 4
#> Year X Y W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>calc_W() can operate on lists of single matrices, too.
This approach works, because the default values for the U
and V arguments to calc_W() are “U” and “V”,
respectively. The input list members (in this case
midf$U[[1]] and midf$V[[1]]) are returned with
the output, because list(U = midf$U[[1]], V = midf$V[[1]])
is passed to the .dat argument of
matsindf_apply().
calc_W(list(U = midf$U[[1]], V = midf$V[[1]]))
#> $U
#> i1 i2
#> p1 1 2
#> p2 3 4
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
#>
#> $V
#> p1 p2
#> i1 21 22
#> i2 23 24
#> attr(,"rowtype")
#> [1] "Industry"
#> attr(,"coltype")
#> [1] "Product"
#>
#> $W
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"It may be clearer to name the arguments as required by the
calc_W() function without wrapping in a list first, as
shown below. But in this approach, the input matrices are not returned
with the output, because arguments U and V are
passed to the ... argument of
matsindf_apply(), not the .dat argument of
matsindf_apply().
calc_W(U = midf$U[[1]], V = midf$V[[1]])
#> $W
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"calc_W() can operate on data frames containing single
numbers.
data.frame(U = c(1, 2), V = c(3, 4)) |> calc_W()
#> # A tibble: 2 × 3
#> U V W
#> <dbl> <dbl> <dbl>
#> 1 1 3 2
#> 2 2 4 2Finally, calc_W() can be applied to single numbers, and
the result is 1x1 matrix.
It is good practice to write internal functions that tolerate
zero-length inputs, as calc_W() does. Doing so, enables
results from different calculations to be rbinded
together.
calc_W(U = numeric(), V = numeric())
#> $W
#> numeric(0)
calc_W(list(U = numeric(), V = numeric()))
#> $U
#> numeric(0)
#>
#> $V
#> numeric(0)
#>
#> $W
#> numeric(0)
res <- calc_W(list(U = c(2, 3, 4, 5), V = c(3, 4, 5, 6)))
res0 <- calc_W(list(U = numeric(), V = numeric()))
dplyr::bind_rows(res, res0)
#> # A tibble: 4 × 3
#> U V W
#> <dbl> <dbl> <dbl>
#> 1 2 3 1
#> 2 3 4 1
#> 3 4 5 1
#> 4 5 6 1This vignette demonstrated use of the versatile
matsindf_apply() function. Inputs to
matsindf_apply() can be
matsindf_apply() can be used for programming, and
functions constructed as demonstrated above share characteristics with
matsindf_apply():
dplyr::mutate()
operators, andThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.