The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
matsindf_apply()
is a powerful and versatile function
that enables analysis with lists and data frames by applying
FUN
in helpful ways. The function is called
matsindf_apply()
, because it can be used to apply
FUN
to a matsindf
data frame, a data frame
that contains matrices as individual entries in a data frame. (A
matsindf
data frame can be created by calling
collapse_to_matrices()
, as demonstrated below.)
But matsindf_apply()
can apply FUN
across
much more: data frames of single numbers, lists of matrices, lists of
single numbers, and individual numbers. This vignette demonstrates
matsindf_apply()
, starting with simple examples and
proceeding toward sophisticated analyses.
The basis of all analyses conducted with
matsindf_apply()
is a function (FUN
) to be
applied across data supplied in .dat
or ...
.
FUN
must return a named list of variables as its result.
Here is an example function that both adds and subtracts its arguments,
a
and b
, and returns a list containing its
result, c
and d
.
example_fun <- function(a, b){
return(list(c = matsbyname::sum_byname(a, b),
d = matsbyname::difference_byname(a, b)))
}
Similar to lapply()
and its siblings, additional
argument(s) to matsindf_apply()
include the data over which
FUN
is to be applied. These arguments can, in the first
instance, be supplied as named arguments to the ...
argument of matsindf_apply()
. All arguments in
...
must be named. The ...
arguments to
matsindf_apply()
are passed to FUN
according
to their names. In this case, the output of
matsindf_apply()
is the the named list returned by
FUN
.
Passing an additional argument (z = 2
) causes an unused
argument error, because example_fun
does not have a
z
argument.
tryCatch(
matsindf_apply(FUN = example_fun, a = 2, b = 1, z = 2),
error = function(e){e}
)
#> <simpleError in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(), the following unused arguments appeared in ...: z>
Failing to pass a needed argument (b
) causes an error
that indicates the missing argument.
tryCatch(
matsindf_apply(FUN = example_fun, a = 2),
error = function(e){e}
)
#> Warning in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(),
#> the following named arguments to FUN were not found in any of .dat, ..., or
#> defaults to FUN: b. Set .warn_missing_FUN_args = FALSE to suppress this warning
#> if you know what you are doing.
#> <simpleError in (function (a, b) { return(list(c = matsbyname::sum_byname(a, b), d = matsbyname::difference_byname(a, b)))})(a = 2): argument "b" is missing, with no default>
Alternatively, arguments to FUN
can be given in a named
list to .dat
, the first argument of
matsindf_apply()
. When a value is assigned to
.dat
, the return value from matsindf_apply()
contains all named variables in .dat
(in this case both
a
and b
) in addition to the results provided
by FUN
(in this case both c
and
d
).
matsindf_apply(list(a = 2, b = 1), FUN = example_fun)
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 3
#>
#> $d
#> [1] 1
Extra variables are tolerated in .dat
, because
.dat
is considered to be a store of data from which
variables can be drawn as needed.
matsindf_apply(list(a = 2, b = 1, z = 42), FUN = example_fun)
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 3
#>
#> $d
#> [1] 1
In contrast, arguments to ...
are named explicitly by
the user, so including an extra argument in ...
is
considered an error, as shown above.
If a named argument is supplied by both .dat
and
...
, the argument in ...
takes precedence,
overriding the argument in .dat
.
matsindf_apply(list(a = 2, b = 1), FUN = example_fun, a = 10)
#> $a
#> [1] 10
#>
#> $b
#> [1] 1
#>
#> $c
#> [1] 11
#>
#> $d
#> [1] 9
When supplying both .dat
and
...
, ...
can contain named strings of length
1
which are interpreted as mappings from named items in
.dat
to arguments in the signature of FUN
. In
the example below, a = "z"
indicates that argument
a
to FUN
should be supplied by item
z
in .dat
.
matsindf_apply(list(a = 2, b = 1, z = 42),
FUN = example_fun, a = "z")
#> $a
#> [1] 2
#>
#> $b
#> [1] 1
#>
#> $z
#> [1] 42
#>
#> $c
#> [1] 43
#>
#> $d
#> [1] 41
If a named argument appears in both .dat
and the output
of FUN
, a name collision occurs in the output of
matsindf_apply()
, and a warning is issued.
tryCatch(
matsindf_apply(list(a = 2, b = 1, c = 42), FUN = example_fun),
warning = function(w){w}
)
#> <simpleWarning in matsindf_apply(list(a = 2, b = 1, c = 42), FUN = example_fun): Name collision in matsindf::matsindf_apply(). The following arguments appear both in .dat and in the output of `FUN`: c>
FUN
can accept more than just numerics.
example_fun_with_string()
accepts a character string and a
numeric. However, because ...
argument that is a character
string of length 1
has special meaning (namely mapping
variables in .dat
to arguments of FUN
),
passing a character string of length 1
can cause an error.
To get around the problem, wrap the single string in a list, as shown
below.
example_fun_with_string <- function(str_a, b) {
a <- as.numeric(str_a)
list(added = matsbyname::sum_byname(a, b), subtracted = matsbyname::difference_byname(a, b))
}
# Causes an error
tryCatch(
matsindf_apply(FUN = example_fun_with_string, str_a = "1", b = 2),
error = function(e){e}
)
#> Warning in matsindf_apply_types(.dat, FUN, ...): In matsindf::matsindf_apply(),
#> the following named arguments to FUN were not found in any of .dat, ..., or
#> defaults to FUN: str_a. Set .warn_missing_FUN_args = FALSE to suppress this
#> warning if you know what you are doing.
#> <simpleError in (function (str_a, b) { a <- as.numeric(str_a) list(added = matsbyname::sum_byname(a, b), subtracted = matsbyname::difference_byname(a, b))})(b = 2): argument "str_a" is missing, with no default>
# To solve the problem, wrap "1" in list().
matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = 2)
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(FUN = example_fun_with_string, str_a = list("1"), b = list(2))
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(FUN = example_fun_with_string,
str_a = list("1", "3"),
b = list(2, 4))
#> $added
#> $added[[1]]
#> [1] 3
#>
#> $added[[2]]
#> [1] 7
#>
#>
#> $subtracted
#> $subtracted[[1]]
#> [1] -1
#>
#> $subtracted[[2]]
#> [1] -1
matsindf_apply(.dat = list(str_a = list("1"), b = list(2)), FUN = example_fun_with_string)
#> $str_a
#> [1] "1"
#>
#> $b
#> [1] 2
#>
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply(.dat = list(m = list("1"), n = list(2)), FUN = example_fun_with_string,
str_a = "m", b = "n")
#> $m
#> [1] "1"
#>
#> $n
#> [1] 2
#>
#> $added
#> [1] 3
#>
#> $subtracted
#> [1] -1
matsindf_apply()
and data frames.dat
can also contain a data frame (or tibble), both of
which are fancy lists. When .dat
is a data frame or tibble,
the output of matsindf_apply()
is a tibble, and
FUN
acts like a specialized dplyr::mutate()
,
adding new columns at the right of .dat
.
matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)),
FUN = example_fun_with_string)
#> # A tibble: 2 × 4
#> str_a b added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1
matsindf_apply(.dat = data.frame(str_a = c("1", "3"), b = c(2, 4)),
FUN = example_fun_with_string,
str_a = "str_a", b = "b")
#> # A tibble: 2 × 4
#> str_a b added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1
matsindf_apply(.dat = data.frame(m = c("1", "3"), n = c(2, 4)),
FUN = example_fun_with_string,
str_a = "m", b = "n")
#> # A tibble: 2 × 4
#> m n added subtracted
#> <chr> <dbl> <dbl> <dbl>
#> 1 1 2 3 -1
#> 2 3 4 7 -1
Additional niceties are available when .dat
is a data
frame or a tibble. matsindf_apply()
works when the data
frame is filled with single numeric values, as is typical.
df <- data.frame(a = 2:4, b = 1:3)
matsindf_apply(df, FUN = example_fun)
#> # A tibble: 3 × 4
#> a b c d
#> <int> <int> <int> <int>
#> 1 2 1 3 1
#> 2 3 2 5 1
#> 3 4 3 7 1
But matsindf_apply()
also works with
matsindf
data frames, data frames in which each cell of the
data frame is filled with a single matrix. To demonstrate use of
matsindf_apply()
with a matsindf
data frame,
we’ll construct a simple matsindf
data frame
(midf
) using functions in this package.
# Create a tidy data frame containing data for matrices
tidy <- tibble::tibble(Year = rep(c(rep(2017, 4), rep(2018, 4)), 2),
matnames = c(rep("U", 8), rep("V", 8)),
matvals = c(1:4, 11:14, 21:24, 31:34),
rownames = c(rep(c(rep("p1", 2), rep("p2", 2)), 2),
rep(c(rep("i1", 2), rep("i2", 2)), 2)),
colnames = c(rep(c("i1", "i2"), 4),
rep(c("p1", "p2"), 4))) |>
dplyr::mutate(
rowtypes = case_when(
matnames == "U" ~ "Product",
matnames == "V" ~ "Industry",
TRUE ~ NA_character_
),
coltypes = case_when(
matnames == "U" ~ "Industry",
matnames == "V" ~ "Product",
TRUE ~ NA_character_
)
)
tidy
#> # A tibble: 16 × 7
#> Year matnames matvals rownames colnames rowtypes coltypes
#> <dbl> <chr> <int> <chr> <chr> <chr> <chr>
#> 1 2017 U 1 p1 i1 Product Industry
#> 2 2017 U 2 p1 i2 Product Industry
#> 3 2017 U 3 p2 i1 Product Industry
#> 4 2017 U 4 p2 i2 Product Industry
#> 5 2018 U 11 p1 i1 Product Industry
#> 6 2018 U 12 p1 i2 Product Industry
#> 7 2018 U 13 p2 i1 Product Industry
#> 8 2018 U 14 p2 i2 Product Industry
#> 9 2017 V 21 i1 p1 Industry Product
#> 10 2017 V 22 i1 p2 Industry Product
#> 11 2017 V 23 i2 p1 Industry Product
#> 12 2017 V 24 i2 p2 Industry Product
#> 13 2018 V 31 i1 p1 Industry Product
#> 14 2018 V 32 i1 p2 Industry Product
#> 15 2018 V 33 i2 p1 Industry Product
#> 16 2018 V 34 i2 p2 Industry Product
# Convert to a matsindf data frame
midf <- tidy |>
dplyr::group_by(Year, matnames) |>
collapse_to_matrices(rowtypes = "rowtypes", coltypes = "coltypes") |>
tidyr::pivot_wider(names_from = "matnames", values_from = "matvals")
# Take a look at the midf data frame and some of the matrices it contains.
midf
#> # A tibble: 2 × 3
#> Year U V
#> <dbl> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]>
midf$U[[1]]
#> i1 i2
#> p1 1 2
#> p2 3 4
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
midf$V[[1]]
#> p1 p2
#> i1 21 22
#> i2 23 24
#> attr(,"rowtype")
#> [1] "Industry"
#> attr(,"coltype")
#> [1] "Product"
With midf
in hand, we can demonstrate use of tidyverse
-style
functional programming to perform matrix algebra within a data frame.
The functions of the matsbyname
package (such as
difference_byname()
below) can be used for this
purpose.
result <- midf |>
dplyr::mutate(
W = difference_byname(transpose_byname(V), U)
)
result
#> # A tibble: 2 × 4
#> Year U V W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
result$W[[1]]
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
result$W[[2]]
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
This way of performing matrix calculations works equally well within
a 2-row matsindf
data frame (as shown above) or within a
1000-row matsindf
data frame.
matsindf_apply()
Users can write their own functions using
matsindf_apply()
. A flexible calc_W()
function
can be written as follows.
calc_W <- function(.DF = NULL, U = "U", V = "V", W = "W") {
# The inner function does all the work.
W_func <- function(U_mat, V_mat){
# When we get here, U_mat and V_mat will be single matrices or single numbers,
# not a column in a data frame or an item in a list.
if (length(U_mat) == 0 & length(V_mat == 0)) {
# Tolerate zero-length arguments by returning a zero-length
# a list with the correct name and return type.
return(list(numeric()) |> magrittr::setnames(W))
}
# Calculate W_mat from the inputs U_mat and V_mat.
W_mat <- matsbyname::difference_byname(
matsbyname::transpose_byname(V_mat),
U_mat)
# Return a named list.
list(W_mat) |> magrittr::set_names(W)
}
# The body of the main function consists of a call to matsindf_apply
# that specifies the inner function in the FUN argument.
matsindf_apply(.DF, FUN = W_func, U_mat = U, V_mat = V)
}
This style of writing matsindf_apply()
functions is
incredibly versatile, leveraging the capabilities of both the
matsindf
and matsbyname
packages. (Indeed, the
Recca
package uses matsindf_apply()
heavily
and is built upon the functions in the matsindf
and
matsbyname
packages.)
Functions written like calc_W()
can operate in ways
similar to matsindf_apply()
itself. To demonstrate, we’ll
use calc_W()
in all the ways that
matsindf_apply()
can be used, going in the reverse order to
our demonstration of the capabilities of matsindf_apply()
above.
calc_W()
can be used as a specialized
mutate
function that operates on matsindf
data
frames.
midf |> calc_W()
#> # A tibble: 2 × 4
#> Year U V W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
The added column could be given a different name from the default
(“W
”) using the W
argument.
midf |> calc_W(W = "W_prime")
#> # A tibble: 2 × 4
#> Year U V W_prime
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
As with matsindf_apply()
, column names in
midf
can be mapped to the arguments of
calc_W()
by the arguments to calc_W()
.
midf |>
dplyr::rename(X = U, Y = V) |>
calc_W(U = "X", V = "Y")
#> # A tibble: 2 × 4
#> Year X Y W
#> <dbl> <list> <list> <list>
#> 1 2017 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
#> 2 2018 <dbl [2 × 2]> <dbl [2 × 2]> <dbl [2 × 2]>
calc_W()
can operate on lists of single matrices, too.
This approach works, because the default values for the U
and V
arguments to calc_W()
are “U” and “V”,
respectively. The input list members (in this case
midf$U[[1]]
and midf$V[[1]]
) are returned with
the output, because list(U = midf$U[[1]], V = midf$V[[1]])
is passed to the .dat
argument of
matsindf_apply()
.
calc_W(list(U = midf$U[[1]], V = midf$V[[1]]))
#> $U
#> i1 i2
#> p1 1 2
#> p2 3 4
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
#>
#> $V
#> p1 p2
#> i1 21 22
#> i2 23 24
#> attr(,"rowtype")
#> [1] "Industry"
#> attr(,"coltype")
#> [1] "Product"
#>
#> $W
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
It may be clearer to name the arguments as required by the
calc_W()
function without wrapping in a list first, as
shown below. But in this approach, the input matrices are not returned
with the output, because arguments U
and V
are
passed to the ...
argument of
matsindf_apply()
, not the .dat
argument of
matsindf_apply()
.
calc_W(U = midf$U[[1]], V = midf$V[[1]])
#> $W
#> i1 i2
#> p1 20 21
#> p2 19 20
#> attr(,"rowtype")
#> [1] "Product"
#> attr(,"coltype")
#> [1] "Industry"
calc_W()
can operate on data frames containing single
numbers.
data.frame(U = c(1, 2), V = c(3, 4)) |> calc_W()
#> # A tibble: 2 × 3
#> U V W
#> <dbl> <dbl> <dbl>
#> 1 1 3 2
#> 2 2 4 2
Finally, calc_W()
can be applied to single numbers, and
the result is 1x1 matrix.
It is good practice to write internal functions that tolerate
zero-length inputs, as calc_W()
does. Doing so, enables
results from different calculations to be rbind
ed
together.
calc_W(U = numeric(), V = numeric())
#> $W
#> numeric(0)
calc_W(list(U = numeric(), V = numeric()))
#> $U
#> numeric(0)
#>
#> $V
#> numeric(0)
#>
#> $W
#> numeric(0)
res <- calc_W(list(U = c(2, 3, 4, 5), V = c(3, 4, 5, 6)))
res0 <- calc_W(list(U = numeric(), V = numeric()))
dplyr::bind_rows(res, res0)
#> # A tibble: 4 × 3
#> U V W
#> <dbl> <dbl> <dbl>
#> 1 2 3 1
#> 2 3 4 1
#> 3 4 5 1
#> 4 5 6 1
This vignette demonstrated use of the versatile
matsindf_apply()
function. Inputs to
matsindf_apply()
can be
matsindf_apply()
can be used for programming, and
functions constructed as demonstrated above share characteristics with
matsindf_apply()
:
dplyr::mutate()
operators, andThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.