Packages used in this vignette.
This vignette demonstrates the use of two functions from the docxtools package:
format_engr()
for formatting numbers in engineering notationalign_pander()
for aligning table columns using a simple pander table styleThe primary goal of format_engr()
is to present numeric variables in a data frame in engineering format, that is, scientific notation with exponents that are multiples of 3. Compare:
syntax | expression |
---|---|
computer | \(1.011E+5\) |
mathematical | \(1.011\times10^{5}\) |
engineering | \(101.1\times10^{3}\) |
This example uses a small data set, density
, included with docxtools, with temperature in K, pressure in Pa, the gas constant in J kg-1K-1, and density in kg m-3.
density
#> date trial T_K p_Pa R density
#> 1 2018-06-12 a 294.05 101100 287 1.197976
#> 2 2018-06-13 b 294.15 101000 287 1.196384
#> 3 2018-06-14 c 294.65 101100 287 1.195536
#> 4 2018-06-15 d 293.35 101000 287 1.199647
#> 5 2018-06-16 e 293.85 101100 287 1.198791
Four of the variables are numeric. The date
variable is of type “double” but class “Date”, so it is not reformatted.
map_chr(density, class)
#> date trial T_K p_Pa R density
#> "Date" "character" "numeric" "numeric" "numeric" "numeric"
map_chr(density, typeof)
#> date trial T_K p_Pa R density
#> "double" "character" "double" "double" "double" "double"
Usage is format_engr(x, sigdig = NULL, ambig_0_adj = FALSE)
. The function returns a data frame with all numeric values reformatted as character strings in engineering format with math delimiters $...$
.
density_engr <- format_engr(density)
density_engr
#> date trial T_K p_Pa R density
#> 1 2018-06-12 a $294.0$ ${101.1}\\times 10^{3}$ $287.0$ $1.198$
#> 2 2018-06-13 b $294.2$ ${101.0}\\times 10^{3}$ $287.0$ $1.196$
#> 3 2018-06-14 c $294.6$ ${101.1}\\times 10^{3}$ $287.0$ $1.196$
#> 4 2018-06-15 d $293.4$ ${101.0}\\times 10^{3}$ $287.0$ $1.200$
#> 5 2018-06-16 e $293.8$ ${101.1}\\times 10^{3}$ $287.0$ $1.199$
The formerly numeric variables are now characters. Non-numeric variables are returned unaltered.
map_chr(density_engr, class)
#> date trial T_K p_Pa R density
#> "Date" "character" "character" "character" "character" "character"
The math formatting is applied when the data frame is printed in the output document. For example, we can use knitr::kable()
to print the formatted data.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
2018-06-13 | b | \(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-14 | c | \(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-15 | d | \(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
2018-06-16 | e | \(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
The function is compatible with the pipe operator.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
2018-06-13 | b | \(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-14 | c | \(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-15 | d | \(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
2018-06-16 | e | \(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
Comments:
format_engr()
has three arguments:
x
, a data frame with at least one numerical variable.sigdig
, an optional vector of significant digits. Default is 4.ambig_0_adj
, an optional logical to adjust the notation in the event of ambiguous trailing zeros. Default is FALSE.The sigdig
argument can be a single value, applied to all numeric columns.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-13 | b | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-14 | c | \(295\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-15 | d | \(293\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-16 | e | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
Alternatively, significant digits can be assigned to every numeric column. A zero returns the variable in its original form.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.05\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1980\) |
2018-06-13 | b | \(294.15\) | \({101.0}\times 10^{3}\) | \(287\) | \(1.1964\) |
2018-06-14 | c | \(294.65\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1955\) |
2018-06-15 | d | \(293.35\) | \({101.0}\times 10^{3}\) | \(287\) | \(1.1996\) |
2018-06-16 | e | \(293.85\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1988\) |
Subset the data to look at just the numerical variables.
Print the data with incrementally decreasing significant digits.
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
\(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
\(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
\(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
\(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
Three digits creates no ambiguity.
T_K | p_Pa | R | density |
---|---|---|---|
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(295\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
With 2 digits, we have three columns with ambiguous trailing zeros.
T_K | p_Pa | R | density |
---|---|---|---|
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
By setting the ambig_0_adj
argument to TRUE, scientific notation is used to remove the ambiguity.
T_K | p_Pa | R | density |
---|---|---|---|
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
The ambiguous trailing zero adjustment is applied only to those variables for which the condition exists. For example, if the pressure were known to only 2 digits, it is the only variable with ambiguous trailing zeros.
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294.2\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294.6\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293.4\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293.8\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
With ambig_0_adj = TRUE
, only the pressure variable has a reformatted power of ten.
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(294.2\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(294.6\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(293.4\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(293.8\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
This function uses pander()
to print a table and panderOptions('table.alignment.default')
to align columns. Usage is: align_pander(x, align_idx = NULL, caption = NULL)
x
is the data framealign_idx
is a string comprised of any combination of “r”, “l”, and “c” The default alignments are numeric right and everything else leftcaption
is an optional string used as the pander()
caption argumentFinally, the heading can be edited for presentation.
names(density_engr) <- c(
"Date",
"Trial",
"Temp (K)",
"Press (Pa)",
"R (J kg^-1^K^-1^)",
"$\\rho$ (kg/m^3^)"
)
\\rho
m^3^
These two functions provide the means for consistently rendering numbers with the desired number of significant digits, including trailing zeros, and align them in output tables without affecting character data in the same data frame.