A collection of miscellaneous methods to simplify various tasks, including plotting, data.frame and matrix transformations, environment functions, regular expression methods, and string and logical operations, as well as numerical and statistical tools. Most of the methods are simple but useful wrappers of common base R
functions, which extend S3 generics or provide default values for important parameters.
Install the package from cran
or from github
by using the following commands:
# from CRAN
install.packages("miscset")
# from github - latest version
devtools::install_github("setempler/miscset", build_vignettes = TRUE)
Development of the package is traceable at github. In case you find any bugs or have other issues concerning the package development, feel free to make use of the github issues.
A more detailed help for each function can be viewed on the R
help pages. To view them, call the function’s name with a ?
prepended, e.g.:
?duplicates
The following chapters will describe the functions from the miscset
package. The prerequisites to run all vignette examples are to load the following packages, and generate the sample data:
library(miscset)
library(ggplot2)
d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1)
m <- matrix(letters[1:9], 3, 3, dimnames = list(1:3,1:3))
Plot a bargraph with error bars. Input data is a list with numeric vectors. Functions to calculate bar heights (e.g. mean
by default) and error bar sizes (e.g. confint.numeric
by default) can be modified (e.g. sd
for error bars).
ciplot(d)
Arrange ggplots on a grid (plot window or pdf file). Supply a list with ggplot
objects and define number of rows and/or columns. If a path
is supplied, the plot is written to that file instead of the internal graphics device.
ggplotGrid(list(
ggplot(d, aes(x=b,y=-c,col=b)) + geom_line(),
ggplot(d, aes(x=b,y=-c,shape=factor(b))) + geom_point()),
ncol = 2)
Generate a character vector with html values from a color hue as in ggplot
.
n <- length(d)
gghcl(n)
[1] "#F8766D" "#00BA38" "#619CFF"
ciplot(d, col = gghcl(n))
Create an empty plot. Useful to fill layout
.
plotn()
Sort data.frame
objects. This extends the functionality of the base R distributed generic sort
. Define multiple columns by column names as character vector or expression.
d
a b c
1 2 2 5
2 1 3 4
3 3 4 3
4 NA 5 2
5 1 6 1
sort(d, by = c("a", "c"))
a b c
5 1 6 1
2 1 3 4
1 2 2 5
3 3 4 3
A wrapper function to row-bind data.frame
objects in a list with do.call
and rbind
. Object names from the list are inserted as additional column.
d[1:3,]
a b c
1 2 2 5
2 1 3 4
3 3 4 3
do.rbind(list(first=d[1:2,], second=d[1:3,]))
Name a b c
1 first 2 2 5
2 first 1 3 4
3 second 2 2 5
4 second 1 3 4
5 second 3 4 3
Generate a pairwise list (data.frame
) of a matrix containing row and column id and upper and lower triangle values.
m
1 2 3
1 "a" "d" "g"
2 "b" "e" "h"
3 "c" "f" "i"
enpaire(m)
row col lower upper
1 1 2 b d
2 1 3 c g
3 2 3 f h
Generate a symmetric (square) matrix from an unsymmetric one using column and row names. Fills empty cells with NA
.
m[-1,]
1 2 3
2 "b" "e" "h"
3 "c" "f" "i"
squarematrix(m[-1,])
1 2 3
1 NA NA NA
2 "b" "e" "h"
3 "c" "f" "i"
Print a data.frame
as latex table. Extends xtable
by optionally including a latex header, and if desired writing the output to a file directly and calling a system command to convert it to a .pdf
file, for example.
textable(d, caption = 'miscset vignette example data.frame', as.document = TRUE)
% output by function 'textable' from package miscset 1.0.0
% latex table generated in R 3.2.3 by xtable 1.8-0 package
% Mon Jan 25 23:06:44 2016
\documentclass[a4paper,10pt]{article}
\usepackage[a4paper,margin=2cm]{geometry}
\begin{document}
\begin{table}[ht]
\centering
\caption{miscset vignette example data.frame}
\begin{tabular}{rrr}
\hline
a & b & c \\
\hline
2.00 & 2 & 5 \\
1.00 & 3 & 4 \\
3.00 & 4 & 3 \\
& 5 & 2 \\
1.00 & 6 & 1 \\
\hline
\end{tabular}
\end{table}
\end{document}
Show the help index page of a package (with the list of all help pages of a package).
help.index(miscset)
Load multiple R data objects into a list. List is of same length as number of files provided. Sublists contain all respective objects. Simplification is possible if all names are unique.
lload("folder/with/rdata/", "test*.RData")
Return all current workspace (or any custom) object names, lengths, classes, modes and sizes in a data.frame
.
lsall()
Environment: R_GlobalEnv
Objects:
Name Length Class Mode Size Unit
1 d 3 data.frame list 1008.0 byte
2 m 9 matrix character 1.3 Kb
3 n 1 integer numeric 48.0 byte
Remove all objects from the current or custom environment.
rmall()
Search for multiple patterns in a character vector. Merge results by (custom) logical functions (e.g. any
, all
) and use mutlicore support from the parallel
package. Optionally return the index (as with which
). Use identity
to return a matrix with the results of each pattern per row.
mgrepl(c("a","b"), c("ab","ac","bc"), any)
[1] TRUE TRUE TRUE
mgrepl(c("a","b"), c("ab","ac","bc"), all)
[1] TRUE FALSE FALSE
mgrepl(c("a","b"), c("ab","ac","bc"), all, use.which = TRUE)
[1] 1
mgrepl(c("a","b"), c("ab","ac","bc"), identity)
[,1] [,2] [,3]
[1,] TRUE TRUE FALSE
[2,] TRUE FALSE TRUE
Retreive the n
th or "last"
index of an expression found in a character string.
gregexprind(c("a"), c("ababa","ab","xyz",NA), 1)
[1] 1 1 NA NA
gregexprind(c("a"), c("ababa","ab","xyz",NA), 2)
[1] 3 NA NA NA
gregexprind(c("a"), c("ababa","ab","xyz",NA), "last")
[1] 5 1 NA NA
Prepend 0
characters to numbers to generate equally sized strings.
leading0(c(9, 112, 5009))
[1] "0009" "0112" "5009"
Split strings by a separator (sep
) and extract all substrings matching a pattern
. Optionally allow multiple matches, and use multicore support from the parallel
package.
strextr("xa,xb,xn,ya,yb", "n$", ",")
[1] "xn"
strextr("xa,xb,xn,ya,yb", "^x", ",", mult=T)
[[1]]
[1] "xa" "xb" "xn"
Similar to strextr
, but extracting substrings is done by setting an index value n
. Optionally roll the last value to n
if it’s index is less.
strpart("xa,xb,xn,ya,yb", ",", 3)
[1] "xn"
Create reverse version of strings of a character
vector.
strrev(c("olleH", "!dlroW"))
[1] "Hello" "World!"
Determine duplicates. Return either a logical vector (duplicates
) or an integer index (duplicatei
). Extends the base method duplicated
by also returning TRUE
for the first occurence of a value.
data.frame(
duplicate = d$a,
".d" = duplicated(d$a),
".s" = duplicates(d$a),
".i" = duplicatei(d$a))
duplicate .d .s .i
1 2 FALSE FALSE 1
2 1 FALSE TRUE 2
3 3 FALSE FALSE 3
4 NA FALSE FALSE 4
5 1 TRUE TRUE 2
Asign range symbols to values, e.g. convert p-values to significance characters.
p2star(c(0.003, 0.049, 0.092, 0.431))
[1] "**" "*" "." "n.s."
Calculate confidence intervals. Extends the base method confint
to numeric vectors.
d$a
[1] 2 1 3 NA 1
confint(d$a, ret.attr = FALSE)
[1] 0.8392064
Generate a series of triangular numbers of length n
according to OEIS#A000217. The series for 12 rows of a triangle, for example, can be returned as in the following example.
ntri(12)
[1] 0 1 3 6 10 15 21 28 36 45 55 66
Scale numeric vectors to a range of 0 to 1 with scale0
or to a custom output range r
and input range b
with scaler
.
d$c
[1] 5 4 3 2 1
scale0(d$c)
[1] 1.00 0.75 0.50 0.25 0.00
scaler(d$c, c(2, 6), b = c(1, 10))
[1] 3.777778 3.333333 2.888889 2.444444 2.000000
Return the amount (with nunique
) or index (with uniquei
) of unique values in a vector. Extends plyr::nunique
by allowing NA
values to be counted as a ‘level’.
d$a
[1] 2 1 3 NA 1
nunique(d$a)
[1] 4
nunique(d$a, FALSE)
[1] 3
uniquei(d$a)
[1] 1 2 3 4
uniquei(d$a, FALSE)
[1] 1 2 3