The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Let’s briefly show some of the capabilities of cppally, from its custom C++ scalar and vectors, to using templates and concepts.
To make a C++ function available to R we use the
[[cppally::register]] tag.
#include <cppally.hpp>
using namespace cppally;
[[cppally::register]]
void hello_world(){
print("Hello World!");
}After tagging our functions we want to make them available to R. To do that we have a few routes.
After writing our hello world program in foo.cpp we can use
cpp_source() to compile and register the function to R.
Now the function is available in R
Similarly we can use the helper cpp_eval to run simple
expressions and return the result without needing to include cppally.hpp
and register the function.
Note - For the rest of the examples it is assumed that the following code is always included beforehand.
Since cppally is header-only, we can include the headers directly into our own package.
usethis::create_tidy_package()cppally::use_cppally()cppally::document()This will automatically add the necessary package content needed to
start working with cppally. For continuous development, use
cppally::load_all() to compile and register cppally tagged
functions, including our hello world function.
Note: We aim to integrate cppally registration into
the devtools framework for ease-of-use.
cppally offers a rich set of R types in C++ that are NA-aware. This
means that common arithmetic and logical operations will account for
NA in a similar fashion to R.
r_lglcppally’s scalar version of logical, r_lgl
can represent true, false or NA.
#> [1] TRUE
#> [1] FALSE
#> [1] NA
Logical operators work just like in R
[[cppally::register]]
r_vec<r_lgl> lgl_ops(){
return make_vec<r_lgl>(
r_true || r_false, // true
r_true && r_false, // false
r_na || r_true, // true
r_na && r_true, // NA
r_na && r_false, // false
r_na || r_na, // NA
r_na && r_na // NA
);
}Using r_lgl in if-statements
For type-safety reasons r_lgl cannot be implicitly
converted to bool except in if-statements where an error is
thrown if the value is NA.
DON’T do this:
[[cppally::register]]
void bad_lgl_print(r_lgl condition){
if (condition){
print("true");
} else {
print("false");
}
}bad_lgl_print(TRUE)
#> true
bad_lgl_print(FALSE)
#> false
bad_lgl_print(NA) # Can't implicitly convert NA to bool
#> Error:
#> ! Cannot implicitly convert r_lgl NA to bool, please checkDO this:
[[cppally::register]]
void good_lgl_print(r_lgl condition){
if (is_na(condition)){
print("NA");
} else if (condition){
print("true");
} else {
print("false");
}
}good_lgl_print(TRUE)
#> true
good_lgl_print(FALSE)
#> false
good_lgl_print(NA) # NA is handled explicitly so no issues
#> NAWe can also use r_lgl members is_true() and
is_false() which return bool and are
equivalent to R’s isTRUE() and isFALSE()
[[cppally::register]]
void also_good_lgl_print(r_lgl condition){
if (condition.is_true()){
print("true");
} else {
print("not true");
}
}also_good_lgl_print(TRUE)
#> true
also_good_lgl_print(FALSE)
#> not true
also_good_lgl_print(NA) # Falls into 'not true' branch here as expected
#> not trueAll cppally scalar types are implemented as structs that contain the underlying C/C++ types as well as other member functions.
| cppally type | Description | Implicitly converts to |
|---|---|---|
r_lgl |
Scalar logical | bool only in
if-statements |
r_int |
Scalar integer | int |
r_int64 |
Scalar 64-bit integer | int64_t |
r_dbl |
Scalar double | double |
r_str |
Scalar string | SEXP |
r_cplx |
Scalar double complex | std::complex<double> |
r_raw |
Scalar raw | unsigned char |
r_sym |
Symbol | SEXP |
r_date 1 |
Scalar date | double |
r_psxct |
Scalar date-time | double |
r_sexp |
Generic R object (SEXP)2 | SEXP |
NA values can be accessed via the template function
na<T>
| Type | Value | R C API Value | constexpr?3 |
|---|---|---|---|
r_lgl |
na<r_lgl>()/r_na |
NA_LOGICAL |
Yes |
r_int |
na<r_int>() |
NA_INTEGER |
Yes |
r_int64 |
na<r_int64>() |
Not applicable | Yes |
r_dbl |
na<r_dbl>() |
NA_REAL |
Yes |
r_str |
na<r_str>() |
NA_STRING |
No |
r_cplx |
na<r_cplx>() |
Not applicable | Yes |
r_sym |
Not applicable | Not applicable | No |
r_sexp4 |
na<r_sexp>/r_null |
R_NilValue |
No |
cppally vectors are templated and can be thought of as containers of
scalar elements like r_int, r_dbl, etc.
We can create vectors like so
// Integer vector of size n
[[cppally::register]]
r_vec<r_int> new_integer_vector(int n){
r_vec<r_int> int_vctr(n, /*fill = */ r_int(0));
return int_vctr;
}To create inline vectors, use make_vec<>
#> [1] 1.0 1.5 2.0 NA
We can add names on the fly with arg()
make_vec<r_dbl>(
arg("first") = 1,
arg("second") = 1.5,
arg("third") = 2,
arg("last") = na<r_dbl>()
)#> first second third last
#> 1.0 1.5 2.0 NA
In R a list is a generic vector, so cppally defines lists as
r_vec<r_sexp>, a vector of the generic type
r_sexp.
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
#>
#> [[3]]
#> [1] 3
A list of all cppally vectors of length 0
[[cppally::register]]
r_vec<r_sexp> all_vectors(){
return make_vec<r_sexp>(
arg("logical") = r_vec<r_lgl>(),
arg("integer") = r_vec<r_int>(),
arg("integer64") = r_vec<r_int64>(), // Requires bit64
arg("double") = r_vec<r_dbl>(),
arg("character") = r_vec<r_str>(),
arg("character") = r_vec<r_str_view>(),
arg("raw") = r_vec<r_raw>(),
arg("date") = r_vec<r_date>(),
arg("date-time") = r_vec<r_psxct>(),
arg("list") = r_vec<r_sexp>()
);
}all_vectors()
#> $logical
#> logical(0)
#>
#> $integer
#> integer(0)
#>
#> $integer64
#> integer64(0)
#>
#> $double
#> numeric(0)
#>
#> $character
#> character(0)
#>
#> $character
#> character(0)
#>
#> $raw
#> raw(0)
#>
#> $date
#> Date of length 0
#>
#> $`date-time`
#> POSIXct of length 0
#>
#> $list
#> list()One of the most powerful features of C++20 are concepts. These allow users to write human-readable templates and constraints.
When writing your own templates, it is highly encouraged to place them in headers for cppally registration to work correctly.
Let’s practice by creating an absolute function in C++ using
templates and the RMathType concept.
template <RMathType T>
[[cppally::register]]
T cpp_abs(T x){
if (is_na(x)) return na<T>();
if (x < 0){
return -x;
} else {
return x;
}
}Works correctly for doubles
It also works for integers
The top-line template <RMathType T> declares a
template that encapsulates T, an RMathType - a
concept that contains r_lgl, r_int,
r_int64 and r_dbl
If x is NA then we immediately also return NA via
na<T>() which is a templated function that returns NA
of the input type T.
Without templates, writing C++ functions that accept flexible inputs is quite difficult because C++ is a statically-typed language. Usually one would write one absolute function for doubles and another for integers whereas here we don’t have to.
To correctly register templates, the ‘[[cppally::register]]’ tag must always go above the function name.
Explicit instantiation (from R) is unfortunately not possible and template types must be deduced from supplied arguments.
You may get a cryptic compiler error like this
error: no matching function for call to 'foo()'
[]<typename T>() -> decltype(cpp_to_sexp(::foo())) {along with an equally cryptic note
This is because the parameter T cannot be automatically
deduced from any of the function inputs. Even though these kinds of
templates can be written with cppally, they cannot be exported to R.
An obvious and somewhat ugly workaround is to include a prototype argument that allows the template parameter to be deduced from.
// Return the default constructor result of RScalar types
template <RScalar T>
[[cppally::register]]
T scalar_default(T ptype){
return T();
}scalar_default(integer(1)) # Default is 0L
#> [1] 0
scalar_default(numeric(1)) # Default is 0.0
#> [1] 0
scalar_default(character(1)) # Default is ""
#> [1] ""Exporting variadic templates are also not supported. The best
alternative is to use lists (r_vec<r_sexp>).
In the above example we used the RScalar concept which
includes all cppally scalar types (excluding r_sexp). For a
list of all cppally concepts, please see the Annex
To coerce from one scalar to another we can use
as<T>
We can also coerce from one vector type to another
Since as<T> is extremely flexible, we can also
coerce from a scalar to a vector or vice versa
cppally provides the useful string type r_str
We can create R strings easily
#> [1] "hello"
To get a C or C++ string, use the members c_str() and
cpp_str() respectively
C string via c_str()
#> [1] "hello"
C++ string_view via cpp_str()
This can be converted into a std::string via its constructor
Symbols have class r_sym and can be created directly
from a string literal
#> new_symbol
Or from a cppally string
#> symbol_from_string
cppally provides an efficient caching strategy for constructing cppally strings/symbols from string literals
cached_str<>
#> [1] "cached_string"
This initialises the string once, caches it (to R’s CHARSXP pool), and efficiently re-uses the cached string for each subsequent call.
We can cache symbols in a similar way
#> cached_symbol
r_sexp is generally interpreted as an “element of a
list” since lists are defined as r_vec<r_sexp>, a
vector that holds generic r_sexp elements.
The problem with a class like r_sexp is that it is by
design generic and therefore difficult to work with in C++. To
disambiguate the actual type we can use visit_vector() or
visit_sexp() via a C++ lambda.
Example: using visit_vector() to resize
every vector to length n in-place
[[cppally::register]]
r_vec<r_sexp> resize_all(r_vec<r_sexp> x, r_size_t n){
r_size_t list_length = x.length();
for (r_size_t i = 0; i < list_length; ++i){
visit_vector(x.view(i), [&](auto vec) {
x.set(i, vec.resize(n));
});
}
return x;
}When we pass a non-vector to visit_vector, it aborts and
explains that the input must be a vector
resize_all(list(mean_fn = mean), 1)
#> Error:
#> ! `x` must be a vector to be instantiated from an `r_sexp`visit_sexp
This allows us to visit more types than just vectors, including
factors, symbols and (soon to be implemented) data frames. When an
object’s type can’t be deduced into a distinct type, r_sexp
is returned.
Example: Same example as above but with
visit_sexp()
[[cppally::register]]
r_vec<r_sexp> resize_all2(r_vec<r_sexp> x, r_size_t n){
r_size_t list_length = x.length();
for (r_size_t i = 0; i < list_length; ++i){
visit_sexp(x.view(i), [&](auto vec) {
using vec_t = decltype(vec); // type of object `vec`
if constexpr (RVector<vec_t>){
x.set(i, vec.resize(n));
} else {
abort("Cannot resize a non-vector");
}
});
}
return x;
}We can create a factor via r_factors()
new_factor(letters)
#> [1] a b c d e f g h i j k l m n o p q r s t u v w x y z
#> Levels: a b c d e f g h i j k l m n o p q r s t u v w x y zIn cppally, like R, factors are not vectors and therefore do not
satisfy the RVector concept. To access the underlying integer codes
vector, use the public codes() member function
Attributes can be manipulated via functions defined in the attr namespace.
Example: Converting a list of samples to a data frame
[[cppally::register]]
r_vec<r_sexp> list_as_df(r_vec<r_sexp> x){
r_size_t n = x.length();
if (n_unique(x.lengths()) > 1){
abort("List must have vectors of equal length to be converted to a data frame");
}
r_vec<r_str> names(attr::get_attr(x, cached_sym<"names">()));
if (names.is_null()){
abort("list must have names to be converted to a data frame");
}
r_vec<r_sexp> out = shallow_copy(x);
int nrow = 0;
r_vec<r_int> row_names;
if (n > 0){
nrow = out.view(0).length();
row_names = make_vec<r_int>(na<r_int>(), -nrow);
}
attr::set_attr(out, cached_sym<"row.names">(), row_names);
attr::set_attr(out, cached_sym<"class">(), make_vec<r_str>("data.frame"));
return out;
}set.seed(42)
norm_samples <- lapply(1:5, \(x) rnorm(10, mean = x))
names(norm_samples) <- paste0("sample_", 1:5)
list_as_df(norm_samples)
#> sample_1 sample_2 sample_3 sample_4 sample_5
#> 1 2.3709584 3.3048697 2.693361 4.455450 5.205999
#> 2 0.4353018 4.2866454 1.218692 4.704837 4.638943
#> 3 1.3631284 0.6111393 2.828083 5.035104 5.758163
#> 4 1.6328626 1.7212112 4.214675 3.391074 4.273295
#> 5 1.4042683 1.8666787 4.895193 4.504955 3.631719
#> 6 0.8938755 2.6359504 2.569531 2.282991 5.432818
#> 7 2.5115220 1.7157471 2.742731 3.215541 4.188607
#> 8 0.9053410 -0.6564554 1.236837 3.149092 6.444101
#> 9 3.0184237 -0.4404669 3.460097 1.585792 4.568554
#> 10 0.9372859 3.3201133 2.360005 4.036123 5.655648More useful attribute helpers
get_attrs() - Returns a list of attributes (possibly
r_vec<r_sexp>(r_null))set_attrs() - Sets attributes to ones specified. Note:
replaces any current attributesclear_attrs() - Removes all attributesset_attr() - Set a single attributeget_attr() - Get a single attributeinherits1() - Does object inherit class?inherits_any() - Does object inherit at least one of
the specified classes?inherits_all() - Does object inherit all of the
specified classes?modify_attrs() - Modifies current attributes but
doesn’t remove any existing onescppally also offers many useful and high-performance common functions in cppally/sugar
Example: n_unique() - fast calculation
of number of unique values.
template <RVector T>
[[cppally::register]]
r_int cpp_n_unique(T x){
return as<r_int>(n_unique(x));
}library(bench)
x <- sample(1:100, 10^5, replace = TRUE)
mark(
base_n_unique = length(unique(x)),
cppally_n_unique = cpp_n_unique(x)
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 base_n_unique 553µs 734µs 1247. 1.38MB 34.7
#> 2 cppally_n_unique 171µs 214µs 4149. 0B 0More useful sugar functions
unique() - Like R’s unique() but with a
sort argument to return sorted unique values
identical() - A very fast identical function that
works for scalars and vectors. Use this for exact equality of any scalar
or vector.
match() - Like R’s match, but also faster
sequences() - Like sequence() but it
returns a list of sequences and also works with doubles.
order() - Like base R’s order but it internally uses
a hybrid approach of ska sort, count sorting, quick sort, etc.
make_groups() - An advanced function that returns a
struct containing group IDs and number of groups (i.e number of unique
group IDs). The groups struct contains the following
members:
recycle() - Recycles supplied vectors to common
length
r_vec<T>::subset() - Fast subsetting of
vectors
Scalar math functions
There is a rich suite of math functions. Some examples include
min(), max(), round(),
log(), floor(), ceiling() and
more.
Stats sugar functions
Some statistical summary functions that are all very highly optimised for speed
sum() - Sum of valuesrange() - Min and max range of valuesabs() - Computes absolute values (there is also a
scalar version)var() and sd() - Variance and standard
deviationgcd() - Greatest common divisorlcm() - Lowest common multipler_sym is unsupported in templates when it’s part of a
template argument but is supported when the argument is explicitly an
r_sym.
RIntegerType - Includes r_lgl, r_int,
r_int64
RMathType - Includes r_lgl, r_int,
r_int64 and r_dbl
RStringType - Includes r_str and
r_str_view
RScalar - Includes all cppally specific scalar types
RVal - Includes anything a cppally vector
(r_vec<>) can contain: RScalar
+r_sexp
RVector - Includes r_vec<T> where
T is an RVal
RTimeType - Includes r_date and
r_psxct
RNumericType - Numeric types, including RMathType and RTimeType
RSortableType - Includes RNumericType and RStringType (strings can also be sorted)
RAtomicVector - A vector that contains RScalar elements
CppallyType - Any R type defined by R, including RVal, RVector, RFactor, RDataFrame, RSymbol
CppType - Anything that is not an CppallyType
CastableToRScalar - Anything that can be constructed or cast into an RScalar (which also includes RScalar)
CastableToRVal (questioning) - Anything that can
be constructed or cast into an RVal. This is more complicated as it
includes vectors, factors and data frames which can be cast to
r_sexp
Other useful type traits
unwrap_t - Returns the underlying unwrapped typeas_r_scalar_t - Returns the equivalent RScalar
typeas_r_val_t - Returns the equivalent RVal typecommon_r_t - Returns the common RVal type between 2
types. Generally this is a hierarchy where the common type is the type
that both values can be coerced to without complete loss of
informationWhile it is generally recommended not to access the underlying
objects, you can do so with unwrap() which returns the
underlying C/C++ value. For example, unwrap(r_int(5)) will
return an int of value 5.
To access the underlying type, use unwrap_t<>
which always aligns with unwrap()
The main reason for wanting to access underlying values would likely
be optimisation and so unwrap() and unwrap_t
allow this to be done consistently.
Example: Summing a double vector using
r_vec<T>::data() member
[[cppally::register]]
double primitive_sum(const r_vec<r_dbl>& x){
// r_vec<T>::data_type always returns typename T
using data_t = typename std::remove_cvref_t<decltype(x)>::data_type;
using primitive_t = unwrap_t<data_t>;
primitive_t *p_x = x.data();
r_size_t n = x.length();
double sum = 0;
OMP_SIMD_REDUCTION1(+:sum)
for (r_size_t i = 0; i < n; ++i){
sum += p_x[i];
}
return sum;
}Unlike r_str which is composite and holds
an r_sexp member, r_date and
r_psxct instead inherit directly from r_dbl.
This means that they can implicitly convert to r_dbl↩︎
r_sexp represents a generic R object which
can include cppally vectors. We will explain how to disambiguate
r_sexp later which is most useful when working with lists
and data frames↩︎
In C++ constexpr is used as a keyword to declare that
it’s possible to evaluate values at compile-time, meaning they are known
before any code is run by the user. Since r_na internally
is the largest possible int which does not change and is
known a priori, it is therefore a compile-time constant.↩︎
Having an NA sentinel for
r_sexp is very useful when writing templates involving
vectors. For this reason the NA sentinel is
r_null. This doesn’t mean is_na(r_null) is
true, and is intentionally not true because it is not a scalar and
therefore cannot be NA. As r_null represents
the absence of a tangible R object, it can be thought of as a
zero-length object and since all NA values are represented
as length-1 vectors (in R), is_na(r_null) should not return
true.↩︎
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.