The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
openclPort namespaceThe openclPort namespace (declared in
openclPort.h and implemented in
opencl_kernel_runners.cpp) provides the generic OpenCL
infrastructure that underlies all of nmathopencl’s GPU
execution. It is designed to be completely independent of any particular
mathematical library: it knows only about OpenCL C types and standard
C++ types, not about nmath, distributions, or R data
structures.
This namespace is currently shipped inside nmathopencl
to make the package self-contained. It will migrate to a standalone
openclPort CRAN package in a future release (see Chapter 00
for the release roadmap). The design described here will remain stable
during that migration; downstream packages using these utilities via
#include "openclPort.h" will require only a change in their
LinkingTo field.
opencl_dbl_scalar_kernel_runnerThis is the single generic runner that powers every kernel in
nmathopencl.
| Parameter | Type | Description |
|---|---|---|
kernel_source |
const std::string& |
Complete, concatenated OpenCL program source (shims + library + kernel). Must be syntactically valid OpenCL C. |
kernel_name |
const char* |
Name of the __kernel void function to execute, e.g.,
"dnorm_kernel". |
dargs |
const std::vector<double>& |
Scalar double arguments to pass to the kernel, in the
order the kernel declares them. |
n_out |
int |
Number of double values to read back from the output
buffer. Also becomes the global work size. |
out_flat |
std::vector<double>& |
Output buffer; resized to n_out on entry and filled
with the kernel’s results. |
The runner performs the complete OpenCL lifecycle for a single kernel invocation:
clGetPlatformIDs and clGetDeviceIDs to select
the first available GPU device.cl_context and a cl_command_queue for the
selected device.clCreateProgramWithSource followed by
clBuildProgram. On failure, it retrieves the build log via
CL_PROGRAM_BUILD_LOG and throws a
std::runtime_error containing the compiler output.clCreateKernel
with the supplied kernel_name.CL_MEM_WRITE_ONLY device buffer of size
n_out * sizeof(double).dargs as a scalar kernel argument
(clSetKernelArg) in index order, then sets the output
buffer as the next argument, and finally n_out as the last
integer argument.n_out via
clEnqueueNDRangeKernel.clFinish
then reads the output buffer back to host memory via
clEnqueueReadBuffer.Error checking is performed after every OpenCL API call. Any
non-CL_SUCCESS status causes the runner to clean up all
already-created resources and throw a std::runtime_error
with a message that includes the status code, its symbolic name, and the
platform/device information.
The runner assumes the kernel is declared with the following argument signature:
__kernel void my_kernel(
double arg0, double arg1, ..., double argN,
__global double* out,
int n_out
)The dargs vector maps element-by-element to
arg0 through argN. The runner always appends
the output buffer and n_out as the last two arguments
automatically. Kernels in inst/cl/src/ that accept vector
inputs (e.g., dnorm_kernel) do not fit this layout and use
a separate, more specialized runner in kernel_runners.cpp;
the double-scalar runner handles the scalar-parameter kernels that
produce replicated output.
// Inside a kernel_wrappers.cpp function (nmathopencl namespace):
namespace nmathopencl {
Rcpp::NumericVector dnorm_opencl_impl(
int n, double mu, double sigma, bool give_log, bool verbose)
{
std::string src = openclPort::load_kernel_library("nmath") + "\n"
+ openclPort::load_kernel_source("src/dnorm_scalar_kernel.cl");
std::vector<double> out(n);
openclPort::opencl_dbl_scalar_kernel_runner(
src,
"dnorm_scalar_kernel",
{ mu, sigma, static_cast<double>(give_log) }, // dargs
n, // n_out
out
);
return Rcpp::NumericVector(out.begin(), out.end());
}
} // namespace nmathopenclFive inline functions in openclPort.h
provide consistent error reporting across all runners. Because they are
inline, they are available to any translation unit that
includes openclPort.h; downstream packages that
LinkingTo: nmathopencl get them for free.
opencl_status_nameReturns the symbolic name of an OpenCL error code as a C string:
CL_SUCCESS, CL_DEVICE_NOT_FOUND, CL_BUILD_PROGRAM_FAILURE, ...
Returns "UNKNOWN_OR_VENDOR_SPECIFIC" for codes not in
the table. Covers all error codes defined by the OpenCL 3.0
specification that are commonly encountered in practice.
opencl_status_hintReturns a plain-English diagnostic hint for the most actionable error codes:
| Status | Hint |
|---|---|
CL_OUT_OF_RESOURCES |
Device/runtime resource limit exceeded (watchdog, register pressure, …) |
CL_OUT_OF_HOST_MEMORY |
Host memory allocation failed |
CL_MEM_OBJECT_ALLOCATION_FAILURE |
Device memory allocation failed |
CL_BUILD_PROGRAM_FAILURE |
Kernel compilation failed; inspect build log |
CL_INVALID_CONTEXT |
OpenCL context is invalid |
CL_DEVICE_NOT_AVAILABLE |
Device present but temporarily unavailable |
Returns "No additional hint available." for all other
codes.
opencl_read_platform_info_str /
opencl_read_device_info_strinline std::string opencl_read_platform_info_str(
cl_platform_id platform, cl_platform_info param);
inline std::string opencl_read_device_info_str(
cl_device_id device, cl_device_info param);Query string-valued platform or device info (e.g.,
CL_PLATFORM_NAME, CL_DEVICE_NAME,
CL_DRIVER_VERSION) safely: they handle null handles, query
failures, and trailing null terminators, returning
"unknown" on any error.
Typical usage:
opencl_make_context_errorinline std::runtime_error opencl_make_context_error(
cl_int status, cl_platform_id platform, cl_device_id device);Constructs a std::runtime_error with a message of the
form:
OpenCL error at clCreateContext (status=-6, name=CL_OUT_OF_HOST_MEMORY).
platform_name=NVIDIA CUDA, platform_vendor=NVIDIA Corporation,
device_name=NVIDIA GeForce RTX 4090, driver_version=560.94.
This may indicate a transient driver/runtime context failure.
Used inside opencl_dbl_scalar_kernel_runner and
f2_f3_kernel_runner immediately after a
clCreateContext failure.
openclPortstd::vector conversionstd::vector<double> openclPort::flattenMatrix(const Rcpp::NumericMatrix& mat);
std::vector<double> openclPort::copyVector(const Rcpp::NumericVector& vec);flattenMatrix converts a column-major R matrix to a
row-major (or simply contiguous) std::vector<double>
suitable for passing as a device buffer. copyVector does
the same for a numeric vector. Both are implemented in
OpenCL_helper.cpp.
openclPort)At the R prompt, host GPU inventory uses
opencltools::gpu_names() (see Chapters
00–01). Inside the package DLL, openclPort
exposes related C++ helpers for kernel runners:
Rcpp::CharacterVector openclPort::gpu_names();
int openclPort::detect_num_gpus_internal();
bool openclPort::has_opencl();
int openclPort::get_opencl_core_count();openclPort::gpu_names() — OpenCL device names from the
runtime linked into this package (distinct from
opencltools::gpu_names() R helper).detect_num_gpus_internal() — GPU device count for
envelope-style runners.nmathopencl_has_opencl() — compile-time
USE_OPENCL flag in this DLL.openclPort::get_opencl_core_count() — internal C++
helper (delegates to opencltools); at the R prompt use
opencltools::get_opencl_core_count().struct openclPort::OpenCLConfig {
bool have_expm1;
bool have_log1p;
std::string buildOptions;
};
openclPort::OpenCLConfig openclPort::configureOpenCL(
cl_context context, cl_device_id device);configureOpenCL probes the device to determine which
standard math functions are available as OpenCL built-ins (specifically
expm1 and log1p). It returns a set of
buildOptions (e.g., -DHAVE_EXPM1 -DHAVE_LOG1P)
that are passed to clBuildProgram so that the nmath
.cl files can conditionally use the built-in or the ported
version.
openclPort from a downstream packageA package that LinkingTo: nmathopencl can include
openclPort.h directly:
// In a downstream package's source file:
#include "openclPort.h"
void my_kernel_runner(const std::string& program_source, int n) {
std::vector<double> out;
openclPort::opencl_dbl_scalar_kernel_runner(
program_source,
"my_kernel",
{ 1.0, 2.5 }, // scalar args
n,
out
);
}No additional CMake or autoconf work is needed beyond the
LinkingTo field, because openclPort.h is
installed to inst/include/ and the exported symbols are
visible in the shared library.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.