The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Rcpp::asis
vignette engine. No rendering on CRAN runners.rmarkdown from Suggests (no longer
needed).ggml_graph_print() output captured in
test-graph-utils.R; C-level broadcast warnings captured in
ONNX broadcast and resize-broadcast tests.gguf_load(path) — opens a GGUF file
(v2/v3) and reads all metadata and tensor descriptors. Returns an S3
object of class "gguf".gguf_metadata(x) — returns all
key-value metadata pairs as a named list (architecture, tokenizer
config, quantization info, etc.).gguf_tensor_names(x) — lists all
tensor names in the file.gguf_tensor_info(x, name) — returns
shape, type, and size in bytes for a single tensor.gguf_tensor_data(x, name) —
dequantizes (if needed) and returns tensor weights as an R numeric array
with correct dimensions.gguf_free(x) — explicitly frees GGUF
context (also called by GC).print.gguf() method shows file version, tensor count,
and metadata count.VK_KHR_push_descriptor): unchanged — when the extension is
available and maxPushDescriptors >= 12, descriptor sets
are pushed directly into the command buffer via
pushDescriptorSetKHR(), eliminating descriptor pool
overhead. Falls back to the traditional descriptor pool path on hardware
without the extension.fit() now accepts a
callbacks parameter for sequential models (passed through
to ggml_fit_sequential()).test-gguf.R,
test-graph-utils.R, test-inplace-ops.R,
test-keras-api.R, test-misc-ops.R,
test-model-ops.R, test-print-methods.R,
test-tensor-utils.R, test-threading.R,
test-autograd-missing.R,
test-nn-functional-missing.R,
test-quants-missing.R.src/ and
inst/include/ headers: configure and
configure.win now automatically sync all public headers
from src/ to inst/include/ at install time.
Previously, changes to GGML_MAX_DIMS (4→5) and other
structs in src/ggml.h were not propagated to the exported
headers, causing segfaults in downstream packages (e.g. sd2R).tests/testthat/test-headers-sync.R to verify that
inst/include/ headers remain in sync with src/
headers and that GGML_MAX_DIMS is consistent.ggml_view_5d() — new API function for
creating 5D views with explicit strides, extending the existing 1D–4D
view family. Uses the existing ggml_view_impl()
internally.ggml_repeat_5d() — new API function
for tiling tensors up to 5D. CPU kernels
(ggml_compute_forward_repeat_f32,
ggml_compute_forward_repeat_f16) updated with a 5th loop
dimension. Vulkan dispatch collapses dim3×dim4 into push constants
transparently (no shader changes needed — push constants remain at 128
bytes).onnx_ggml.c (~20 sites):
ne[GGML_MAX_DIMS] arrays, switch with
case 5: new_tensor_5d.onnx_broadcast_align): all
reshape/new_tensor calls use dimension-aware helpers.onnx_reshape_nd().ggml_repeat_5d().tmap_put_nd() and slice_fill arrays
updated to GGML_MAX_DIMS.onnx_reshape_nd(),
onnx_new_tensor_nd(), ne_product() — eliminate
switch/case duplication.ggml_permute API
limitation).ConstantOfShape read the
value TensorProto attribute as float regardless of
data_type. When data_type=7 (INT64), the
8-byte int64 was reinterpreted as a 4-byte float, producing garbage
values (~1.4e-45 instead of 1). This broke attention mask generation
(fill=0 instead of 1) and position ID generation (NonZero on zeros =
empty).ConstantOfShape now checks data_type
and correctly handles INT64, INT32, DOUBLE, and FLOAT value
attributes.ggml_get_rows which only supports 2D data.
For axis=0 on rank>2 (e.g. CaiT QKV split on
[48,576,6,3]), the tensor is now reshaped to 2D, gathered,
and reshaped back.GGML_OP_SCATTER_ELEMENTS added to the ggml engine
with both CPU kernel and Vulkan compute shader.scatter_elements.comp):
two variants compiled at install time —
scatter_elements_none (overwrite) and
scatter_elements_add (atomicAdd via
GL_EXT_shader_atomic_float). Data is copied to output via
vkCmdCopyBuffer with a pipeline barrier before the scatter
dispatch.ScatterElements op with
axis=0 and reduction="none"/"add" attributes.
Indices cast to I32, updates/data cast to F32 automatically.ggml_map_custom3 op. The CPU kernel
computes 2D relative position bias directly:
bias[b,hq,wq,hk,wk] = dot(x, W_h) + dot(x_transposed, W_w).detect_pos_embed_blocks() identifies
contiguous node ranges with /pos_embed/ in output names,
extracts W_h/W_w initializer shapes to determine H, W, C, validates F32
data type.onnx_ggml_run(), input data is copied into pinned memory
before ggml_backend_tensor_set() — the Vulkan driver
detects the pinned source pointer and performs direct DMA transfer to
VRAM, bypassing the internal staging copy.ggml_backend_vk_host_buffer_type() returns
NULL or buffer is too small, the standard staging path is used
transparently.onnx_device_info(): added NULL guards for
ctx->graph and n_nodes == 0 edge cases that
caused segfault when called on models before first inference run.ggml_predict() with stochastic
dropout: nn_build_graph() now receives
training = FALSE during inference, so stochastic Bernoulli
dropout is disabled at predict time. Previously,
stochastic = TRUE dropout layers applied random masks
during inference, degrading accuracy.ggml_fit() return value: the return
value of ggml_fit() must be assigned back to
model to obtain trained weights
(model <- ggml_fit(...)). This is now clarified in all
examples and documentation. Using
history <- ggml_fit(...) without reassigning
model leaves the model with untrained weights.ggml_evaluate() return value: now
includes n_samples in addition to loss and
accuracy. Metrics are computed on all samples without
truncation (via ggml_predict() internally).inst/examples/titanic_classification.R — new end-to-end
binary classification example on the Titanic dataset. Demonstrates
feature engineering (Title, FamilySize, IsAlone), stratified train/val
split, one-hot encoding, dropout regularization, and manual validation
metrics (accuracy, precision, recall, F1, confusion matrix). Achieves
~82% val accuracy.weight_buf and
never re-transferred between runs. Previous architecture reloaded all
weights before every onnx_run() call — eliminated
entirely.ctx_weight / ctx contexts: weight
tensors live in a permanent GPU buffer that the scheduler never aliases;
compute tensors are managed by ggml_backend_sched
independently.onnx_device_info() — scheduler diagnostic: number of
splits, GPU/CPU op counts, CPU-only op list.inst/examples/benchmark_onnx.R):
proper VRAM cleanup between models via rm() +
gc().onnx_load(path, device, input_shapes) — load an ONNX
model file, build a ggml computation graph, and allocate tensors on
Vulkan GPU or CPU. Weights are loaded via memory-mapped file (zero-copy
where possible).onnx_run(model, inputs) — run inference on a loaded
ONNX model with named input data.onnx_inputs(model) — list expected input tensor names
and shapes.onnx_summary(model) — return model metadata (IR
version, opset, producer, ops used).print.onnx_model() — formatted summary of a loaded ONNX
model.input_shapes parameter for models with dynamic
dimensions: specify fixed shapes at load time
(e.g. input_shapes = list(image = c(1L, 3L, 224L, 224L))).auto_pad attribute (SAME_UPPER, SAME_LOWER) supported
for Conv and pooling ops.input_shapes (Conv,
Reshape, Transpose)input_shapes (1180 nodes)input_shapes (482 nodes:
MatMul, LayerNorm, GELU, Softmax)inst/lib/libggml.a, breaking static linking from dependent
packages (e.g. llamaR).dp_train(make_model, data, loss_fn, forward_fn, target_fn, n_gpu, n_iter, lr, max_norm, verbose)
— data-parallel training across multiple replicas. Weights are broadcast
from replica 0 before the first step; gradients are averaged across
replicas each iteration; weights are re-broadcast after each optimizer
update. Returns list(params, loss_history, model).ag_mul and ag_sub now support CPU
broadcast: [d×s] * [1×s] and [d×s] * [d×1]
shapes work correctly with proper gradient reduction.ag_softmax_cross_entropy_loss accepts integer target
vectors (0-based class indices) and converts them to one-hot
automatically.ggml_sum_rows f16 on Vulkan: F16→F16 dispatch now
supported natively (no CPU fallback).ag_tensor() / ag_param() —
environment-backed tensors with reference semantics; in-place optimizer
updates visible to all references.with_grad_tape({ ... }) — enables the global gradient
tape for the enclosed forward pass.backward(loss) — reverse-mode automatic
differentiation; returns a gradient environment keyed by tensor id.ag_matmul, ag_add
(with bias broadcast), ag_sub, ag_mul,
ag_scale.ag_relu, ag_sigmoid,
ag_tanh, ag_softmax.ag_sum, ag_mean,
ag_log, ag_exp, ag_pow,
ag_clamp.ag_reshape, ag_transpose.ag_mse_loss,
ag_cross_entropy_loss,
ag_softmax_cross_entropy_loss (numerically-stable
fused).optimizer_sgd() — SGD with optional momentum.optimizer_adam() — Adam with bias-corrected moment
estimates.ag_linear() — Glorot-initialised dense layer
(closure-based, returns $forward,
$params()).ag_gradcheck() — central finite-difference gradient
checker (like torch.autograd.gradcheck).ag_sequential(...) — ordered layer container; collects
all parameters for the optimizer.ag_dropout(rate) — inverted dropout; identity in eval
mode.ag_batch_norm(num_features) — batch normalisation with
running statistics and learnable γ/β.ag_embedding(vocab_size, dim) — token lookup with
scatter-add backward.ag_train(model) / ag_eval(model) — switch
all sub-layers between train and eval mode.ag_dataloader(x, y, batch_size, shuffle, col_major) —
mini-batch iterator with shuffle and $epoch() helper.lr_scheduler_step(optimizer, step_size, gamma) —
step-decay learning rate.lr_scheduler_cosine(optimizer, T_max, lr_min, restart)
— cosine-annealing (with optional SGDR warm restarts).clip_grad_norm(params, grads, max_norm) — clips all
gradients by global L2 norm in-place.ggml_layer_lstm() — LSTM recurrent layer (unrolled
BPTT).ggml_layer_gru() — GRU recurrent layer (unrolled
BPTT).ggml_layer_global_max_pooling_2d() — reduces
[H,W,C] to [C] via max pooling.ggml_layer_global_average_pooling_2d() — reduces
[H,W,C] to [C] via average pooling.ggml_save_model() — saves full model (architecture +
weights) to RDS file.ggml_load_model() — restores a model saved with
ggml_save_model().ggml_dense(), ggml_conv_2d(),
ggml_conv_1d(), ggml_batch_norm(),
ggml_embedding(), ggml_lstm(),
ggml_gru() — layer object constructors returning a reusable
ggml_layer object.ggml_apply(tensor, layer) — applies a
ggml_layer object to a tensor node; shared weights by
object identity.ggml_layer_dropout() — dropout with deterministic or
stochastic (per-epoch Bernoulli mask) mode.ggml_layer_embedding() — token embedding lookup for
integer inputs.ggml_input() gains dtype argument
("float32" or "int32").ggml_model() and
ggml_predict().ggml_input() — declare a symbolic input tensor node
(Functional API).ggml_model() — assemble a
ggml_functional_model from input/output nodes.ggml_layer_add() — element-wise addition of tensor
nodes (residual connections).ggml_layer_concatenate() — concatenate tensor nodes
along an axis.ggml_layer_*() functions now accept a
ggml_tensor_node as first argument (Functional API
mode).ggml_compile(), ggml_fit(),
ggml_evaluate(), ggml_predict() are now S3
generics with methods for ggml_functional_model.ggml_fit_opt() — low-level optimizer loop with
callbacks and learning-rate control.ggml_callback_early_stopping() — stops training when a
metric stagnates.ggml_schedule_step_decay() — step learning-rate
decay.ggml_schedule_cosine_decay() — cosine learning-rate
annealing.ggml_schedule_reduce_on_plateau() — reduces LR when
metric stops improving.ggml_opt_init_for_fit(),
ggml_opt_set_lr(), ggml_opt_get_lr() —
learning-rate control without recreating the optimizer context.configure.win.ggml_layer_conv_1d() — 1D convolution layer.ggml_layer_batch_norm() — batch normalization
layer.ggml_predict_classes() — argmax wrapper returning
1-based class indices.summary.ggml_sequential_model() — detailed model
summary with parameter counts.ggml_fit() now returns model$history
(class ggml_history) with print and
plot methods.ggml_model_sequential(),
ggml_layer_dense(), ggml_layer_conv_2d(),
ggml_layer_max_pooling_2d(),
ggml_layer_flatten(), ggml_compile(),
ggml_fit(), ggml_evaluate(),
ggml_predict(), ggml_save_weights(),
ggml_load_weights().ggml_timestep_embedding() — sinusoidal timestep
embeddings.ggml_set_f32_nd(),
ggml_get_f32_nd(), ggml_set_i32_nd(),
ggml_get_i32_nd().ggml_tensor_nb(),
ggml_tensor_num(), ggml_tensor_copy(),
ggml_tensor_set_f32_scalar(),
ggml_get_first_tensor(),
ggml_get_next_tensor().libggml.a exported for linking by
dependent packages.gguf.cpp added for GGUF file format support.inst/include/ for
LinkingTo.ggml_opt_init(),
ggml_opt_free(), ggml_opt_fit(),
ggml_opt_epoch(), ggml_opt_eval().ggml_opt_dataset_init(),
ggml_opt_dataset_data(),
ggml_opt_dataset_labels(),
ggml_opt_dataset_shuffle().ggml_opt_result_init(),
ggml_opt_result_loss(),
ggml_opt_result_accuracy(),
ggml_opt_result_pred().These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.