Sparse derivatives with sparsediff

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

What sparsediff is

sparsediff is a thin R interface to the SparseDiffEngine C library — the sparse Jacobian and Hessian differentiation backend used by CVXPY for its Disciplined Nonlinear Programming (DNLP) extension. It is the R analog of the Python sparsediffpy package and wraps the same C library.

You build a nonlinear expression graph out of the sd_* atom constructors, assemble the pieces into a problem, and then evaluate, at any primal point,

the objective value and its dense gradient,
the constraint vector and its sparse Jacobian (in COO form), and
the lower-triangular Lagrangian Hessian (in COO form).

This is a low-level backend: the intended user is a higher-level modelling layer such as CVXR, not an analyst writing models by hand. The walk-through below shows the moving parts.

library(sparsediff)
engine_version()
#> [1] "0.3.0"

Building an expression graph

Leaves are created with sd_variable() (a slice of the differentiation vector) and sd_parameter() (fixed data). Everything else is an atom applied to other expressions. Each constructor returns an expression handle — an external pointer into the engine’s graph that frees itself when garbage collected.

Here is f(x) = sum(exp(x)) for a length-3 variable, with a single linear constraint g(x) = sum(x):

n   <- 3L
x   <- sd_variable(d1 = n, d2 = 1L, var_id = 0L, n_vars = n)  # 0-based var offset
obj <- sd_sum(sd_exp(x), axis = -1L)                          # sum(exp(x))  (axis -1 = all)
g1  <- sd_sum(x, axis = -1L)                                  # sum(x)

var_id is the 0-based, column-major offset of the variable’s first entry in the flat primal vector; n_vars is the total number of variables. Reduction axis follows the engine convention: -1 (all entries), 0 (down rows), 1 (across columns).

Assembling a problem and its derivative oracle

prob <- sd_problem(objective = obj, constraints = list(g1), verbose = FALSE)

# initialise the (structural) derivative layout once
sd_init_derivatives(prob)
sd_init_jacobian_coo(prob)
sd_init_hessian_coo(prob)

Sparsity patterns are structural and fixed after these sd_init_* calls, so you query them once and reuse them; only the values are recomputed at each point.

Evaluating value and derivatives

Evaluation is ordered: a forward pass populates the node values at a primal point u, after which the gradient, Jacobian values and Hessian values can be read.

u <- c(0, 0.5, 1)

sd_objective_forward(prob, u)   # value of sum(exp(x))
#> [1] 5.367003
sd_gradient(prob)               # gradient = exp(u)
#> [1] 1.000000 1.648721 2.718282

sd_constraint_forward(prob, u)  # value of sum(x)
#> [1] 1.5

The constraint Jacobian comes back in COO form (0-based row/column indices) plus the matching nonzero values:

sd_jacobian_sparsity(prob)      # rows / cols / nrow / ncol
#> $rows
#> [1] 0 0 0
#> 
#> $cols
#> [1] 0 1 2
#> 
#> $nrow
#> [1] 1
#> 
#> $ncol
#> [1] 3
sd_jacobian_values(prob)        # d/dx sum(x) = (1, 1, 1)
#> [1] 1 1 1

The Lagrangian Hessian is the lower triangle of obj_w * Hess(f) + sum_i w[i] * Hess(g_i). Here the objective weight is 1 and the constraint is linear (zero Hessian), so the result is diag(exp(u)):

sd_hessian_sparsity(prob)
#> $rows
#> [1] 0 1 2
#> 
#> $cols
#> [1] 0 1 2
#> 
#> $nrow
#> [1] 3
#> 
#> $ncol
#> [1] 3
sd_hessian_values(prob, obj_w = 1, w = 0)
#> [1] 1.000000 1.648721 2.718282

Parameters and fast re-evaluation

A sd_parameter() is fixed data that can be updated between evaluations without rebuilding the graph — the basis for CVXPY/CVXR’s Disciplined Parametrized Programming (DPP) fast path. Register the parameters once, then push new values with sd_update_params():

p    <- sd_parameter(d1 = 1L, d2 = 1L, param_id = 0L, n_vars = n, values = 2)
obj2 <- sd_sum(sd_scalar_mult(p, sd_exp(x)), axis = -1L)      # theta * sum(exp(x))
prob2 <- sd_problem(obj2, constraints = list(), verbose = FALSE)
sd_register_params(prob2, list(p))
sd_init_derivatives(prob2)

sd_objective_forward(prob2, u)        # theta = 2
#> [1] 10.73401
sd_update_params(prob2, 3)            # theta -> 3
sd_objective_forward(prob2, u)        # re-evaluated, no rebuild
#> [1] 16.10101

Where to go next

The full atom catalogue is grouped in the reference index: leaves, elementwise, affine/shape, bivariate, product-reduction, and parameter/constant-matrix atoms. For modelling at a higher level, see CVXR, which uses sparsediff as the derivative backend for its DNLP solver path.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.