The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

arcopt

Robust nonlinear optimization for R

When optim() fails on your ill-conditioned model, arcopt is designed to succeed. It uses Adaptive Regularization with Cubics (ARC) to handle indefinite Hessians and escape saddle points automatically.

Installation

# Install from GitHub
pak::pak("marcus-waldman/arcopt")

# Or with devtools
devtools::install_github("marcus-waldman/arcopt")

Quick Start

library(arcopt)

# Rosenbrock function - a classic difficult optimization problem
result <- arcopt(
  x0 = c(-1.2, 1),
  fn = function(x) (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2,
  gr = function(x) c(
    -2 * (1 - x[1]) - 400 * x[1] * (x[2] - x[1]^2),
    200 * (x[2] - x[1]^2)
  ),
  hess = function(x) matrix(c(
    1200 * x[1]^2 - 400 * x[2] + 2, -400 * x[1],
    -400 * x[1], 200
  ), 2, 2)
)

result$par
#> [1] 1 1

Two Modes: Exact Hessian vs Quasi-Newton

arcopt offers two optimization strategies:

Mode 1: Exact Hessian (Default)

Best when you can compute the Hessian analytically or via automatic differentiation.

# Provide fn, gr, and hess
result <- arcopt(x0, fn, gr, hess)

Mode 2: Quasi-Newton (No Hessian Required)

Best when computing the Hessian is expensive or unavailable. Uses BFGS/SR1 approximations. arcopt seeds the initial approximation \(B_0\) with a one-time finite-difference Hessian computed from the supplied gradient (cost: \(2n\) gradient evaluations at startup), which dramatically improves saddle-escape behavior compared to the identity-initialization convention used in classical BFGS/SR1.

# Just provide fn and gr - no Hessian needed
result <- arcopt(x0, fn, gr, control = list(use_qn = TRUE))

Which Mode Should I Use?

Scenario	Recommendation
Analytic Hessian available	Use exact Hessian (default)
Hessian via autodiff (torch, Stan)	Use exact Hessian
Hessian expensive (n > 100)	Use `use_qn = TRUE`
No Hessian function	Use `use_qn = TRUE`

Benchmark Results

Tested on 8 standard functions with 10 starting points each (80 test cases):

Method	Convergence	Avg Iterations	Hessian Evals
Exact Hessian	100%	22	1,443
BFGS QN	100%	28	0
SR1 QN	99%	61	0

BFGS Quasi-Newton achieves the same 100% convergence as exact Hessian with only 26% more iterations and zero Hessian evaluations.

On problems with symmetric saddle points (e.g., growth-mixture models, factor models with rotational symmetry), qn_method = "hybrid" combines SR1’s ability to represent indefinite curvature with BFGS’s fast local convergence and matches full-Hessian solvers’ saddle-escape reliability using only gradient information.

Box Constraints

# Optimize with bounds: 0 <= x <= 0.5
result <- arcopt(
  x0 = c(0.1, 0.1),
  fn = fn, gr = gr, hess = hess,
  lower = c(0, 0),
  upper = c(0.5, 0.5)
)

Why ARC?

Standard optimizers struggle when the Hessian is indefinite or nearly singular—common in:

Maximum likelihood with complex likelihoods
Bayesian posterior modes (MAP estimation)
Nonlinear mixed effects models
Any model with poor identifiability

Cubic regularization automatically handles negative curvature without line searches or trust-region adjustments. It provably escapes saddle points and converges to local minima.

The Key Idea

Instead of the quadratic model used by Newton’s method:

m(s) = f + g's + (1/2) s'Hs

ARC adds a cubic term:

m(s) = f + g's + (1/2) s'Hs + (σ/3)||s||³

This cubic term prevents unbounded steps when the Hessian has negative eigenvalues.

Adaptive Solver-Mode Dispatch

arcopt() automatically adapts its subproblem solver to the local landscape:

Cubic (default). Uses adaptive cubic regularization with a Newton-first fast path. Handles saddles and indefinite Hessians naturally.
Trust-region (enabled by default). Fires a one-way switch from cubic when four runtime signals indicate flat-ridge or stuck-indefinite-saddle stagnation. Replaces the cubic penalty with a hard step-norm constraint so the step can walk off a near-singular ridge.
QN-polish (opt-in). Fires a bidirectional switch from cubic when the iterate enters the quadratic attraction basin of a strict minimum. Replaces the cubic subproblem with Wolfe line-search BFGS warmstarted from the current exact Hessian; skips further hess() calls until convergence or revert, saving wall-clock time on expensive Hessians (Stan AD, finite differences).

result <- arcopt(x0, fn, gr, hess, control = list(
  tr_fallback_enabled = TRUE,    # cubic->TR on flat-ridge (default)
  qn_polish_enabled = TRUE       # cubic<->BFGS polish in basin (opt-in)
))

result$diagnostics$solver_mode_final   # "cubic", "tr", or "qn_polish"
result$diagnostics$ridge_switches      # count of cubic->TR transitions
result$diagnostics$qn_polish_switches  # count of cubic->polish transitions
result$diagnostics$qn_polish_reverts   # count of polish->cubic reversions

See design/design-principles.qmd §3a for the σ↔︎r duality framing and todo/tr-fallback-hybrid-briefing.md / todo/qn-polish-mode-briefing.md for the detector designs.

Common Options

arcopt(x0, fn, gr, hess,
  control = list(
    # Quasi-Newton mode (no exact Hessian needed)
    use_qn = TRUE,
    qn_method = "hybrid",    # recommended for saddle-prone problems;
                             # also "bfgs", "sr1"

    # Convergence tolerances
    gtol_abs = 1e-5,         # Gradient norm
    maxit = 1000,            # Max iterations

    # Hybrid solver-mode dispatch
    tr_fallback_enabled = TRUE,   # cubic->TR on stagnation
    qn_polish_enabled   = FALSE,  # cubic<->polish in quadratic basin

    # Output
    trace   = 1,             # Depth of saved trace data (0/1/2/3)
    verbose = FALSE          # Print one line per iteration to console
  )
)

The full set of advanced controls (cubic regularization tuning, TR-fallback thresholds, polish-mode thresholds, QN routing) lives under ?arcopt_advanced_controls.

Comparison with optim()

Feature	arcopt	optim(BFGS)	optim(L-BFGS-B)
Handles indefinite Hessian	✓	✗	✗
Escapes saddle points	✓	✗	✗
Box constraints	✓	✗	✓
Hessian-free mode	✓	✓	✓
Best for	Complex likelihoods	Smooth convex	Large bound-constrained

When NOT to Use arcopt

arcopt is a local optimizer. Use something else for:

Global optimization: Use DEoptim, GenSA, or nloptr with global algorithms
Derivative-free: Use nloptr with COBYLA or dfoptim
Very large scale (n > 10,000): Use lbfgs or sparse methods
Linear/Quadratic programming: Use lpSolve, quadprog

Best for: 2-500 parameters, nonconvex objectives, need robust convergence.

Return Value

result <- arcopt(x0, fn, gr, hess)

result$par                 # Optimal parameters
result$value               # Optimal function value
result$gradient            # Gradient at solution
result$hessian             # Hessian at solution (exact in cubic/tr mode;
                           # BFGS approximation if solver_mode_final == "qn_polish")
result$converged           # TRUE if converged
result$iterations          # Number of iterations
result$evaluations         # List: fn, gr, hess counts
result$message             # Why it stopped

# Solver-mode diagnostics (nested sublist; most users do not inspect this).
# See ?arcopt_advanced_controls for the full schema.
result$diagnostics$solver_mode_final           # "cubic", "tr", or "qn_polish"
result$diagnostics$ridge_switches              # count of cubic->TR transitions
result$diagnostics$radius_final                # final TR radius (NA if no switch)
result$diagnostics$qn_polish_switches          # count of cubic->qn_polish transitions
result$diagnostics$qn_polish_reverts           # count of qn_polish->cubic reversions
result$diagnostics$hess_evals_at_polish_switch # baseline for Hessian-eval savings

References

Adaptive Regularization with Cubics (ARC)

Cartis, C., Gould, N. I. M., & Toint, P. L. (2011). Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results. Mathematical Programming, 127(2), 245-295.

Nesterov, Y., & Polyak, B. T. (2006). Cubic regularization of Newton method and its global performance. Mathematical Programming, 108(1), 177-205.

Quasi-Newton with Cubic Regularization

Kamzolov, D., Ziu, K., Agafonov, A., & Takáč, M. (2025). Accelerated Adaptive Cubic Regularized Quasi-Newton Methods. Journal of Optimization Theory and Applications, 208. https://doi.org/10.1007/s10957-025-02804-3

Kamzolov, D., Ziu, K., Agafonov, A., & Takáč, M. (2023). Cubic Regularization is the Key! The First Accelerated Quasi-Newton Method with a Global Convergence Rate of O(k^-2) for Convex Functions. arXiv:2302.04987.

Benson, H. Y., & Shanno, D. F. (2018). Cubic regularization in symmetric rank-1 quasi-Newton methods. Mathematical Programming Computation, 10(4), 457-494.

License

Contributing

Issues and PRs welcome at: https://github.com/marcus-waldman/arcopt/issues

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.