The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
In this vignette we elaborate on the abilities and workings of the
estimate
method.
The default argument to the estimate
method are:
model$estimate(
data,
method = "ekf",
ode.solver = "euler",
ode.timestep = diff(data$t),
loss = "quadratic",
loss_c = NULL,
control = list(trace=1,iter.max=1e5,eval.max=1e5),
use.hessian = FALSE,
laplace.residuals = FALSE,
unconstrained.optim = FALSE,
estimate.initial.state = FALSE,
silent = FALSE
)
method
The method
argument decides which estimation/filtering
algorithm is used. Currently, the following methods are available:
method='lkf'
: The Linear Kalman Filter.
method='ekf'
: The Extended Kalman Filter.
method='laplace'
: The Laplace
Approximation.
There is a difference between the Kalman filtering methods, and the Laplace method in the way they approach likelihood computations, and thus what information the filters produce:
laplace.residuals
argument. The
computations are very costly and slow. The user is referred to the
documentation in TMB::oneStepPredict
for further
information.ode.solver
This argument is used for the Kalman filtering methods only to determine the ODE integrator used to solving the moment differential equations i.e. \[ \dfrac{d\mathrm{E}\left[x_t\right]}{dt} = f(\mathrm{E}\left[x_t\right]) \] \[ \dfrac{d\mathrm{V}\left[x_t\right]}{dt} = \dfrac{df}{dx}\bigg\vert_{(\mathrm{E}\left[x_t\right])}\mathrm{V}\left[x_t\right] + \mathrm{V}\left[x_t\right]\left[\dfrac{df}{dx}\bigg\vert_{(\mathrm{E}\left[x_t\right])}\right]^{T} + g(\mathrm{E}\left[x_t\right])g(\mathrm{E}\left[x_t\right])^{T} \] The ctsmTMB package implements the Explicit Forward-Euler euler and the Explicit 4th Order Runge-Kutta rk4 methods. In addition all of the methods available for deSolve::ode are also available through the use of the RTMBode package, but they are significantly slower than other two solvers, and they are primarily there to add the possibility for implicit and adaptive solvers.
We recommend using the fast simple solvers to begin with when modelling.
ode.timestep
This argument has two different implications depending on whether
Kalman filtering or Laplace filtering is carried out. The method accepts
either a single scalar value to be used as the global time-step, or a
vector of length diff(data$t)
specifying all individual
time-steps. The input values are interpolated linearly between
time-points if more than one step is taken.
Kalman filters: In this case the argument controls the time-step used for the euler and rk4 ODE solvers.
Laplace filter: In this case the argument controls the number of added intermediate time-points between observations each of which represents an additional state (random effect) at that particular time-point.
If a provided time-step \(\Delta t_{i}\) does not divide the correspond time-difference \(t_{i+1}-t_{i}\) in the data then it is rounded down such that it does. Consider the following example where the time-difference between two observations in the data is 3 seconds, but a time-step of 0.7. This produces a non-integer number of steps i.e.: \[ N_{i} = \dfrac{t_{i+1}-t_{i}}{\Delta t_i} = \dfrac{3}{0.7} = 4.28... \] Thus the number of steps taken is rounded up to \(N^{*}_i = \left\lceil N_i \right\rceil = \left\lceil 4.28... \right\rceil = 5\) and the corrected time-step then becomes \[ \Delta t^{*}_{i} = \dfrac{3}{N^{*}_{i}} = \dfrac{3}{5} = 0.6 \]
Note: The exception to this rule is when the remainder is less than \(\epsilon = 10^{-3}\) i.e. if for instance \(N_i = 4.0001\) then the time-step is accepted, and the number of steps rounded down to \(N^{*}_{i}= \left\lfloor N_i \right\rfloor = 4\).
loss
and loss_c
The following losses are currently available:
loss='quadratic'
(default)
loss='huber'
loss='tukey'
This argument only affects the Kalman filtering methods, and is used to regularize the likelihood contributions, removing the influence of large outliers.
The ith likelihood contributions is given by: \[ -\log L_{i}(\theta) \propto f(r_i) \] where \(r_i = \sqrt{e_{i}^{T} \Sigma_{i}^{-1} e_{i}}\) is a normalized residual, and where \(e_{i}\) is the ith residual vector, and \(\Sigma_{i}^{-1}\) the ith residual precision matrix.
The loss
argument changes the function \(f\) as follows:
If loss='quadratic'
then \(f\) is quadratic in the residuals i.e.
\[
f(r) = r^2
\] and the likelihood contributions are exactly those from a
Gaussian.
If loss='huber'
then \(f\) is the Huber’s \(\psi\) function given by \[
\psi_{c}(r) = \left\{ \begin{array}{l} r^2 & \text{for} \,\, r \leq
c \\ c(2r-c) & \text{otherwise} \end{array} \right\}
\] which is quadratic/linear in the residuals below/above the
threshold determined by \(c\).
If loss='tukey'
then \(f\) is Tukey’s byweight function given by
\[
l_{c}(r) = \left\{ \begin{array}{l}
r^2
& \text{for} \,\, r \leq c
\\
c^2
& \text{otherwise}
\end{array} \right\}
\] which is quadratic/constant in the residuals below/above the
threshold determined by \(c\).
In practice a smooth approximation to both Huber and Tukey are implemented in practice using the construction \[ \tilde{\psi}_{c}(r) = r^2 (1-\sigma_{c}(r)) + c^2 \sigma_{c}(r) \] \[ \tilde{l}_{c}(r) = r^2 (1-\sigma_{c}(r)) + c(2r-c) \sigma_{c}(r) \] where \(\sigma(r)\) is the sigmoid function \[ \sigma_{c}(r) = \dfrac{1}{1+\exp(-5(r-c))} \] The plot below shows the actual and implemented loss functions (almost indistinguishable) for \(c=5\). The threshold value is marked by with dashed line for the line \(x = c\).
Loss Functions
The loss_c
argument is used to determine the value of
\(c\). The default values are chosen
based on the fact that under the assumed (multivariate) normal
distribution of the residuals, the squared (normalized) residuals
follows a \(\chi^{2}_{m}\) distribution
with degrees of freedom equal to the number of elements of \(e_{i}\) i.e: \[
r_{i}^2 = e_{i}^{T} \Sigma_{i}^{-1} e_{i} \sim \chi^{2}_{m}
\] It is therefore reasonable to choose the threshold level
(value of \(c\)) which determines
whether or not \(r_{i}\) is an outlier,
as the level at which the null-hypothesis \[
H_{0}: r_{i}^2 \sim \chi^{2}_{m}
\] is rejected for some critical value \(1 - \alpha\). Choosing the significance
level \(\alpha = 0.05\) the appropriate
\(c\) threshold value becomes
## [1] 3.841459
where \(m\) is the number of observation equations.
Note: The significance level will be higher than the chosen if there are missing observations in some indices of \(i\) for systems with multiple observation equations.
use_hessian
This argument is a boolean which determines whether or not the
likelihood hessian constructed by automatic differentiation from
TMB is used during the optimization procedure by
providing it to the hessian
argument of
stats::nlminb
. The default is
use.hessian=FALSE
. The argument has no effect if
method=laplace
due to restrictions in
TMB/RTMB.
The effect of providing the hessian is typically that an optimum is found in fewer iterations, but the cost of computing the hessian is relatively large, so it is often faster to optimize using only the gradient.
laplace.residuals
This boolean controls whether or not model residuals are calculated
with TMB::oneStepPredict
when the Laplace filtering is used
(method=laplace
). This takes a considerable amount of time
- typically much longer than the estimation itself.
unconstrainted.optim
This boolean allows for quick unconstrained estimation removing the
parameter boundaries specified by setParameter
. This may be
useful sometimes, to quickly check whether estimation issues occur
because of ill boundaries.
estimate.initial.state
This boolean determines whether or not the initial-time state
distribution (mean and covariance) should be estimated or not. By
default estimate.initial.state=FALSE
and it is not
estimated but simply taken as the values provided by
setInitialState
. When
estimate.initial.state=TRUE
the mean and covariance are
estimated as the stationary solution of the moment differential
equations using the input values are the initial time-point.
The stationary solution is obtained first by solving \[ \dfrac{d\mathrm{E}\left[x_{\infty}\right]}{dt} = f(\mathrm{E}\left[x_\infty\right]) = 0 \] for \(\mathrm{E}\left[x_\infty\right]\) using Newton’s method. This stationary mean is then used to solve \[ \dfrac{d\mathrm{V}\left[x_\infty\right]}{dt} = \dfrac{df}{dx}\bigg\vert_{(\mathrm{E}\left[x_\infty\right])}\mathrm{V}\left[x_\infty\right] + \mathrm{V}\left[x_\infty\right]\left[\dfrac{df}{dx}\bigg\vert_{(\mathrm{E}\left[x_\infty\right])}\right]^{T} + g(\mathrm{E}\left[x_\infty\right])g(\mathrm{E}\left[x_\infty\right])^{T} = 0 \] for the stationary covariance \(\mathrm{V}\left[x_\infty\right]\) by calling a linear solver on the vectorized system of equations (using kronecker products).
control
This argument is a list which controls various settings of the
stats::nlminb
optimizer. See the documentation on
?stats::nlminb
for more information.
The default is list(trace=1,iter.max=1e5,eval.max=1e5)
which prints the iteration steps, and increases the default number of
iterations and function calls allowed before the optimization procedure
terminates.
Note:: The user should remember that disabling
tracing by passing control = list(trace=0)
will remove the
ìter.max
and eval.max
arguments, so they should
be provided as well if needed.
silent
This boolean argument controls whether or not various information messages are printed by ctsmTMB during model building, compilation and estimation.
Insert Example
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.