Fast Kalman Filtering using Sequential Processing

Introduction

This document provides worked examples of Kalman Filtering through the FKF.SP package and the fkf.SP function. The fkf function is a well-established function call of the well-known Kalman Filter algorithm that is designed to maximize computational efficiency of the filtering process. The fkf.SP function builds from this by taking the additional assumption that the error of observations are independent and therefore filtering can be performed through a univariate treatment of the multivariate process - increasing computational efficiency in the general case. This vignette showcases three examples that were first presented within the FKF package and associated vignette. The following examples are structured to both showcase the functionality of the package whilst further showcasing the increase in computational speed between the fkf.SP and fkf functions. The examples presented are relatively simple Kalman Filtering problems, whilst the relative increase in Filtering computational speed would be greater for problems with more dimensions in observations and the state vector.

library(FKF.SP)
##The package 'FKF' is required for this Vignette:
# install.packages("FKF")
library(FKF)

Example 1 - ARMA(2,1) model estimation.

Autoregression Moving Average models can be estimated through Kalman Filtering. See also help(makeARIMA) and help(KalmanRun).

Set constants:

## Length of series
n <- 10000

## AR parameters
ar1 <- 0.6
ar2 <- 0.2
ma1 <- -0.2
sigma <- sqrt(0.2)

Sample from an ARMA(2, 1) process

a <- stats::arima.sim(model = list(ar = c(ar1, ar2), ma = ma1), n = n,
            innov = rnorm(n) * sigma)

Create a state space representation out of the four ARMA parameters

arma21ss <- function(ar1, ar2, ma1, sigma) {
Tt <- matrix(c(ar1, ar2, 1, 0), ncol = 2)
Zt <- matrix(c(1, 0), ncol = 2)
ct <- matrix(0)
dt <- matrix(0, nrow = 2)
GGt <- matrix(0)
H <- matrix(c(1, ma1), nrow = 2) * sigma
HHt <- H %*% t(H)
a0 <- c(0, 0)
## Diffuse assumption
P0 <- matrix(1e6, nrow = 2, ncol = 2)
return(list(a0 = a0, P0 = P0, ct = ct, dt = dt, Zt = Zt, Tt = Tt, GGt = GGt,
            HHt = HHt))
            }

Parameter estimation is performed through maximum likelihood estimation (MLE). This involves optimizing the log-likelihood returned by the Kalman Filter through the ‘optim’ function.

# The objective function passed to 'optim'
objective <- function(theta, yt, SP = F) {
param <- arma21ss(theta["ar1"], theta["ar2"], theta["ma1"], theta["sigma"])
# Kalman Filtering through the fkf.SP function:
if(SP){
 ans <- - fkf.SP(a0 = param$a0, P0 = param$P0, dt = param$dt, ct = param$ct, 
               Tt = param$Tt, Zt = param$Zt, HHt = param$HHt, GGt = param$GGt, 
               yt = yt)
 }
# Kalman Filtering through the fkf function:
 else{
 ans <- - fkf(a0 = param$a0, P0 = param$P0, dt = param$dt, ct = param$ct, Tt = param$Tt,
            Zt = param$Zt, HHt = param$HHt, GGt = param$GGt, yt = yt)$logLik
   
 }
 return(ans)
}
##Optim minimizes functions, so the negative is returned

Perform MLE:

theta <- c(ar = c(0, 0), ma1 = 0, sigma = 1)

###FKF Package:
start <- Sys.time()
set.seed(1)
FKF_estimation <- optim(theta, objective, yt = rbind(a), hessian = TRUE, SP = F)
FKF_runtime = Sys.time() - start

###fkf.SP Package:
start <- Sys.time()
set.seed(1)
FKF.SP_estimation <- optim(theta, objective, yt = rbind(a), hessian = TRUE, SP = T)
FKF.SP_runtime <- Sys.time() - start

The MLE process applying both functions has returned identical estimated parameters:

print(rbind(FKF.SP_estimation$par, FKF_estimation$par))
#>           ar1       ar2        ma1     sigma
#> [1,] 0.620943 0.1890581 -0.2029199 0.4528587
#> [2,] 0.620943 0.1890581 -0.2029199 0.4528587

As well as an identical call count number for both functions:

print(c(FKF.SP = FKF.SP_estimation$counts[1], FKF = FKF_estimation$counts[1]))
#> FKF.SP.function    FKF.function 
#>             199             199

Utilizing Sequential Processing however, we’ve decreased processing time:


print(c(FKF.SP = FKF.SP_runtime, FKF = FKF_runtime))
#> Time differences in secs
#>    FKF.SP       FKF 
#> 0.5405231 0.9434748

The vignette of FKF shows how to filter the series with estimated parameter values and develop some plots for analysis purposes. fkf.SP is only appropriate for efficient parameter estimation, rather than the filtering under estimated parameters.

Example 2 - Local level model for the Nile’s annual flow:

This example presents differences in the computational time of the fkf.SP and fkf functions to the Nile dataset. It also shows the difference in log-likelihood values returned by the two functions that occurs when NAs are within observations.

## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

##Complete Nile Data - no NA's
y_complete <- y_incomplete <- Nile
##Incomplete Nile Data - two NA's are present:
y_incomplete[c(3, 10)] <- NA


## Set constant parameters:
dt <- ct <- matrix(0)
Zt <- Tt <- matrix(1)
a0 <- y_incomplete[1]   # Estimation of the first year flow
P0 <- matrix(100)     # Variance of 'a0'

## Parameter estimation - maximum likelihood estimation:
Nile_MLE <- function(yt, SP){
##Unknown parameters initial estimates:
GGt <- HHt <- var(yt, na.rm = TRUE) * .5
set.seed(1)
#fkf.SP function:
if(SP){
  return(suppressWarnings(optim(c(HHt = HHt, GGt = GGt),
        fn = function(par, ...)
             -fkf.SP(HHt = matrix(par[1]), GGt = matrix(par[2]), ...),
             yt = rbind(yt), a0 = a0, P0 = P0, dt = dt, ct = ct,
             Zt = Zt, Tt = Tt)))
} else {
#fkf function:
  return(optim(c(HHt = HHt, GGt = GGt),
        fn = function(par, ...)
             -fkf(HHt = matrix(par[1]), GGt = matrix(par[2]), ...)$logLik,
             yt = rbind(yt), a0 = a0, P0 = P0, dt = dt, ct = ct,
             Zt = Zt, Tt = Tt))
}}

Performing parameter estimation using complete data, the fkf and fkf.SP functions return identical results:

fkf.SP_MLE_complete = Nile_MLE(y_complete, SP = T)
fkf_MLE_complete = Nile_MLE(y_complete, SP = F)

fkf.SP:

print(fkf.SP_MLE_complete[1:3])
#> $par
#>       HHt       GGt 
#>  1300.777 15247.773 
#> 
#> $value
#> [1] 637.626
#> 
#> $counts
#> function gradient 
#>       57       NA

fkf:

print(fkf_MLE_complete[1:3])
#> $par
#>       HHt       GGt 
#>  1300.777 15247.773 
#> 
#> $value
#> [1] 637.626
#> 
#> $counts
#> function gradient 
#>       57       NA

Performing parameter estimation using incomplete data returns identical estimated parameters, but different log-likelihood values:

fkf.SP_MLE_incomplete = Nile_MLE(y_incomplete, SP = T)
fkf_MLE_incomplete = Nile_MLE(y_incomplete, SP = F)

fkf.SP:

print(fkf.SP_MLE_incomplete[1:3])
#> $par
#>       HHt       GGt 
#>  1385.066 15124.131 
#> 
#> $value
#> [1] 625.1676
#> 
#> $counts
#> function gradient 
#>       53       NA

fkf:

print(fkf_MLE_incomplete[1:3])
#> $par
#>       HHt       GGt 
#>  1385.066 15124.131 
#> 
#> $value
#> [1] 627.0055
#> 
#> $counts
#> function gradient 
#>       53       NA

The difference in log-likelihood values is equal to 1.8378771. This difference is equal to:

#Number of NA values:
NA_values = length(which(is.na(y_incomplete)))

print( 0.5 * NA_values * log(2 * pi))
#> [1] 1.837877

This is because the fkf function instantiates its log-likelihood score by calculating \(- n \times d \times log(2\pi)\), where \(n\) is the number of columns of object yt and \(d\) is the number of rows of object yt. Because there are NA’s, however, the first term of the log-likelihood score would instead be \(- n \times (d-2) \times log(2\pi)\), explaining this difference in log-likelihood scores. The fkf function instantiates the log-likelihood score of two observations that are not actually observed.

Speed Comparison - Nile Data (10,000 iterations):

#This test uses estimated parameters of complete data. 
#Please run the complete chunk for a fair comparison:

set.seed(1)
start = Sys.time()
for(i in 1:1e4) fkf(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fkf_MLE_complete$par[1]),
                    GGt = matrix(fkf_MLE_complete$par[2]), yt = rbind(y_complete))
FKF_runtime = Sys.time() - start

start = Sys.time()
set.seed(1)
for(i in 1:1e4) fkf.SP(a0, P0, dt, ct, Tt, Zt, HHt = matrix(fkf.SP_MLE_complete$par[1]),
                       GGt = matrix(fkf.SP_MLE_complete$par[2]), yt = rbind(y_complete))
fkf.SP_runtime = Sys.time() - start

print(c(FKF.SP = fkf.SP_runtime, FKF = FKF_runtime))
#> Time differences in secs
#>    FKF.SP       FKF 
#> 0.5814130 0.8487618

Utilizing Sequential Processing has decreased processing time.

Example 3 - Tree Ring Data:


## Transition equation:
## alpha[t+1] = alpha[t] + eta[t], eta[t] ~ N(0, HHt)
## Measurement equation:
## y[t] = alpha[t] + eps[t], eps[t] ~  N(0, GGt)

## tree-ring widths in dimensionless units
y <- treering

## Set constant parameters:
dt <- ct <- matrix(0)
Zt <- Tt <- matrix(1)
a0 <- y[1]            # Estimation of the first width
P0 <- matrix(100)     # Variance of 'a0'

##Time comparison - Estimate parameters 10 times:
start = Sys.time()
set.seed(1)
for(i in 1:10)  fit.fkf <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                     GGt = var(y, na.rm = TRUE) * .5),
                   fn = function(par, ...)
                     -fkf(HHt = array(par[1],c(1,1,1)), GGt = array(par[2],c(1,1,1)), ...)$logLik,
                   yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                   Zt = Zt, Tt = Tt)

run.time_FKF = Sys.time() - start


start = Sys.time()
##When FKF.SP is input poorly specified parameters 
##(ie. log-likelihood = NA) is output a warning:
set.seed(1)
suppressWarnings(
for(i in 1:10)  fit.fkf.SP <- optim(c(HHt = var(y, na.rm = TRUE) * .5,
                        GGt = var(y, na.rm = TRUE) * .5),
                      fn = function(par, ...)
                        -fkf.SP(HHt = array(par[1],c(1,1,1)), GGt = matrix(par[2]), ...),
                      yt = rbind(y), a0 = a0, P0 = P0, dt = dt, ct = ct,
                      Zt = Zt, Tt = Tt)
)
run.time_FKF.SP = Sys.time() - start

print(c(fkf.SP = run.time_FKF.SP, fkf = run.time_FKF))
#> Time differences in secs
#>   fkf.SP      fkf 
#> 1.063154 1.599721

## Filter tree ring data with estimated parameters using fkf:
fkf.obj <- fkf(a0, P0, dt, ct, Tt, Zt, HHt = array(fit.fkf$par[1],c(1,1,1)),
               GGt = array(fit.fkf$par[2],c(1,1,1)), yt = rbind(y))

Utilizing Sequential Processing has decreased processing time.