The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Nonparametric model

library(serosv)
library(magrittr)

Local estimation by polynomial

Proposed model

Within the local polynomial framework, the linear predictor \(\eta(a)\) is approximated locally at one particular value \(a_0\) for age by a line (local linear, degree \(p=1\)) or a parabola (local quadratic, degree \(p=2\)).

For a general degree \(p\), the linear predictor for a neighbor of \(a_0\), labeled \(a_i\) is equivalent to the Taylor approximation

\[ \eta(a_i) = \eta(a_0) + \eta^{(1)}(a_0)(a_i - a_0) + \frac{\eta^{(2)}(a_0)}{2}(a_i - a_0)^2 + ... + \frac{\eta^{(p)}(a_0)}{p!}(a_i - a_0)^p \]

\(\eta(a_i)\) can be estimated by maximizing

\[ \Sigma_{i=1}^{N} \ell_i \{Y_i, g^{-1} (\beta_0 + \beta_1(a_i-a_0)+ \beta_2(a_i-a_0)^2 ... + \beta_p(a_i-a_0)^p) \} K_h(a_i - a_0) \]

Where:

The estimator for the \(k\)-th derivative of \(\eta(a_0)\), for \(k = 0,1,…,p\) (degree of local polynomial) is thus

\[ \hat{\eta}^{(k)}(a_0) = k!\hat{\beta}_k(a_0) \]

The estimator for the prevalence at age \(a_0\) is then given by

\[ \hat{\pi}(a_0) = g^{-1}\{ \hat{\beta}_0(a_0) \} \]

The estimator for the force of infection at age \(a_0\) by assuming \(p \ge 1\) is as followed

\[ \hat{\lambda}(a_0) = \hat{\beta}_1(a_0) \delta \{ \hat{\beta}_0 (a_0) \} \]

Refer to Chapter 7.1 of the book by Hens et al. (2012) for a more detailed explanation of the method.

Fitting data

mump <- mumps_uk_1986_1987

To fit a local estimation by polynomials, use lp_model() function with model parameters include:

lp <- lp_model(mump, kern="tcub", nn = 0.5, deg=2)
lp
#> Local polynomial model 
#> 
#> Input type:  aggregated 
#> Configs:  nn=0.5, bandwidth(h)=0, degree=2, kernel=tcub 
#> 
#> Call:
#> locfit(formula = y ~ lp(age, deg = deg, nn = nn, h = h), family = "binomial", 
#>     kern = kern)
#> 
#> Number of observations:          44 
#> Family:  Logistic 
#> Fitted Degrees of freedom:       6.463 
#> Residual scale:                  1
plot(lp)

The users can also provide h or nn as a numeric vector, in which case the package will use the parameter value that minimize the generalized cross-validation (GCV) criterion.

# ----- Tune nearest neighbor (nn) parameter
lp_model(mump, kern="tcub", nn = seq(0.2, 0.8, 0.1), deg=2) %>% 
  plot()


# ----- Tune bandwidth (h) parameter
lp_model(mump, kern="tcub", h = seq(5,25), deg=2) %>% 
  plot()

Alternatively, use plot_gcv() to show GCV curves for the nearest neighbor method (left) and constant bandwidth (right) for manual parameter selection.

plot_gcv(
   mump$age, mump$pos, mump$tot,
   nn_seq = seq(0.2, 0.8, by=0.1),
   h_seq = seq(5, 25, by=1)
 )

Based on the plot, we can then fit the model using the parameter value that give the lowest GCV (nn = 0.3 and h = 14)

lp_model(mump, kern="tcub", nn = 0.3, deg=2) %>% 
  plot()


lp_model(mump, kern="tcub", h = 14, deg=2)  %>% 
  plot()

Hens, Niel, Ziv Shkedy, Marc Aerts, Christel Faes, Pierre Van Damme, and Philippe Beutels. 2012. Modeling Infectious Disease Parameters Based on Serological and Social Contact Data: A Modern Statistical Perspective. Statistics for Biology and Health. Springer New York. https://doi.org/10.1007/978-1-4614-4072-7.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.