The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The Weighted Interval Score (WIS) is a probabilistic measure for evaluating forecast accuracy, balancing three key aspects of forecast quality:
WIS penalizes wide intervals or those that fail to include the observed outcome, while rewarding narrow, well-calibrated intervals. This makes WIS an interpretable and robust metric for assessing both forecast accuracy and uncertainty.
As of version 1.1.0, the NobBS package can explicitly
return quantile estimates when calling the NobBS and
NobBS.strat functions, which can be used to calculate the
WIS for the given nowcast. This vignette demonstrates how to calculate
and compare the WIS for different nowcasting models using quantile
estimates generated by the NobBS package, in combination
with the scoringutils package (version 2.0.0 or later).
To illustrate this process, we run two nowcasting models — a Poisson
model and a negative binomial model — for five nowcasting dates, using
the dengue dataset provided in the NobBS package. To obtain
quantile estimates we set the desired quantiles in the new
quantiles parameter within the specs list
given in the call to the NobBS and NobBS.strat
functions. By default, NobBS returns estimates for five
quantiles (0.025, 0.25, 0.5, 0.75, 0.975), representing the median along
with the 50% and 95% prediction intervals. Here, we use a more detailed
set of 23 quantile values to better capture the full predictive
distribution, ensuring a more granular assessment of forecast
uncertainty and improving the accuracy of the WIS calculation:
library(NobBS)
library(dplyr)
library(ggplot2)
library(scoringutils)
data(denguedat)
# Helper function to run nowcasting
run_nowcast <- function(data, now, win_size, specs) {
  NobBS(data, now, units = "1 week", onset_date = "onset_week",
        report_date = "report_week", moving_window = win_size, specs = specs)
}
quantiles <- c(0.025,seq(0.05,0.95,0.05),0.975)
q_len <- length(quantiles)
q_cols <- paste0('q_',quantiles)
# Poisson model specs
specs_poisson <- list(
  dist=c("Poisson"),
  quantiles=quantiles
)
#NB model specs
specs_nb = list(
  dist=c("NB"),
  quantiles=quantiles
)
# Initialize lists to store nowcasts
nowcasts_pois <- list()   # Nowcasts using poisson model
nowcasts_nb <- list()     # Nowcasts using negtive-binomial model
test_dates <- seq(as.Date("1994-08-29"),as.Date("1994-09-26"),by=7)
win_size <- 8
for (t in seq_along(test_dates)) {
  now <- test_dates[t]
  denguedat1 <- denguedat[denguedat$onset_week<=now,]
  nowcasts_pois[[t]] <- run_nowcast(denguedat1, now, win_size, specs_poisson)
  nowcasts_nb[[t]] <- run_nowcast(denguedat1, now, win_size, specs_nb)
}Now let us plot and compare the estimates obtained from each of these nowcasts:
plot_estimates <- function(nowcast1, nowcast2, cases_per_date, now) {
  # Ensure input data is valid
  if (is.null(nowcast1$estimates) || is.null(nowcast2$estimates)) {
    stop("Nowcast estimates are missing.")
  }
  
  # Extract estimates and credible intervals for nowcast without DoW effect
  onset_dates1 <- nowcast1$estimates$onset_date
  estimates1 <- nowcast1$estimates$estimate
  lower1 <- nowcast1$estimates$q_0.025  # 2.5% quantile (lower bound of 95% CI)
  upper1 <- nowcast1$estimates$q_0.975  # 97.5% quantile (upper bound of 95% CI)
  
  # Extract estimates and credible intervals for nowcast with DoW effect
  onset_dates2 <- nowcast2$estimates$onset_date
  estimates2 <- nowcast2$estimates$estimate
  lower2 <- nowcast2$estimates$q_0.025
  upper2 <- nowcast2$estimates$q_0.975
  
  # Extract eventual case counts
  case_dates <- cases_per_date$onset_week
  case_counts <- cases_per_date$count
  
  # Calculate plot range
  min_val <- min(c(lower1, lower2, case_counts), na.rm = TRUE)
  max_val <- max(c(upper1, upper2, case_counts), na.rm = TRUE) * 1.35
  
  # Create the plot
  plot(
    onset_dates1, estimates1, col = 'blue', type = 'l',
    xlab = 'Onset Date', ylab = 'Cases',
    ylim = c(min_val, max_val), lwd = 2,
    main = paste0('Incidence Estimates for ', now)
  )
  lines(onset_dates2, estimates2, col = 'red', lwd = 2)
  
  # Add 95% PI shaded regions for both nowcasts
  polygon(c(onset_dates1, rev(onset_dates1)), c(lower1, rev(upper1)), 
          col = rgb(0, 0, 1, 0.2), border = NA) 
  polygon(c(onset_dates2, rev(onset_dates2)), c(lower2, rev(upper2)), 
          col = rgb(1, 0, 0, 0.2), border = NA) 
  
  # Add true case counts as points
  points(case_dates, case_counts, col = 'black', pch = 20)
  
  # Add a legend
  legend(
    'topleft',
    legend = c('Estimates Poisson model', 
               'Estimates negative binomial model', 
               '95% PI (Poisson model)', 
               '95% PI (negative binomial model)',
               'Eventual cases'),
    col = c('blue', 'red', rgb(0, 0, 1, 0.2), rgb(1, 0, 0, 0.2), 'black'),
    lty = c(1, 1, NA, NA, NA), lwd = c(2, 2, NA, NA, NA), 
    pch = c(NA, NA, 15, 15, 20),
    pt.cex = c(NA, NA, 1.5, 1.5, 1), 
    cex = 0.9
  )
}
cases_per_week <- denguedat %>% group_by(onset_week) %>% summarize(count=n())
for (t in seq_along(test_dates)) {
  
  now <- test_dates[t]
  nowcast_pois <- nowcasts_pois[[t]]
  nowcast_nb <- nowcasts_nb[[t]]
  cases_per_week1 <- cases_per_week[cases_per_week$onset_week <= now, ]
  plot_estimates(nowcast_pois, nowcast_nb, cases_per_week1, now)
}As seen in these plots, the mean estimates of the two models are very similar, but the negative binomial model has wider prediction intervals, which appear to better capture the actual number of eventual cases.
Now, let us calculate the WIS for these nowcasting estimates. To do
this, we first need to extract the quantile estimates from the nowcast
results and use them to construct a data frame compatible with the
scoringutils package.
The quantile estimates can be found in the estimates
dataframe within the list returned by the NobBS and
NobBS.strat functions. The following code iterates over the
nowcasting results, extracts the quantile estimates, and combines them
with the observed case counts into a unified data frame. From each
nowcast we extract the estimates for three horizons: 0 (the current
week), -1 (the previous week) and -2 (two weeks ago).
data <- data.frame(onset_week=as.Date(character()),
                   now=as.Date(character()),
                   horizon=numeric(),
                   quantile_level=numeric(),
                   predicted=numeric(),
                   observed=numeric(),
                   model=character())
cases_per_week <- denguedat %>% group_by(onset_week) %>% summarize(count=n())
horizons <- c(-2,-1,0)
for (t in seq_along(test_dates)) {
  now <- test_dates[t]
  nowcast_pois <- nowcasts_pois[[t]]
  nowcast_nb <- nowcasts_nb[[t]]
  for(h in horizons) {
    date <- now + h*7
    true_value <- cases_per_week[cases_per_week$onset_week==date,]$count
    q_est_pois <- unname(unlist(nowcast_pois$estimates[nowcast_pois$estimates$onset_date==date,q_cols]))
    q_est_nb <- unname(unlist(nowcast_nb$estimates[nowcast_nb$estimates$onset_date==date,q_cols]))
    data_est <- data.frame(onset_week=rep(date,q_len*2),
                           now=rep(now,q_len*2),
                           horizon=rep(h,q_len*2),
                           quantile_level=rep(quantiles,2),
                           predicted=c(q_est_pois,q_est_nb),
                           observed=rep(true_value,q_len*2),
                           model=c(rep('Pois_model',q_len),rep('NB_model',q_len)))
    data <- rbind(data,data_est)
  }
}
print(head(data))
#>   onset_week        now horizon quantile_level predicted observed      model
#> 1 1994-08-15 1994-08-29      -2          0.025       131      135 Pois_model
#> 2 1994-08-15 1994-08-29      -2          0.050       132      135 Pois_model
#> 3 1994-08-15 1994-08-29      -2          0.100       133      135 Pois_model
#> 4 1994-08-15 1994-08-29      -2          0.150       133      135 Pois_model
#> 5 1994-08-15 1994-08-29      -2          0.200       134      135 Pois_model
#> 6 1994-08-15 1994-08-29      -2          0.250       135      135 Pois_model
print(tail(data))
#>     onset_week        now horizon quantile_level predicted observed    model
#> 625 1994-09-26 1994-09-26       0          0.750    196.00      210 NB_model
#> 626 1994-09-26 1994-09-26       0          0.800    204.00      210 NB_model
#> 627 1994-09-26 1994-09-26       0          0.850    215.15      210 NB_model
#> 628 1994-09-26 1994-09-26       0          0.900    230.00      210 NB_model
#> 629 1994-09-26 1994-09-26       0          0.950    259.05      210 NB_model
#> 630 1994-09-26 1994-09-26       0          0.975    289.00      210 NB_modelTo help interpret this table, consider the first row as an example. The 0.025 quantile level means that 131 cases was the lower bound of the 95% prediction interval for the August 15, 1994 onset week, forecasted on August 29, 1994 (horizon = -2).Since the actual observed value was 135 cases, this indicates that the forecasted interval appropriately captured the lower tail of the distribution.
Next, we utilize the scoringutils package to compute the
WIS for the Poisson and negative binomial models using the constructed
data frame.
nowcasts <- data %>%
          scoringutils::as_forecast_quantile()
scores <- scoringutils::score(nowcasts, 
             get_metrics(nowcasts,select=c("wis","overprediction","underprediction","dispersion")))
print(head(scores))
#>    onset_week        now horizon      model       wis overprediction
#>        <Date>     <Date>   <num>     <char>     <num>          <num>
#> 1: 1994-08-15 1994-08-29      -2 Pois_model  1.259524      0.4761905
#> 2: 1994-08-15 1994-08-29      -2   NB_model  1.035714      0.1428571
#> 3: 1994-08-22 1994-08-29      -1 Pois_model 10.921429      8.5714286
#> 4: 1994-08-22 1994-08-29      -1   NB_model  5.976190      2.9523810
#> 5: 1994-08-29 1994-08-29       0 Pois_model 16.635714      9.8095238
#> 6: 1994-08-29 1994-08-29       0   NB_model  9.364286      1.4761905
#>    underprediction dispersion
#>              <num>      <num>
#> 1:               0  0.7833333
#> 2:               0  0.8928571
#> 3:               0  2.3500000
#> 4:               0  3.0238095
#> 5:               0  6.8261905
#> 6:               0  7.8880952To help interpret this table, consider the first row as an example. For the onset week of August 15, 1994, using the Poisson model, the nowcast made on August 29, 1994 (horizon = -2) resulted in a WIS of 1.259. The decomposition of WIS shows that 0.476 of the score came from overprediction, 0 from underprediction, and 0.783 from dispersion. This suggests that the model slightly overestimated case counts but had a reasonable spread in its prediction intervals.
We can now summarize the results of the models according to the horizon:
scores_per_horizon <- scores %>%
          summarise_scores(by = c("horizon", "model"))
print(scores_per_horizon)
#>    horizon      model       wis overprediction underprediction dispersion
#>      <num>     <char>     <num>          <num>           <num>      <num>
#> 1:      -2 Pois_model  1.845238      0.3333333       0.6666667  0.8452381
#> 2:      -2   NB_model  1.695286      0.1619048       0.5533333  0.9800476
#> 3:      -1 Pois_model 24.635238      4.8380952      17.6571429  2.1400000
#> 4:      -1   NB_model 19.276190      2.6857143      13.0952381  3.4952381
#> 5:       0 Pois_model 24.351643      6.9238095      11.6238095  5.8040238
#> 6:       0   NB_model 20.149238      2.5714286       8.9904762  8.5873333
plot_wis(scores_per_horizon) + 
  facet_wrap(~ horizon, scales ='free') +
  ggtitle('WIS per Horizon')As we have seen above, the negative binomial model gives wider estimates compared to the Poisson model, which reflects in higher dispersion scores. However, the negative binomial model has better coverage of the true data compared to the Poisson model which reflects in lower overprediction and underprediction scores. Overall, the WIS (which is the sum of these three components) of the negative binomial model is lower (better) than the WIS of the Poisson model, for all three horizons.
We can also compare the model results according to another parameter - like the nowcasting date (the date on which the nowcast was performed):
scores_per_now_date <- scores %>%
          summarise_scores(by = c("now", "model"))
print(scores_per_now_date)
#>            now      model       wis overprediction underprediction dispersion
#>         <Date>     <char>     <num>          <num>           <num>      <num>
#>  1: 1994-08-29 Pois_model  9.605556       6.285714        0.000000   3.319841
#>  2: 1994-08-29   NB_model  5.458730       1.523810        0.000000   3.934921
#>  3: 1994-09-05 Pois_model 21.019841       0.000000       18.238095   2.781746
#>  4: 1994-09-05   NB_model 18.068254       0.000000       14.444444   3.623810
#>  5: 1994-09-12 Pois_model 11.625417       8.698413        0.000000   2.927004
#>  6: 1994-09-12   NB_model  8.753175       3.888889        0.000000   4.864286
#>  7: 1994-09-19 Pois_model 14.326984       5.174603        6.500000   2.652381
#>  8: 1994-09-19   NB_model 12.352698       3.619048        4.222222   4.511429
#>  9: 1994-09-26 Pois_model 28.142401       0.000000       25.174603   2.967798
#> 10: 1994-09-26   NB_model 23.901667       0.000000       19.065079   4.836587
plot_wis(scores_per_now_date) + 
  facet_wrap(~ now, scales ='free')  +
  ggtitle('WIS per Nowcasting Date')We again see that in this instance, for each of these 5 dates, the WIS of the negative binomial model is lower (better) than the WIS of the Poisson model. This is a common finding because surveillance data typically are over-dispersed.
The WIS provides an effective framework for assessing the accuracy
and uncertainty of probabilistic forecasts, making it an important tool
for evaluating nowcasting models. This vignette demonstrated how version
1.1.0 of the NobBS package integrates with the
scoringutils package to evaluate forecast accuracy using
WIS.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.