The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
KRONXnbc implements a Clock of Regimes (COR) classifier: a Student-t Naive Bayes model designed for non-stationary financial market data. Three market regimes are distinguished:
| Regime | Economic intuition |
|---|---|
| Calm | Low volatility, mean-reverting returns |
| Steady | Moderate drift, controlled drawdowns |
| Stress | Fat-tailed returns, deep drawdowns, elevated ruin probability |
The distinguishing engineering choice is a profile grid search over the degrees-of-freedom parameter \(\nu\) of the Student-t likelihood. Rather than fixing \(\nu\) or solving a numerically fragile continuous optimisation, the model evaluates a discrete grid \(\nu \in \{3, 4, \ldots, 30, 40, 60, 100\}\) for every (class, feature) pair and selects the \(\nu\) that maximises the profile log-likelihood. This prevents the \(-\infty\) log-density underflow that collapses a standard Gaussian NBC when a crisis observation falls in the far tail.
The raw input is an hourly equity price series (e.g. E-mini S&P
500 futures, data.csv) paired with a file of decoded HMM
regime labels (decoded_states.csv). The
input2nbc.R pipeline constructs six continuous predictors
over a 24-hour rolling window.
library(zoo)
es_data <- read.csv("data.csv", stringsAsFactors = FALSE)
decoded <- read.csv("decoded_states.csv", stringsAsFactors = FALSE)
es_data <- es_data[!is.na(es_data$ret), ] # drop leading NA
stopifnot(nrow(es_data) == nrow(decoded))
n_roll <- 24L # 24-hour window
cor_data <- data.frame(
timestamp = es_data$timestamp,
log_return = es_data$ret
)Standard deviation of log-returns over the window; floored at 0.0001 to avoid zero-SD degeneracy on flat-market bars.
Measures how far the current close has fallen from the rolling 24-hour peak. Values are zero or negative; a reading of \(-0.03\) means the price is 3 % below its recent high.
\[ \text{Drawdown}_t = \frac{\text{Close}_t - \max_{s \in [t-23,\, t]} \text{Close}_s} {\max_{s \in [t-23,\, t]} \text{Close}_s} \]
Unlike rolling volatility — which treats up and down moves symmetrically — the downside semi-deviation isolates the left tail of the return distribution. It is the root-mean-square of negative returns only, making it highly sensitive to the onset of a Stress episode even when overall volatility is still moderate.
\[ \text{SemiDev}_t = \sqrt{\frac{1}{|\mathcal{N}|} \sum_{r \in \mathcal{N}} r^2}, \qquad \mathcal{N} = \{r_s : r_s < 0,\; s \in [t-23,\, t]\} \]
Counts consecutive hours spent in drawdown (defined as drawdown \(< -0.5\%\)). A long, uninterrupted drawdown streak signals structural regime persistence rather than a momentary spike.
The probability of a \(-2\%\) or worse move under the current rolling distribution — i.e. \(\Phi\!\left(\frac{-0.02 - \hat\mu_t}{\hat\sigma_t}\right)\). This forward-looking tail-risk measure rises sharply just before a Stress transition.
# KRONX empirical label mapping (derived from HMM state ordering)
state_labels <- c("1" = "Stress", "2" = "Calm", "3" = "Steady")
cor_data$regime <- factor(
state_labels[as.character(decoded$state)],
levels = c("Calm", "Steady", "Stress")
)
cor_data <- cor_data[complete.cases(cor_data), ] # drop rolling-window NAs
write.csv(cor_data, file = "nbc_analysis_report.txt", row.names = FALSE)A natural instinct for time-series data is to train on the first 80 % of observations and test on the last 20 %. For COR data this fails for a structural reason: financial regimes cluster.
Hourly market data exhibits strong regime persistence — a Stress episode may last 48–200 consecutive hours. A chronological cut therefore risks placing an entire regime cluster exclusively in the test set, leaving the training set with zero (or near-zero) Stress observations. The classifier then has no template for Stress and is forced to assign all Stress observations to the nearest alternative regime, producing classification collapse rather than a meaningful accuracy estimate.
Random 80/20 sampling breaks the temporal adjacency of observations, ensuring every regime class is represented in both partitions regardless of where in calendar time the Stress episodes happened to occur.
Trade-off acknowledged: random sampling leaks distributional information across the split boundary (observations from the same cluster appear in both train and test). For a production backtesting framework a purged, embargo-based cross-validation scheme (e.g.
mlr3+PurgedCV) is preferred. For this diagnostic classifier the random split is the correct choice.
cor_data <- read.csv("nbc_analysis_report.txt", stringsAsFactors = FALSE)
cor_data$regime <- factor(cor_data$regime, levels = c("Calm", "Steady", "Stress"))
cor_data <- cor_data[!is.na(cor_data$regime), ]
features <- c("log_return", "rolling_volatility", "drawdown",
"transition_stress", "residence_pressure", "ruin_proxy")
set.seed(123)
train_idx <- sample(seq_len(nrow(cor_data)), size = floor(0.80 * nrow(cor_data)))
train <- cor_data[ train_idx, ]
test <- cor_data[-train_idx, ]
x_train <- as.matrix(train[, features]); y_train <- train$regime
x_test <- as.matrix(test[, features]); y_test <- test$regimeA self-contained synthetic demonstration using the same six feature names:
library(kronxNBC)
set.seed(42L)
n <- 300L
mk <- n / 3L
# Mimic the distributional shape of each regime
X_syn <- rbind(
data.frame( # Calm
log_return = rnorm(mk, 0.0002, 0.003),
rolling_volatility = rnorm(mk, 0.004, 0.001),
drawdown = rnorm(mk, -0.002, 0.002),
transition_stress = abs(rnorm(mk, 0.001, 0.0005)),
residence_pressure = rpois(mk, 1),
ruin_proxy = rbeta(mk, 1, 20)
),
data.frame( # Steady
log_return = rnorm(mk, 0.0005, 0.005),
rolling_volatility = rnorm(mk, 0.008, 0.002),
drawdown = rnorm(mk, -0.008, 0.004),
transition_stress = abs(rnorm(mk, 0.003, 0.001)),
residence_pressure = rpois(mk, 3),
ruin_proxy = rbeta(mk, 2, 10)
),
data.frame( # Stress: fat-tailed
log_return = rt(mk, df = 3) * 0.012,
rolling_volatility = rnorm(mk, 0.022, 0.005),
drawdown = rnorm(mk, -0.030, 0.010),
transition_stress = abs(rnorm(mk, 0.015, 0.005)),
residence_pressure = rpois(mk, 12),
ruin_proxy = rbeta(mk, 5, 3)
)
)
X_syn <- as.matrix(X_syn)
y_syn <- factor(
rep(c("Calm", "Steady", "Stress"), each = mk),
levels = c("Calm", "Steady", "Stress")
)
set.seed(7L)
tr_idx <- sample(n, size = floor(0.8 * n))
x_train <- X_syn[ tr_idx, ]; y_train <- y_syn[ tr_idx]
x_test <- X_syn[-tr_idx, ]; y_test <- y_syn[-tr_idx]
model <- student_t_naive_bayes(x_train, y_train)
summary(model)
#>
#> ============================ Student-t Naive Bayes ============================
#>
#> - Call: student_t_naive_bayes(x = x_train, y = y_train)
#> - Samples: 240
#> - Features: 6
#> - nu grid range: 3 to 100
#> - Prior probabilities:
#> - Calm: 0.3417
#> - Steady: 0.3125
#> - Stress: 0.3458
#>
#> -------------------------------------------------------------------------------tabs <- tables(model)
print(tabs)
#> $log_return
#> Calm Steady Stress
#> mu 3.768694e-04 4.757325e-05 -1.158370e-03
#> sd 3.091415e-03 4.948078e-03 1.447585e-02
#> nu 3.000000e+01 1.000000e+02 9.000000e+00
#>
#> $rolling_volatility
#> Calm Steady Stress
#> mu 3.954591e-03 7.661608e-03 2.174149e-02
#> sd 9.037302e-04 1.708582e-03 4.649135e-03
#> nu 1.000000e+02 4.000000e+01 1.000000e+02
#>
#> $drawdown
#> Calm Steady Stress
#> mu -0.002126564 -0.008043378 -0.029062709
#> sd 0.002051736 0.003979391 0.010660654
#> nu 100.000000000 27.000000000 100.000000000
#>
#> $transition_stress
#> Calm Steady Stress
#> mu 9.923345e-04 3.106482e-03 1.602354e-02
#> sd 4.205171e-04 1.150587e-03 5.069987e-03
#> nu 1.000000e+02 6.000000e+01 4.000000e+01
#>
#> $residence_pressure
#> Calm Steady Stress
#> mu 0.9445213 2.7786013 11.9450720
#> sd 0.9865471 1.6836134 3.4498315
#> nu 100.0000000 100.0000000 100.0000000
#>
#> $ruin_proxy
#> Calm Steady Stress
#> mu 0.04645572 0.16278727 0.64599386
#> sd 0.05494230 0.11070799 0.15403066
#> nu 6.00000000 15.00000000 100.00000000
#>
#> attr(,"class")
#> [1] "naive_bayes_tables"
#> attr(,"cond_dist")
#> log_return rolling_volatility drawdown transition_stress
#> "Student-t" "Student-t" "Student-t" "Student-t"
#> residence_pressure ruin_proxy
#> "Student-t" "Student-t"coef(model)
#> Calm:mu Calm:sd Calm:nu Steady:mu Steady:sd
#> log_return 0.0003768694 0.0030914147 30 4.757325e-05 0.004948078
#> rolling_volatility 0.0039545908 0.0009037302 100 7.661608e-03 0.001708582
#> drawdown -0.0021265636 0.0020517360 100 -8.043378e-03 0.003979391
#> transition_stress 0.0009923345 0.0004205171 100 3.106482e-03 0.001150587
#> residence_pressure 0.9445213006 0.9865470622 100 2.778601e+00 1.683613368
#> ruin_proxy 0.0464557214 0.0549422995 6 1.627873e-01 0.110707985
#> Steady:nu Stress:mu Stress:sd Stress:nu
#> log_return 100 -0.00115837 0.014475850 9
#> rolling_volatility 40 0.02174149 0.004649135 100
#> drawdown 27 -0.02906271 0.010660654 100
#> transition_stress 60 0.01602354 0.005069987 40
#> residence_pressure 100 11.94507202 3.449831495 100
#> ruin_proxy 15 0.64599386 0.154030658 100Observations where the posterior probability of the Stress regime exceeds 60 % trigger a COR Stress Alert — an actionable signal for risk managers to review position sizing or hedging.
stress_prob <- pred_prob[, "Stress"]
alert_flag <- ifelse(stress_prob > 0.60, "COR Stress Alert", "No Alert")
cat("\nCOR Stress Alert Summary (test period):\n")
#>
#> COR Stress Alert Summary (test period):
print(table(alert_flag))
#> alert_flag
#> COR Stress Alert No Alert
#> 17 43
cat("\nPosterior Stress probability — first 10 test observations:\n")
#>
#> Posterior Stress probability — first 10 test observations:
print(round(head(stress_prob, 10L), 4))
#> [1] 0 0 0 0 0 0 0 0 0 0The most theoretically important output is the per-feature, per-class degrees-of-freedom estimates. Extracting them directly from the parameter matrices:
nu_df <- as.data.frame(t(model$params$nu))
colnames(nu_df) <- paste0("nu.", c("Calm", "Steady", "Stress"))
nu_df
#> nu.Calm nu.Steady nu.Stress
#> log_return 30 100 9
#> rolling_volatility 100 40 100
#> drawdown 100 27 100
#> transition_stress 100 60 40
#> residence_pressure 100 100 100
#> ruin_proxy 6 15 100Under a Student-t distribution:
When fitted to real COR data, the Stress regime consistently receives
\(\nu \approx 3\)–\(6\) on log_return and
drawdown, while Calm receives \(\nu > 20\). This is not a modelling
assumption — it is an empirical finding that emerges from the
profile grid search.
This finding validates the core financial hypothesis:
Crisis episodes are not merely high-volatility Gaussian events. They are draws from a genuinely different, fat-tailed distribution that a Gaussian NBC cannot represent without catastrophic classification failure.
The grid search selects the \(\nu\) that best explains the observed data under the Student-t family. A low \(\nu\) on Stress features is therefore both a diagnostic of past crises and a structural reason why the KRONXnbc classifier is more reliable than a standard Gaussian Naive Bayes during market dislocations.
nu_stress_ret <- model$params$nu["Stress", "log_return"]
nu_calm_ret <- model$params$nu["Calm", "log_return"]
cat(sprintf(
"log_return: nu(Stress) = %.0f | nu(Calm) = %.0f\n",
nu_stress_ret, nu_calm_ret
))
#> log_return: nu(Stress) = 9 | nu(Calm) = 30
if (nu_stress_ret < nu_calm_ret) {
cat("=> Stress regime shows heavier tails on log_return, as hypothesised.\n")
} else {
cat("=> Note: with this synthetic data nu ordering may differ from empirical results.\n")
}
#> => Stress regime shows heavier tails on log_return, as hypothesised.sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: aarch64-apple-darwin23
#> Running under: macOS Sequoia 15.7.7
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: Europe/Riga
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] kronxNBC_0.1.1
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.39 R6_2.6.1 fastmap_1.2.0 xfun_0.57
#> [5] cachem_1.1.0 knitr_1.51 htmltools_0.5.9 rmarkdown_2.31
#> [9] lifecycle_1.0.5 cli_3.6.6 sass_0.4.10 jquerylib_0.1.4
#> [13] compiler_4.6.0 tools_4.6.0 evaluate_1.0.5 bslib_0.11.0
#> [17] yaml_2.3.12 otel_0.2.0 rlang_1.2.0 jsonlite_2.0.0These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.