---
title: "Introduction to StochSimR"
author: "Ayush Kundu"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to StochSimR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4,
  dpi = 100
)
set.seed(42)
```

## Overview

**StochSimR** is a modular simulation engine for stochastic processes. It
provides a unified interface for simulating, analysing, and visualising paths
from a wide range of processes used in finance, physics, queueing theory, and
applied probability.

Every simulator returns a `stoch_path` object that plugs directly into the
plotting and analysis functions.

```{r}
library(StochSimR)
```

## Classic Processes

### Brownian Motion

Standard Brownian motion (Wiener process) is the building block for most
continuous-time models. The `sim_brownian()` function supports both exact
simulation and Brownian bridge interpolation.

```{r brownian}
bm <- sim_brownian(T_max = 1, n_steps = 1000, mu = 0, sigma = 1,
                   n_paths = 100)
bm
plot_paths(bm, max_paths = 50, show_mean = TRUE, show_bands = TRUE)
```

The red line is the sample mean across 100 paths; the shaded band shows the
pointwise 2.5%--97.5% quantile envelope.

### Poisson Process

The counting process for random arrivals. Supports both homogeneous (constant
rate) and inhomogeneous (time-varying rate) specifications.

```{r poisson}
# Homogeneous
pp <- sim_poisson(T_max = 10, lambda = 3, n_paths = 20)
plot_paths(pp, show_bands = FALSE)
```

```{r poisson-inhom}
# Inhomogeneous with sinusoidal rate
pp_ih <- sim_poisson(
  T_max = 10,
  lambda = function(t) 3 + 2 * sin(2 * pi * t / 5),
  lambda_max = 5,
  n_paths = 20
)
plot_paths(pp_ih, show_bands = FALSE,
           title = "Inhomogeneous Poisson (sinusoidal rate)")
```

### Markov Chain

Discrete-time Markov chains are simulated exactly by sampling from the
transition matrix row at each step.

```{r markov}
P <- matrix(c(0.7, 0.2, 0.1,
              0.3, 0.4, 0.3,
              0.2, 0.3, 0.5), nrow = 3, byrow = TRUE)

mc <- sim_markov(P, n_steps = 300, x0 = 1, n_paths = 10)
plot_paths(mc, show_bands = FALSE, show_mean = FALSE)
```

Verify convergence to the stationary distribution:

```{r markov-stationary}
long_mc <- sim_markov(P, n_steps = 50000, x0 = 1, n_paths = 1)
freq <- table(factor(long_mc$values, levels = 1:3)) / length(long_mc$values)
freq
```

## Advanced Processes

### Geometric Brownian Motion (GBM)

The standard model for stock prices. Two methods are available: `"exact"` uses
the known log-normal solution; `"euler"` uses Euler--Maruyama discretisation.

```{r gbm}
stock <- sim_gbm(T_max = 1, n_steps = 252, mu = 0.08, sigma = 0.25,
                 x0 = 100, n_paths = 200)
plot_paths(stock, max_paths = 80)
plot_distribution(stock)
```

### Ornstein--Uhlenbeck Process

A mean-reverting diffusion. With `method = "exact"`, the known Gaussian
transition density is used -- no discretisation error.

```{r ou}
ou <- sim_ou(T_max = 10, n_steps = 1000, theta = 2, mu = 5,
             sigma = 0.5, x0 = 0, n_paths = 50)
plot_paths(ou, show_mean = TRUE, show_bands = TRUE)
```

Note how all paths converge toward the long-run mean of 5.

### Jump-Diffusion (Merton Model)

Combines GBM with compound Poisson jumps in log-price.

```{r jump-diffusion}
jd <- sim_jump_diffusion(
  T_max = 1, n_steps = 1000,
  mu = 0.05, sigma = 0.2,
  lambda = 3, mu_j = -0.02, sigma_j = 0.1,
  x0 = 100, n_paths = 50
)
plot_paths(jd, max_paths = 30)
```

### Hawkes (Self-Exciting) Process

A point process where each event temporarily increases the rate of future
events. The branching ratio `alpha/beta` controls the degree of clustering.

```{r hawkes}
hk <- sim_hawkes(T_max = 50, mu = 0.5, alpha = 0.8, beta = 1.2,
                 n_paths = 15)
plot_paths(hk, show_bands = FALSE, show_mean = TRUE)
```

### Levy Processes

Four families are available:

```{r levy-gamma}
# Gamma subordinator (non-decreasing)
gam <- sim_levy(T_max = 5, n_steps = 500, type = "gamma",
                shape = 2, rate = 1, n_paths = 20)
plot_paths(gam, show_bands = FALSE)
```

```{r levy-vg}
# Variance-Gamma
vg <- sim_levy(T_max = 1, n_steps = 500, type = "variance_gamma",
               theta = -0.1, sigma = 0.3, nu = 0.5, n_paths = 30)
plot_paths(vg, max_paths = 20)
```

## Summary Statistics

The `path_summary()` function computes terminal value distribution, extremes,
increment moments, skewness, and excess kurtosis:

```{r summary}
path_summary(stock)
```

The `plot_acf_paths()` function shows the autocorrelation of path increments,
useful for verifying independence (Brownian) or detecting dependence (OU):

```{r acf}
plot_acf_paths(ou, path_idx = 1)
```

## Variance Reduction

StochSimR includes four variance reduction techniques. Each returns the
estimate, standard error, and the variance reduction factor versus crude
Monte Carlo.

### Control Variates

Use the terminal stock price (whose expectation is known analytically) as a
control for pricing a European call:

```{r cv}
h_call <- function(vals) pmax(vals[nrow(vals), ] - 100, 0)
g_term <- function(vals) vals[nrow(vals), ]
E_g <- 100 * exp(0.05)

cv <- vr_control_variate(sim_gbm, h = h_call, g = g_term, E_g = E_g,
  n_paths = 5000, T_max = 1, n_steps = 252,
  mu = 0.05, sigma = 0.2, x0 = 100)

cat("Estimate:", round(cv$estimate, 4), "\n")
cat("SE:      ", round(cv$se, 4), "\n")
cat("Reduction:", round(cv$reduction_factor, 1), "x\n")
cat("Correlation(h, g):", round(cv$correlation, 3), "\n")
```

### Importance Sampling

Tilt the drift upward to estimate a tail probability more efficiently:

```{r is}
h_tail <- function(vals) as.numeric(vals[nrow(vals), ] > 150)
is_result <- vr_importance(sim_gbm, h = h_tail, drift_shift = 0.4,
  n_paths = 10000, T_max = 1, n_steps = 252,
  mu = 0.05, sigma = 0.2, x0 = 100)

cat("P(S_T > 150):", round(is_result$estimate, 6), "\n")
cat("ESS:         ", round(is_result$ess), "/", is_result$n_samples, "\n")
cat("Reduction:   ", round(is_result$reduction_factor, 1), "x\n")
```

## Monte Carlo Estimation

The `mc_estimate()` function provides a general-purpose estimator with
batch-means standard errors and a convergence trace:

```{r mc}
h_pos <- function(vals) pmax(vals[nrow(vals), ], 0)
mc <- mc_estimate(sim_brownian, h = h_pos, n_paths = 10000,
                  batch_size = 500, T_max = 1, n_steps = 200)

cat("E[max(B_1, 0)] =", round(mc$estimate, 4),
    "+/-", round(mc$se, 4), "\n")
cat("Exact answer:    ", round(1 / sqrt(2 * pi), 4), "\n")
```

## Method Comparison

Compare simulation methods on the same process to measure bias and speed:

```{r compare}
comp <- compare_methods(sim_ou, methods = c("exact", "euler"),
  n_paths = 500, n_reps = 20,
  T_max = 5, n_steps = 500, theta = 2, mu = 1, sigma = 0.5, x0 = 0)
print(comp)
```

The exact method uses the known Gaussian transition density and has zero
discretisation bias; the Euler method is faster per step but introduces
$O(\Delta t)$ weak error.

## Session Info

```{r session}
sessionInfo()
```