The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

fdars

R-CMD-check CRAN status CRAN downloads License: MIT codecov

Functional Data Analysis in Rust - A high-performance R package for functional data analysis with a Rust backend.

What is Functional Data Analysis?

Functional Data Analysis (FDA) is a branch of statistics that deals with data where each observation is a function, curve, or surface rather than a single number or vector. Examples include:

Traditional statistical methods treat each time point as a separate variable, losing the inherent smoothness and continuity of the data. FDA treats the entire curve as a single observation, enabling more powerful and interpretable analyses.

Features

fdars is a comprehensive toolkit for functional data analysis with a high-performance Rust backend providing 10-200x speedups over pure R implementations.

Data Representation

Depth & Centrality

Measure how “central” or “typical” each curve is: - Fraiman-Muniz (FM), Band depth (BD), Modified band depth (MBD) - Modal depth, Random projection (RP, RT, RPD) - Functional spatial depth (FSD, KFSD) - Depth-based median, trimmed mean, trimmed variance

Outlier Detection

Multiple approaches to identify anomalous curves: - Depth-based trimming and weighting - Likelihood ratio test (LRT) - Functional boxplot - Magnitude-Shape plot (magnitude vs shape outliers) - Outliergram (MEI vs MBD)

Distance & Similarity

Quantify differences between curves: - Lp distances (L1, L2, L∞) - Hausdorff distance - Dynamic time warping (DTW) - PCA-based and derivative-based semimetrics

Regression

Predict scalar outcomes from functional predictors: - Principal component regression (fregre.pc) - Basis expansion regression (fregre.basis) - Nonparametric kernel regression (fregre.np) - Cross-validation for model selection

Clustering

Group similar curves together: - K-means clustering with K-means++ initialization - Fuzzy C-means with soft membership - Automatic selection of optimal k (silhouette, CH, elbow)

Smoothing & Basis Expansion

Functional Statistics

Gaussian Process Simulation

Generate synthetic functional data: - Multiple covariance kernels (Gaussian, Matérn, Exponential, Periodic) - Kernel composition (addition, multiplication) - Brownian motion and Ornstein-Uhlenbeck processes

Group Comparison

Visualization

Installation

Prerequisites

From GitHub

# Install remotes if needed
install.packages("remotes")

# Install fdars (with documentation)
remotes::install_github("sipemu/fdars-r", build_vignettes = TRUE)

Note: On Windows, you may need Rtools installed.

From Binary Release (No Rust Required)

Download the pre-built binary from GitHub Releases:

# macOS
install.packages("path/to/fdars_x.y.z.tgz", repos = NULL, type = "mac.binary")

# Windows
install.packages("path/to/fdars_x.y.z.zip", repos = NULL, type = "win.binary")

From Source

# Clone the repository
git clone https://github.com/sipemu/fdars-r.git
cd fdars-r

# Build and install
R CMD build .
R CMD INSTALL fdars_*.tar.gz

Quick Start

library(fdars)

# Create functional data from a matrix (rows = observations, cols = time points)
t <- seq(0, 1, length.out = 100)
X <- matrix(0, 20, 100)
for (i in 1:20) {
  X[i, ] <- sin(2 * pi * t) + rnorm(100, sd = 0.1)
}
fd <- fdata(X, argvals = t)

# Compute depth - measures how "central" each curve is
depths <- depth(fd)  # default: FM method
depths <- depth(fd, method = "mode")  # or specify method

# Find the functional median (most central curve)
median_curve <- median(fd)  # default: FM method

# Detect outliers
outliers <- outliers.depth.trim(fd, trim = 0.1)

# Functional regression: predict scalar y from functional X
y <- rowMeans(X) + rnorm(20, sd = 0.1)
model <- fregre.pc(fd, y, ncomp = 3)
predictions <- predict(model, fd)

# Cluster curves into groups
clusters <- cluster.kmeans(fd, ncl = 2)

# Smooth noisy curves
S <- S.NW(t, h = 0.1)  # Nadaraya-Watson smoother
smoothed <- S %*% X[1, ]

Key Concepts

Functional Data Objects (fdata)

The fdata class stores functional data as a matrix where rows are observations and columns are evaluation points:

fd <- fdata(data_matrix, argvals = time_points, rangeval = c(0, 1))

Identifiers and Metadata

You can attach identifiers and metadata (covariates) to functional data objects:

# Create fdata with IDs and metadata
meta <- data.frame(
  group = factor(c("control", "treatment", ...)),
  age = c(25, 32, ...),
  response = c(0.5, 0.8, ...)
)
fd <- fdata(X, id = paste0("patient_", 1:n), metadata = meta)

# Access fields
fd$id              # Character vector of identifiers
fd$metadata$group  # Access metadata columns

# Subsetting preserves metadata
fd_sub <- fd[1:10, ]  # id and metadata are also subsetted

# View metadata info
print(fd)    # Shows metadata columns
summary(fd)  # Shows metadata types and ranges

Note: If metadata contains an id column or has non-default row names, they must match the fdata identifiers. An error is thrown on mismatch.

Depth Functions

Depth measures how “central” or “typical” a curve is relative to a sample. Higher depth = more central.

Use the unified depth() function with a method parameter:

depth(fd, method = "FM")     # Fraiman-Muniz depth (default)
depth(fd, method = "BD")     # Band depth
depth(fd, method = "MBD")    # Modified band depth
depth(fd, method = "mode")   # Modal depth (kernel density)
depth(fd, method = "RP")     # Random projection depth
depth(fd, method = "RT")     # Random Tukey depth
depth(fd, method = "FSD")    # Functional spatial depth
depth(fd, method = "KFSD")   # Kernel functional spatial depth
depth(fd, method = "RPD")    # Random projection with derivatives

Functional Regression

Predict a scalar response from functional predictors:

All models support predict() for new data.

Distance Metrics

Measure similarity between curves using metric() with a method parameter:

metric(fd, method = "lp")        # Lp distance (default, L2 = Euclidean)
metric(fd, method = "hausdorff") # Hausdorff distance
metric(fd, method = "dtw")       # Dynamic time warping
metric(fd, method = "pca")       # PCA-based semimetric
metric(fd, method = "deriv")     # Derivative-based semimetric

Individual functions are also available: metric.lp, metric.hausdorff, metric.DTW, semimetric.pca, semimetric.deriv.

Outlier Detection

Identify unusual curves:

Labeling Outliers by ID or Metadata

Both magnitudeshape and outliergram support labeling points by ID or metadata columns:

# Create fdata with IDs and metadata
fd <- fdata(X, id = paste0("patient_", 1:n),
            metadata = data.frame(subject_id = paste0("S", 1:n)))

# Outliergram with custom labels
og <- outliergram(fd)
plot(og, label = "id")           # Label outliers with patient IDs
plot(og, label = "subject_id")   # Label with metadata column
plot(og, label_all = TRUE)       # Label ALL points, not just outliers

# magnitudeshape with custom labels
magnitudeshape(fd, label = "id")        # Label outliers with patient IDs
magnitudeshape(fd, label = NULL)        # No labels

Functional Statistics

Covariance Functions and Gaussian Process Generation

Generate synthetic functional data from Gaussian processes with various covariance kernels:

# Smooth samples with Gaussian (squared exponential) kernel
fd_smooth <- make_gaussian_process(n = 20, t = seq(0, 1, length.out = 100),
                                   cov = kernel_gaussian(length_scale = 0.2))

# Rough samples with Matern kernel
fd_rough <- make_gaussian_process(n = 20, t = seq(0, 1, length.out = 100),
                                  cov = kernel_matern(nu = 1.5))

# Periodic samples
fd_periodic <- make_gaussian_process(n = 10, t = seq(0, 2, length.out = 200),
                                     cov = kernel_periodic(period = 0.5))

# Combine kernels: signal + noise
cov_total <- kernel_add(kernel_gaussian(variance = 1), kernel_whitenoise(variance = 0.1))
fd_noisy <- make_gaussian_process(n = 10, t = seq(0, 1, length.out = 100), cov = cov_total)

Available covariance functions: - kernel_gaussian - Squared exponential (RBF) kernel, infinitely smooth - kernel_exponential - Exponential kernel (Matern ν=0.5), rough - kernel_matern - Matern family with smoothness parameter ν - kernel_brownian - Brownian motion covariance (1D only) - kernel_linear - Linear kernel - kernel_polynomial - Polynomial kernel - kernel_whitenoise - Independent noise at each point - kernel_periodic - Periodic kernel (1D only) - kernel_add - Combine kernels by addition - kernel_mult - Combine kernels by multiplication

Depth-Based Medians and Trimmed Means

Use the unified functions with a method parameter:

# Median (curve with maximum depth)
median(fd)                          # default: FM method
median(fd, method = "mode")         # modal depth-based median

# Trimmed mean (mean of deepest curves)
trimmed(fd, trim = 0.1)             # default: FM method
trimmed(fd, trim = 0.1, method = "RP")  # RP depth-based trimmed mean

# Trimmed variance
trimvar(fd, trim = 0.1)             # default: FM method
trimvar(fd, trim = 0.1, method = "mode")

Visualization

Group Comparison

Clustering

Curve Registration

Feature Extraction

2D Functional Data (Surfaces)

fdars supports 2D functional data (surfaces/images). The following functions have full 2D support:

Category Functions
Depth depth (methods: FM, mode, RP, RT, FSD, KFSD)
Distance metric.lp, metric.hausdorff, semimetric.pca, semimetric.deriv
Statistics mean, var, sd, cov, gmed, deriv
Centrality median, trimmed, trimvar (all methods except BD, MBD, RPD)
Regression fregre.np (nonparametric)
Visualization plot (heatmap + contours)

Note: Band depths (BD, MBD), RPD, and DTW do not support 2D data.

# Create 2D functional data (e.g., 10 surfaces on a 20x30 grid)
n <- 10
m1 <- 20
m2 <- 30
s <- seq(0, 1, length.out = m1)
t <- seq(0, 1, length.out = m2)

# Generate surfaces: f(s,t) = sin(2*pi*s) * cos(2*pi*t) + noise
X <- array(0, dim = c(n, m1, m2))
for (i in 1:n) {
  for (si in 1:m1) {
    for (ti in 1:m2) {
      X[i, si, ti] <- sin(2*pi*s[si]) * cos(2*pi*t[ti]) + rnorm(1, sd = 0.1)
    }
  }
}

fd2d <- fdata(X, argvals = list(s, t), fdata2d = TRUE)

# All these work with 2D data:
mean_surface <- mean(fd2d)           # Mean surface
var_surface <- var(fd2d)             # Pointwise variance
depths <- depth(fd2d)                # Depth values
median_surface <- median(fd2d)       # Depth-based median
gmed_surface <- gmed(fd2d)           # Geometric median

# Plot 2D data (heatmap + contours)
plot(fd2d)

Converting DataFrames to 2D fdata

Use df_to_fdata2d() to convert long-format DataFrames to 2D functional data:

# DataFrame structure: id column, s-index column, t-value columns
df <- data.frame(
  id = rep(c("surf1", "surf2"), each = 5),
  s = rep(1:5, 2),
  t1 = rnorm(10), t2 = rnorm(10), t3 = rnorm(10)
)

# Convert to 2D fdata
fd2d <- df_to_fdata2d(df, id_col = 1, s_col = 2)

# With metadata (must have one row per surface)
meta <- data.frame(group = c("A", "B"), value = c(1.5, 2.3))
fd2d <- df_to_fdata2d(df, id_col = 1, s_col = 2, metadata = meta)

Examples

License

MIT

Author

Simon Mueller

Acknowledgments

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.