The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Type: Package
Title: Labeled Optimal Partitioning
Version: 2024.6.19
Author: Toby Dylan Hocking
Maintainer: Toby Dylan Hocking <toby.hocking@r-project.org>
Description: Change-point detection algorithm with label constraints and a penalty for each change outside of labels. Read TD Hocking, A Srivastava (2023) <doi:10.1007/s00180-022-01238-z> for details.
License: GPL-3
LinkingTo: Rcpp
URL: https://github.com/tdhock/LOPART
BugReports: https://github.com/tdhock/LOPART/issues
Imports: data.table, Rcpp
Suggests: ggplot2, testthat
RoxygenNote: 7.3.1
NeedsCompilation: yes
Packaged: 2024-06-20 11:31:04 UTC; tdhock
Repository: CRAN
Date/Publication: 2024-06-20 21:30:07 UTC

Labeled Optimal PARTitioning

Description

Compute an optimal segmentation (change in Gaussian mean model, square loss), which is consistent with the given labels, and with a penalty for each changepoint outside of labeled regions.

Usage

LOPART(
  x,
  labels,
  penalty_unlabeled,
  n_updates = length(x),
  penalty_labeled = penalty_unlabeled
)

Arguments

x

numeric vector of data to fit a Gaussian mean model.

labels

data frame with at least three columns: start, end, changes. start/end should be indices of x, from 1 to length(x). changes should be either 0 or 1. The prediced changepoints are guaranteed to be consistent with these labels.

penalty_unlabeled

non-negative penalty constant (larger for fewer changes, smaller for more changes). penalty=0 means a change in every unlabeled region, penalty=Inf means no changes in unlabeled regions.

n_updates

how many dynamic programming updates to compute? Must be at least 1 and at most length(x).

penalty_labeled

non-negative penalty constant to use for changes in positive labels.

Details

Provides a high-level interface to LOPART_interface R function and LOPART C code.

Value

list with named elements, all of which are data tables. loss has one row with loss/cost values. cost is the output from LOPART_interface. changes has one row for each predicted changepoint (e.g. change=1.5 means a change between data points 1 and 2). segments has one row for each segment.

Author(s)

Toby Dylan Hocking

Examples

set.seed(2)
library(data.table)
signal <- c(
  rnorm(25, mean = 10),
  rnorm(25, mean = 7),
  rnorm(25, mean = 8),
  rnorm(25, mean = 5))
#outliers
signal[86] <- 10
labels.dt <- data.table(
  start = c(20, 45, 80),
  end = c(30, 55, 90),
  changes = c(1, 1, 0))
signal.dt <- data.table(
  signal,
  position=seq_along(signal))
label.colors <- c(
  "1"="#ff7d7d",
  "0"="#f6c48f")
sig.color <- "grey50"
if(require(ggplot2)){
  gg.data <- ggplot()+
    geom_rect(aes(
      xmin=start, xmax=end,
      fill=paste(changes),
      ymin=-Inf, ymax=Inf),
      alpha=0.5,
      data=labels.dt)+
    geom_point(aes(
      position, signal),
      color=sig.color,
      data=signal.dt)+
    scale_x_continuous(
      "position",
      breaks=seq(0, 100, by=10))+
    scale_fill_manual("label", values=label.colors)+
    theme_bw()+
    theme(panel.spacing=grid::unit(0, "lines"))
  print(gg.data)
}

label.list <- list(
  OPART=labels.dt[0],
  LOPART=labels.dt)
seg.dt.list <- list()
change.dt.list <- list()
cost.dt.list <- list()
for(model.name in names(label.list)){
  label.dt <- data.table(label.list[[model.name]])
  fit <- LOPART::LOPART(signal, label.dt, 10)
  Algorithm <- factor(model.name, names(label.list))
  tau.dt <- fit$cost[, .(
    cost_candidates,
    tau=0:(.N-1),
    change=seq_along(cost_candidates)-0.5
  )]
  cost.dt.list[[model.name]] <- data.table(Algorithm, tau.dt)
  seg.dt.list[[model.name]] <- data.table(Algorithm, fit$segments)
  change.dt.list[[model.name]] <- data.table(Algorithm, fit$changes)
}
seg.dt <- do.call(rbind, seg.dt.list)
change.dt <- do.call(rbind, change.dt.list)
cost.dt <- do.call(rbind, cost.dt.list)

algo.sizes <- c(
  OPART=1,
  LOPART=0.5)
algo.colors <- c(
  OPART="deepskyblue",
  LOPART="black")
algo.shapes <- c(
  OPART=1,
  LOPART=2)
if(require(ggplot2)){
  gg.data+
    scale_size_manual(values=algo.sizes)+
    scale_color_manual(values=algo.colors)+      
    geom_vline(aes(
      xintercept=change,
      size=Algorithm,
      color=Algorithm),
      data=change.dt)+
    geom_segment(aes(
      start-0.5, mean,
      size=Algorithm,
      color=Algorithm,
      xend=end+0.5, yend=mean),
      data=seg.dt)
}

if(require(ggplot2)){
  ggplot()+
    geom_rect(aes(
      xmin=start, xmax=end,
      fill=paste(changes),
      ymin=-Inf, ymax=Inf),
      alpha=0.5,
      data=labels.dt)+
    scale_fill_manual("label", values=label.colors)+
    theme_bw()+
    theme(panel.spacing=grid::unit(0, "lines"))+
    scale_x_continuous(
      "position",
      breaks=seq(0, 100, by=10))+
    geom_point(aes(
      change, cost_candidates,
      color=Algorithm, shape=Algorithm),
      data=cost.dt)+
    scale_color_manual(values=algo.colors)+      
    scale_shape_manual(values=algo.shapes)
}

abbrev.vec <- c(
  data="data and models",
  cost="cost of last change")
yfac <- function(l){
  factor(abbrev.vec[[l]], abbrev.vec)
}
COST <- function(dt){
  data.table(y.var=yfac("cost"), dt)
}
DATA <- function(dt){
  data.table(y.var=yfac("data"), dt)
}
if(require(ggplot2)){
  ggplot()+
    geom_rect(aes(
      xmin=start, xmax=end,
      fill=paste(changes),
      ymin=-Inf, ymax=Inf),
      alpha=0.5,
      data=labels.dt)+
    scale_fill_manual("label", values=label.colors)+
    theme_bw()+
    theme(panel.spacing=grid::unit(0, "lines"))+
    facet_grid(y.var ~ ., scales="free")+
    geom_vline(aes(
      xintercept=change,
      size=Algorithm,
      color=Algorithm),
      data=change.dt)+
    geom_segment(aes(
      start-0.5, mean,
      size=Algorithm,
      color=Algorithm,
      xend=end+0.5, yend=mean),
      data=DATA(seg.dt))+
    geom_point(aes(
      position, signal),
      color=sig.color,
      shape=1,
      data=DATA(signal.dt))+
    scale_size_manual(values=algo.sizes)+
    scale_color_manual(values=algo.colors)+
    scale_shape_manual(values=algo.shapes)+
    ylab("")+
    scale_x_continuous(
      "position",
      breaks=seq(0, 100, by=10))+
    geom_point(aes(
      change, cost_candidates,
      color=Algorithm, shape=Algorithm),
      data=COST(cost.dt))
}


Labeled Optimal Partitioning interface

Description

Low-level interface to LOPART C code

Usage

LOPART_interface(
  input_data,
  input_label_start,
  input_label_end,
  input_label_changes,
  n_updates,
  penalty_unlabeled,
  penalty_labeled = 0
)

Arguments

input_data

numeric vector of N data to segment

input_label_start

integer vector of label start positions in 0, ..., N-2

input_label_end

integer vector of label end positions in 1, ..., N-1

input_label_changes

integer vector of 0/1, number of labeled changes

n_updates

number of dynamic programming updates to perform, usually should be number of input_data N, but can be less if you want to analyze/plot the cost/candidates at previous data.

penalty_unlabeled

non-negative numeric scalar (bigger for fewer changes in unlabeled regions, smaller for more changes)

penalty_labeled

non-negative numeric scalar (penalty for each change in a positive label).

Details

Avoid using this function and instead use the LOPART function.

Value

data frame with four columns: cost_candidates is the cost of each last segment start considered (from 1 to N) for the computation of the optimal cost up to the last data point (Inf means infeasible); cost_optimal is the optimal cost vector computed using dynamic programming; mean is the last segment mean of the optimal model ending at that data point; last_change is the optimal changepoints (negative numbers are not used).

Author(s)

Toby Dylan Hocking

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.