The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Mathematical Definitions of Sufficient Statistics

Overview

The redeem package models the intensities of interaction formation and dissolution using a log-linear formulation. The intensity (rate) for a dyad \((i,j)\) at time \(t\) is:

\[\lambda_{i,j}(t) = \exp(s_{i,j}(\mathscr{H}_t)^\top \beta + \alpha_i + \alpha_j + f(t, \gamma))\]

where:

This vignette provides precise mathematical definitions for all sufficient network statistics implemented in the package, as well as a guide for adding custom statistics.

The complete list of available formula terms is documented in the redeem_terms reference manual page. For model estimation, see the help pages for the models via dem and rem.


Statistic Transformations

Each sufficient statistic \(s_{i,j}(\mathscr{H}_t)\) is defined as a transformed count \(f(N(t))\), where \(N(t)\) is a raw network statistic. The package supports five standard transformations \(f(\cdot)\):

Transformation Mathematical Definition
identity \(f(s) = s\)
log \(f(s) = \log(1 + s)\)
recip \(f(s) = 1/(1+s)\)
bin \(f(s) = \mathbb{I}(s > 0)\)
sig \(f(s) = \frac{s}{s + K}\)

Endogenous Network Statistics

These statistics capture structural dependencies within the network evolution.

1. Intercept (Intercept / intercept)

A constant term representing the baseline log-intensity. \[s_{i,j}(\mathscr{H}_t) = 1\]

2. Inertia (inertia / number_interaction)

Counts how many times the dyad \((i,j)\) has initiated an interaction in the past. \[s_{i,j}(\mathscr{H}_t) = f(N_{i,j}(t))\] where \(N_{i,j}(t) = \sum_{k: t_k < t} \mathbb{I}(i_k = i, j_k = j)\) (or the windowed version \(N_{i,j}^w(t) = \sum_{k: t-w < t_k < t} \mathbb{I}(i_k = i, j_k = j)\)).

3. Reciprocity (reciprocity)

Models the tendency to reciprocate past interactions (directed only). \[s_{i,j}(\mathscr{H}_t) = f(N_{j,i}(t))\]

4. Duration (duration / current_interaction)

Measures the dependency on the time since the current interaction started (DEM only). \[s_{i,j}(\mathscr{H}_t) = \begin{cases} f(t - t_{\text{start, } i,j}) & \text{if dyad } (i,j) \text{ is interacting at } t \\ 0 & \text{otherwise} \end{cases}\] where \(t_{\text{start, } i,j}\) is the timestamp of the last formation event.

5. Participation Shifts (P-shifts)

P-shifts capture the sequential dependencies between consecutive events (REM only). Let the preceding event in the stream be \(A \to B\). For a candidate event \(i \to j\) at time \(t\):


Triadic Closure and Shared Partners

Triadic statistics capture structural closure. They can be calculated over active edges (designated as current_ in DEM) or historical event existence (designated as general_ in REM/DEM). Let \(\mathcal{A}_t\) represent the set of interacting dyads (for current_) or previously interacted dyads (for general_) at time \(t\).

Common Partners (general_common_partners / current_common_partners)

Counts the number of third-party actors \(k\) sharing a connection of a specified type with both \(i\) and \(j\):

Triangles (general_triangle / current_triangle)

Similar to common partners, but only non-zero if the focal dyad itself is active (directed only):


Degree and Centrality Statistics

These statistics capture actor-level activity or popularity. Let \(\mathcal{D}_t\) be the network state at time \(t\), and \(d_{i, \text{out}}(t)\), \(d_{i, \text{in}}(t)\) represent the out-degree and in-degree of \(i\) in \(\mathcal{D}_t\). Let \(c_{i, \text{out}}(t)\), \(c_{i, \text{in}}(t)\) be the total out-events and in-events involving \(i\).

Degree Statistics (general_degree_out_sender, etc.)

Count Statistics (general_count_out_sender, etc.)

Count statistics are identical to degree statistics but use total interaction counts (\(c\)) rather than binary degrees (\(d\)):

All of these degree and count statistics can be conveniently specified in model formulas using the degree() (or degrees()) and count() helper functions:


Exogenous Statistics

1. Dyadic Covariate (dyadic_cov)

\[s_{i,j}(\mathscr{H}_t) = f(X_{i,j}(t))\] where \(X(t)\) is an \(N \times N\) matrix.

2. Monadic Covariate (monadic_cov)

\[s_{i,j}(\mathscr{H}_t) = g(x_i(t), x_j(t))\] where \(x\) is a monadic vector and \(g(\cdot)\) is a user-defined function.


Developer Guide: Adding Custom Sufficient Statistics

To add a new sufficient statistic (e.g., my_stat) to the package, follow this two-step process:

Step 1: Implement the Update Logic in C++

Define a C++ function in the src/ directory (e.g., in a new file or in src/sufficient_statistics.cpp) with the signature ValidateFunction. This function computes the change to the statistic state matrix when an event occurs:

#include "redeem/sufficient_statistics.h"
#include "redeem/extension_api.hpp"

arma::uvec stat_my_stat(
    Data_DEM &object, 
    arma::mat &data, 
    unsigned int &from, 
    unsigned int &to, 
    unsigned int &number_event, 
    unsigned int col_number, 
    std::string transformation, 
    unsigned int type
) {
  if (from == 0 || to == 0) return arma::uvec();
  
  // 1. Calculate which dyad indices are affected and the change value (val)
  double val = (type == 1) ? 1.0 : -1.0; 
  arma::uvec affected_indices = { object.current_stats.find_from_to(from, to) };
  
  // 2. Apply the update using the helper (handles transformations internally)
  apply_update(object, affected_indices, col_number, val, transformation, data);
  
  // 3. Return the indices of modified dyads
  return affected_indices;
}

// 4. Register the term in the global C++ Registry
TERM_REGISTER("my_stat", stat_my_stat);

Step 2: Write the R Initializer Function

In R/init_terms.R, create an initialization function named InitRedeemTerm.my_stat. This function validates user-provided arguments and sets up initial values:

#' @keywords internal
InitRedeemTerm.my_stat <- function(arglist, n_nodes, model_type, directed, ...) {
  # Validate arguments against expected types and models
  arglist <- check.RedeemTerm(arglist,
    model_type = model_type, directed = directed,
    expected = list(transformation = c("identity", "log", "recip", "bin", "sig"), K = "numeric"),
    defaults = list(transformation = "identity", K = 1)
  )
  
  # Define initial statistic values at time 0 (optional)
  eval_at_zero <- matrix(0, n_nodes, n_nodes)
  
  # Return the configuration list
  list(
    base_name = "my_stat",
    transformation = arglist$transformation,
    eval_at_zero = eval_at_zero,
    event_stream = arglist$event_stream
  )
}

Once these steps are completed, run devtools::document() to update namespaces and documentation, and build the package. The new term is now ready for use in dem() and rem() by specifying it directly in the formula (e.g., ~ my_stat(transformation = "log")).

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.