The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
An example of how to estimate the dynamics of a latent factor. Assume that indicators are observed at two time points and you want to estimate how the period-2 value depends on the period-1 value. You need two things:
f₂ = α + β·f₁ + ε.factorana supports this with a two-stage workflow, and two helper functions make the setup a few lines:
define_dynamic_measurement() builds
the Stage 1 measurement model: a 2-factor independent system (or
k-factor, for k periods) with loadings and residual sigmas tied across
periods via equality constraints, but measurement intercepts left
period-specific.build_dynamic_previous_stage()
converts the Stage 1 estimation result into a
previous_stage object for Stage 2, carrying the anchor
period’s intercepts into every factor slot. This anchors the measurement
level under E[f₁] = 0 and lets the observed mean shift
between periods identify the structural intercept α.This vignette walks through a simulated example where α,
β, and σ²_ε are known, shows the recovery, and
explains one subtlety of the workflow (why the anchor-period intercepts
are the right ones to carry into Stage 2).
One latent construct, three linear indicators per period, two periods. The DGP:
\[ f_1 \sim N(0,\sigma^2_{f_1}), \qquad f_2 = \alpha + \beta\,f_1 + \varepsilon, \quad \varepsilon \sim N(0,\sigma^2_\varepsilon) \]
\[ Y_{t,m} = \tau_m + \lambda_m\,f_t + u_{t,m}, \quad u_{t,m} \sim N(0,\sigma_m^2) \]
with measurement parameters τ_m, λ_m,
σ_m shared across the two periods.
library(factorana)
set.seed(41)
n <- 1500
# Structural parameters (what we want to recover)
true_var_f1 <- 1.0
true_alpha <- 0.4
true_beta <- 0.6
true_sigma_e2 <- 0.5
# Shared measurement parameters
item_int <- c(1.5, 1.0, 0.8)
item_load <- c(1.0, 0.9, 1.1) # first loading fixed to 1 in the model
item_sigma <- c(0.7, 0.75, 0.65)
f1 <- rnorm(n, 0, sqrt(true_var_f1))
eps <- rnorm(n, 0, sqrt(true_sigma_e2))
f2 <- true_alpha + true_beta * f1 + eps
gen_Y <- function(f, i) {
item_int[i] + item_load[i] * f + rnorm(length(f), 0, item_sigma[i])
}
dat <- data.frame(
intercept = 1,
eval = 1L,
Y_t1_m1 = gen_Y(f1, 1), Y_t1_m2 = gen_Y(f1, 2), Y_t1_m3 = gen_Y(f1, 3),
Y_t2_m1 = gen_Y(f2, 1), Y_t2_m2 = gen_Y(f2, 2), Y_t2_m3 = gen_Y(f2, 3)
)The wide-format data has columns named
<prefix><item>: Y_t1_m1,
Y_t1_m2, Y_t1_m3 for wave 1, and
Y_t2_m1, … for wave 2.
define_dynamic_measurement()One function call builds the measurement model: a 2-factor independent system (one factor per period) with loadings and sigmas tied across periods, intercepts left free per period.
dyn <- define_dynamic_measurement(
data = dat,
items = c("m1", "m2", "m3"),
period_prefixes = c("Y_t1_", "Y_t2_"),
model_type = "linear",
evaluation_indicator = "eval"
)
# The wrapper generates 5 equality constraints: 2 for loadings (items
# m2, m3; item m1's loading is fixed to 1 on its factor slot) and 3
# for sigmas.
length(dyn$equality_constraints)
#> [1] 5Estimate Stage 1 with estimate_model_rcpp() exactly as
for any model system:
ctrl <- define_estimation_control(n_quad_points = 8, num_cores = 1)
result_stage1 <- estimate_model_rcpp(
model_system = dyn$model_system,
data = dat,
control = ctrl,
optimizer = "nlminb",
parallel = FALSE,
verbose = FALSE
)
result_stage1$convergence
#> [1] 0Inspect the intercepts: the wave-1 intercepts should match the DGP
τ_m (because E[f₁] = 0 by convention). The
wave-2 intercepts are shifted by λ_m · E[f₂], which is the
mean-drift artefact we are about to discard.
est <- result_stage1$estimates
tab <- data.frame(
m = 1:3,
DGP_tau = item_int,
wave_1 = round(c(est["Y_t1_m1_intercept"],
est["Y_t1_m2_intercept"],
est["Y_t1_m3_intercept"]), 3),
wave_2 = round(c(est["Y_t2_m1_intercept"],
est["Y_t2_m2_intercept"],
est["Y_t2_m3_intercept"]), 3)
)
knitr::kable(tab, row.names = FALSE)| m | DGP_tau | wave_1 | wave_2 |
|---|---|---|---|
| 1 | 1.5 | 1.488 | 1.905 |
| 2 | 1.0 | 1.000 | 1.332 |
| 3 | 0.8 | 0.778 | 1.234 |
A natural alternative is to stack period-1 and period-2 rows into one
long-format dataset and fit a single-factor CFA. The resulting
intercepts would be the pooled means of Y across
both periods, biased by λ_m · E[f₂]/2 (because the pooled
mean averages the period-1 and period-2 factor means). Plugging those
into Stage 2 under-estimates α by roughly
E[f₂]/2.
The wrapper avoids that: Stage 1 estimates period-specific
intercepts, then build_dynamic_previous_stage() carries
only the anchor period’s intercepts into Stage 2.
build_dynamic_previous_stage() +
SE_lineardummy <- build_dynamic_previous_stage(
dyn = dyn,
stage1_result = result_stage1,
data = dat,
anchor_period = 1L
)
fm_stage2 <- define_factor_model(
n_factors = 2,
n_types = 1,
factor_structure = "SE_linear"
)
ms_stage2 <- define_model_system(
components = list(), # measurement components prepended from previous_stage
factor = fm_stage2,
previous_stage = dummy
)
#> Two-stage SE estimation: Stage 1 (independent) -> Stage 2 (SE_linear)
#> Measurement parameters (loadings, thresholds) will be fixed from Stage 1
#> Factor distribution parameters will use Stage 2 structure
#> Fixing 16 measurement parameters, ignoring 2 factor distribution parameters
init_s2 <- initialize_parameters(ms_stage2, dat, verbose = FALSE)
init_s2$init_params["factor_var_1"] <- unname(dummy$estimates["factor_var_1"])
init_s2$init_params["se_intercept"] <- 0.0
init_s2$init_params["se_linear_1"] <- 0.5
init_s2$init_params["se_residual_var"] <- 0.5
result_stage2 <- estimate_model_rcpp(
model_system = ms_stage2,
data = dat,
init_params = init_s2$init_params,
control = ctrl,
optimizer = "nlminb",
parallel = FALSE,
verbose = FALSE
)
result_stage2$convergence
#> [1] 0est <- result_stage2$estimates
se <- result_stage2$std_errors
ps <- c("factor_var_1", "se_intercept", "se_linear_1", "se_residual_var")
tab <- data.frame(
param = ps,
true = c(true_var_f1, true_alpha, true_beta, true_sigma_e2),
est = round(unname(est[ps]), 4),
se = round(unname(se[ps]), 4)
)
tab$z <- round((tab$est - tab$true) / tab$se, 2)
knitr::kable(tab, row.names = FALSE)| param | true | est | se | z |
|---|---|---|---|---|
| factor_var_1 | 1.0 | 0.9104 | 0.0325 | -2.76 |
| se_intercept | 0.4 | 0.4036 | 0.0217 | 0.17 |
| se_linear_1 | 0.6 | 0.5880 | 0.0236 | -0.51 |
| se_residual_var | 0.5 | 0.5051 | 0.0261 | 0.20 |
se_intercept (the structural α) recovers essentially
exactly, which is the whole point of the workflow. The slope β, residual
variance σ²_ε, and input factor variance σ²_f₁ also recover within their
standard errors.
period_prefixes
can have length 3, 4, or more. The wrapper builds a k-factor independent
Stage 1 with all k · (n_items - 1) loadings and
k · n_items sigmas tied across periods. For Stage 2 SE
modelling, pick any two factors (e.g., periods t-1 and
t) and build a 2-factor SE model with
previous_stage.model_type = "oprobit" and n_categories = K.
The wrapper ties the K-1 cutpoints of each item across
periods (in place of sigma).covariates = c("intercept", "x1", ...). Covariate
coefficients on items are estimated per-period in Stage 1 (not tied by
this wrapper); the anchor-period values are carried into Stage 2.?define_factor_model:
factor_structure = "SE_quadratic" adds a quadratic term
γ·f_1² to the structural equation (trap/threshold
dynamics).?define_model_system: the underlying
equality_constraints argument, used by the wrapper.vignette("measurement_system", package = "factorana"):
basics of measurement-system estimation.vignette("roy_model", package = "factorana"): a
different structural setting (discrete sector choice plus continuous
outcomes sharing a latent ability).These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.