| Type: | Package |
| Title: | STATIS and STATIS DUAL Multivariate Methods |
| Version: | 1.0.1 |
| Description: | Provides tools for the integration and exploration of data tables measured on the same set of observational units. The package includes methods to assess similarities among tables, extract common patterns across variable blocks, and create visual summaries that highlight shared structures in multiblock data. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Imports: | ggforce, ggplot2, ggrepel |
| Depends: | R (≥ 4.1) |
| Language: | en-US |
| LazyData: | true |
| Suggests: | knitr, rmarkdown |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2025-11-28 20:49:01 UTC; promidat20 |
| Author: | Oldemar Rodriguez [aut, cre], Alejandro Vargas [ctb, prg] |
| Maintainer: | Oldemar Rodriguez <oldemar.rodriguez@ucr.ac.cr> |
| Repository: | CRAN |
| Date/Publication: | 2025-12-03 21:20:02 UTC |
STATIS and STATIS Dual Multivariate Methods
Description
STATIS is a multivariate technique developed in France in the 1970s by L'Hermier des Plantes and Escoufier, designed to analyze multiple data tables. This package provides an implementation of methods for the joint analysis of multiple data tables sharing the same set of observations. Includes functions for performing inter-structure and intra-structure analyses, as well as graphical representations of compromise structures and the shared variability across different groups of variables.
Details
| Package: | statisR |
| Type: | Package |
| Version: | 1.0.1 |
| Date: | 2025-11-28 |
| License: | GPL (>=2) |
Author(s)
Oldemar Rodriguez Rojas
Maintainer: Oldemar Rodriguez Rojas oldemar.rodriguez@ucr.ac.cr
References
Abdi, H., & Valentin, D. (2007). The STATIS method. Encyclopedia of measurement and statistics, 955-962.
González, J., & Rodrıguez, O. (1995). Algoritmo e implementación del método Statis. In IX Simposio Métodos Matemáticos Aplicados a las Ciencias. UCR-ITCR, Turrialba.
Elizondo, W. C., & Varela, J. G. (1998). Stadis dual: software y análisis de datos reales. Revista de Matemática: Teoría y Aplicaciones, 5(2), 149-162.
Abdi, H. (2007). RV coefficient and congruence coefficient. Encyclopedia of measurement and statistics, 849(853), 92.
Abdi, H. (2007). Singular value decomposition (SVD) and generalized singular value decomposition. Encyclopedia of measurement and statistics, 907(912), 44.
Xiang, G. (2007). Fast algorithms for computing statistics under interval uncertainty, with applications to computer science and to electrical and computer engineering. The University of Texas at El Paso.
McHale, D., & Lavit, C. (1990). Analyse Conjointe de Tableaux Quantitatifs. Biometrics, 46(2), 542.
Physicochemical Variables Dataset for STATIS Analysis
Description
This dataset belongs to a project called “Development and application of effective, low-cost methods for the biological monitoring of Costa Rican rivers” by the National University, and contains measurements of various physicochemical variables collected across several sampling sites. It is intended to be used in the examples and demonstrations of the main functions of the package, particularly those related to multivariate analysis and STATIS methodology.
Usage
data(STATIS_TABLE1)
Format
A data frame with 14 columns and multiple rows (one per sampling site):
- NIT
Total nitrogen level.
- FOS
Phosphorus level.
- CAL
Calcium level.
- STO
Sodium level.
- PH
pH measurement.
- MN
Manganese concentration.
- ZN
Zinc concentration.
- SS
Suspended solids.
- ALC
Alkalinity.
- CL
Chlorine level.
- CAU
Exchangeable calcium or equivalent measurement.
- DBO
Biological oxygen demand.
- POR
Porosity or related percentage.
Examples
data(STATIS_TABLE1)
head(STATIS_TABLE1)
Physicochemical Variables Dataset for STATIS Analysis
Description
This dataset belongs to a project called “Development and application of effective, low-cost methods for the biological monitoring of Costa Rican rivers” by the National University, and contains measurements of various physicochemical variables collected across several sampling sites. It is intended to be used in the examples and demonstrations of the main functions of the package, particularly those related to multivariate analysis and STATIS methodology.
Usage
data(STATIS_TABLE1)
Format
A data frame with 14 columns and multiple rows (one per sampling site):
- NIT
Total nitrogen level.
- FOS
Phosphorus level.
- CAL
Calcium level.
- STO
Sodium level.
- PH
pH measurement.
- MN
Manganese concentration.
- ZN
Zinc concentration.
- SS
Suspended solids.
- ALC
Alkalinity.
- CL
Chlorine level.
- CAU
Exchangeable calcium or equivalent measurement.
- DBO
Biological oxygen demand.
- POR
Porosity or related percentage.
Examples
data(STATIS_TABLE2)
head(STATIS_TABLE2)
Physicochemical Variables Dataset for STATIS Analysis
Description
This dataset belongs to a project called “Development and application of effective, low-cost methods for the biological monitoring of Costa Rican rivers” by the National University, and contains measurements of various physicochemical variables collected across several sampling sites. It is intended to be used in the examples and demonstrations of the main functions of the package, particularly those related to multivariate analysis and STATIS methodology.
Usage
data(STATIS_TABLE3)
Format
A data frame with 14 columns and multiple rows (one per sampling site):
- NIT
Total nitrogen level.
- FOS
Phosphorus level.
- CAL
Calcium level.
- STO
Sodium level.
- PH
pH measurement.
- MN
Manganese concentration.
- ZN
Zinc concentration.
- SS
Suspended solids.
- ALC
Alkalinity.
- CL
Chlorine level.
- CAU
Exchangeable calcium or equivalent measurement.
- DBO
Biological oxygen demand.
- POR
Porosity or related percentage.
Examples
data(STATIS_TABLE3)
head(STATIS_TABLE3)
Physicochemical Variables Dataset for STATIS Analysis
Description
This dataset belongs to a project called “Development and application of effective, low-cost methods for the biological monitoring of Costa Rican rivers” by the National University, and contains measurements of various physicochemical variables collected across several sampling sites. It is intended to be used in the examples and demonstrations of the main functions of the package, particularly those related to multivariate analysis and STATIS methodology.
Usage
data(STATIS_TABLE4)
Format
A data frame with 14 columns and multiple rows (one per sampling site):
- NIT
Total nitrogen level.
- FOS
Phosphorus level.
- CAL
Calcium level.
- STO
Sodium level.
- PH
pH measurement.
- MN
Manganese concentration.
- ZN
Zinc concentration.
- SS
Suspended solids.
- ALC
Alkalinity.
- CL
Chlorine level.
- CAU
Exchangeable calcium or equivalent measurement.
- DBO
Biological oxygen demand.
- POR
Porosity or related percentage.
Examples
data(STATIS_TABLE4)
head(STATIS_TABLE4)
Physicochemical Quality Data (Tuis5_95)
Description
This dataset contains physicochemical measurements collected from a Sugarcane Fertilizer experiment in Costa Rica. The values represent indicators measured during a monitoring campaign. The dataset is useful for illustrating multivariate methods, STATIS analyses, or environmental data exploration workflows.
Usage
data(Tuis5_95)
Format
A data frame with 10 observations and 19 variables:
- Ph
pH value of the sample.
- Temp
Temperature (°C).
- Na
Sodium concentration.
- Ka
Potassium concentration.
- Ca
Calcium concentration.
- Mg
Magnesium concentration.
- Si02
Silica concentration.
- OD
Dissolved oxygen.
- DBO
Biochemical oxygen demand (BOD).
- SD
Dissolved solids.
- ST
Total solids.
- PO4
Phosphate concentration.
- Cl
Chloride concentration.
- NO3
Nitrate concentration.
- SO45
Sulfate concentration.
- HC03
Bicarbonate concentration.
- DT
Total hardness or related measurement.
- POD
Dissolved oxygen percentage or related measure.
- Cal
Calcium-related parameter (likely residual hardness).
Examples
data(Tuis5_95)
head(Tuis5_95)
Physicochemical Quality Data (Tuis5_96)
Description
This dataset contains physicochemical measurements collected from a Sugarcane Fertilizer experiment in Costa Rica. The values represent indicators measured during a monitoring campaign. The dataset is useful for illustrating multivariate methods, STATIS analyses, or environmental data exploration workflows.
Usage
data(Tuis5_96)
Format
A data frame with 12 observations and 19 variables:
- Ph
pH value of the water sample.
- Temp
Temperature (°C).
- Na
Sodium concentration.
- Ka
Potassium concentration.
- Ca
Calcium concentration.
- Mg
Magnesium concentration.
- Si02
Silica concentration.
- OD
Dissolved oxygen.
- DBO
Biochemical oxygen demand (BOD).
- SD
Dissolved solids.
- ST
Total solids.
- PO4
Phosphate concentration.
- Cl
Chloride concentration.
- NO3
Nitrate concentration.
- SO45
Sulfate concentration.
- HC03
Bicarbonate concentration.
- DT
Total hardness or related parameter.
- POD
Dissolved oxygen percentage or related measurement.
- Cal
Calcium-related parameter (e.g., residual hardness).
Examples
data(Tuis5_96)
head(Tuis5_96)
Physicochemical Quality Data (Tuis5_97)
Description
This dataset contains physicochemical measurements collected from a Sugarcane Fertilizer experiment in Costa Rica. The values represent indicators measured during a monitoring campaign. The dataset is useful for illustrating multivariate methods, STATIS analyses, or environmental data exploration workflows.
Usage
data(Tuis5_97)
Format
A data frame with 12 observations and 19 variables:
- Ph
pH value of the water sample.
- Temp
Temperature (°C).
- Na
Sodium concentration.
- Ka
Potassium concentration.
- Ca
Calcium concentration.
- Mg
Magnesium concentration.
- Si02
Silica concentration.
- OD
Dissolved oxygen.
- DBO
Biochemical oxygen demand (BOD).
- SD
Dissolved solids.
- ST
Total solids.
- PO4
Phosphate concentration.
- Cl
Chloride concentration.
- NO3
Nitrate concentration.
- SO45
Sulfate concentration.
- HC03
Bicarbonate concentration.
- DT
Total hardness or related parameter.
- POD
Dissolved oxygen percentage or related measure.
- Cal
Calcium-related parameter (e.g., residual hardness).
Examples
data(Tuis5_97)
head(Tuis5_97)
Physicochemical Quality Data (Tuis5_98)
Description
This dataset contains physicochemical measurements collected from a Sugarcane Fertilizer experiment in Costa Rica. The values represent indicators measured during a monitoring campaign. The dataset is useful for illustrating multivariate methods, STATIS analyses, or environmental data exploration workflows.
Usage
data(Tuis5_98)
Format
A data frame with 12 observations and 19 variables:
- Ph
pH value of the water sample.
- Temp
Temperature (°C).
- Na
Sodium concentration.
- Ka
Potassium concentration.
- Ca
Calcium concentration.
- Mg
Magnesium concentration.
- Si02
Silica concentration.
- OD
Dissolved oxygen.
- DBO
Biochemical oxygen demand (BOD).
- SD
Dissolved solids.
- ST
Total solids.
- PO4
Phosphate concentration.
- Cl
Chloride concentration.
- NO3
Nitrate concentration.
- SO45
Sulfate concentration.
- HC03
Bicarbonate concentration.
- DT
Total hardness or related parameter.
- POD
Dissolved oxygen percentage or related measurement.
- Cal
Calcium-related parameter (e.g., residual hardness).
Examples
data(Tuis5_98)
head(Tuis5_98)
Sensory Evaluation Data from Expert 1
Description
This dataset contains the ratings provided by Expert 1 for six wine samples. Each wine is evaluated according to three sensory attributes commonly used in descriptive analysis: fruity, woody, and coffee. The dataset is typically used in STATIS and multitable analyses to illustrate how different experts evaluate the same set of products.
Usage
data(expert1)
Format
A data frame with 6 rows (Wine1–Wine6) and 3 sensory attributes:
- fruity
Intensity of fruity aromas.
- woody
Intensity of woody/aged aromas.
- coffee
Perceived coffee-like notes.
References
Abdi, H., & Valentin, D. (2007). The STATIS method. Encyclopedia of measurement and statistics, 955-962.
Examples
data(expert1)
expert1
Sensory Evaluation Data from Expert 2
Description
This dataset contains the evaluations provided by Expert 2 for the same six wine samples. Unlike Expert 1, this expert uses four sensory descriptors: red_fruit, roasted, vanillin, and woody. The dataset demonstrates how experts may differ in terminology and profiling, and it is commonly used in STATIS, MFA, and other multitable comparison techniques.
Usage
data(expert2)
Format
A data frame with 6 rows (Wine1–Wine6) and 4 sensory attributes:
- red_fruit
Intensity of red fruit aromas.
- roasted
Intensity of roasted or toasted notes.
- vanillin
Perceived vanilla-related notes.
- woody
Intensity of woody/aged aromas.
References
Abdi, H., & Valentin, D. (2007). The STATIS method. Encyclopedia of measurement and statistics, 955-962.
Examples
data(expert2)
expert2
Sensory Evaluation Data from Expert 3
Description
This dataset contains the ratings given by Expert 3 for the same set of six wine samples. This expert evaluates wines using three sensory attributes: fruity, butter, and woody. The dataset is often used in multivariate and STATIS examples to highlight both agreement and divergence across panels of experts.
Usage
data(expert3)
Format
A data frame with 6 rows (Wine1–Wine6) and 3 sensory attributes:
- fruity
Intensity of fruity aromas.
- butter
Presence of buttery or lactic notes.
- woody
Intensity of woody/aged aromas.
References
Abdi, H., & Valentin, D. (2007). The STATIS method. Encyclopedia of measurement and statistics, 955-962.
Examples
data(expert3)
expert3
Plot a Correlation Circle (Unit Circle)
Description
This function generates a correlation circle plot from two-dimensional coordinates, commonly used in principal component analysis (PCA) or other multivariate methods.
Usage
## S3 method for class 'statis.circle'
plot(points, inertia = 100, labels = NULL, title = "")
Arguments
points |
A matrix or data frame with two numeric columns (x, y) representing the coordinates of the vectors. |
inertia |
A number between 0 and 100 representing the percentage of explained inertia. It is displayed in the title. |
labels |
A character vector with labels for the points (optional). If not specified, labels are left blank. |
title |
Optional text used as the main title of the plot. |
Details
Arrows are drawn from the origin to each point specified in points. A reference circle
with radius 1 is displayed. You can also show the percentage of explained inertia and point labels.
The inertia argument is flexible and can be passed as the second or third parameter
if argument names are omitted.
Value
A ggplot object with the generated plot.
See Also
Examples
data(expert1, expert2, expert3)
labels <- c("Expert 1", "Expert 2", "Expert 3")
# If you want to select an specific table or row just set the parameters in the statis function.
res <- statis(list(expert1, expert2, expert3), table.labels = labels)
# Circle of correlations of all the tables
inter <- res$circle.inter
plot.statis.circle(inter$points, inter$inertia, inter$labels, inter$title)
# Circle of correlations of all variables evolution
intra <- res$circle.intra
plot.statis.circle(intra$points, intra$inertia, intra$labels, intra$title)
Bivariate PCA-style Scatter Plot
Description
This function generates a 2D scatter plot with support for multiple groups,
labels, arrows from the origin, reference circles, cross axes, and full style customization using ggplot2.
Usage
## S3 method for class 'statis.dual.circle'
plot(
points.list,
style.points = list(list(size = 3)),
style.circle = list(),
radius.circle = 1,
labels = "auto",
labels.style = NULL,
draw.labels = TRUE,
vars.direction = NULL,
style.vars = list(),
radius.vars = c(0.5, 1),
join.dots = FALSE,
style.join = list(),
base.colors = .base.colors,
axes = TRUE,
frame = TRUE,
hide.ticks = TRUE,
proportion = 1,
xlim = NULL,
ylim = NULL,
axes.xy = TRUE,
style.axes.xy = list(linewidth = 0.35, linetype = "dashed", color = "gray40"),
arrows.points = TRUE,
factor.arrow = 0.95,
style.arrows = list(color = "red", linewidth = 0.6, arrow = grid::arrow(length =
grid::unit(0.2, "cm")))
)
Arguments
points.list |
List of numeric objects (matrices, data.frames, or lists of vectors), each representing a group of 2D points. |
style.points |
List of |
style.circle |
List of styles for the reference circle (passed to |
radius.circle |
Radius (or vector of radii) to draw circles centered at the origin. Default value |
labels |
|
labels.style |
List of styles for the labels, passed to |
draw.labels |
Logical. If |
vars.direction |
Directions of projected variables. |
style.vars |
Style for projected variables. |
radius.vars |
Radii used to scale variable arrows. |
join.dots |
Logical or list. If |
style.join |
List of styles for connecting points (passed to |
base.colors |
Vector of base colors used for the groups. |
axes |
Logical. If |
frame |
Logical. Not directly used; may be reserved for future use. |
hide.ticks |
Logical. If |
proportion |
Fixed aspect ratio of the plot (to avoid distortion). |
xlim |
X-axis limits. |
ylim |
Y-axis limits. |
axes.xy |
Logical. If |
style.axes.xy |
List of styles for the XY cross axes (e.g., |
arrows.points |
Logical. If |
factor.arrow |
Factor to shorten the arrows. |
style.arrows |
List of styles for the arrows. |
Value
A ggplot object with the generated plot.
See Also
Examples
data(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98)
labels <- c("95","96","97","98")
res <- statis.dual(list(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98), labels.tables = labels)
# Interstructure
t <- ggplot2::ggtitle("Interstructure")
plot.statis.dual.circle(points.list = list(res$interstructure), labels = res$labels.tables) + t
# Circle of correlations (all variables)
t <- ggplot2::ggtitle("Correlation (all variables)")
plot.statis.dual.circle(list(res$supervariables), labels = row.names(res$supervariables)) + t
# Circle of correlations (variables selected)
selected.variables <- c("Ph", "Temp", "DBO", "ST", "PO4", "NO3", "POD", "Cal")
superv.sel.df <- select.super.variables(res$supervariables, res$vars.names, selected.variables)
t <- ggplot2::ggtitle("Correlation (selected variables)")
plot.statis.dual.circle(list(superv.sel.df), labels = row.names(superv.sel.df)) + t
Plot Variable Trajectories in STATIS DUAL
Description
Visualizes the evolution of one or more variables across the different tables in a STATIS DUAL analysis. Each trajectory represents the sequence of positions of a variable in the compromise space.
Usage
## S3 method for class 'statis.dual.trajectories'
plot(
vars,
trajectories,
labels.tables,
.range = NULL,
style.line = list(linetype = 2, linewidth = 0.5, color = "orange"),
point.size = 3,
base.colors = c("red", "blue", "brown", "darkgreen", "purple", "orange", "cyan4",
"gold3", "black")
)
Arguments
vars |
Vector of variable names to plot (must match the names in |
trajectories |
List generated by |
labels.tables |
Vector of length K with the names or labels of the tables. |
.range |
List with axis limits: |
style.line |
List with line style for the trajectories. |
point.size |
Size of the points at each position along the trajectory. |
base.colors |
Vector of base colors to distinguish the variables. |
Value
A ggplot object showing the trajectories of the selected variables.
See Also
plot.statis.dual.circle, statis.dual
Examples
data(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98)
labels = c("95","96","97","98")
res <- statis.dual(list(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98), labels.tables = labels)
# If you want to select some variables
vars.A <- c("Ph","ST","NO3")
t <- ggplot2::ggtitle(sprintf("Trayectorias (%s)", paste(vars.A, collapse = ", ")))
plot.statis.dual.trajectories(vars.A, res$trajectories, res$labels.tables) + t
# If you want to select an specific variable
vars.1 <- "Temp"
t <- ggplot2::ggtitle(sprintf("Trajectory (%s)", vars.1))
plot.statis.dual.trajectories(vars.1, res$trajectories, res$labels.tables) + t
# All variables
t <- ggplot2::ggtitle("Trajectories (all variables)")
plot.statis.dual.trajectories(res$vars.names, res$trajectories, res$labels.tables) + t
Plot a Plane of Observations or 2D Projections
Description
This function generates a two-dimensional scatter plot with centered axes, useful for representing the results of multivariate analyses.
Usage
## S3 method for class 'statis.plane'
plot(points, inertia = 100, labels = NULL, title = "")
Arguments
points |
A matrix, data frame, or a length-2 vector with coordinates (x, y). Must have exactly two columns. |
inertia |
A number between 0 and 100 indicating the percentage of explained inertia (optional, defaults to 100). |
labels |
A character vector with labels for the points (optional). Must match the number of rows in |
title |
A text string to be used as the main title of the plot (optional). |
Details
The plot includes points, optional labels, Cartesian axes (centered at 0), and a title indicating the percentage of explained inertia.
Value
A ggplot object with the generated plot.
See Also
Examples
data(expert1, expert2, expert3)
labels <- c("Expert 1", "Expert 2", "Expert 3")
# If you want to select an specific table or row just set the parameters in the statis function.
res <- statis(list(expert1, expert2, expert3), table.labels = labels)
# Main Plane of Average Individuals
individuals <- res$plane.individuals
plot.statis.plane(individuals$points, individuals$inertia, individuals$labels, individuals$title)
# Main Plane of the Evolution of individuals
evolution <- res$plane.evolution
plot.statis.plane(evolution$points, evolution$inertia, evolution$labels, evolution$title)
Select and prepare a subset of variables from a supervision matrix
Description
This function selects a predefined subset of variables (ETCal) from a
supervision matrix (superv), checks dimension consistency, verifies
missing variables, and constructs a clean data frame containing the first two
coordinates typically used for PCA or STATIS correlation plots.
Usage
select.super.variables(superv, ET, ETCal)
Arguments
superv |
A numeric matrix or data frame where each row corresponds to a variable and columns represent coordinates ( |
ET |
A character vector containing the full list of expected variable names ( |
ETCal |
A character vector containing the subset of variables to be selected. |
Details
The function performs the following steps:
Checks that the number of rows in
supervmatches the length ofET.Assigns the row names of
supervusingET.Identifies whether any variables in
ETCalare missing insuperv; missing variables trigger a warning.Creates an ordered list of valid variables (
ETCal_ok) based on their presence insuperv.Extracts the corresponding rows from
supervand constructs a data frame with columnsxandy.
Value
A data frame with two columns:
-
x: first coordinate (e.g., PC1) -
y: second coordinate (e.g., PC2)
Row names correspond to the selected variables defined in ETCal.
Examples
data(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98)
labels <- c("95","96","97","98")
res <- statis.dual(list(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98), labels.tables = labels)
ETCal <- c("Ph","Temp","DBO","ST","PO4","NO3","POD","Cal")
df_selected <- select.super.variables(res$supervariables, res$vars.names, ETCal)
STATIS Method
Description
Applies the STATIS method to a set of matrices (data tables) with the same rows. Is a multivariate analysis technique that allows studying the common structure and the evolution of individuals and variables across multiple tables.
Usage
statis(
matrices,
selected.tables = NULL,
selected.rows = NULL,
table.labels = NULL
)
Arguments
matrices |
List of numeric matrices (at least 2), all with the same number of rows (individuals). |
selected.tables |
Select a subset of tables. If |
selected.rows |
Select a subset of rows. If |
table.labels |
Optional vector with names for the tables. It must have the same length as the number of tables. |
Value
A list with the following elements:
n |
Number of rows (individuals). |
r |
Number of tables. |
p |
Vector with the number of columns per table. |
S |
List of centered matrices. |
W |
List of proximity matrices. |
X |
Matrix of interstructure (vectorization of W). |
acp.inter |
PCA results of the interstructure: eigenvalues, eigenvectors, components, correlations. |
XT |
Weighted average of the matrices. |
acp.intra |
PCA results of the average: eigenvalues, eigenvectors, components, correlations. |
IND |
Concatenated matrix with all W stacked (individual evolution). |
Omega |
Projection of individual evolution onto the principal components. |
circle.inter |
Data to plot the correlation circle between tables. |
circle.intra |
Data to plot the variable evolution circle. |
plane.individuals |
Data to plot the average individuals plane. |
plane.evolution |
Data to plot the evolution of the individuals. |
See Also
plot.statis.circle, plot.statis.plane
Examples
data(expert1, expert2, expert3)
labels <- c("Expert 1", "Expert 2", "Expert 3")
# If you want to select an specific table or tables
res <- statis(list(expert1, expert2, expert3), selected.tables = c(1, 3), table.labels = labels)
# If you want to select an specific row or rows
res <- statis(list(expert1, expert2, expert3), selected.rows = c(1, 5), table.labels = labels)
# If you want to select some tables and rows at the same time
res <- statis(list(expert1,expert2,expert3), selected.tables=c(1, 3), selected.rows=c(1, 4), labels)
# All tables and rows selected
res <- statis(list(expert1, expert2, expert3), table.labels = labels)
# How to use res
inter <- res$circle.inter
plot.statis.circle(inter$points, inter$inertia, inter$labels, inter$title)
evolution <- res$plane.evolution
plot.statis.plane(evolution$points, evolution$inertia, evolution$labels, evolution$title)
STATIS DUAL Analysis
Description
Implementation of the STATIS DUAL method for the joint analysis of multiple tables that share the same variables. This approach allows evaluating the common structure between tables (interstructure), building a compromise (weighted average of structures), and analyzing the trajectories of variables across the tables.
Usage
statis.dual(tables, labels.tables = NULL)
Arguments
tables |
A list of matrices or data frames with the same columns (variables). Each element represents a "table". |
labels.tables |
A vector of length equal to the number of tables, used to name the tables in the results. If |
Details
The STATIS DUAL method allows:
Evaluating structural coherence across multiple tables using the interstructure
Constructing a representative compromise of the set of tables
Analyzing the behavior of the variables across the set (trajectories)
Internally, the tables are centered and normalized considering uniform observation weights. The R matrices capture the internal structure of each table. The interstructure is based on scalar products between these matrices.
Value
A list with the following elements:
labels.tables |
Vector with the table labels |
interstructure |
K x 2 matrix with the coordinates of the tables in the interstructure space |
supervariables |
p x 2 matrix with the coordinates of the variables in the compromise |
trajectories |
List of p K x 2 matrices, one per variable, showing its trajectory across the tables |
vars.names |
Names of the variables (common columns) |
S |
Interstructure similarity matrix |
R |
List of R matrices for each table |
Comp |
Compromise matrix (weighted combination of R matrices) |
eigenvalues.compromise |
Eigenvalues of the compromise (inertia per axis) |
eigenvectors.compromise |
Eigenvectors of the compromise |
beta.weights |
Weights of each table in the construction of the compromise |
See Also
plot.statis.dual.circle, plot.statis.dual.trajectories
Examples
data(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98)
labels = c("95","96","97","98")
res <- statis.dual(list(Tuis5_95, Tuis5_96, Tuis5_97, Tuis5_98), labels.tables = labels)
# How to use res
t <- ggplot2::ggtitle("Correlation (all variables)")
plot.statis.dual.circle(list(res$supervariables), labels = row.names(res$supervariables)) + t
t <- ggplot2::ggtitle("Trajectories (all variables)")
plot.statis.dual.trajectories(res$vars.names, res$trajectories, res$labels.tables) + t