The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

detectXOR: XOR pattern detection and visualization in R

Provides tools for detecting XOR-like patterns in variable pairs. Includes visualizations for pattern exploration.

Overview

Traditional feature selection methods often miss complex non-linear relationships where variables interact to produce class differences. The detectXOR package specifically targets XOR patterns - relationships where class discrimination only emerges through variable interactions, not individual variables alone.

Key capabilities

πŸ” XOR pattern detection - Statistical identification using χ² and Wilcoxon tests
πŸ“ˆ Correlation analysis - Class-wise Kendall Ο„ coefficients
πŸ“Š Visualization - Spaghetti plots and decision boundary visualizations
⚑ Parallel processing - Multi-core acceleration for large datasets
πŸ”¬ Robust statistics - Winsorization and scaling options for outlier handling

Installation

Install the development version from GitHub:

# Install devtools if needed
if (!requireNamespace("devtools", quietly = TRUE)) { install.packages("devtools") }
# Install detectXOR
devtools::install_github("JornLotsch/detectXOR")

Dependencies

The package requires R β‰₯ 3.5.0 and depends on: - dplyr, tibble (data manipulation) - ggplot2, ggh4x, scales (visualization) - future, future.apply, pbmcapply, parallel (parallel processing) - reshape2, glue (data processing and string manipulation) - DescTools (statistical tools) - Base R packages: stats, utils, methods, grDevices

Optional packages (suggested): - testthat, knitr, rmarkdown (development and documentation) - doParallel, foreach (additional parallel processing options)

Quick start

Basic XOR detection

library(detectXOR)
# Load example data
data(XOR_data)
# Detect XOR patterns with default settings
results <- detectXOR(XOR_data, class_col = "class")
# View summary
print(results$results_df)

Usage with custom parameters

# Detection with custom thresholds and parallel processing
results <- detect_xor(
  data = XOR_data,
  class_col = "class",
  p_threshold = 0.01,
  tau_threshold = 0.4,
  max_cores = 4,
  extreme_handling = "winsorize",
  scale_data = TRUE
)

Function parameters

detectXOR() - Main detection function

Parameter Type Default Description
data data.frame required Input dataset with variables and class column
class_col character "class" Name of the class/target variable column
check_tau logical TRUE Compute class-wise Kendall Ο„ correlations
compute_axes_parallel_significance logical TRUE Perform group-wise Wilcoxon tests
p_threshold numeric 0.05 Significance threshold for statistical tests
tau_threshold numeric 0.3 Minimum absolute Ο„ for β€œstrong” correlation
abs_diff_threshold numeric 20 Minimum absolute difference for practical significance
split_method character "quantile" Tile splitting method: "quantile" or "range"
max_cores integer NULL Maximum cores for parallel processing (auto-detect if NULL)
extreme_handling character "winsorize" Outlier handling: "winsorize", "remove", or "none"
winsor_limits numeric vector c(0.05, 0.95) Winsorization percentiles
scale_data logical TRUE Standardize variables before analysis
use_complete logical TRUE Use only complete cases (remove NA values)

Output structure

The detectXOR() function returns a list with two components: ### results_df - Summary data frame

Column Description
var1, var2 Variable pair names
xor_shape_detected Logical: XOR pattern identified
chi_sq_p_value χ² test p-value for tile independence
tau_class_0, tau_class_1 Class-wise Kendall Ο„ coefficients
tau_difference Absolute difference between class Ο„ values
wilcox_p_x, wilcox_p_y Wilcoxon test p-values for each axis
significant_wilcox Logical: significant group differences detected

pair_list - Detailed results

Contains comprehensive analysis for each variable pair including: - Tile pattern analysis results - Statistical test outputs - Processed data subsets - Intermediate calculations

Visualization functions

Function Description Key Parameters
generate_spaghetti_plot_from_results() Creates connected line plots showing variable trajectories for XOR-detected pairs results, data, class_col, scale_data = TRUE
generate_xy_plot_from_results() Generates scatter plots with decision boundary lines for detected XOR patterns results, data, class_col, scale_data = TRUE, quantile_lines = c(1/3, 2/3), line_method = "quantile"

Both functions return ggplot objects that can be displayed or saved manually.

# Generate plots
generate_spaghetti_plot_from_results(results, XOR_data) 
generate_xy_plot_from_results(results, XOR_data)

Example plots

Reporting functions

Function Description Key Parameters
generate_xor_reportConsole() Creates console-friendly formatted report with optional plots results, data, class_col, scale_data = TRUE, show_plots = TRUE
generate_xor_reportHTML() Generates comprehensive HTML report with interactive elements results, data, class_col, output_file, open_browser = TRUE

Example report

# Generate formatted report 
generate_xor_reportHTML(results, XOR_data, class_col = "class")

The report will be automaticlaly opened in the system standard web browser.

Methodology

XOR detection pipeline

  1. Pairwise dataset creation - Extract all variable pairs with preprocessing
  2. Tile pattern analysis - Divide variable space into 2Γ—2 tiles and test for XOR-like distributions
  3. Statistical validation - Apply χ² tests for independence and Wilcoxon tests for group differences
  4. Correlation analysis - Compute class-wise Kendall Ο„ to quantify relationship strength
  5. Result aggregation - Combine findings into interpretable summary format

Statistical tests

Use cases

Machine learning

Technical details

Cross-platform compatibility

Package structure

detectXOR/
β”œβ”€β”€ R/                 # Package source code
β”œβ”€β”€ man/               # Package documentation
β”œβ”€β”€ data/              # Example dataset
β”œβ”€β”€ issues/            # Problem reporting
└── analyses/          # Files used to generate or plot publictaion data sets (not in library)

Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests on GitHub. ## License GPL-3 ## Citation

For citation details or to request a formal publication reference, please contact the maintainer.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.