Automating Item Removal Strategies with ItemRest

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction

The ItemRest package is designed to automate the process of evaluating item removal strategies in Exploratory Factor Analysis (EFA). It helps identify low-quality items (those with low factor loadings or significant cross-loadings) and assesses the impact of their removal on the model’s overall fit and structure. This guide provides a step-by-step walkthrough of the package’s core functionalities.

1. Loading the Package

To begin the analysis, we first load the ItemRest library.

library(ItemRest)

2. Preparing Example Data

For this demonstration, we will use the bfi (Big Five Inventory) dataset, which is available in the psych package. This dataset includes responses to 25 personality items. For a clean analysis, we will remove cases with missing data.

# Ensure the 'psych' package is available
if (requireNamespace("psych", quietly = TRUE)) {
  data(bfi, package = "psych")
  
  # Select the personality items (first 25 columns)
  analysis_data <- bfi[, 1:25]
  
  # Omit rows with missing values for this example
  analysis_data <- na.omit(analysis_data)
  
  # View the first few rows of the prepared data
  head(analysis_data)
}
#>       A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 O3 O4
#> 61617  2  4  3  4  4  2  3  3  4  4  3  3  3  4  4  3  4  2  2  3  3  6  3  4
#> 61618  2  4  5  2  5  5  4  4  3  4  1  1  6  4  3  3  3  3  5  5  4  2  4  3
#> 61620  5  4  5  4  4  4  5  4  2  5  2  4  4  4  5  4  5  4  2  3  4  2  5  5
#> 61621  4  4  6  5  5  4  4  3  5  5  5  3  4  4  4  2  5  2  4  1  3  3  4  3
#> 61622  2  3  3  4  5  4  4  5  3  2  2  2  5  4  5  2  3  4  4  3  3  3  4  3
#> 61623  6  6  5  6  5  6  6  6  1  3  2  1  6  5  6  3  5  2  2  3  4  3  5  6
#>       O5
#> 61617  3
#> 61618  3
#> 61620  2
#> 61621  5
#> 61622  3
#> 61623  1

3. Running the Analysis

With the data prepared, we can now run the main itemrest() function. Based on the Big Five model, we set the n_factors argument to 5.

# Run the analysis
if (exists("analysis_data")) {
  results <- itemrest(
    data = analysis_data,
    n_factors = 5,
    cor_method = "pearson"
  )
}
#> Loading required namespace: GPArotation
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N4 N5 O2 O4 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> --- Settings and Descriptive Statistics ---
#> Number of Factors (Manual): 5
#> Number of Items: 25
#> Number of Observations: 2436
#> Minimum Value: 1
#> Maximum Value: 6
#> 
#> --- Initial EFA Results (No items removed) ---
#> Cronbach's Alpha: 0.525
#> Total Explained Variance: % 42.36
#> Low-loading Items: None
#> Cross-loading Items: E3, N4, O4
#> 
#> All Identified Low-Quality Items: E3, N4, O4
#> 
#> [Info] Testing 7 different removal combinations for low-quality items...
#>   |                                                                              |                                                                      |   0%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N4 N5 O2 O4 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |=========                                                             |  12%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A2 A3 A4 A5 C1 C2 C3 E4 E5 O1 O3 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |==================                                                    |  25%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N5 O2 O4 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |==========================                                            |  38%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N4 N5 O2 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |===================================                                   |  50%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N5 O2 O4 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |============================================                          |  62%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A2 A3 A4 A5 C1 C2 C3 E4 E5 O1 O3 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |====================================================                  |  75%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N5 O2 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |=============================================================         |  88%
#> Warning in psych::alpha(data, check.keys = FALSE): Some items were negatively correlated with the first principal component and probably 
#> should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option
#> Some items ( A1 C4 C5 E1 E2 N1 N2 N3 N5 O2 O5 ) were negatively correlated with the first principal component and 
#> probably should be reversed.  
#> To do this, run the function again with the 'check.keys=TRUE' option  |                                                                              |======================================================================| 100%

As itemrest() runs, it prints messages to the console, informing you about the initial EFA results, the problematic items it has identified, and the progress of testing different removal combinations.

4. Reviewing the Results

After the analysis is complete, the results object contains all the output. We can use the print() method to display the summary tables.

Optimal Strategy Report

By default, the print() function displays the “optimal” report. This table shows the strategies that resulted in a clean factor structure (no cross-loadings), sorted by the highest total explained variance.

if (exists("results")) {
  # Print the default optimal report
  print(results, report = "optimal")
}
#> 
#> ==============================
#>  Item Removal Strategy Report
#> ==============================
#> 
#> --- Optimal Removal Strategies (No Cross-Loadings) ---
#>   Removed_Items Total_Explained_Var Factor_Loading_Range Cronbachs_Alpha
#> 1         N4-O4             % 42.86            0.42-0.83           0.470
#> 2      E3-N4-O4             % 42.84            0.42-0.83           0.443
#>   Cross_Loading
#> 1            No
#> 2            No
#> 
#> -----------------------------------------------------
#> Final Reminder: Let algorithms be your compass, not your captain. Valid item removal also requires theoretical competence.

In the table above, a user should typically look for the row where “Cross_Loading” is “No” and the “Total_Explained_Var” is highest. This often represents the most statistically sound item removal strategy.

All Strategies Report

If you wish to see the results for every combination that was tested, you can set the report argument to "all".

if (exists("results")) {
  # Print the report for all tested strategies
  print(results, report = "all")
}
#> 
#> ==============================
#>  Item Removal Strategy Report
#> ==============================
#> 
#> --- Results of All Removal Strategies ---
#>   Removed_Items Total_Explained_Var Factor_Loading_Range Cronbachs_Alpha
#> 1         N4-O4             % 42.86            0.42-0.83           0.470
#> 2      E3-N4-O4             % 42.84            0.42-0.83           0.443
#> 3            O4             % 43.12            0.41-0.82           0.508
#> 4         E3-O4             % 43.11            0.42-0.83           0.492
#> 5          None             % 42.36            0.30-0.83           0.525
#> 6            E3             % 42.31            0.31-0.84           0.510
#> 7            N4             % 42.07            0.34-0.84           0.487
#> 8         E3-N4             % 42.02            0.33-0.84           0.462
#>   Cross_Loading
#> 1            No
#> 2            No
#> 3           Yes
#> 4           Yes
#> 5           Yes
#> 6           Yes
#> 7           Yes
#> 8           Yes
#> 
#> -----------------------------------------------------
#> Final Reminder: Let algorithms be your compass, not your captain. Valid item removal also requires theoretical competence.

This guide has demonstrated how to use the ItemRest package to automate and evaluate the item removal process in an EFA.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.