The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Matrix Factorization (also known as Decomposition) is a technique used to reduce a complex dataset into a smaller set of fundamental “factors” or patterns. In the context of Formal Concept Analysis (FCA), this corresponds to finding a small subset of concepts that can explain or reconstruct the original data with minimal error.
Given an object-attribute matrix \(I\) of size \(n \times m\), we seek to factorize it into two matrices:
The goal is that the composition of these matrices approximates the original data: \[ I \approx A \circ B \]
fcaR implements two powerful algorithms for this
purpose:
Let’s use a fuzzy dataset describing different dog breeds and their characteristics. We want to see if we can reduce these breeds to a few “archetypes” (factors).
# Create a fuzzy matrix (6 breeds x 5 attributes)
I <- matrix(c(
0.9, 0.9, 0.0, 0.0, 0.2, # Labrador
0.8, 0.9, 0.1, 0.0, 0.1, # Golden Ret.
0.2, 0.2, 0.9, 0.9, 0.8, # German Shepherd
0.1, 0.1, 0.8, 0.9, 0.9, # Rottweiler
0.9, 0.2, 0.2, 0.1, 0.2, # Beagle
0.2, 0.1, 0.1, 0.1, 0.9 # Chihuahua
), nrow = 6, byrow = TRUE)
rownames(I) <- c("Labrador", "Golden Ret.", "G. Shepherd", "Rottweiler", "Beagle", "Chihuahua")
colnames(I) <- c("Friendly", "Playful", "Guard", "Aggressive", "Small")
fc <- FormalContext$new(I)
# Use Lukasiewicz logic for fuzzy operations
fc$use_logic("Lukasiewicz") We apply GreConD+. We can tune the weight
w (penalty for overcovering) or the stopping threshold.
# Factorize using GreConD+
factors <- fc$factorize(method = "GreConD", w = 1.0)
# The result contains two new FormalContext objects
A <- factors$object_factor
B <- factors$factor_attribute
print("Matrix A (Object-Factor):")
#> [1] "Matrix A (Object-Factor):"
print(A$incidence())
#> F1 F2 F3 F4 F5 F6 F7
#> Labrador 0.2 1.0 0.3 0.9 0.1 1.0 0.1
#> Golden Ret. 0.3 0.9 0.2 0.8 0.1 1.0 0.2
#> G. Shepherd 1.0 0.3 0.9 0.2 1.0 0.3 1.0
#> Rottweiler 1.0 0.2 1.0 0.1 1.0 0.2 0.9
#> Beagle 0.4 0.3 0.3 0.9 0.2 0.3 0.3
#> Chihuahua 0.3 0.2 1.0 0.2 0.2 0.2 0.2
print("Matrix B (Factor-Attribute):")
#> [1] "Matrix B (Factor-Attribute):"
print(B$incidence())
#> Friendly Playful Guard Aggressive Small
#> F1 0.1 0.1 0.8 0.8 0.8
#> F2 0.9 0.9 0.0 0.0 0.2
#> F3 0.1 0.1 0.1 0.1 0.9
#> F4 1.0 0.3 0.1 0.1 0.3
#> F5 0.1 0.1 0.8 0.9 0.8
#> F6 0.8 0.9 0.0 0.0 0.1
#> F7 0.2 0.2 0.9 0.8 0.8We can verify the quality of the decomposition by reconstructing the matrix using the Lukasiewicz product (\(A \circ B\)).
# Reconstruct I' = A o B
rec_I <- A$incidence() %*% B$incidence() # Note: standard matrix product is just an approximation
# For exact fuzzy reconstruction we would loop using the T-norm, but let's check the error:
# (In a real scenario, we use the logic's operators)
mae <- mean(abs(I - A$incidence() %*% B$incidence())) # Simplified check
# For exact reconstruction, GreConD+ guarantees I <= A o B if w is high.For large binary datasets, ASSO is a classic heuristic. It uses pairwise association confidence to generate candidate factors.
# Create a binary dataset
I_bin <- matrix(c(
1, 1, 1, 0, 0,
1, 1, 1, 0, 0,
0, 0, 0, 1, 1,
0, 0, 0, 1, 1,
1, 0, 0, 0, 1
), nrow = 5, byrow = TRUE)
rownames(I_bin) <- paste0("O", 1:5)
colnames(I_bin) <- paste0("A", 1:5)
fc_bin <- FormalContext$new(I_bin)
# Factorize using ASSO
# threshold: confidence threshold for candidate generation
res_asso <- fc_bin$factorize(method = "ASSO", threshold = 0.6)
print(res_asso$factor_attribute$incidence())
#> A1 A2 A3 A4 A5
#> F1 1 1 1 0 0
#> F2 0 0 0 1 1These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.