The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Hinton diagrams were introduced by Geoffrey Hinton, one of the founders of deep learning, as a practical debugging tool for neural network weights in the 1980s. The diagram appeared in textbooks on neural networks and connectionist models and became a standard visualization in that literature. Despite their long history, they remain underused in modern data analysis toolkits, in part because no convenient, ggplot2-native implementation existed.
gghinton aims to fix that.
Suppose you are training a neural network and you want to inspect a weight matrix: to understand which connections are large, which are small, and which are inhibitory versus excitatory. The standard tool is a heatmap:
set.seed(7)
nr <- 10
nc <- 18
W <- matrix(rnorm(nr*nc, sd = 0.4), nrow = nr, ncol = nc)
rownames(W) <- paste0("neuron_", 1:nr)
colnames(W) <- paste0("input_", 1:nc)
# The standard heatmap approach
df <- as.data.frame(as.table(W))
names(df) <- c("row", "col", "value")
ggplot(df, aes(x = col, y = row, fill = value)) +
geom_tile() +
scale_fill_gradient2(low = "blue", mid = "white", high = "red",
midpoint = 0) +
coord_fixed() +
theme_minimal() +
theme(panel.grid = element_blank(),
axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
labs(title = "Weight matrix as a heatmap")This works, but it has weaknesses:
0.35 more than twice 0.16? With colour, you
can’t easily tell.Now the same data as a Hinton diagram:
df_h <- matrix_to_hinton(W,
rowname_col = "row", colname_col = "col", value_col = "weight")
ggplot(df_h, aes(x = col, y = row, weight = weight)) +
geom_hinton() +
scale_fill_hinton() +
scale_x_continuous(breaks = seq_along(colnames(W)), labels = colnames(W)) +
scale_y_continuous(breaks = seq_along(rownames(W)), labels = rev(rownames(W))) +
coord_fixed() +
theme_hinton() +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
labs(title = "Weight matrix as a Hinton diagram")The key differences:
A large body of research in visual perception (Mackinlay 1986; Cleveland & McGill 1984) ranks visual encoding channels by how accurately humans can decode quantitative information. The consensus ranking for magnitude:
Heatmaps use colour saturation (the worst channel for magnitude). Hinton diagrams use area (a dramatically better channel). The improvement is most pronounced when:
0.01 to
0.99): tiny vs large squares are unmistakable; pale vs
saturated blue is not.For correlation matrices or weight matrices where sign matters, Hinton diagrams have an additional advantage. A heat map must choose a diverging colour scheme, map its midpoint correctly to zero, and hope that readers can distinguish near-zero from slightly-positive from slightly-negative.
A Hinton diagram encodes sign with the most basic visual distinction possible: black vs white. There is no perceptual ambiguity.
set.seed(3)
# Simulate a correlation matrix
S <- matrix(c(
1.00, 0.72, -0.35, 0.15,
0.72, 1.00, -0.21, 0.08,
-0.35, -0.21, 1.00, -0.58,
0.15, 0.08, -0.58, 1.00
), 4, 4)
vars <- c("IQ", "Memory", "Anxiety", "Stress")
rownames(S) <- colnames(S) <- vars
df_cor <- matrix_to_hinton(S)
ggplot(df_cor, aes(x = col, y = row, weight = weight)) +
geom_hinton() +
scale_fill_hinton() +
scale_x_continuous(breaks = 1:4, labels = vars) +
scale_y_continuous(breaks = 1:4, labels = rev(vars)) +
coord_fixed() +
theme_hinton() +
labs(title = "Correlation matrix",
subtitle = "White = positive, black = negative")Notice how the Anxiety-Stress negative correlation is
immediately visible as a large black square, while the small positive
IQ-Stress correlation is nearly absent.
Hinton diagrams are not universally superior. Other visualisations may be better when:
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.