Introduction

Questions are often taken from here the stackoverflow dendrogram tag.

How to colour the labels of a dendrogram by an additional factor variable

Asked (http://stackoverflow.com/questions/27485549/how-to-colour-the-labels-of-a-dendrogram-by-an-additional-factor-variable-in-r)[here].

Solution: use the labels_colors function.

# install.packages("dendextend")
library(dendextend)

dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like: 
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram

# By default, the dend has no colors to the labels
labels_colors(dend)
## NULL
par(mfrow = c(1,2))
plot(dend, main = "Original dend")

# let's add some color:
labels_colors(dend) <- 1:5
# Now each state has a color
labels_colors(dend) 
##   Arkansas    Arizona California    Alabama     Alaska 
##          1          2          3          4          5
plot(dend, main = "A color for every state")

plot of chunk unnamed-chunk-2

Instead of using 1:5, we can obviously use colors which are based on another factor (organized) the labels themseslves. But in such a case, we want to map between the order of the labels, and the order of the items in the original dataset. Here is another example based on the iris dataset:

# install.packages("dendextend")
library(dendextend)

small_iris <- iris[c(1, 51, 101, 2, 52, 102), ]
dend <- as.dendrogram(hclust(dist(small_iris[,-5])))
# Like: 
# dend <- small_iris[,-5] %>% dist %>% hclust %>% as.dendrogram

# By default, the dend has no colors to the labels
labels_colors(dend)
## NULL
par(mfrow = c(1,2))
plot(dend, main = "Original dend")

# let's add some color:
colors_to_use <- as.numeric(small_iris[,5])
colors_to_use
## [1] 1 2 3 1 2 3
# But sort them based on their order in dend:
colors_to_use <- colors_to_use[order.dendrogram(dend)]
colors_to_use
## [1] 1 1 2 2 3 3
# Now we can use them
labels_colors(dend) <- colors_to_use
# Now each state has a color
labels_colors(dend) 
##   1   2  51  52 101 102 
##   1   1   2   2   3   3
plot(dend, main = "A color for every Species")

plot of chunk unnamed-chunk-3

How to color a dendrogram’s branches/labels based on cluster (i.e.: cutree result)

Use the color_branches and color_labels functions, with the k (orh) parameter:

# install.packages("dendextend")
library(dendextend)

dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like: 
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram

dend1 <- color_branches(dend, k = 3)
dend2 <- color_labels(dend, k = 3)

par(mfrow = c(1,2))
plot(dend1, main = "Colored branches")
plot(dend2, main = "Colored labels")

plot of chunk unnamed-chunk-4

Change dendrogram’s labels

Use the left assign labels<- function:

# install.packages("dendextend")
library(dendextend)

dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like: 
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram

labels(dend)
## [1] "Arkansas"   "Arizona"    "California" "Alabama"    "Alaska"
labels(dend) <- 1:5
labels(dend)
## [1] 1 2 3 4 5

Larger font for leaves in a dendrogram

Asked (http://stackoverflow.com/questions/26965390/larger-font-and-spacing-between-leaves-in-r-dendrogram)[here].

Solution: use the set function, with the “labels_cex” parameter.

# install.packages("dendextend")
library(dendextend)

dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like: 
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram

# By default, the dend has no text size to it (showing only the first leaf)
get_leaves_nodePar(dend)[[1]]
## [1] NA
par(mfrow = c(1,2), mar = c(10,4,4,2))
plot(dend, main = "Original dend")

# let's increase the size of the labels:
dend <- set(dend, "labels_cex", 2)
# Now each state has a larger label
get_leaves_nodePar(dend)[[1]]
## $lab.cex
## [1] 2
## 
## $pch
## [1] NA
plot(dend, main = "A larger font for labels")

plot of chunk unnamed-chunk-6

(note that changing the spacing between the labels is currently not implemented)

How to view attributes of a dendrogram

Asked (http://stackoverflow.com/questions/26240200/how-to-access-attributes-of-a-dendrogram-in-r)[here], and (http://stackoverflow.com/questions/25664911/r-hclust-height-of-final-merge)[here].

It generally depends on which attribute we want to view, for “midpoint” (or height) use the get_nodes_attr function, with the “midpoint” parameter.

# install.packages("dendextend")
library(dendextend)

dend <- as.dendrogram(hclust(dist(USArrests[1:5,])))
# Like: 
# dend <- USArrests[1:5,] %>% dist %>% hclust %>% as.dendrogram

# midpoint for all nodes
get_nodes_attr(dend, "midpoint")
## [1] 1.25   NA 1.50 0.50   NA   NA 0.50   NA   NA
# Maybe also the height:
get_nodes_attr(dend, "height")
## [1] 108.85   0.00  63.01  23.19   0.00   0.00  37.18   0.00   0.00

To also change an attribute, you can use the various assign functions from the package: assign_values_to_leaves_nodePar, assign_values_to_leaves_edgePar, assign_values_to_nodes_nodePar, assign_values_to_branches_edgePar, remove_branches_edgePar, remove_nodes_nodePar

How to color the branches in heatmap.2?

Asked (http://stackoverflow.com/questions/29265536/how-to-color-the-branches-and-tick-labels-in-the-heatmap-2?)[here].

Solution: use the color_branches function (or the set function, with the “branches_k_color”, “k”, and “value” parameters ).

(Getting the data for this example is from the (http://stackoverflow.com/questions/29265536/how-to-color-the-branches-and-tick-labels-in-the-heatmap-2?)[original SO question])

test <- test0
rnames <- test[,1] 
test <- data.matrix(test[,2:ncol(test)]) # to matrix
rownames(test) <- rnames                 
test <- scale(test, center=T, scale=T) # data standarization
test <- t(test) # transpose


## Creating a color palette & color breaks

my_palette <- colorRampPalette(c("forestgreen", "yellow", "red"))(n = 299)

col_breaks = c(seq(-1,-0.5,length=100),  # forestgreen
               seq(-0.5,0.5,length=100), # yellow
               seq(0.5,1,length=100))    # red

# distance & hierarchical clustering
distance = dist(test, method ="euclidean")    
hcluster = hclust(distance, method ="ward.D")


dend1 <- as.dendrogram(hcluster)

# Get the dendextend package
if(!require(dendextend)) install.packages("dendextend")
library(dendextend)
# get some colors
cols_branches <- c("darkred", "forestgreen", "orange", "blue")
# Set the colors of 4 branches
dend1 <- color_branches(dend1, k = 4, col = cols_branches)
# or with:
# dend1 <- set(dend1, "branches_k_color", k = 4, value = cols_branches)

# get the colors of the tips of the dendrogram:
# col_labels <- cols_branches[cutree(dend1, k = 4)] # this may need tweaking in various cases - the following is a more general solution.

col_labels <- get_leaves_branches_col(dend1)
# But due to the way heatmap.2 works - we need to fix it to be in the 
# order of the data!   
col_labels <- col_labels[order(order.dendrogram(dend1))]

dend1
# plot(dend1)
# a <- heights_per_k.dendrogram(dend1)
# library(dendextendRcpp)
# a2 <- heights_per_k.dendrogram(dend1)
# nleaves(dend1)


# Creating Heat Map
# if(!require(gplots)) install.packages("gplots")
library(gplots)
heatmap.2(test,  
    main = paste( "test"),  
        trace="none",          
        margins =c(5,7),      
        col=my_palette,        
        breaks=col_breaks,     
        dendrogram="row",      
        Rowv = dend1,  
        Colv = "NA", 
        key.xlab = "Concentration (index)",
        cexRow =0.6,
        cexCol = 0.8,
        na.rm = TRUE,
        RowSideColors = col_labels # to add nice colored strips     
      # colRow = col_labels # to add nice colored labels - only for qplots 2.17.0 and higher
        ) 

For package developers - how to call imported calls from dendextend 0.18.3?

If you are developing a package and you wish to use dendextend as an imported package, that is - without loading it to the search path, you should run:

dendextend::assign_dendextend_options()
# This populates the dendextend::dendextend_options() space

Before using any of its function (for example: dendextend::color_branches ). As of dendextend version 1.0.0, this is no longer required.